Vox-E: Text-guided Voxel Editing of 3D Objects

Supplementary Material

 

Please click on the sections below to view our interactive results, comparisons and ablations:

 


Ablations

We ablate the key components in our framework by comparing a variety of losses against our selected loss term:

  1. Image space L1 loss (col 2)
  2. Image space L1 loss with density-guided postprocessing* (col 3)
  3. Image space L2 loss (col 4)
  4. Image space L2 loss with density-guided postprocessing* (col 5)
  5. Volumetric L1 loss (col 6)
  6. Volumetric L2 loss (col 7)
  7. Our correlation-based volumetric regularization (col 8)
  8. Our final result, after refinement (col 9)

* As image-space losses yield high levels of noise, we additionally show results obtained with density-guided postprocessing, which filters voxels with density values below a threshold, according to the initial grid.
Note that this does not allow for geometric changes (as evident, for instance, by the removal of the hat in the third row), however, we illustrate this for better visualizing the differences between the different losses.

  Input Image space L1 (IS-L1) IS-L1 w/ postprocessing Image space L2 (IS-L2) IS-L2 w/ postprocessing Volumetric L1 Volumetric L2 Ours unrefined Ours refined

A kangaroo wearing a christmas sweater



A dog wearing big sunglasses



A cat wearing a birthday party hat



A kangaroo wearing a birthday party hat



A dog wearing a christmas sweater



A cat wearing big sunglasses