Notebookcheck Logo

Nvidia's DiffUHaul AI tool can move objects in images

Nvidia's DiffUHaul AI model can move objects in images (Image Source: Omri Avrahami on YouTube)
Nvidia's DiffUHaul AI model can move objects in images (Image Source: Omri Avrahami on YouTube)
Researchers from Nvidia have developed a new AI tool that can relocate objects in images. The tool can change the position of an object in an image without affecting the background.

Researchers from Nvidia have released a paper on a new AI tool, DiffUHaul, that can understand and move objects within an image without changing the size or the background. The paper says the tool "harnesses the spatial understanding of a localized text-to-image model, for the object dragging task."  

Current text-to-image models struggle with complex image-editing tasks because they lack "spatial reasoning." DiffuHaul solves this problem by baking that into the model, letting it track objects across an image, "seamlessly" relocating them without altering anything else. 

To achieve this, the tool masks the object during the denoising steps, helping it understand its location and separate it from the background. Then, it interpolates the difference between the original and the generated image to place the object in a new position without touching the background. After that, finer details and features from the original image are moved to the new one for consistency. 

DiffUHaul is based on BlobGEN, a model that uses spatial understanding to compose images from complex prompts. The paper says the tool is training-free, which means it was created without any datasets and works out of the box.

static version load dynamic
Loading Comments
Comment on this article
Please share our article, every link counts!
Mail Logo
Rohith Bhaskar, 2024-12- 3 (Update: 2024-12- 3)