Nvidia's AI Tool DiffUHaul Lets You Move Objects in Photos

On Monday, researchers at Nvidia unveiled a new artificial intelligence (AI) model that has the ability to move objects within an image. The program, called DiffUHaul, can transport an object from one location to another without changing the image’s shape or backdrop since it can spatially grasp the context of the image. This technique’s distinctive feature is that it is training-free, which means that no pre-training data was used in its development. The business demonstrated the new technique at the Asia 2024 meeting of the Special Interest Group on Computer Graphics and Interactive Techniques (SIGGRAPH).

Researchers from Nvidia outlined the new AI tool in a study. The technology was created in partnership with Reichman University, Tel Aviv University, and The Hebrew University of Jerusalem. The researchers’ goal with the new tool was to address a major difficulty with AI picture generating models: the inability to spatially aware move elements within an image.

The study emphasizes how AI models’ lack of spatial reasoning has kept this specific editing task a barrier for AI researchers. Because they don’t comprehend how a movement in a 2D world would be interpreted spatially, existing visual models are only able to comprehend the context of a picture.

Nvidia says this problem can be resolved with DiffUHaul. The tool’s denoising step makes use of attention masking, which is based on picture diffusion architecture. To maintain the appearance of the high-level object, this is done. BlobGEN is a novel method that incorporates spatial awareness into the AI tool. Furthermore, using the localized model in the appropriate location, new methods were employed to reconstruct actual images.

The AI will be able to physically reposition the object and change the background in response to a text prompt that users input on the front end that highlights the object they wish to modify. The AI editing tool’s ability to comprehend the shape changes that accompany spatial movement was not evident in the company’s demonstrations. For example, the shape of an airborne balloon is altered when it is transported to the ground. However, due to a lack of training, the AI might not be able to recognize that.

DiffUHaul uses BlobGEN for spatial understanding, which enables robust object dragging without fine-tuning.

To learn more on this, visit to learn and read more about this.

Discover more from TechBooky

Subscribe to get the latest posts sent to your email.

Tags: AI DiffUHaul nvidia

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

DiffUHaul, an AI Tool from Nvidia Research, Enables Object Relocation in Pictures

Akinola Ajibola

BROWSE BY CATEGORIES

Receive top tech news directly in your inbox

Freshly Squeezed

Browse Archives

Quick Links

About Us

Recent News

Netflix Eyes Becoming First $1tr Streaming Company by 2030

Banks & Telecoms Consider New Billing Plans for USSD Airtime Payments

Google Launches Veo 2 Video AI for Advanced Gemini Users

Transform Your Home on a Budget with Smart Devices Under ₦150,000

CBEX Crypto Scam and Lessons from Nigeria’s Latest Digital Fraud

Apple Uses Privacy Techniques to Enhance Intelligence Features

DiffUHaul, an AI Tool from Nvidia Research, Enables Object Relocation in Pictures

Related Posts:

Discover more from TechBooky

Akinola Ajibola

BROWSE BY CATEGORIES

Receive top tech news directly in your inbox

Freshly Squeezed

Browse Archives

Popular Tags

Quick Links

About Us

Recent News

Netflix Eyes Becoming First $1tr Streaming Company by 2030

Banks & Telecoms Consider New Billing Plans for USSD Airtime Payments

Google Launches Veo 2 Video AI for Advanced Gemini Users

Transform Your Home on a Budget with Smart Devices Under ₦150,000

CBEX Crypto Scam and Lessons from Nigeria’s Latest Digital Fraud

Apple Uses Privacy Techniques to Enhance Intelligence Features

Discover more from TechBooky