Research Highlights: An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion

Print Friendly, PDF & Email

Title of Paper: An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion

Code: https://github.com/rinongal/textual_inversion

Overview: Recently, there has been an increase of text-to-image models that allow the synthesis of novel scenes and rich images using different styles. In terms of the artistic creation process using these generative models, coming up with effective text descriptions to render a desired target remains a challenge. It’s now clear how to generate images of specific unique concepts, incorporate modifications on appearance, and compose them in different roles and novel scenes. The featured research paper proposes a new approach designed to tackle these challenges and allow for more creative freedom with these generative systems.

This new research takes a few images for a concept and learns to represent it through new “words” in the embedding space of a frozen text-to-image model. Through a process called “textual inversions,” the goal is to find new pseudo-words in the embedding space that can capture high-level semantics and fine visual details. These words are then used to compose new sentences to guide novel personalized creations. Results demonstrate that this approach for personalizing text-to-image generation can provide high visual fidelity and enables robust editing of scenes. 

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: @InsideBigData1 – https://twitter.com/InsideBigData1

Speak Your Mind

*