Personalized Text-to-Image Generative models


Problem of Interest


Our Method & Results


Figure 1. The architecture of StyleBoost. StyleRef images of the target style, paired with text prompt (“A [V] style”), and Aux images, collected from the Internet and paired with the prompt (“A style”) are provided as input images. After fine-tuning, the text-to-image model can generate various images of the target style with the guidance of text prompts.

Figure 1. The architecture of StyleBoost. StyleRef images of the target style, paired with text prompt (“A [V] style”), and Aux images, collected from the Internet and paired with the prompt (“A style”) are provided as input images. After fine-tuning, the text-to-image model can generate various images of the target style with the guidance of text prompts.

Figure 2. Text-to-image synthesis of StyleBoost. Personalized images generated by StyleBoost compared to the existing DreamBooth for 3 different styles. Across the categories of person, animal, and background (landscape), our model generates meticulously aligned, high-fidelity images that resonate with the target style.

Figure 2. Text-to-image synthesis of StyleBoost. Personalized images generated by StyleBoost compared to the existing DreamBooth for 3 different styles. Across the categories of person, animal, and background (landscape), our model generates meticulously aligned, high-fidelity images that resonate with the target style.

Table 1. Comparison of FID performance for each style, with different compositions of StyleRef and Aux images. For Aux images, we chose StyleRef composition of Back+person.

Table 1. Comparison of FID performance for each style, with different compositions of StyleRef and Aux images. For Aux images, we chose StyleRef composition of Back+person.

Publications & Github


GitHub - ION-dgu/StyleBoost: ICTC_conference