SerialGen: Personalized Image Generation by First Standardization Then Personalization

Global Business Unit, Baidu Inc.

*Indicates Equal Contribution

CVPR 2025

Consistency in overall appearance Ensure text controllability Only one reference image required Tuning free

Serial images generated by SerialGen. Our method can produce personalized images that faithfully recover the reference image’s overall appearance while accurately responding to a wide range of text prompts.

MY ALT TEXT

Application to story creation.

Abstract

In this work, we are interested in achieving both high text controllability and overall appearance consistency in the generation of personalized human characters. We propose a novel framework, named SerialGen, which is a serial generation method consisting of two stages: first, a standardization stage that standardizes reference images, and then a personalized generation stage based on the standardized reference. Furthermore, we introduce two modules aimed at enhancing the standardization process. Our experimental results validate the proposed framework's ability to produce personalized images that faithfully recover the reference image's overall appearance while accurately responding to a wide range of text prompts. Through thorough analysis, we highlight the critical contribution of the proposed serial generation method and standardization model, evidencing enhancements in appearance consistency between reference and output images and across serial outputs generated from diverse text prompts. The term "Serial" in this work carries a double meaning: it refers to the two-stage method and also underlines our ability to generate serial images with consistent appearance throughout.

Method

MY ALT TEXT

Overview of the proposed SerialGen with two stages: (1) Standardization – training a standardization model on synthetic data, and (2) Personalization – using the standardization model to create (standardized reference, target) pairs for personalized text-to-image model training. During inference, once a reference image is standardized, serial images can be generated based on different text prompts.

BibTeX

@article{xie2024serialgenpersonalizedimagegeneration,
        title={SerialGen: Personalized Image Generation by First Standardization Then Personalization}, 
        author={Cong Xie and Han Zou and Ruiqi Yu and Yan Zhang and Zhenpeng Zhan},
        journal = {arXiv preprint arXiv:2412.01485},
        year={2024},
}