← Text-to-Image

Editing, Personalization & Prompts

Text-guided image editing, subject-driven personalization, and prompt engineering optimization techniques.

⌘K

Text-Guided Image Editing & Manipulation 34+ papers

ModelFull TitleVenueYear
WorldEditWorldEdit: Towards Open-World Image Editing with a Knowledge-Informed Benchmark arXiv2026
SliderEditSliderEdit: Continuous Image Editing with Fine-Grained Instruction Control arXiv2025
UltraEditUltraEdit: Instruction-Based Fine-Grained Image Editing at Scale arXiv2024
FlexEditFlexEdit: Flexible and Controllable Diffusion-based Object-centric Image Editing arXiv2025
MagicBrushMagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing NeurIPS2024
OmniEditOmniEdit: Building Image Editing Generalist Models Through Specialist Supervision arXiv2025
ICEditICEdit: Instruction-based Image Editing via In-Context Learning with Multimodal Models arXiv2025
Step1X-EditStep1X-Edit: A Practical Framework for General Image Editing StepFun2025
In-Context EditEnabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer arXiv2025
SmartEditExploring Complex Instruction-based Image Editing with Multimodal LLMs CVPR2024
MultiEditsSimultaneous Multi-Aspect Editing with Text-to-Image Diffusion Models arXiv2024
StyleShotA Snapshot on Any Style (Style Transfer) arXiv2024
Instruct-ImagenImage Generation with Multi-modal Instruction CVPR2024
AnimateDiffAnimate Your Personalized Text-to-Image Diffusion Models without Specific Tuning arXiv2023
DreamBoothFine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation CVPR2023
Break-A-SceneExtracting Multiple Concepts from a Single Image SIGGRAPH Asia2023
MasaCtrlTuning-free Mutual Self-Attention Control for Consistent Image Synthesis and Editing arXiv2023
Delta Denoising ScoreDelta Denoising Score arXiv2023
DiffEditDiffusion-based Semantic Image Editing with Mask Guidance ICLR2023
Plug-and-Play DiffusionPlug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation arXiv2022
Null-text InversionNull-text Inversion for Editing Real Images using Guided Diffusion Models arXiv2022
InstructPix2PixLearning to Follow Image Editing Instructions arXiv2022
Blended DiffusionText-driven Editing of Natural Images CVPR2022
DiffusionCLIPText-Guided Diffusion Models for Robust Image Manipulation CVPR2022
ManiTransEntity-Level Text-Guided Image Manipulation via Token-wise Semantic Alignment CVPR2022
CLIPstylerImage Style Transfer with a Single Text Condition CVPR2022
Text2LIVEText-Driven Layered Image and Video Editing arXiv2022
HairCLIPDesign Your Hair by Text and Reference Image CVPR2022
CLIP-NeRFText-and-Image Driven Manipulation of Neural Radiance Fields CVPR2022
LANITLanguage-Driven Image-to-Image Translation for Unlabeled Data arXiv2022
StyleCLIPText-Driven Manipulation of StyleGAN Imagery ICCV2021
Talk-to-EditFine-Grained Facial Editing via Dialog ICCV2021
Paint by WordPaint by Word arXiv2021
Lightweight T2I ManipulationLightweight Generative Adversarial Networks for Text-Guided Image Manipulation NeurIPS2020

Subject-Driven & Personalized Generation 13+ papers

ModelFull TitleVenueYear
MAGREFMasked Guidance for Any-Reference Video Generation arXiv2025
Gen4GenGenerative Data Pipeline for Generative Multi-Concept Composition arXiv2024
MM-DiffHigh-Fidelity Image Personalization via Multi-Modal Condition Integration arXiv2024
ViCoPlug-and-play Visual Condition for Personalized Text-to-image Generation arXiv2023
DisenBoothDisentangled Parameter-Efficient Tuning for Subject-Driven Text-to-Image Generation arXiv2023
ELITEEncoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation arXiv2023
InstantBoothPersonalized Text-to-Image Generation without Test-Time Finetuning arXiv2023
Subject-driven T2ISubject-driven Text-to-Image Generation via Apprenticeship Learning arXiv2023
Controllable Textual InversionControllable Textual Inversion for Personalized Text-to-Image Generation arXiv2023
LegoLearning to Disentangle and Invert Concepts Beyond Object Appearance arXiv2023
P+Extended Textual Conditioning in Text-to-Image Generation arXiv2023
Taming EncoderTaming Encoder for Zero Fine-tuning Image Customization with T2I Diffusion Models arXiv2023
Instance-Conditioned GANInstance-Conditioned GAN NeurIPS2021

Prompt Engineering & Optimization 8 papers

TitleFocusVenueYear
PromptCharm: T2I Generation through Multi-modal Prompting and RefinementMulti-modal Prompting CHI2024
Automated Black-box Prompt Engineering for Personalized T2IBlack-box Optimization arXiv2024
BeautifulPrompt: Towards Automatic Prompt Engineering for T2IAutomatic Prompt Engineering EMNLP2023
NeuroPrompts: Adaptive Framework to Optimize Prompts for T2IPrompt Optimization arXiv2023
Optimizing Prompts for Text-to-Image GenerationPrompt Optimization arXiv2022
Best Prompts for Text-to-Image Models and How to Find ThemAesthetic Prompt Search arXiv2022
A Taxonomy of Prompt Modifiers for Text-To-Image GenerationPrompt Taxonomy arXiv2022
Design Guidelines for Prompt Engineering T2I Generative ModelsDesign Guidelines CHI2022