Adversarial (GANs)
Autoregressive
-
[2404.02905] Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
-
[2410.10812] HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
-
[2412.01819] Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis
-
[2412.04431] Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
-
[2412.04332] Liquid: Language Models are Scalable Multi-modal Generators
-
[2412.03069] TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation
-
[2411.18447] Continuous Autoregressive Models with Noise Augmentation Avoid Error Accumulation
-
[2412.12095] Causal Diffusion Transformers for Generative Modeling
-
[2411.19722] JetFormer: An Autoregressive Generative Model of Raw Images and Text
-
[ ]
Liquid
Diffusion
Diffusion Transformers
- [2412.16112] CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up
- [2412.12391] Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Patterns
Latent Space
VAE
VQVAE
-
[2312.02116] GIVT: Generative Infinite-Vocabulary Transformers
-
[2411.19722] JetFormer: An Autoregressive Generative Model of Raw Images and Text
-
[2412.01819] Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis
-
[2412.01199] TinyFusion: Diffusion Transformers Learned Shallow
-
[2412.03177] PatchDPO: Patch-level DPO for Finetuning-free Personalized Image Generation
-
[2412.06774] Visual Lexicon: Rich Image Features in Language Space