基于 Diffusion Transformer(DiT)又迎来一大力作「Flag-DiT」,这次要将图像、视频、音频和 3D「一网打尽」。
论文地址:https://arxiv.org/pdf/2405.05945 GitHub 地址:https://github.com/Alpha-VLLM/Lumina-T2X 模型下载地址:https://huggingface.co/Alpha-VLLM/Lumina-T2I/tree/main 论文标题:Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
试用地址 1:http://106.14.2.150:10021/ 试用地址 2:http://106.14.2.150:10022/