關於 cookie 的說明

本網站使用瀏覽器紀錄 (Cookies) 來提供您最好的使用體驗,我們使用的 Cookie 也包括了第三方 Cookie。相關資訊請訪問我們的隱私權與 Cookie 政策。如果您選擇繼續瀏覽或關閉這個提示,便表示您已接受我們的網站使用條款。

HiDream.ai Awards Best Demo at ACM MM 2025: Redefining Conversational Visual Creation

文章來源 : PR Newswire 美通社 發表時間 : 瀏覽次數 : 472 加入收藏 :

BEIJING, Nov. 6, 2025 /PRNewswire/ -- Recently, HiDream.ai has been honored the Best Demo at the 33rd ACM International Conference on Multimedia (ACM MM 2025), thus becoming the first Chinese startup team in multimodal generative AI to claim this honor—underscoring the company's top-tier research prowess and exceptional innovation capabilities in this field. The prestigious award recognizes the company's revolutionary unified multimodal agent—HiDream-Agent, a pioneering agent that transforms complex visual content creation into an intuitive conversational experience.


ACM MM, organized by the Special Interest Group on Multimedia (SIGMM) of the Association for Computing Machinery (ACM), is the top-tier academic event in the global multimedia field. Dedicated to advancing research innovation and industrial application of multimedia technologies, it is widely regarded as one of the most authoritative and influential conferences in the industry, attracting leading scholars and tech giants worldwide. The Best Demo symbolizes both the high international recognition of the research outcomes and the research team's outstanding competence in multimedia technology innovation and application.

HiDream-Agent's core strength lies in breaking the limitations of fragmented multimodal tools. It seamlessly integrates text-to-image generation, instruction-based image editing, and text/image-to-video generation within a single interface, effectively addressing the industry-wide challenge of cross-modal semantic alignment. Built on the 17-billion-parameter HiDream-I1 model, featuring a sparse Diffusion Transformer (DiT) structure and dynamic Mixture-of-Experts (MoE) design, it delivers exceptional performance on international benchmarks like HPS and GenEval. For instruction-based image editing, the team optimized HiDream-I1 with robust in-context visual conditioning, enabling precise image modifications.

This agent ushers in a new paradigm for accessible, interactive visual storytelling and collaborative content creation in multimodal generative AI. By merging generation and editing into a dialogue-driven experience, it lowers the barrier to high-quality visual content creation, drastically shortens iteration cycles, and enables a "one-conversation" creative loop from idea to polished output. Currently, this technology prototype has been successfully iterated into the Chat Generation function of HiDream.ai's flagship product vivago, delivering more natural, personalized multimodal interaction for users.

Additionally, at ACM MM 2025, HiDream.ai hosted the Identity-Preserving Video Generation (IPVG) Challenge. Featuring two tracks—Facial Identity-Preserving Video Generation and Full-Body Identity-Preserving Video Generation—the competition requires participants to maintain the consistency of the given identity during video generation. It also provides a new dataset to support the task of identity-preserving text-to-video generation, attracting numerous top-tier research teams worldwide.

HiDream.ai was founded in 2023 by Dr. Mei Tao—an Academician of the Canadian Academy of Engineering, Fellow of IEEE/IAPR/CAAI, and Senior Researcher at Microsoft Research Asia. The team he leads has over a decade of experience, dedicating to the innovative exploration and commercialization of generative AI technologies. HiDream.ai focuses on visual multimodal foundation models, aiming to empower the creative industry through generative AI technology. Notably, its HiDream-I1 model, launched in April this year, topped the authoritative Artificial Analysis ranking within 24 hours, becoming the first Chinese self-developed generative AI model to enter the global top tier and maintaining its leading position ever since. Moving forward, the team will deepen multimodal technology innovation, accelerate the industrialization of its technologies, expand core application scenarios in digital creation and film/television post-production, foster global tech collaboration and academic exchanges, and deliver more intelligent, efficient AI creative solutions for creators worldwide.

Paper: https://doi.org/10.1145/3746027.3754467

以上新聞投稿內容由 PR Newswire 美通社 全權自負責任,若有涉及任何違反法令、違反本網站會員條款、有侵害第三人權益之虞,將一概由 PR Newswire 美通社 承擔法律及損害賠償之責任,與台灣產經新聞網無關。

Tags :
2025 年 12 月 13 日 (星期六) 農曆十月廿四日
首 頁 我的收藏 搜 尋 新聞發佈