本节介绍多模态输入多模态输出(Multimodal Summarization with Multimodal Output,MSMO)的相关工作。
7. 多模态会议摘要
本小节介绍多模态会议摘要的相关工作。Improving Productivity Through NLP, Microsoft 指出职员需要花费 37% 的工作时间用于参加会议,每个会议平均会陈述 5000 个词语。如此频繁的会议和冗长的内容给职员造成了极大的负担,因此会议摘要可以帮助快速的总结会议决策信息,提问信息,任务信息等核心内容,缓解职员压力,提高工作效率。但是仅仅利用会议文本信息是不够的,多模态信息,例如视频、音频可以提供更加充足和全方面的信息,例如有人加入了会议,离开了会议;通过一些动作,语音语调,面部表情,识别讨论是否有情绪,是否有争论等等。因此多模态会议摘要逐渐得到了人们的关注。
Jindřich Libovický and Jindřich Helcl. Attention strategies for multi-source sequence-to-sequence learning. ACL 2017. https://www.aclweb.org/anthology/P17-2031
[2]
Yansen Wang, Ying Shen, Zhun Liu, P. P. Liang, Amir Zadeh, and Louis-Philippe Morency. Words can shift: Dynamically adjusting word representations using nonverbal behaviors. AAAI 2019.
[3]
Gen Li, N. Duan, Yuejian Fang, Daxin Jiang, and M. Zhou. Unicoder-vl: A universal encoder forvision and language by cross-modal pre-training. AAAI 2020.
[4]
R. Sanabria, Ozan Caglayan, Shruti Palaskar, Desmond Elliott, Loïc Barrault, Lucia Specia,and F. Metze. How2: A large-scale dataset for multimodal language understanding. NeurIPS 2018.
[5]
Shruti Palaskar, Jindřich Libovický, Spandana Gella, and F. Metze. Multimodal abstractive summarization for how2 videos. ACL 2019.
[6]
Haoran Li, Junnan Zhu, C. Ma, Jiajun Zhang, and C. Zong. Multi-modal summarization forasynchronous collection of text, image, audio and video. 2017.
[7]
Haoran Li, Junnan Zhu, Tianshang Liu, Jiajun Zhang, and C. Zong. Multi-modal sentence summarization with modality attention and image filtering. IJCAI 2018.
[8]
Junnan Zhu, Haoran Li, Tianshang Liu, Y. Zhou, Jiajun Zhang, and C. Zong. Msmo: Multimodal summarization with multimodal output. EMNLP 2018.
[9]
Junnan Zhu, Yin qing Zhou, Jiajun Zhang, Haoran Li, Chengqing Zong, and Changliang Li. Multimodal summarization with guidance of multimodal reference. AAAI 2020.
[10]
B. Erol, Dar-Shyang Lee, and J. Hull. Multimodal summarization of meeting recordings. ICME 2003.
[11]
Fumio Nihei, Yukiko I. Nakano, and Yutaka Takase. Fusing verbal and nonverbal information forextractive meeting summarization. GIFT 2018.
[12]
Manling Li, L. Zhang, H. Ji, and R. Radke. Keep meeting summaries on topic: Abstractive multimodal meeting summarization. ACL 2019.