论文名称:Counterfactual Off-Policy Training for Neural Dialogue Generation 论文作者:朱庆福,张伟男,刘挺,王威廉 原创作者:朱庆福 论文链接:https://arxiv.org/abs/2004.14507 转载须标注出处:哈工大SCIR
2. 模型结构
2.1 结构因果模型(Structural Causal Model)
2.2 干预(Intervention)
2.3 反事实推理(Counterfactual Inference)
3. 实验结果
4. 实验分析
5. 结论
参考文献
[1] Judea Pearl and Dana Mackenzie. 2018. The book of why: the new science of cause and effect. Basic Books.
[2] Lars Buesing, Theophane Weber, Yori Zwols, Nicolas Heess, Sebastien Racaniere, Arthur Guez, and Jean Baptiste Lespiau. 2019. Woulda, coulda, shoulda: Counterfactually-guided policy search. In Proceedings of the Seventh International Conference on Learning Representations.
[3] Michael Oberst and David Sontag. 2019. Counterfactual off-policy evaluation with gumbel-max structural causal models. In International Conference on Machine Learning, pages 4881–4890.
[4] Iulian V Serban, Alessandro Sordoni, Yoshua Bengio, Aaron Courville, and Joelle Pineau. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence.
[5] Jingjing Xu, Xuancheng Ren, Junyang Lin, and Xu Sun. 2018. Diversity-promoting GAN: A cross-entropy based generative adversarial network for diversified text generation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3940–3949.
[6] Jiwei Li, Will Monroe, Tianlin Shi, Se ́bastien Jean, Alan Ritter, and Dan Jurafsky. 2017a. Adversarial learning for neural dialogue generation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2157–2169.
[7] Yi-Lin Tuan and Hung-Yi Lee. 2019. Improving conditional sequence generative adversarial networks by stepwise evaluation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27(4):788–798.