Adapting the Neural Encoder-Decoder Framework from Single to Multi-Document Summarization

Abstract

Generating an abstract from a set of relevant documents remains challenging. Despite the development of the neural encoder-decoder framework, prior studies focus primarily on single-document summarization, possibly because labelled training data can be automatically harvested from the Web. Nevertheless, labelled data for multi-document summarization are scarce. There is thus an increasing need to adapt the encoder-decoder framework from single- to multiple-document summarization in an unsupervised fashion. In this paper we present an initial investigation into a novel adaptation method. It exploits the maximal marginal relevance method to select representative sentences from multi-document input, and an abstractive encoder-decoder model to fuse disparate sentences to an abstractive summary. The adaptation method is robust and itself requires no training data. Our system compares favorably to state-of-the-art extractive and abstractive approaches judged by both automatic metrics and human assessors.

Publication
In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.
Date
Links