Selected Publications

In this paper, we explore the ability of Transformers to fuse sentences and propose novel algorithms to enhance their ability to perform sentence fusion by leveraging the knowledge of points of correspondence between sentences. Through extensive experiments, we investigate the effects of different design choices on Transformer’s performance. Our findings highlight the importance of modeling points of correspondence between sentences for effective sentence fusion.
In EMNLP, 2020

In this paper, we analyze the outputs of five state-of-the-art abstractive summarizers, focusing on summary sentences that are formed by sentence fusion. Our analysis reveals that system sentences are mostly grammatical, but often fail to remain faithful to the original article.
In EMNLP Summarization Workshop, 2019

When writing a summary, humans tend to choose content from one or two sentences and merge them into a single summary sentence. Our proposed framework attempts to model human methodology by selecting either a single sentence or a pair of sentences, then compressing or fusing the sentence(s) to produce a summary sentence.
In ACL, 2019

In this paper we present an initial investigation into a novel adaptation method. It exploits the maximal marginal relevance method to select representative sentences from multi-document input, and an abstractive encoder-decoder model to fuse disparate sentences to an abstractive summary.
In EMNLP, 2018

Recent Publications

In this paper, we present an investigation into fusing sentences drawn from a document by introducing the notion of points of correspondence, which are cohesive devices that tie any two sentences together into a coherent text. Our dataset bridges the gap between coreference resolution and summarization. It is publicly shared to serve as a basis for future work to measure the success of sentence fusion systems.
In ACL Student Research Workshop, 2020

In this paper we seek to strengthen a DPP-based method for extractive multi-document summarization by presenting a novel similarity measure inspired by capsule networks. Our findings are particularly meaningful for summarizing documents created by multiple authors containing redundant yet lexically diverse expressions.
In ACL, 2019

In this paper, we seek to identify vague content in privacy policies. We construct the first corpus of human-annotated vague words and sentences and present empirical studies on automatic vagueness detection.
In EMNLP, 2018

This paper studies the feasibility of using Abstract Meaning Representation (AMR), a semantic representation of natural language grounded in linguistic theory, as a form of content representation. Our approach condenses source documents to a set of summary graphs following the AMR formalism.
In COLING, 2018