Abstractive Related Work Generation: a Span Based Approach



Journal Title

Journal ISSN

Volume Title




The literature review is a crucial part of conducting and publishing academic research. It provides background of prior works and highlights the novelties of the current research. The literature review is presented under different sections in different fields. In natural language processing domain, it is present under the Related Work section. The task of automatically generating a related work section consists of generating a series of citations given the rest of the current research paper and a list of cited papers. Prior work focuses on generating individual sentences for each cited paper. However, citations consist of variable-length text units, which can be multiple sentences or just part of a sentence and can summarize more than one cited paper. To address these issues, we build a dataset to extract citation spans, text fragments influenced by cited papers, and distinguish the spans that describe the cited paper in detail from those that only provide high-level concepts. We train a baseline citation generation model which generates masked spans given the context paragraph from the related work section. We also discuss the role of citation span length and propose a length-controlled summarization model that generates summaries of a user-specified length. We additionally propose an approach where the generation length is predicted and used as the desired length by a single model and discuss heuristic estimates of the desired length and how they impact the performance of the generation model. Finally, we discuss the importance of generating spans that rightly fit in the related work context. Traditional abstractive summarization models often generate generic outputs that do not fit well in the context. To force the model to pay attention to contextual information, we propose a model which generates the context along with the masked span. We conduct a case study and human evaluation to show how the proposed model improves the coherence of the generated span.



Computer Science