Understanding omitted facts in transformer-based abstractive summarization

Panawenna, PH; Wickramanayake, S

Understanding omitted facts in transformer-based abstractive summarization

dc.contributor.author	Panawenna, PH
dc.contributor.author	Wickramanayake, S
dc.date.accessioned	2026-02-11T05:29:54Z
dc.date.issued	2024
dc.description.abstract	Text summarization is a natural language processing task that generates concise document summaries. It can be extractive or abstractive. The former extracts pieces of the document, while the latter generates new concise sentences after identifying the critical information from the input text. Abstractive Summarization (AS) more closely represents how a human would summarize and is used in multiple missioncritical downstream tasks in domains such as law and finance. However, the existing state-of-the-art AS models are based on black-box deep learning models such as Transformers. Hence, the users of such systems cannot understand why some facts from the document have been included in the summary while some have been omitted. This paper proposes an algorithm to explain which facts have been omitted and why in Transformerbased AS. We leverage the Cross-Attention (CA) in transformers to identify words in the input passage with minimum influence in generating the summary. These identified words are then given to a Large Language Model along with the input passage and the generated summary to explain the omitted facts and the reasons for omissions. The experimental results using the state-of-the-art AS model show that CA can help provide valuable explanations for the model’s fact selection process.
dc.identifier.conference	Moratuwa Engineering Research Conference 2024
dc.identifier.department	Engineering Research Unit, University of Moratuwa
dc.identifier.email	pasadie.23@cse.mrt.ac.lk
dc.identifier.email	sandarekaw@cse.mrt.ac.lk
dc.identifier.faculty	Engineering
dc.identifier.isbn	979-8-3315-2904-8
dc.identifier.pgnos	pp. 624-629
dc.identifier.place	Moratuwa, Sri Lanka
dc.identifier.proceeding	Proceedings of Moratuwa Engineering Research Conference 2024
dc.identifier.uri	https://dl.lib.uom.lk/handle/123/24842
dc.language.iso	en
dc.publisher	IEEE
dc.subject	Fact Selection
dc.subject	Abstractive Summarization
dc.subject	Cross Attention
dc.subject	Transformers
dc.subject	Large Language Models
dc.title	Understanding omitted facts in transformer-based abstractive summarization
dc.type	Conference-Full-text

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 1571021473.pdf
Size:: 2.06 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

MERCon - 2024