Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.12188/10134
Title: Improving NER Performance by Applying Text Summarization on Pharmaceutical Articles
Authors: Dobreva, Jovana
Jofche, Nasi
Jovanovik, Milos 
Trajanov, Dimitar 
Keywords: Named entity recognition
Data processing
Text summarization
Knowledge extraction
Knowledge graphs
Issue Date: 30-Oct-2020
Publisher: Springer International Publishing
Conference: ICT Innovations 2020
Abstract: Analyzing long text articles in the pharmaceutical domain, for the purpose of knowledge extraction and recognizing entities of interest, is a tedious task. In our previous research efforts, we were able to develop a platform which successfully extracts entities and facts from pharmaceutical texts and populates a knowledge graph with the extracted knowledge. However, one drawback of our approach was the processing time; the analysis of a single text source was not interactive enough, and the batch processing of entire article datasets took too long. In this paper, we propose a modified pipeline where the texts are summarized before the analysis begins. With this, the source articles is reduced significantly, to a compact version which contains only the most commonly encountered entities. We show that by reducing the text size, we get knowledge extraction results comparable to the full text analysis approach and, at the same time, we significantly reduce the processing time, which is essential for getting both real-time results on single text sources, and faster results when analyzing entire batches of collected articles from the domain.
URI: http://hdl.handle.net/20.500.12188/10134
DOI: 10.1007/978-3-030-62098-1_8
Appears in Collections:Faculty of Computer Science and Engineering: Conference papers

Show full item record

Page view(s)

56
checked on Jul 24, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.