Please use this identifier to cite or link to this item:
http://hdl.handle.net/20.500.12188/10134
Title: | Improving NER Performance by Applying Text Summarization on Pharmaceutical Articles | Authors: | Dobreva, Jovana Jofche, Nasi Jovanovik, Milos Trajanov, Dimitar |
Keywords: | Named entity recognition Data processing Text summarization Knowledge extraction Knowledge graphs |
Issue Date: | 30-Oct-2020 | Publisher: | Springer International Publishing | Conference: | ICT Innovations 2020 | Abstract: | Analyzing long text articles in the pharmaceutical domain, for the purpose of knowledge extraction and recognizing entities of interest, is a tedious task. In our previous research efforts, we were able to develop a platform which successfully extracts entities and facts from pharmaceutical texts and populates a knowledge graph with the extracted knowledge. However, one drawback of our approach was the processing time; the analysis of a single text source was not interactive enough, and the batch processing of entire article datasets took too long. In this paper, we propose a modified pipeline where the texts are summarized before the analysis begins. With this, the source articles is reduced significantly, to a compact version which contains only the most commonly encountered entities. We show that by reducing the text size, we get knowledge extraction results comparable to the full text analysis approach and, at the same time, we significantly reduce the processing time, which is essential for getting both real-time results on single text sources, and faster results when analyzing entire batches of collected articles from the domain. | URI: | http://hdl.handle.net/20.500.12188/10134 | DOI: | 10.1007/978-3-030-62098-1_8 |
Appears in Collections: | Faculty of Computer Science and Engineering: Conference papers |
Show full item record
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.