Please use this identifier to cite or link to this item:
http://hdl.handle.net/20.500.12188/27397
Title: | A Comprehensive Analysis of LayoutLM and Donut for Document Classification | Authors: | Bajrami, Merxhan Zdravevski, Eftim Lameski, Petre Stojkoska, Biljana |
Keywords: | document classification, layout analysis, OCR, intelligent document processing | Issue Date: | Jul-2023 | Publisher: | Ss Cyril and Methodius University in Skopje, Faculty of Computer Science and Engineering, Republic of North Macedonia | Series/Report no.: | CIIT 2023 papers;22; | Conference: | 20th International Conference on Informatics and Information Technologies - CIIT 2023 | Abstract: | Document classification is important in everyday life as it allows for efficient management and organization of vast amounts of digital documents, saving time and resources. This task is essential for businesses, organizations, and individ uals who handle large volumes of data and need to quickly retrieve and analyze specific information. AI-based document classification can help organizations better manage and organize their digital assets, improve information retrieval, and make better business decisions based on the insights derived from the classified documents. This paper compares the performance of two transformer-based models, LayoutLM and Donut, for image classification tasks on two different datasets. LayoutLM was trained using pre-trained weights from Microsoft, while Donut used pre-trained weights from Huggingface. Both models were fine-tuned for 100 epochs with early stopping technique, using the Adam optimizer and Cross Entropy Loss. Our results show that LayoutLM performs better than Donut on the first dataset, achieving an overall accuracy of 0.88, while Donut achieved an accuracy of 0.74. Our study demonstrates the importance of carefully selecting and evaluating different models for document classification tasks, based on the specific char- acteristics of the dataset and the task requirements. Additionally, we provide insights into the strengths and weaknesses of both LayoutLM and Donut models for document classification on different datasets. | URI: | http://hdl.handle.net/20.500.12188/27397 |
Appears in Collections: | Faculty of Computer Science and Engineering: Conference papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
CIIT2023_paper_22.pdf | 9.19 MB | Adobe PDF | View/Open |
Page view(s)
214
checked on Jul 11, 2024
Download(s)
602
checked on Jul 11, 2024
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.