AJAO, Ademolu Daniel and Covenant University, Theses Masters (2024) Development of an Extended Biomedical Named Entity Recognition and Relation Extraction Model for Malaria using BioBERT. Masters thesis, Covenant University.
PDF
Download (276kB) |
Abstract
This study aims to enhance the biomedical Named Entity Recognition and Relation Extraction model for use in the malaria subdomain by fine-tuning the existing BioBERT (Bidirectional Encoder Representations from Transformers for Biomedical Text Mining) model. The process of fine-tuning involves adjusting the parameters of the BioBERT model to suit the characteristics of the malaria subdomain better. The model is intended to improve the process of recognizing entities and their relationships in the context of malaria-associated publications. This solves an essential problem in connection with the inapplicability of the previously developed models in the biomedical field. The study uses complex and highly effective machine learning algorithms, such as Long Short-Term Memory, Random Forests, Support Vector Machine and Gradient Boosting Machine, to fine-tune the existing BioBERT model, leading to the FT-BioBERT model. The fine-tuned model is compared with other models, such as BioBERT and Multi-BioNER, over three datasets, namely BC5CDR-Disease, BioRED, and NCBI-Disease. The fine-tuned model achieved notable performance improvements: achieving 92.4% in accuracy, which is a 3.13% increase from BioBERT and 2.33% from Multi-BioNER and attaining 91.8% in precision, 92.7% in recall, and 92.2% F1-score which is a of 3.15% improvement over BioBERT, and 2.23% improvement over Multi-BioNER. Based on the results, we confirm that the proposed model can effectively identify and extract entities and their relationships when supplied with malaria literature and, therefore, is suitable for biomedical text mining. We hope that the study's findings will provide new avenues that will lead to the creation of domain-related NLP applications in malaria-related fields.
Item Type: | Thesis (Masters) |
---|---|
Uncontrolled Keywords: | Named Entity Recognition, Relation Extraction, BioBERT, Biomedical Language Model, Biomedical Natural Language Processing |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science Q Science > QH Natural history Q Science > QH Natural history > QH301 Biology |
Divisions: | Faculty of Engineering, Science and Mathematics > School of Electronics and Computer Science |
Depositing User: | Patricia Nwokealisi |
Date Deposited: | 05 Nov 2024 14:36 |
Last Modified: | 05 Nov 2024 14:36 |
URI: | http://eprints.covenantuniversity.edu.ng/id/eprint/18561 |
Actions (login required)
View Item |