University Links: Home Page | Site Map
Covenant University Repository

A SEMANTICS-BASED CLUSTERING APPROACH FOR SIMILAR RESEARCH AREA DETECTION: A CASE STUDY OF NIGERIAN UNIVERSITIES

ADIGUN, EMMANUEL BUKUNMI and Covenant University, Theses (2018) A SEMANTICS-BASED CLUSTERING APPROACH FOR SIMILAR RESEARCH AREA DETECTION: A CASE STUDY OF NIGERIAN UNIVERSITIES. Masters thesis, COVENANT UNIVERSITY.

[img] PDF
Download (159Kb)

Abstract

The place of research collaborations is indispensable in coming up with research publications. The task of detecting similar research areas is crucial to the development and furtherance of research. Prominent and rookie researchers alike are predisposed to seek existing research publications in a research field of interest before coming up with a thesis. The manual process of searching out individuals in an already existing research techniques which do not sufficiently capture the implicit semantics of keywords thereby leaving out some research articles. In this work, we have proposed a similar research area detection framework to address this problem. The aim of this study is to develop a semantics-based clustering method for similar research area detection. This study employs a number of techniques such as Ontology-based pre-processing, Latent Semantic.Indexing and K-Means Clustering to develop a prototype similar research area detection system, that can be used to determine similar research domain publications. However, traditional document clustering techniques suffer from high dimensionality and data sparsity problems. In a bid to solve these problems, a domain ontology is used in the preprocessing stage to weight concepts and determine semantically similar concepts, while Latent Semantic Analysis is used as the topic modelling technique in order to capture the implicit semantic relationship between terms in the text corpus. To test our framework, publications from a number of Nigerian University faculties were randomly selected and used as the dataset for our clustering model. A proof-of-concept implementation was developed using the Python programming language. From the evaluation of our system, we were able to derive more accurate clustering results as a result of the integration of ontologies in the pre-processing stage in comparison with documents that were not pre-processed with the ontology. field is cumbersome and time-consuming. Besides, it tends to not capture publications with keywords that do not match a keyword query which results in inaccurate results. From extant literature, automated similar research area detection systems have been developed to solve this problem. However, most of them use keyword matching techniques which do not sufficiently capture the implicit semantics of keywords thereby leaving out some research articles. In this work, we have proposed a similar research area detection framework to address this problem. The aim of this study is to develop a semantics-based clustering method for similar research area detection. This study employs a number of techniques such as Ontology-based pre-processing, Latent Semantic Indexing and K-Means Clustering to develop a prototype similar research area detectionsystem, that can be used to determine similar research domain publications. However, traditional document clustering techniques suffer from high dimensionality and data sparsity problems. In a bid to solve these problems, a domain ontology is used in the preprocessing stage to weight concepts and determine semantically similar concepts, while Latent Semantic Analysis is used as the topic modelling technique in order to capture the implicit semantic relationship between terms in the text corpus. To test our framework, publications from a number of Nigerian University faculties were randomly selected and used as the dataset for our clustering model. A proof-of-concept implementation was developed using the Python programming language. From the evaluation of our system, we were able to derive more accurate clustering results as a result of the integration of ontologies in the pre-processing stage in comparison with documents that were not pre-processed with the ontology.

Item Type: Thesis (Masters)
Subjects: Q Science > Q Science (General)
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Faculty of Engineering, Science and Mathematics > School of Electronics and Computer Science
Depositing User: Mrs Hannah Akinwumi
Date Deposited: 23 Mar 2020 10:42
Last Modified: 23 Mar 2020 10:42
URI: http://eprints.covenantuniversity.edu.ng/id/eprint/13238

Actions (login required)

View Item View Item