AUTOMATIC SUBJECT LABELING IN DOCUMENTS BY USING ONTOLOGY AND GRAPH DATABASES
Main Article Content
Tóm tắt
Ontologies apply to many applications in recent years, such as information retrieval, information extraction, and text document classification. The purpose of domain-specific ontology is to enrich the identification of concept and the interrelationships. In our research, we use ontology to specify a set of generic subjects (concept) that characterizes the domain as well as their definitions and interrelationships. This paper introduces a system for labeling subjects of a text documents based on the differential layers of domain specific ontology, which contains the information and the vocabularies related to the computer domain. A document can contain several subjects such as data science, database, and machine learning. The subjects in text document classification are determined based on the differential layers of the domain specific ontology. We combine the methodologies of Natural Language Processing with domain ontology to determine the subjects in text document. In order to increase performance, we use graph database to store and access ontology. Besides, the paper focuses on evaluating our proposed algorithm with some other methods. Experimental results show that our proposed algorithm yields performance significantly