APPLYING DEEP LEARNING FOR AUTOMATIC REGULATION QUESTION ANSWERING SYSTEM AT INDUSTRIAL UNIVERSITY OF HO CHI MINH CITY

Main Article Content

ĐẶNG THỊ PHÚC
NGUYỄN THANH LONG
ĐẶNG VĂN NGHIÊM
TRẦN THỊ MINH KHOA

Abstract

Currently, for a large-scale university like the Industrial University of Ho Chi Minh City, the number of regulations, announcements is very large and frequently updated, making it difficult to understand and grasp the content. In this paper, we build a system to automatically answer questions based on the content of text files using deep learning techniques. The system extracts information from the question, enters the keywords and returns the relevant text using the BM25 algorithm. Given the text with the highest relevance, the deep learning model is trained to extract the corresponding answer. The model is trained on a training data set of 10000 and a test dataset of 1600 pairs of questions and corresponding answers from texts taken from announcements, regulations of the university. We refine deep learning models for training and evaluation, based on efficiency and accuracy to select the most optimal model. The resulting accuracy obtained according to the F1-score of the BERT model is 73.93%, RoBERTa is 75.59% PhoBERT is 45.13% and DistilBERT is 72.95%. The RoBERTa model was selected with the highest training speed and accuracy and deployed to the system to evaluate the results.

Article Details

Section
Information Technology, Electricity, Electronic