APPLYING DEEP LEARNING FOR AUTOMATIC REGULATION QUESTION ANSWERING SYSTEM AT INDUSTRIAL UNIVERSITY OF HO CHI MINH CITY
Main Article Content
Abstract
Currently, for a large-scale university like the Industrial University of Ho Chi Minh City, the number of regulations, announcements is very large and frequently updated, making it difficult to understand and grasp the content. In this paper, we build a system to automatically answer questions based on the content of text files using deep learning techniques. The system extracts information from the question, enters the keywords and returns the relevant text using the BM25 algorithm. Given the text with the highest relevance, the deep learning model is trained to extract the corresponding answer. The model is trained on a training data set of 10000 and a test dataset of 1600 pairs of questions and corresponding answers from texts taken from announcements, regulations of the university. We refine deep learning models for training and evaluation, based on efficiency and accuracy to select the most optimal model. The resulting accuracy obtained according to the F1-score of the BERT model is 73.93%, RoBERTa is 75.59% PhoBERT is 45.13% and DistilBERT is 72.95%. The RoBERTa model was selected with the highest training speed and accuracy and deployed to the system to evaluate the results.