KOreaPAS: TAPAS 기반의 한국어 특화 표 질의응답 모델
KOreaPAS: TAPAS based Korean-Specific Table Question Answering Model
- 주제(키워드) Natural Language Processing , Table Question Answering , Question Answering , TAPAS
- 발행기관 대한산업공학회
- 발행년도 2023
- 총서유형 Journal
- DOI http://dx.doi.org/10.7232/JKIIE.2023.49.6.502
- KCI ID ART003023518
- 본문언어 한국어
초록/요약
Table Question Answering (QA) aims to answer questions based on semi-structured tables. Unlike text data, tables possess a unique two-dimensional structure, driving the exploration of specialized learning approaches to enhance language models’ understanding of tables. However, while Table QA research is advancing rapidly in English, its development in Korean is still in its early stages. To mitigate this gap, we present KOreaPAS, specifically designed for Korean Table QA tasks. KOreaPAS is based on TAPAS’s architecture, and its learning process consists of two stages: pre-training and fine-tuning. In the publicly available Korean tabular dataset for pre-training language models, approximately 36.5% instances lack text information related to tables, and it can potentially hinder the models’ learning of various correlations between text and the table during pre-training. To address the issue, we introduce a table-text mapping method that retrieves the most relevant text for the table from Wikipedia pages. Further, we propose a multi-granularity fine-tuning strategy that utilizes the three granularities of the table structure for both training and inference. Experimental results robustly confirm the effectiveness of the proposed approaches in enhancing the comprehension abilities of language models towards questions over tables. Specifically, KOreaPAS demonstrated the highest performance among currently published benchmark models in tests conducted on two Korean Table QA datasets, thus establishing a new standard in Korean Table QA tasks.
more

