TY - JOUR
T1 - Enhancing Transformer-based language models with Commonsense Representations for Knowledge-driven Machine Comprehension
AU - Chen, Daqing
PY - 2021/3/6
Y1 - 2021/3/6
N2 - Compared to the traditional machine reading comprehension (MRC) with limitation to the information in a passage, knowledge-driven MRC tasks aim to enable models to answer the question according to text and related commonsense knowledge. Although pre-trained Transformer-based language models (TrLMs) such as BERT and Roberta, have shown powerful perfor-mance in MRC, external knowledge such as unspoken commonsense and world knowledge still can not be used and explained explicitly. In this work, we present three simple yet effective injection methods integrated into the structure of TrLMs to fine-tune downstream knowledge-driven MRC tasks with off-the-shelf commonsense representations. Moreover, we introduce a mask mechanism for a token-level multi-hop relationship searching to fil-ter external knowledge. Experimental results indicate that the incremental TrLMs have significantly outperformed the baseline systems by 1%-4.1% on DREAM and CosmosQA, two prevalent knowledge-driven datasets. Further analysis shows the effectiveness of the proposed methods and the robustness of the incremental model in the case of an incomplete training set.
AB - Compared to the traditional machine reading comprehension (MRC) with limitation to the information in a passage, knowledge-driven MRC tasks aim to enable models to answer the question according to text and related commonsense knowledge. Although pre-trained Transformer-based language models (TrLMs) such as BERT and Roberta, have shown powerful perfor-mance in MRC, external knowledge such as unspoken commonsense and world knowledge still can not be used and explained explicitly. In this work, we present three simple yet effective injection methods integrated into the structure of TrLMs to fine-tune downstream knowledge-driven MRC tasks with off-the-shelf commonsense representations. Moreover, we introduce a mask mechanism for a token-level multi-hop relationship searching to fil-ter external knowledge. Experimental results indicate that the incremental TrLMs have significantly outperformed the baseline systems by 1%-4.1% on DREAM and CosmosQA, two prevalent knowledge-driven datasets. Further analysis shows the effectiveness of the proposed methods and the robustness of the incremental model in the case of an incomplete training set.
KW - Commonsense
KW - Machine Reading Comprehension
KW - Transformer
U2 - 10.1016/j.knosys.2021.106936
DO - 10.1016/j.knosys.2021.106936
M3 - Article
SN - 0950-7051
SP - 106936
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
ER -