Preview

Russian Technological Journal

Advanced search

Knowledge injection methods in question answering

https://doi.org/10.32362/2500-316X-2025-13-3-21-43

EDN: QKUGFZ

Abstract

Objectives. Despite the recent success of large language models, which are now capable of solving a wide range of tasks, a number of practical issues remain unsolved. For example, users of systems providing question answering (QA) services may experience a lack of commonsense knowledge and reasoning proficiency. The present work considers knowledge injection methods as a means of providing functional enhancements to large language models by providing necessary facts and patterns from external sources.
Methods. Knowledge injection methods leveraged in relevant QA systems are classified, analyzed, and compared. Self-supervised learning, fine-tuning, attention mechanism and interaction tokens for supporting information injection are considered along with auxiliary approaches for emphasizing the most relevant facts.
Results. The reviewed QA systems explicitly show the accuracy increase on the CommonsenseQA benchmark compared to pretrained language model baseline due to knowledge injection methods exploitation. At the same time, in general the higher results are related to knowledge injection methods based on language models and attention mechanism.
Conclusions. The presented systematic review of existing external knowledge injection methods for QA systems confirms the continuing validity of this research direction. Such methods are not only capable of increasing the accuracy of QA systems but also mitigating issues with interpretability and factual obsolescence in pretrained models. Further investigations will be carried out to improve and optimize different aspects of the current approaches and develop conceptually novel ideas.

About the Author

D. V. Radyush
ITMO University
Russian Federation

Daniil V. Radyush, Postgraduate Student, Faculty of Software Engineering and Computer Systems 
49-А, Kronverkskii pr., Saint Petersburg, 197101 Russia 
Scopus Author ID 58234958500


Competing Interests:

The author declares no conflicts of interest



References

1. Devlin J., Chang M.-W., Lee K., Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019;1:4171–4186. https://doi.org/10.18653/v1/N19-1423

2. Petroni F., Rocktäschel T., Lewis P., et al. Language Models as Knowledge Bases? Processing (EMNLP-IJCNLP). 2019. P. 2463–2473. https://doi.org/10.18653/v1/D19-1250

3. Sap M., Le Bras R., Allaway E., et al. ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2019;33(1):3027–3035. https://doi.org/10.1609/aaai.v33i01.33013027

4. Niven T., Kao H.-Y. Probing Neural Network Comprehension of Natural Language Arguments. arXiv preprint arXiv:1907.07355. 2019. https://doi.org/10.48550/arXiv.1907.07355

5. McCoy R. T., Pavlick E., Linzen T. Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. P. 3428–3448. http://doi.org/10.18653/v1/P19-1334

6. Li J., Chen J., Ren R., et al. The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models. arXiv preprint arXiv:2401.03205. 2024. https://doi.org/10.48550/arXiv.2401.03205

7. Wei J., Wang X., Schuurmans D., et al. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In: 36th Conference on Neural Information Processing Systems. 2022;35:24824–24837. https://doi.org/10.48550/arXiv.2201.11903

8. Lewis P., Perez E. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. Advances in Neural Information Processing Systems. 2020;33:9459–9474. https://doi.org/10.48550/arXiv.2005.11401

9. Ye Zhi-Xiu, Chen Q., Wang W., Ling Zhen-Hua. Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models. arXiv preprint arXiv:1908.06725v5. 2020. https://doi.org/10.48550/arXiv.1908.06725

10. Vaswani A., Shazeer N., Parmar N., et al. Attention Is All You Need. Advances in Neural Information Processing Systems 30. 2018. https://doi.org/10.48550/arXiv.1706.03762

11. Liu J., Shen D., Zhang Y., et al. What Makes Good In-Context Examples for GPT-3? arXiv preprint arXiv:2101.06804. 2021. https://doi.org/10.48550/arXiv.2101.06804

12. Gao T., Fisch A., Chen D. Making Pre-trained Language Models Better Few-shot Learners. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021. P. 3816–3830. http://doi.org/10.18653/v1/2021.acl-long.295

13. Shwartz V., West P., Le Bras R., et al. Unsupervised Commonsense Question Answering with Self-Talk. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. P. 4615–4629. http://doi.org/10.18653/v1/2020.emnlp-main.373

14. Wang J., Zhao H. ArT: All-round Thinker for Unsupervised Commonsense Question-Answering. In: Proceedings of the 29th International Conference on Computational Linguistics. 2022. P. 1490–1501. https://doi.org/10.48550/arXiv.2112.13428

15. Wang P., Peng N., Ilievski F., et al. Connecting the Dots: A Knowledgeable Path Generator for Commonsense Question Answering. arXiv preprint arXiv:2005.00691. 2020. https://doi.org/10.48550/arXiv.2005.00691

16. Raffel C., Shazeer N., Roberts A., et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research. 2020;21(140):1–67. https://doi.org/10.48550/arXiv.1910.10683

17. Zhang Z., Han X., Liu Z., et al. ERNIE: Enhanced Language Representation with Informative Entities. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. P. 1441–1451. https://doi.org/10.18653/v1/P19-1139

18. Peters M.E., Neumann M., Logan IV R.L., et al. Knowledge enhanced contextual word representations. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019. P. 43–54. https://doi.org/10.18653/V1/D19-1005

19. He L., Zheng S., Yang T., Zhang F. KLMo: Knowledge Graph Enhanced Pretrained Language Model with Fine-Grained Relationships. In: Findings of the Association for Computational Linguistics: EMNLP. 2021. P. 4536–4542. https://doi.org/10.18653/v1/2021.findings-emnlp.384

20. Xiong W., Du J., Wang W.Y., Stoyanov V. Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model. arXiv preprint arXiv:1912.09637. 2019. https://doi.org/10.48550/arXiv.1912.09637

21. Sun Y., Wang S., Li Y., et al. ERNIE: Enhanced Representation through Knowledge Integration. arXiv preprint arXiv:1904.09223. 2019. https://doi.org/10.48550/arXiv.1904.09223

22. Zhang D., Yuan Z., Liu Y., et al. E-BERT: A Phrase and Product Knowledge Enhanced Language Model for E-commerce. arXiv preprint arXiv:2009.02835. 2020. https://doi.org/10.48550/arXiv.2009.02835

23. Chen Q., Li F.-L., Xu G., et al. DictBERT: Dictionary Description Knowledge Enhanced Language Model Pre-training via Contrastive Learning. arXiv preprint arXiv:2208.00635. 2022. https://doi.org/10.48550/arXiv.2208.00635

24. Lauscher A., Vulić I., Ponti E.M., et al. Informing Unsupervised Pretraining with External Linguistic Knowledge. arXiv preprint arXiv:1909.02339v1. 2019. https://doi.org/10.48550/arXiv.1909.02339

25. Levine Y., Lenz B., Dagan O., et al. SenseBERT: Driving Some Sense into BERT. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020. P. 4656–4667. https://doi.org/10.18653/v1/2020.acl-main.423

26. Wang X., Gao T., Zhu Z., et al. KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation. Trans. Assoc. Comput. Linguis. 2021;9:176–194. https://doi.org/10.1162/tacl_a_00360

27. Bordes A., Usunier N., Garcia-Durán A., et al. Translating Embeddings for Modeling Multi-relational Data. Advances in Neural Information Processing Systems. 2013. P. 2787–2795.

28. He B., Zhou D., Xiao J., et al. BERT-MK: Integrating Graph Contextualized Knowledge into Pre-trained Language Models. Findings of the Association for Computational Linguistics: EMNLP. 2020. P. 2281–2290. https://doi.org/10.18653/v1/2020.findings-emnlp.207

29. Banerjee P., Baral C. Self-Supervised Knowledge Triplet Learning for Zero-shot Question Answering. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. P. 151–162. https://doi.org/10.18653/v1/2020.emnlp-main.11

30. Zhong W., Tang D., Duan N., et al. Improving Question Answering by Commonsense-Based Pre-training. In: Tang J., Kan M.Y., Zhao D., Li S., Zan H. (Eds.). Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science. Springer; 2019. V. 11838. P. 16–28. https://doi.org/10.1007/978-3-030-32233-5_2

31. Sun T., Shao Y., Qiu X., et al. CoLAKE: Contextualized Language and Knowledge Embedding. arXiv preprint arXiv:2010.00309v1. 2020. https://doi.org/10.48550/arXiv.2010.00309

32. Su Y., Han X., Zhang Z., et al. CokeBERT: Contextual knowledge selection and embedding towards enhanced pre-trained language models. AI Open. 2021;2:127–134. https://doi.org/10.1016/j.aiopen.2021.06.004

33. Ma K., Ilievski F., Francis J., et al. Knowledge-driven Data Construction for Zero-shot Evaluation in Commonsense Question Answering. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2021;35(15):13507–13515. https://doi.org/10.1609/aaai.v35i15.17593

34. Wang W., Fang T., Ding W., et al. CAR: Conceptualization-Augmented Reasoner for Zero-Shot Commonsense Question Answering. Findings of the Association for Computational Linguistics: EMNLP 2023. 2023. P. 13520–13545. https://doi.org/10.18653/v1/2023.findings-emnlp.902

35. Zhan X., Li Y., Dong X., et al. elBERto: Self-supervised Commonsense Learning for Question Answering. arXiv preprint arXiv:2203.09424v1. 2022. https://doi.org/10.48550/arXiv.2203.09424

36. Rajpurkar P., Jia R., Liang P. Know What You Don’t Know: Unanswerable Questions for SQuAD. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018;2(Short Papers):784–789. https://doi.org/10.18653/v1/P18-2124

37. Khashabi D., Min S., Khot T., et al. UnifiedQA: Crossing Format Boundaries with a Single QA System. In: Findings of the Association for Computational Linguistics. 2020. P. 1896–1907. https://doi.org/10.18653/v1/2020.findings-emnlp.171

38. Lourie N., Le Bras R., Bhagavatula C., Choi Y. UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark. arXiv preprint arXiv:2103.13009v1. 2021. https://doi.org/10.48550/arXiv.2103.13009

39. Baek J., Aji A.F., Saffari A. Knowledge-Augmented Language Model Prompting for Zero-Shot Knowledge Graph Question Answering. In: Proceedings of the 1st Workshop on Natural Language Reasoning and Structured Explanations (NLRSE). 2023. P. 78–106. https://doi.org/10.18653/v1/2023.nlrse-1.7

40. Pan X., Sun K., Yu D., et al. Improving Question Answering with External Knowledge. In: Proceedings of the 2nd Workshop on Machine Reading for Question Answering. 2019. P. 27–37. https://doi.org/10.18653/v1/D19-5804

41. Xu Y., Zhu C., Xu R., et al. Fusing Context Into Knowledge Graph for Commonsense Question Answering. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 2021. P. 1201–1207. https://doi.org/10.18653/v1/2021.findings-acl.102

42. Xu Y., Zhu C., Wang S., et al. Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention. In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI). 2022. P. 2762–2768. https://doi.org/10.24963/ijcai.2022/383

43. Arora S., Wu S., Liu E., Ré C. Metadata Shaping: A Simple Approach for Knowledge-Enhanced Language Models. In: Findings of the Association for Computational Linguistics: ACL 2022. 2022. P. 1733–1745. https://doi.org/10.18653/v1/2022.findings-acl.137

44. Li S., Gao Y., Jiang H., et al. Graph Reasoning for Question Answering with Triplet Retrieval. In: Findings of the Association for Computational Linguistics: ACL 2023. 2023. P. 3366–3375. https://doi.org/10.18653/v1/2023.findings-acl.208

45. Mitra A., Banerjee P., Pal K.K., et al. How Additional Knowledge can Improve Natural Language Commonsense Question Answering? arXiv preprint arXiv:1909.08855v3. 2020. https://doi.org/10.48550/arXiv.1909.08855

46. Chen Q., Zhu X., Ling Z.-H., et al. Neural Natural Language Inference Models Enhanced with External Knowledge. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics 2018;1(Long Papers):2406–2417. https://doi.org/10.18653/v1/P18-1224

47. Chen Q., Ji F., Chen H., Zhang Y. Improving Commonsense Question Answering by Graph-based Iterative Retrieval over Multiple Knowledge Sources. In: Proceedings of the 28th International Conference on Computational Linguistics. 2020. P. 2583–2594. https://doi.org/10.18653/v1/2020.coling-main.232

48. Ma K., Francis J., Lu Q., et al. Towards Generalizable Neuro-Symbolic Systems for Commonsense Question Answering. In: Proceedings of the First Workshop on Commonsense Inference in Natural Language Processing. 2019. P. 22–32. https://doi.org/10.18653/v1/D19-6003

49. Bauer L., Wang Y., Bansal M. Commonsense for Generative Multi-Hop Question Answering Tasks. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018. P. 4220–4230. https://doi.org/10.18653/v1/D18-1454

50. Paul D., Frank A. Ranking and Selecting Multi-Hop Knowledge Paths to Better Predict Human Needs. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019;1(Long and Short Papers):3671–3681. https://doi.org/10.18653/v1/N19-1368

51. Liu W., Zhou P., Zhao Z., et al. K-BERT: Enabling Language Representation with Knowledge Graph. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2020;34(03):2901–2908. https://doi.org/10.1609/aaai.v34i03.5681

52. Kipf T.N., Welling M. Semi-Supervised Classification with Graph Convolutional Networks. arXiv preprint arXiv:1609.02907. 2017. https://doi.org/10.48550/arXiv.1609.02907

53. Lv S., Guo D., Xu J., et al. Graph-Based Reasoning over Heterogeneous External Knowledge for Commonsense Question Answering. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2020;34(05):8449–8456. https://doi.org/10.1609/aaai.v34i05.6364

54. Feng Y., Chen Y., Lin B.Y., et al. Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. P. 1295–1309. https://doi.org/10.18653/v1/2020.emnlp-main.99

55. Sun Y., Shi Q., Qi L., Zhang Y. JointLK: Joint Reasoning with Language Models and Knowledge Graphs for Commonsense Question Answering. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2022. P. 5049–5060. https://doi.org/10.18653/v1/2022.naaclmain.372

56. Yan J., Raman M., Chan A., et al. Learning Contextualized Knowledge Structures for Commonsense Reasoning. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP. 2021. P. 4038–4051. https://doi.org/10.18653/v1/2021.findings-acl.354

57. Lin B.Y., Chen X., Chen J., Ren X. KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019. P. 2829–2839. https://doi.org/10.18653/v1/D19-1282

58. Jiang J., Zhou K., Zhao W.X., Wen J.-R. Great Truths are Always Simple: A Rather Simple Knowledge Encoder for Enhancing the Commonsense Reasoning Capacity of Pre-Trained Models. In: North American Chapter of the Association for Computational Linguistics-Findings. 2022. https://doi.org/10.48550/arXiv.2205.01841

59. Houlsby N., Giurgiu A., Jastrzebski S., et al. Parameter-Efficient Transfer Learning for NLP. In: Proceedings of Machine Learning Research. 2019;97:2790–2799. https://doi.org/10.48550/arXiv.1902.00751

60. Wang R., Tang D., Duan N., et al. K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP. 2021. P. 1405–1418. https://doi.org/10.18653/v1/2021.findings-acl.121

61. Kim Y.J., Kwak B., Kim Y., et al. Modularized Transfer Learning with Multiple Knowledge Graphs for Zero-shot Commonsense Reasoning. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2022. P. 2244–2257. https://doi.org/10.18653/v1/2022.naaclmain.163

62. Jacobs R., Jordan M., Nowlan S., Hinton G. Adaptive Mixtures of Local Experts. Neural Computation. 1991;3(1):79–87. https://doi.org/10.1162/neco.1991.3.1.79

63. Zhang X., Bosselut A., Yasunaga M., et al. GreaseLM: Graph REASoning Enhanced Language Models for Question Answering. In: The International Conference on Learning Representations (ICLR). 2022. https://doi.org/10.48550/arXiv.2201.08860

64. Yasunaga M., Bosselut A., Ren H., et al. Deep Bidirectional Language-Knowledge Graph Pretraining. In: 36th Conference on Neural Information Processing Systems (NeurIPS). 2022. https://doi.org/10.48550/arXiv.2210.09338

65. Yasunaga M., Ren H., Bosselut A., et al. QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021. P. 535–546. https://doi.org/10.18653/v1/2021.naacl-main.45

66. Su Y., Zhang J., Song Y., Zhang T. PipeNet: Question Answering with Semantic Pruning over Knowledge Graphs. arXiv preprint arXiv:2401.17536v2. 2024. https://doi.org/10.48550/arXiv.2401.17536

67. Talmor A., Herzig J., Lourie N., Berant J. CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019;1(Long and Short Papers):4149–4158. https://doi.org/10.18653/v1/N19-1421

68. Liu Y., Ott M., Goyal N., et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692. 2019. https://doi.org/10.48550/arXiv.1907.11692

69. Robertson S.E., Walker S. Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval. In: SIGIR’94: Proceedings of the Seventeenth Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. 1994. P. 232–241. https://doi.org/10.1007/978-1-4471-2099-5_24


Supplementary files

1. BERT model training scheme
Subject
Type Исследовательские инструменты
View (116KB)    
Indexing metadata ▾
  • Knowledge injection methods leveraged in relevant question answering systems are classified, analyzed, and compared.
  • Self-supervised learning, fine-tuning, attention mechanism and interaction tokens for supporting information injection are considered along with auxiliary approaches for emphasizing the most relevant facts.

Review

For citations:


Radyush D.V. Knowledge injection methods in question answering. Russian Technological Journal. 2025;13(3):21-43. https://doi.org/10.32362/2500-316X-2025-13-3-21-43. EDN: QKUGFZ

Views: 145


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2782-3210 (Print)
ISSN 2500-316X (Online)