Preview

Russian Technological Journal

Расширенный поиск

Методы интеграции знаний для разработки вопросно-ответных систем

https://doi.org/10.32362/2500-316X-2025-13-3-21-43

EDN: QKUGFZ

Аннотация

Цели. Несмотря на наблюдаемые в последние несколько лет успехи больших языковых моделей, которые способны решать широкий перечень задач, ряд практических проблем остается не до конца решенным. В контексте построения вопросно-ответных систем к таким проблемам можно отнести использование общих знаний и учет причинно-следственных связей. Целью статьи является рассмотрение методов интеграции знаний, которые способны усовершенствовать функционирование больших языковых моделей путем предоставления необходимых сведений и закономерностей из внешних источников.
Методы. В работе осуществляются классификация, анализ и сопоставление методов интеграции знаний, используемых в актуальных реализациях вопросно-ответных систем. В частности, рассматривается вовлечение вспомогательных сведений через самообучение, дообучение, механизм внимания и использование токенов взаимодействия, а также описываются соответствующие вспомогательные подходы для акцентирования наиболее релевантных сведений.
Результаты. Рассмотренные в обзоре вопросно-ответные системы непосредственно демонстрируют возрастание точности относительно базового решения на основе предобученной языковой модели за счет использования методов интеграции знаний на примере бенчмарка CommonsenseQA. При этом в целом более высокие результаты показывают методы интеграции знаний, основанные на использовании языковых моделей и механизма внимания.
Выводы. Представленный систематический обзор существующих методов интеграции знаний из внешних источников в работу вопросно-ответных систем фактически подтверждает эффективность и перспективность этого направления исследований. Данные методы демонстрируют не только возможность увеличить точность вопросно-ответных систем, но и в некоторой степени сгладить проблемы, связанные с интерпретируемостью результатов и устареванием знаний в предобученных моделях. Последующие изыскания способны как улучшить и оптимизировать отдельные аспекты существующих подходов, так и выработать концептуально новые.

Об авторе

Д. В. Радюш
ФГАОУ ВО «Национальный исследовательский университет ИТМО»
Россия

Радюш Даниил Валентинович, аспирант, факультет программной инженерии и компьютерной техники 
197101, Россия, Санкт-Петербург, Кронверкский пр., д. 49, лит. А

Scopus Author ID 58234958500


Конфликт интересов:

Автор заявляет об отсутствии конфликта интересов.



Список литературы

1. Devlin J., Chang M.-W., Lee K., Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019;1:4171–4186. https://doi.org/10.18653/v1/N19-1423

2. Petroni F., Rocktäschel T., Lewis P., et al. Language Models as Knowledge Bases? Processing (EMNLP-IJCNLP). 2019. P. 2463–2473. https://doi.org/10.18653/v1/D19-1250

3. Sap M., Le Bras R., Allaway E., et al. ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2019;33(1):3027–3035. https://doi.org/10.1609/aaai.v33i01.33013027

4. Niven T., Kao H.-Y. Probing Neural Network Comprehension of Natural Language Arguments. arXiv preprint arXiv:1907.07355. 2019. https://doi.org/10.48550/arXiv.1907.07355

5. McCoy R. T., Pavlick E., Linzen T. Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. P. 3428–3448. http://doi.org/10.18653/v1/P19-1334

6. Li J., Chen J., Ren R., et al. The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models. arXiv preprint arXiv:2401.03205. 2024. https://doi.org/10.48550/arXiv.2401.03205

7. Wei J., Wang X., Schuurmans D., et al. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In: 36th Conference on Neural Information Processing Systems. 2022;35:24824–24837. https://doi.org/10.48550/arXiv.2201.11903

8. Lewis P., Perez E. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. Advances in Neural Information Processing Systems. 2020;33:9459–9474. https://doi.org/10.48550/arXiv.2005.11401

9. Ye Zhi-Xiu, Chen Q., Wang W., Ling Zhen-Hua. Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models. arXiv preprint arXiv:1908.06725v5. 2020. https://doi.org/10.48550/arXiv.1908.06725

10. Vaswani A., Shazeer N., Parmar N., et al. Attention Is All You Need. Advances in Neural Information Processing Systems 30. 2018. https://doi.org/10.48550/arXiv.1706.03762

11. Liu J., Shen D., Zhang Y., et al. What Makes Good In-Context Examples for GPT-3? arXiv preprint arXiv:2101.06804. 2021. https://doi.org/10.48550/arXiv.2101.06804

12. Gao T., Fisch A., Chen D. Making Pre-trained Language Models Better Few-shot Learners. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021. P. 3816–3830. http://doi.org/10.18653/v1/2021.acl-long.295

13. Shwartz V., West P., Le Bras R., et al. Unsupervised Commonsense Question Answering with Self-Talk. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. P. 4615–4629. http://doi.org/10.18653/v1/2020.emnlp-main.373

14. Wang J., Zhao H. ArT: All-round Thinker for Unsupervised Commonsense Question-Answering. In: Proceedings of the 29th International Conference on Computational Linguistics. 2022. P. 1490–1501. https://doi.org/10.48550/arXiv.2112.13428

15. Wang P., Peng N., Ilievski F., et al. Connecting the Dots: A Knowledgeable Path Generator for Commonsense Question Answering. arXiv preprint arXiv:2005.00691. 2020. https://doi.org/10.48550/arXiv.2005.00691

16. Raffel C., Shazeer N., Roberts A., et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research. 2020;21(140):1–67. https://doi.org/10.48550/arXiv.1910.10683

17. Zhang Z., Han X., Liu Z., et al. ERNIE: Enhanced Language Representation with Informative Entities. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. P. 1441–1451. https://doi.org/10.18653/v1/P19-1139

18. Peters M.E., Neumann M., Logan IV R.L., et al. Knowledge enhanced contextual word representations. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019. P. 43–54. https://doi.org/10.18653/V1/D19-1005

19. He L., Zheng S., Yang T., Zhang F. KLMo: Knowledge Graph Enhanced Pretrained Language Model with Fine-Grained Relationships. In: Findings of the Association for Computational Linguistics: EMNLP. 2021. P. 4536–4542. https://doi.org/10.18653/v1/2021.findings-emnlp.384

20. Xiong W., Du J., Wang W.Y., Stoyanov V. Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model. arXiv preprint arXiv:1912.09637. 2019. https://doi.org/10.48550/arXiv.1912.09637

21. Sun Y., Wang S., Li Y., et al. ERNIE: Enhanced Representation through Knowledge Integration. arXiv preprint arXiv:1904.09223. 2019. https://doi.org/10.48550/arXiv.1904.09223

22. Zhang D., Yuan Z., Liu Y., et al. E-BERT: A Phrase and Product Knowledge Enhanced Language Model for E-commerce. arXiv preprint arXiv:2009.02835. 2020. https://doi.org/10.48550/arXiv.2009.02835

23. Chen Q., Li F.-L., Xu G., et al. DictBERT: Dictionary Description Knowledge Enhanced Language Model Pre-training via Contrastive Learning. arXiv preprint arXiv:2208.00635. 2022. https://doi.org/10.48550/arXiv.2208.00635

24. Lauscher A., Vulić I., Ponti E.M., et al. Informing Unsupervised Pretraining with External Linguistic Knowledge. arXiv preprint arXiv:1909.02339v1. 2019. https://doi.org/10.48550/arXiv.1909.02339

25. Levine Y., Lenz B., Dagan O., et al. SenseBERT: Driving Some Sense into BERT. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020. P. 4656–4667. https://doi.org/10.18653/v1/2020.acl-main.423

26. Wang X., Gao T., Zhu Z., et al. KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation. Trans. Assoc. Comput. Linguis. 2021;9:176–194. https://doi.org/10.1162/tacl_a_00360

27. Bordes A., Usunier N., Garcia-Durán A., et al. Translating Embeddings for Modeling Multi-relational Data. Advances in Neural Information Processing Systems. 2013. P. 2787–2795.

28. He B., Zhou D., Xiao J., et al. BERT-MK: Integrating Graph Contextualized Knowledge into Pre-trained Language Models. Findings of the Association for Computational Linguistics: EMNLP. 2020. P. 2281–2290. https://doi.org/10.18653/v1/2020.findings-emnlp.207

29. Banerjee P., Baral C. Self-Supervised Knowledge Triplet Learning for Zero-shot Question Answering. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. P. 151–162. https://doi.org/10.18653/v1/2020.emnlp-main.11

30. Zhong W., Tang D., Duan N., et al. Improving Question Answering by Commonsense-Based Pre-training. In: Tang J., Kan M.Y., Zhao D., Li S., Zan H. (Eds.). Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science. Springer; 2019. V. 11838. P. 16–28. https://doi.org/10.1007/978-3-030-32233-5_2

31. Sun T., Shao Y., Qiu X., et al. CoLAKE: Contextualized Language and Knowledge Embedding. arXiv preprint arXiv:2010.00309v1. 2020. https://doi.org/10.48550/arXiv.2010.00309

32. Su Y., Han X., Zhang Z., et al. CokeBERT: Contextual knowledge selection and embedding towards enhanced pre-trained language models. AI Open. 2021;2:127–134. https://doi.org/10.1016/j.aiopen.2021.06.004

33. Ma K., Ilievski F., Francis J., et al. Knowledge-driven Data Construction for Zero-shot Evaluation in Commonsense Question Answering. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2021;35(15):13507–13515. https://doi.org/10.1609/aaai.v35i15.17593

34. Wang W., Fang T., Ding W., et al. CAR: Conceptualization-Augmented Reasoner for Zero-Shot Commonsense Question Answering. Findings of the Association for Computational Linguistics: EMNLP 2023. 2023. P. 13520–13545. https://doi.org/10.18653/v1/2023.findings-emnlp.902

35. Zhan X., Li Y., Dong X., et al. elBERto: Self-supervised Commonsense Learning for Question Answering. arXiv preprint arXiv:2203.09424v1. 2022. https://doi.org/10.48550/arXiv.2203.09424

36. Rajpurkar P., Jia R., Liang P. Know What You Don’t Know: Unanswerable Questions for SQuAD. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018;2(Short Papers):784–789. https://doi.org/10.18653/v1/P18-2124

37. Khashabi D., Min S., Khot T., et al. UnifiedQA: Crossing Format Boundaries with a Single QA System. In: Findings of the Association for Computational Linguistics. 2020. P. 1896–1907. https://doi.org/10.18653/v1/2020.findings-emnlp.171

38. Lourie N., Le Bras R., Bhagavatula C., Choi Y. UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark. arXiv preprint arXiv:2103.13009v1. 2021. https://doi.org/10.48550/arXiv.2103.13009

39. Baek J., Aji A.F., Saffari A. Knowledge-Augmented Language Model Prompting for Zero-Shot Knowledge Graph Question Answering. In: Proceedings of the 1st Workshop on Natural Language Reasoning and Structured Explanations (NLRSE). 2023. P. 78–106. https://doi.org/10.18653/v1/2023.nlrse-1.7

40. Pan X., Sun K., Yu D., et al. Improving Question Answering with External Knowledge. In: Proceedings of the 2nd Workshop on Machine Reading for Question Answering. 2019. P. 27–37. https://doi.org/10.18653/v1/D19-5804

41. Xu Y., Zhu C., Xu R., et al. Fusing Context Into Knowledge Graph for Commonsense Question Answering. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 2021. P. 1201–1207. https://doi.org/10.18653/v1/2021.findings-acl.102

42. Xu Y., Zhu C., Wang S., et al. Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention. In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI). 2022. P. 2762–2768. https://doi.org/10.24963/ijcai.2022/383

43. Arora S., Wu S., Liu E., Ré C. Metadata Shaping: A Simple Approach for Knowledge-Enhanced Language Models. In: Findings of the Association for Computational Linguistics: ACL 2022. 2022. P. 1733–1745. https://doi.org/10.18653/v1/2022.findings-acl.137

44. Li S., Gao Y., Jiang H., et al. Graph Reasoning for Question Answering with Triplet Retrieval. In: Findings of the Association for Computational Linguistics: ACL 2023. 2023. P. 3366–3375. https://doi.org/10.18653/v1/2023.findings-acl.208

45. Mitra A., Banerjee P., Pal K.K., et al. How Additional Knowledge can Improve Natural Language Commonsense Question Answering? arXiv preprint arXiv:1909.08855v3. 2020. https://doi.org/10.48550/arXiv.1909.08855

46. Chen Q., Zhu X., Ling Z.-H., et al. Neural Natural Language Inference Models Enhanced with External Knowledge. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics 2018;1(Long Papers):2406–2417. https://doi.org/10.18653/v1/P18-1224

47. Chen Q., Ji F., Chen H., Zhang Y. Improving Commonsense Question Answering by Graph-based Iterative Retrieval over Multiple Knowledge Sources. In: Proceedings of the 28th International Conference on Computational Linguistics. 2020. P. 2583–2594. https://doi.org/10.18653/v1/2020.coling-main.232

48. Ma K., Francis J., Lu Q., et al. Towards Generalizable Neuro-Symbolic Systems for Commonsense Question Answering. In: Proceedings of the First Workshop on Commonsense Inference in Natural Language Processing. 2019. P. 22–32. https://doi.org/10.18653/v1/D19-6003

49. Bauer L., Wang Y., Bansal M. Commonsense for Generative Multi-Hop Question Answering Tasks. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018. P. 4220–4230. https://doi.org/10.18653/v1/D18-1454

50. Paul D., Frank A. Ranking and Selecting Multi-Hop Knowledge Paths to Better Predict Human Needs. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019;1(Long and Short Papers):3671–3681. https://doi.org/10.18653/v1/N19-1368

51. Liu W., Zhou P., Zhao Z., et al. K-BERT: Enabling Language Representation with Knowledge Graph. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2020;34(03):2901–2908. https://doi.org/10.1609/aaai.v34i03.5681

52. Kipf T.N., Welling M. Semi-Supervised Classification with Graph Convolutional Networks. arXiv preprint arXiv:1609.02907. 2017. https://doi.org/10.48550/arXiv.1609.02907

53. Lv S., Guo D., Xu J., et al. Graph-Based Reasoning over Heterogeneous External Knowledge for Commonsense Question Answering. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2020;34(05):8449–8456. https://doi.org/10.1609/aaai.v34i05.6364

54. Feng Y., Chen Y., Lin B.Y., et al. Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. P. 1295–1309. https://doi.org/10.18653/v1/2020.emnlp-main.99

55. Sun Y., Shi Q., Qi L., Zhang Y. JointLK: Joint Reasoning with Language Models and Knowledge Graphs for Commonsense Question Answering. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2022. P. 5049–5060. https://doi.org/10.18653/v1/2022.naaclmain.372

56. Yan J., Raman M., Chan A., et al. Learning Contextualized Knowledge Structures for Commonsense Reasoning. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP. 2021. P. 4038–4051. https://doi.org/10.18653/v1/2021.findings-acl.354

57. Lin B.Y., Chen X., Chen J., Ren X. KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019. P. 2829–2839. https://doi.org/10.18653/v1/D19-1282

58. Jiang J., Zhou K., Zhao W.X., Wen J.-R. Great Truths are Always Simple: A Rather Simple Knowledge Encoder for Enhancing the Commonsense Reasoning Capacity of Pre-Trained Models. In: North American Chapter of the Association for Computational Linguistics-Findings. 2022. https://doi.org/10.48550/arXiv.2205.01841

59. Houlsby N., Giurgiu A., Jastrzebski S., et al. Parameter-Efficient Transfer Learning for NLP. In: Proceedings of Machine Learning Research. 2019;97:2790–2799. https://doi.org/10.48550/arXiv.1902.00751

60. Wang R., Tang D., Duan N., et al. K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP. 2021. P. 1405–1418. https://doi.org/10.18653/v1/2021.findings-acl.121

61. Kim Y.J., Kwak B., Kim Y., et al. Modularized Transfer Learning with Multiple Knowledge Graphs for Zero-shot Commonsense Reasoning. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2022. P. 2244–2257. https://doi.org/10.18653/v1/2022.naaclmain.163

62. Jacobs R., Jordan M., Nowlan S., Hinton G. Adaptive Mixtures of Local Experts. Neural Computation. 1991;3(1):79–87. https://doi.org/10.1162/neco.1991.3.1.79

63. Zhang X., Bosselut A., Yasunaga M., et al. GreaseLM: Graph REASoning Enhanced Language Models for Question Answering. In: The International Conference on Learning Representations (ICLR). 2022. https://doi.org/10.48550/arXiv.2201.08860

64. Yasunaga M., Bosselut A., Ren H., et al. Deep Bidirectional Language-Knowledge Graph Pretraining. In: 36th Conference on Neural Information Processing Systems (NeurIPS). 2022. https://doi.org/10.48550/arXiv.2210.09338

65. Yasunaga M., Ren H., Bosselut A., et al. QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021. P. 535–546. https://doi.org/10.18653/v1/2021.naacl-main.45

66. Su Y., Zhang J., Song Y., Zhang T. PipeNet: Question Answering with Semantic Pruning over Knowledge Graphs. arXiv preprint arXiv:2401.17536v2. 2024. https://doi.org/10.48550/arXiv.2401.17536

67. Talmor A., Herzig J., Lourie N., Berant J. CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019;1(Long and Short Papers):4149–4158. https://doi.org/10.18653/v1/N19-1421

68. Liu Y., Ott M., Goyal N., et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692. 2019. https://doi.org/10.48550/arXiv.1907.11692

69. Robertson S.E., Walker S. Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval. In: SIGIR’94: Proceedings of the Seventeenth Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. 1994. P. 232–241. https://doi.org/10.1007/978-1-4471-2099-5_24


Дополнительные файлы

1. Схема обучения модели BERT
Тема
Тип Исследовательские инструменты
Посмотреть (116KB)    
Метаданные ▾
  • В работе осуществляются классификация, анализ и сопоставление методов интеграции знаний, используемых в актуальных реализациях вопросно-ответных систем.
  • Рассматривается вовлечение вспомогательных сведений через самообучение, дообучение, механизм внимания и использование токенов взаимодействия, а также описываются соответствующие вспомогательные подходы для акцентирования наиболее релевантных сведений.

Рецензия

Для цитирования:


Радюш Д.В. Методы интеграции знаний для разработки вопросно-ответных систем. Russian Technological Journal. 2025;13(3):21-43. https://doi.org/10.32362/2500-316X-2025-13-3-21-43. EDN: QKUGFZ

For citation:


Radyush D.V. Knowledge injection methods in question answering. Russian Technological Journal. 2025;13(3):21-43. https://doi.org/10.32362/2500-316X-2025-13-3-21-43. EDN: QKUGFZ

Просмотров: 146


Creative Commons License
Контент доступен под лицензией Creative Commons Attribution 4.0 License.


ISSN 2782-3210 (Print)
ISSN 2500-316X (Online)