摘要
Interdisciplinary research has attracted extensive attention from researchers and policymakers by its nature of integrating various types of knowledge from multiple disciplines to solve complex scientific problems. Besides the studies on citation-based interdisciplinary knowledge flow, recent efforts have been made to demystify the characteristics of knowledge integration in interdisciplinary research from a knowledge content perspective. To deeply understand the knowledge content integrated into interdisciplinary research, two tasks were formulated in this study. One was to identify the knowledge units integrated by an interdisciplinary field, which are defined as integrated knowledge phrases (IKPs) shared between citances and cited texts of the references. The other was to classify the identified IKPs into several knowledge categories, which could reflect their knowledge functions in the field. We proposed a methodology framework to automate the identification and classification of IKPs by using natural language processing techniques and deep learning models. This automatic methodology was tested on an eHealth dataset. The experiments showed that the baseline matching method and the word embedding based similarity matching method are effective for the identification task, and the Bidirectional Encoder Representation from Transformers (BERT) model using section titles and citances as input features achieved the best performance on the classification task, with an accuracy of 0.951. We further showcased the application of IKPs in the case study with expanded literature of eHealth. The two tasks were operated on the new dataset, then co-occurrence networks of IKPs were constructed and mapped to visualize the knowledge integration structure of the field. This study provides a feasible content-based methodology to foster the fine-grained understanding of the knowledge integration structure of an interdisciplinary field, which could become a general domain analysis method.
-
单位武汉大学