Pre-processing of the text of the policy on the integration of physical and medical care
After searching and screening, a total of 62 documents were collected. The contents of the files were co-imported into an Excel file for use in the KH Coder analysis.
After preliminary text processing by KH Coder, the total number of words in the main Excel document was 158,622 (using 110356), and the number of dissimilar words was 7645 (using 6878); the document had a fixed amount of text, which could be used to carry out text data mining by KH Coder (Table 3).
Keyword extraction and analysis
Through the text mining tool KH Coder software, the text database of the medical integration policy is divided into words and word frequency statistics, and a list of word frequencies of the characteristic words of the medical integration policy at the national level is obtained, as is the corresponding word frequency number. The high-frequency words represent the formulation goals of the national-level body-health integration policy. To reduce interference, some of the words that are meaningless to the analysis are uniformly deleted in the high-frequency word counting, such as to be, committee, new, country, China, and relevant. Additionally, terms that are not directly related to health equity, such as state and relevant, were removed from the policy texts to ensure greater precision in the analysis. The results are as follows (Table 4).
Plotting the word frequencies above shows that the circles on the curve represent the number of occurrences, with being closer to the y-axis representing more occurrences and being closer to the x-axis representing approximately fewer occurrences (Fig. 2).

Word frequency distribution
In addition, to understand whether the number of occurrences of each word is average or not, the number of occurrences of high-frequency words (TF value) and the frequency were used as two variables for curve fitting, which was found to be the closest to the power function distribution. Figure 3 shows that the distribution of high-frequency words has a clear long-tailed distribution, which is the result of the number of occurrences and the number of documents. The interpretation is similar to that in the figure above, which is in line with the obvious Matthew effect, indicating that the data obtained in this study have a better representation.
Distribution of word frequency fit
As a result, the word frequency analysis yields the following results. First, regarding health-related words, in the word frequency data, “health” is the most frequent keyword, appearing 2,758 times, indicating that the policy attaches importance to health issues. Meanwhile, health-related words such as “medical”, “hygiene” and “nutrition” also appear with high frequency, indicating the policy’s demand for and attention to health care and health care. Second, regarding macro policies and institutions, in the word frequency data, words such as “promote”, “promote” and “development” appear with high frequency, reflecting the pursuit of social development and progress. In addition, words such as “organization”, “department” and “government” highlight the important role of government agencies in promoting various activities. Third, regarding community and grassroots services, words such as “community” and “grassroots” also receive some attention in the word frequency data, indicating that focusing on grassroots services and community building has become an important goal in the field of social services. Fourth, comprehensive development is emphasized. The high frequency of terms such as “comprehensive”, “all people” and “all-round” reflects the importance of integrated development and the participation of all people. Fifth, regarding technology and information, as society continues to progress, the frequency of terms such as “technology” and “information” is high, demonstrating the important role of science, technology and informationization in various fields.
The high-frequency words above reflect the main objectives of policy formulation. In the X-axis dimension, policy tools are categorized into demand-based, supply-based, and environment-based tools. Analysis of these high-frequency words reveals that national policies predominantly employ supply-based tools, as evidenced by the frequent occurrence of terms such as “health services,” “institutions,” and “management,” which indicates the government’s focus on advancing the provision of medical services. Additionally, demand-based tools are evident in the policies, with words such as “promote,” “support,” and “implement”, suggesting that the policies aim to respond to the growing societal demand for health. Furthermore, environment-based tools are less mentioned, implying that there is less emphasis on improving the external environment in the policies.
In the Y-axis dimension of health equity, high-frequency words such as “universal,” “grassroots,” and “community” indicate that policy documents emphasize promoting health equity among different groups through the integration of medical services and physical activities. However, further analysis reveals that policies tend to focus more on specific groups (such as elderly individuals and patients with chronic diseases) while paying less attention to younger populations, indicating some deficiencies in achieving health equity.
Through the selection and cluster analysis of high-frequency words, it can be seen that the national-level sports and medicine integration policy mainly concentrates on supply-based policy tools and shows concern for promoting universal health equity. However, there are still shortcomings in terms of specific measures and implementation pathways.
Co-occurrence network analysis
Through the function of KH Coder’s co-occurrence network (the lowest word frequency is set to 15), it can be seen that the sports-medicine integration policy shown in Fig. 4 is the theme. This figure is a covariance network diagram, and it can be seen from the diagram that health, all people, hygiene, sports, physical education, and medical care are the keywords with the most co-occurrences. This once again proves that local governments in China are more concerned with the integration of sports in the formulation of sports integration policies and that the way to promote the health of the whole population is through the integration of sports and medical care. Closer to the high-frequency word health are the words service, construction, and management, which suggests that the national government is more concerned with the government-led construction of relevant service models to promote health when formulating policies for the integration of sports and medicine. Another high-frequency word, sports, is surrounded by the words fitness, all people, and exercise, which indicates that when the national government formulates policies on the integration of sports and medicine, it is clear that the prerequisite for realizing the health of all people is to do so through sports; that is, the work of realizing health is done in the ordinary course of life. In other words, the way to realize health is through physical activity. At the same time, data, medical, information, and platforms form an independent system, indicating that big data intelligent systems have gradually become a means of realizing the integration of sports and medicine.
Co-occurrence network of policies on the integration of physical medicine at the national level
Through KH Coder’s thematic co-linear relational network analysis, it is evident that policy-makers prioritize demand-based policy instruments, particularly emphasizing broad objectives such as the realization of health for all. This reflects a strong demand-side focus, where the dominant direction of the policy aligns with public health needs. However, the analysis reveals that while these policies stress the ultimate goal of health equity (Y-axis), they often lack supply-based measures that address specific interventions required to integrate sports and medical care in practice.
In terms of supply-based tools, there is minimal elaboration on concrete, group-oriented service programmes targeting different demographic groups. Instead, policies tend to favour certain segments of the population, such as elderly individuals, children, and citizens, with little focus on youths—highlighting a potential equity gap on the Y-axis. The lack of detailed strategies for specific groups reflects an imbalance in addressing diverse health needs, indicating that while demand is recognized, policies are not fully tailored to the varied supply-side needs required to achieve health equity.
Moreover, when examining environment-based policies, keywords such as services, medical care, human resources, elderly care, institutions, publicity, prevention and treatment, facilities, and industries suggest regional variability in implementation; this suggests that different areas employ different environmental tools to achieve health goals but without clearly assigned responsibilities for implementation or monitoring mechanisms (indicating inefficiency on the X-axis). The lack of explicit governance structures and detailed planning for personnel and resources further undermines both the effectiveness of supply-based tools and the ultimate goal of promoting health equity (Y-axis).
Therefore, while the policy framework aims to promote health equity (Y-axis), the absence of specific supply-based mechanisms and clearly defined environmental policies raises concerns about its actual impact. Policies lack the precise instruments necessary for effective implementation and monitoring, thus failing to fully realize the integration of sports and medicine for diverse populations.
Multi-dimensional scaling analysis
The results of the analysis (Fig. 5) reveal eight high-frequency word dimensions in the policy text on body-health integration, and these dimensions reflect different policy tools and pathways for achieving health equity.
Results of multi-dimensional scaling analysis
First, the health dimension corresponds to demand-based policy instruments and includes terms such as health, medical care, hygiene, sports, nutrition, disease, elderly, and chronic disease. These terms indicate the policy’s focus on promoting public health, improving medical services, enhancing hygiene, and addressing the health needs of elderly individuals and patients with chronic diseases. On the Y-axis, which represents health equity, this dimension demonstrates responsiveness to the health needs of vulnerable groups, particularly elderly individuals.
Second, the service and promotion dimension encompasses terms such as service, promote, facilitate, improve, support, develop, implement, and encourage; it is closely related to supply-based policy tools, suggesting the policy’s emphasis on promoting the supply of health services through various implementation and encouragement mechanisms. However, from a health equity perspective, although the policy stresses widespread service promotion, specific mechanisms for equitable distribution, particularly for addressing service gaps for different demographic groups, are lacking.
Third, the monitoring and prevention dimension includes terms such as monitoring, prevention and control, environment, region, residents, facilities, publicity, and patients. This dimension emphasizes environmental and disease monitoring and prevention, corresponding to environment-based policy tools. In terms of health equity, these policies aim to ensure public health through improvements in facilities and the environment, but they fall short of addressing the disparities between regions or resident groups, particularly those in economically disadvantaged or remote areas.
Fourth, the medical institution and management dimension consists of terms such as hospital, institution, management, department, responsible, personnel, unit, and the State Council. It reflects the policy’s focus on the construction and management of medical institutions and the enhancement of health care personnel and primary medical services. This dimension aligns with supply-based policy tools, focusing on strengthening health care infrastructure. However, from a health equity perspective, there is insufficient clarity regarding how medical resources are distributed among different socio-economic groups.
Fifth, the social participation and organization dimension includes terms such as society, organization, community, mass, and population, highlighting the role of social participation and organization in implementing universal health services. This dimension reflects efforts to achieve health equity by mobilizing broad societal engagement. However, the policy does not sufficiently elaborate on how different groups are involved in these initiatives, particularly the underrepresentation of the youth group in social participation efforts.
Sixth, the policy and system dimension involves terms such as policy, mechanism, system, planning, government, assessment, and the State Council. This dimension emphasizes the need to formulate and improve relevant policies and strengthen oversight and assessment mechanisms; it reflects both demand-based and supply-based policy tools. From a health equity perspective, while the policy framework stresses the importance of policy mechanisms, concrete strategies for ensuring equitable access to health services across diverse population groups are lacking.
Seventh, the data and information dimension includes terms such as information, data, knowledge, and technology, emphasizing the importance of data management and information sharing in enhancing health care technology applications. This dimension aligns with supply-based policy tools, highlighting the role of technology in promoting health services. However, in terms of health equity, the potential gaps in data access and application for marginalized groups are not adequately addressed in the policy.
Finally, the infrastructure and standard dimension includes terms such as foundation, grassroots, quality, industry, combination, and sports, reflecting the policy’s focus on developing infrastructure and establishing standards to promote sports and fitness activities. This dimension is related to supply-based policy tools. While the policy highlights infrastructure improvement, there is little discussion about how this will ensure equitable access for different socio-economic groups.
Additionally, Fig. 6 shows that high-frequency terms such as health, service, medical care, hygiene, and sports are grouped within the same dimension, with these terms occupying the largest nodes in the word frequency network; this indicates that the policy text places considerable emphasis on the provision of health, medical services, and sports activities. However, despite the focus on promoting and improving these fields, the policy lacks detailed implementation pathways, especially concerning the intersection of policy tools and health equity, with limited attention to equitable service distribution and resource allocation strategies across different population groups.
Corresponding analysis quadrant distribution results
Correspondence analysis
In the context of body-health integration policies, correspondence analysis helps us understand how different policy instruments—demand-based, supply-based, and environmental-based tools—interact to promote health equity. By plotting keywords based on their thematic relevance and contribution to policy outcomes, we can assess the emphasis placed on specific dimensions, such as public health services, medical institutions, and social participation. According to the results of the correspondence analysis (Fig. 6), the keywords are distributed across four quadrants, indicating different policy approaches and their alignment with health equity goals.
First quadrant
High Policy Emphasis – High Health Equity Impact. This quadrant represents areas where policy tools exert a strong influence in terms of both addressing public demand and promoting equitable health outcomes. Words in this quadrant include community, sports, medical integration, fitness for all, mass facilities, and sports events. These terms suggest that demand-based policies in this area focus on promoting physical activity through community-based services, with an emphasis on inclusive programmes that aim to improve health equity by engaging diverse populations in sports and fitness activities. The high correlation between these terms shows that policy tools are designed to integrate health and sports at the community level, fostering broad participation and equitable health outcomes.
Second quadrant
High Policy Emphasis – Low Health Equity Impact. Keywords such as information, medical services, institutional care, elderly, technology, geriatric, and medical elderly are positioned in this quadrant. These findings indicate that supply-based policy tools are focused on improving medical care for elderly individuals, especially those with chronic diseases. While the policy strongly emphasizes improving services for an ageing population, there is less consideration of equitable access across different demographic groups, particularly younger populations; this suggests a potential gap in addressing the health needs of other vulnerable groups through these supply-based measures.
Third quadrant
Low Policy Emphasis – Low Health Equity Impact. The third quadrant contains terms such as management, responsibility, safety, education, implementation, knowledge, detection, environment, disease prevention, health, hygiene, and action. These keywords reflect environment-based policy tools that emphasize public health measures such as disease prevention and hygiene education. However, the relatively low policy emphasis and health equity impact suggest that while these measures are essential, they may not yet be fully integrated or impactful in addressing disparities in health outcomes. The policy tools in this dimension focus on broad environmental health measures, but they lack targeted strategies to ensure equitable access and outcomes across different populations.
Fourth quadrant
Low Policy Emphasis – High Health Equity Impact. This quadrant includes keywords such as city, fitness, activity, citizen, unit, and adolescent. Here, environment-based policy tools are focused on urban health initiatives, such as promoting physical activity and fitness in city settings. Although these tools have strong potential to promote health equity, particularly for adolescents and city residents, the low emphasis on these policies suggests that there are gaps in fully realizing their potential. The clustering of these terms indicates a reliance on the urban public health system to promote fitness, but the policy framework may need to strengthen its focus on ensuring equitable access to fitness programmes across urban populations, especially for youths and marginalized groups.
By categorizing the keywords into four quadrants and aligning them with the X-axis (policy type) and Y-axis (health equity), we gain insights into the strengths and weaknesses of the current policy framework on body-health integration. The “high-high” quadrant indicates a strong alignment between policy tools and health equity goals, particularly in promoting inclusive sports programmes. However, the “high-low” quadrant suggests that certain supply-based policies, such as elderly care, lack equitable access for other demographic groups. Similarly, while some environment-based policies focus on broad health improvements (third quadrant), they may not yet be addressing disparities effectively. Finally, the “low-high” quadrant highlights opportunities for improving urban fitness policies to better support health equity for younger populations and citizens.
link

