Search
Browse By Day
Browse By Time
Browse By Person
Browse By Policy Area
Browse By Session Type
Browse By Keyword
Program Calendar
Personal Schedule
Sign In
Search Tips
Introduction/Background China's extensive use of policy pilots as a governance tool has fostered widespread adoption of difference-in-differences (DID) methods for outcome evaluation. These studies inform critical policy decisions, yet China's institutional and contextual characteristics create unique challenges for DID applications. Our analysis documents this phenomenon's remarkable scale: over 1,100 Chinese-language CSSCI articles employ DID for policy pilot analysis, while Chinese research institutions produce nearly 1,000 of the 1,398 SSCI papers on this topic—representing 71% of global output. This dominance suggests distinctive patterns in research production warranting critical examination.
Purpose/Research Question This study investigates both methodological issues and underlying factors driving the proliferation of DID evaluations of Chinese policy pilots. We explore: (1) patterns in DID applications across policy domains; (2) disconnections between policy aims and measured outcomes; (3) researchers' navigation of methodological challenges; and (4) demand-supply factors shaping this research landscape.
Methods We conducted a comprehensive content analysis of approximately 1,100 Chinese-language CSSCI articles (through 2024) applying DID methods to policy pilots. Our analytical framework incorporated large language models (LLMs) to systematically extract and categorize information across multiple dimensions, including methodological specifications, treatment-outcome relationships, validity considerations, and contextual factors. We analyzed methodological choices (DID variants employed, robustness checks conducted), assessed alignment between policy objectives and measured outcomes, evaluated adherence to core DID assumptions, and examined patterns in reported effectiveness across policy domains. Supplementary qualitative analysis provided deeper insights into methodological adaptation strategies within the Chinese research context.
Results/Findings Our analysis revealed that 42% of studies employed staggered DID designs, while advanced methods like triple difference (5%) and continuous treatment DID (1.2%) remained uncommon. Recent publications demonstrated improved methodological quality. Critical issues included insufficient attention to anticipation effects, concurrent policies, and spillovers (10% each). Notably, 25% of studies failed to establish treatment exogeneity—a fundamental challenge with Chinese policy pilots—with one-third inappropriately claiming "natural experiment" conditions. We identified a striking pattern of studies examining side effects unrelated to policy goals, with overwhelmingly positive reported outcomes across domains.
Conclusion/Implications This study reveals significant limitations in DID applications to Chinese policy pilots stemming from complex institutional and incentive structures rather than merely technical shortcomings. The ecosystem features specific demand-side factors—including preferences for positive evaluations and methodological biases—alongside supply-side pressures from publication requirements and career advancement incentives. These forces create a landscape characterized by unexpected treatment-outcome pairings and unaddressed selection issues. Our findings suggest academic environments should encourage researchers to apply methods appropriate for genuine research problems while promoting methodological best practices. These improvements would strengthen evidence-based policymaking in China and offer insights for policy evaluation globally.