Policy Research Working Paper10673Missing EvidenceTracking Academic Data Use around the WorldBrian StacyLucas KitzmüllerXiaoyu WangDaniel Gerszon MahlerUmar Serajuddin Development Economics Development Data GroupJanuary 2024 A verified reproducibility package for this paper is available at http://reproducibility.worldbank.org, click here for direct access. Public Disclosure AuthorizedPublic Disclosure AuthorizedPublic Disclosure AuthorizedPublic Disclosure AuthorizedProduced by the Research Support TeamAbstractThe Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.Policy Research Working Paper 10673Data-driven research on a country is key to producing evidence-based public policies. Yet little is known about where data-driven research is lacking and how it could be expanded. This paper proposes a method for tracking academic data use by country of subject, applying natural language processing to open-access research papers. The model’s predictions produce country estimates of the number of articles using data that are highly correlated with a human-coded approach, with a correlation of 0.99. Analyzing more than 1 million academic articles, the paper finds that the number of articles on a country is strongly correlated with its gross domestic product per capita, popu-lation, and the quality of its national statistical system. The paper identifies data sources that are strongly associated with data-driven research and finds that availability of sub-national data appears to be particularly important. Finally, the paper classifies countries into groups based on whether they could most benefit from increasing their supply of or demand for data. The findings show that the former applies to many low- and lower-middle-income countries, while the latter applies to many upper-middle- and high-income countries.This paper is a product of the Development Data Group, Development Economics. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The authors may be contacted at bstacy@worldbank.org. A verified reproducibility package for this paper is available at ...