Search
Browse By Day
Browse By Time
Browse By Person
Browse By Policy Area
Browse By Session Type
Browse By Keyword
Program Calendar
Personal Schedule
Sign In
Search Tips
This paper proposes an innovative data science technique called “time-shift data augmentation,” which repurposes information from multiple time periods to artificially increase a dataset’s sample size, thereby improving the performance of machine learning models. This methodology produced the top-performing model in an international data science competition, and has the potential to improve the accuracy of machine learning models in a wide range of social services applications. Specifically, we demonstrate the effectiveness of time-shift data augmentation by applying it in a competition called the Predicting Fertility Data Challenge, where dozens of participants around the world with expertise in data science, social science, and machine learning competed to predict births and adoptions using Dutch survey data. This strategy can be widely applied in many types of social science datasets, such as panel surveys and administrative data, with the potential to improve the accuracy of predictive models in a variety of social services settings.