Optimally Combining Outcomes to Improve Prediction
In many studies, multiple instruments are used to measure various facets of an outcome of interest. For example, in studies of childhood development, researchers measure neurocognitive ability of a child using tests that measure development in several different areas (e.g., language, math, logic). Researchers are interested in developing prediction algorithms that can be used to identify children at risk of neurocognitive delay based on household and environment characteristics early in life. Various multivariate methods have been proposed for combining the various outcome measures into a single "score", but many are difficult for applied practitioners to understand and may be difficult to justify scientifically. We propose a simple alternative that allows researchers to learn from the data the weighted combination of outcomes that maximizes predictive performance. The result is an easy-to-interpret single outcome "score" for which predictions are optimized. Furthermore, we propose methods for obtaining honest inference about how well we are able to predict the combined outcome. We illustrate our method using data from a cohort study of childhood development in the Philippines.
David Benkeser is a post-doctoral researcher in the Department of Biostatistics working with Dr. Mark van der Laan. My research interests revolve around causal inference and machine learning. I am currently working with the Bill and Melinda Gates Foundation's Healthy Birth, Growth, and Development initiative aimed at identifying children at high risk for developmental deficits in the developing world and developing targeted interventions to help these children. I received my PhD. from the University of Washington Department of Biostatistics where my research focused on methods for evaluating vaccines, particularly for prevention of HIV and malaria. I have also worked extensively in cardiovascular epidemiology and health care economics at end-of-life.