Title: Estimation of the optimal surrogate endpoint based on a randomized trial
Authors: Peter Gilbert - University of Washington & Fred Hutchinson Cancer Research Center (United States) [presenting]
Brenda Price - University of Washington (United States)
Mark van der Laan - University of California Berkeley (United States)
Abstract: A common scientific problem is to determine a surrogate outcome for a long-term outcome so that future randomized studies can restrict themselves to only collecting the surrogate outcome. We consider the setting that we observe n independent and identically distributed observations of a random variable consisting of baseline covariates, a treatment, a vector of candidate surrogate outcomes at the intermediate time point, and the final outcome of interest at a final time point. It is assumed that in this current study the treatment is randomized, conditional on the baseline covariates. The goal is to use these data to learn a most-promising surrogate for use in future trials for estimation and testing of a mean contrast treatment effect on the outcome of interest. We define an optimal surrogate for the current study as the function of the data collected by the intermediate time point that satisfies the Prentice definition of a valid surrogate endpoint and that optimally predicts the final outcome (in the current study): this optimal surrogate is a function of the data generating distribution and is thus unknown. We show that this optimal surrogate is a conditional mean and present super-learner and targeted super-learner based estimators that can accommodate high-dimensional covariates and response endpoints,whose predicted outcomes are used as the surrogate in applications.