Yi Zuo dissertation defense – January 6
PhD candidate Yi Zuo will be defending his dissertation on Thursday, January 6, at 9 a.m. Central Time. All are invited. To attend the defense, which will take place online, contact our department at biostatistics[at]vumc[dot]org for the link.
Improved Variable Selection with Second-Generation P-Values
Many statistical methods have been proposed for variable selection in the past decades, but few balance inference and prediction tasks well. Here we investigate a novel variable selection approach called Penalized regression with Second-Generation P-Values (ProSGPV). It captures the true model at the best rate achieved by current standards, is easy to implement in practice, and often yields the smallest parameter estimation error. The idea is to use an l_0 penalization scheme with second-generation p-values (SGPV), instead of classical p-values, to determine which variables remain in a model. The approach yields tangible advantages for balancing support recovery, parameter estimation, and prediction tasks in linear regression, logistic regression, Poisson regression, and Cox proportional hazards regression settings. The ProSGPV algorithm can maintain its good performance even when there is strong collinearity among features or when a high dimensional feature space with p>n is considered. We present extensive simulations and real-world applications comparing the ProSGPV approach with current standards for variable selection. ProSGPV has superior inference performance and comparable prediction performance in certain scenarios. An R package is provided to implement the ProSGPV algorithm and yield variable selection results.