Michael E. Kim, Chenyu Gao, Leon Y. Cai, Qi Yang, Nancy R. Newlin, Karthik Ramadass, Angela Jefferson, Derek Archer, Niranjana Shashikumar, Kimberly R. Pechman, Katherine A. Gifford, Timothy J. Hohman, Lori L. Beason-Held, Susan M. Resnick, Stefan Winzeck, Kurt G. Schilling, Panpan Zhang, Daniel Moyer, and Bennett A. Landman. “Empirical Assessment of the Assumptions of ComBat with Diffusion Tensor Imaging.” Journal of Medical Imaging (Bellingham), vol. 11, no. 2, 024011, March 2024. doi:10.1117/1.JMI.11.2.024011.
Diffusion tensor imaging (DTI) is a magnetic resonance imaging technique that provides unique insights into white matter microstructure in the brain. However, it is susceptible to confounding effects introduced by scanner or acquisition differences. ComBat is a leading approach for addressing these site biases. Despite its frequent use for harmonization, ComBat’s robustness towards site dissimilarities and overall cohort size has not yet been evaluated in the context of DTI.
To address this, we matched 358 participants from two sites to create a “silver standard” cohort for multi-site harmonization. We harmonized mean fractional anisotropy (FA) and mean diffusivity (MD) calculated from participant DTI data for regions of interest defined by the JHU EVE-Type III atlas. To quantify the reliability of ComBat, we performed bootstrapping over 10 iterations at 19 levels of total sample size, 10 levels of sample size imbalance between sites, and 6 levels of mean age difference between sites. We measured three key metrics: (i) β_AGE, the linear regression coefficient of the relationship between FA and age; (ii) γ_sf, the ComBat-estimated site-shift; and (iii) δ_sf, the ComBat-estimated site-scaling. We evaluated the reliability of ComBat by calculating the root mean squared error (RMSE) in these metrics and examined the correlation between the reliability of ComBat and the violation of model assumptions.
Our results indicate that ComBat performs reliably for β_AGE when the total sample size is greater than 162 and the mean age difference between sites is less than 4 years. The assumptions of the ComBat model regarding the normality of residual distributions are not violated as the model becomes unstable.
In conclusion, before harmonizing DTI data with ComBat, it is crucial to examine the input cohort for size and covariate distributions at each site. Direct assessment of residual distributions is less informative on stability than bootstrap analysis. We advise caution when using ComBat in situations that do not conform to the identified thresholds.