Abstract
Nat Commun. 2025 Dec 24. doi: 10.1038/s41467-025-67656-x. Online ahead of print.
ABSTRACT
Smoking is the most important behavioural determinant of morbidity and mortality. Using machine learning on plasma levels of 2,917 proteins in the UK Biobank (n = 43,914), we develop a proteomic Smoking Index (pSIN) comprising 51 proteins that accurately distinguish current from never smokers (AUC = 0.95; 95% CI 0.94-0.95). Validation in the China Kadoorie Biobank (n = 3,977) shows similar accuracy (AUC = 0.91; 95% CI 0.89-0.92). pSIN is significantly associated with the risk of all-cause mortality and 18 major chronic diseases, including cardiovascular, renal, pulmonary, neurodegenerative, and cancer outcomes. Among current and former smokers, pSIN predicts death and 11 diseases independently of self-reported smoking history and lifestyle factors. Genome-wide analysis identifies 125 genes (e.g., ALPP, CST5, IL12B) associated with pSIN, while exposome analysis highlights maternal smoking, diet, physical activity, and air pollution as key modifiers. Notably, pSIN tracks recovery among former smokers and identifies those whose disease risks remain comparable to current smokers. These findings demonstrate that plasma proteomics effectively capture the biological imprint of smoking and predict smoking-related morbidity and mortality, offering a more nuanced, molecularly grounded assessment of individual variation in biological response to smoking.
PMID:41444232 | DOI:10.1038/s41467-025-67656-x
UK DRI Authors