Find in Library
Search millions of books, articles, and more
Indexed Open Access Databases
Chronic Obstructive Pulmonary Disease: Novel Genes Detection with Penalized Logistic Regression
oleh: Kimiya Gohari, Anoshirvan Kazemnejad, Shayan Mostafaei, Samaneh Saberi, Ali Sheidaei
Format: | Article |
---|---|
Diterbitkan: | Royan Institute (ACECR), Tehran 2023-03-01 |
Deskripsi
Objective: This study aimed to introduce novel techniques for identifying the genes associated with developingchronic obstructive pulmonary disease (COPD) and to prioritize COPD candidate genes using regression methods.Materials and Methods: This is a secondary analysis of the data from an experimental study. We used penalizedlogistic regressions with three different types of penalties included least absolute shrinkage and selection operator(LASSO), minimax concave penalty (MCP), and smoothly clipped absolute deviation (SCAD). The models weretrained using genome-wide expression profiling to define gene networks relevant to the COPD stages. A 10-foldcross-validation scheme was used to evaluate the performance of the methods. In addition, we validate ourresults by the external validity approach. We reported the sensitivity, specificity, and area under curve (AUC) ofthe models.Results: There were 21, 22, and 18 significantly associated genes for LASSO, SCAD, and MCP models, respectively.The most statistically conservative method (detecting less significant features) was MCP detected 18 genes that wereall detected by the other two approaches. The most appropriate approach was a SCAD penalized logistic regression(AUC= 96.26, sensitivity= 94.2, specificity= 86.96). In this study, we have a common panel of 18 genes in all threemodels that show a significant positive and negative correlation with COPD, in which RNF130, STX6, PLCB1,CACNA1G, LARP4B, LOC100507634, SLC38A2, and STIM2 showed the odds ratio (OR) more than 1. However, therewas a slight difference between penalized methods.Conclusion: Regularization solves the serious dimensionality problem in using this kind of regression. More explorationof how these genes affect the outcome and mechanism is possible more quickly in this manner. The regression-basedapproaches we present could apply to overcoming this issue.