Open Access Open Access  Restricted Access Subscription or Fee Access

Detection of Cervical Cancer Using lncRNA Expression Data and Detection of Possible Biomarkers with Bagged CART Machine Learning Method

Zeynep Kucukakcali, Ipek Balikci Cicek, Cemil Colak

Abstract


Aim: Cervical cancer (CC), one of the most common gynecological cancers, occurs when the cell layer that forms the surface of the cervix turns into abnormal cells. This type of cancer ranks fourth in cancer-related female deaths, and 3.6% of women living in developed countries suffer from this disease, while approximately 15% of women living in underdeveloped countries are exposed to this cancer. The primary means of reducing the high mortality associated with the disease is early diagnosis and treatment. Especially in underdeveloped countries and in countries where screening programs are not adequately established, early diagnosis of cancer and initiation of rapid and effective treatment are very important in reducing possible deaths and increasing survival rates. Therefore, new studies and biomarkers for the early detection of this cancer are needed. Therefore, within the scope of this study, using the lncRNA expression data of patients with open access CC and paracancerous tissues, the data were classified with Bagged CART, which is one of the machine learning (ML) methods, and biomarkers that may be associated with cancer were obtained as a result of modeling. Methods: In the current study, an open-access dataset was used to reveal the relationship of lncRNAs with CC. The dataset includes data from samples taken from 9 paracancerous tissues with 9 cervical cancers. In the modeling phase, the Bagged CART method was applied using 5-fold cross-validation. Performance values obtained as a result of the model were evaluated with accuracy (ACC), balanced accuracy (b-ACC), sensitivity (SE), specificity (SP), positive predictive value (ppv), negative predictive value (npv), and F1-score. Results: In the study, LASSO variable selection method was used in order to select the most important variables associated with the output variable and to reduce the number of input variables. Modeling was done with 14 lncRNAs selected by this method. When the performance metrics obtained by the modeling were examined, ACC, b-ACC, SE, SP, ppv, npv and F1-score were obtained as 94.4%, 94.4%, 100%, 88.9%, 90%, 100%, 94.7%, respectively. When the variable significance values obtained from the bagged CART results were examined, it was seen that the lncRNAs that most explained CC were RP11-80I15.4, RP11-183I6.2, AC144525.1, RP11-129B22.1, and RP11-389K14.3. Conclusion: When the findings obtained from the study were examined, possible biomarker lncRNAs for CC were detected using Bagged CART, one of the ML methods. With comprehensive analyzes on the subject, more accurate and reliable results can be obtained and the reliability of possible marker lncRNAs can be tested.

Keywords


Cervical Cancer, lncRNA, biomarker, expression, machine learning, BaggedCART.

Full Text:

PDF

References


Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: a cancer journal for clinicians. 2021;71(3):209–49.

Small Jr W, Bacon MA, Bajaj A, Chuang LT, Fisher BJ, Harkenrider MM, et al. Cervical cancer: a global health crisis. Cancer. 2017;123(13):2404–12.

Meng Y, Liang H, Hu J, Liu S, Hao X, Wong MSK, et al. PD-L1 expression correlates with tumor infiltrating lymphocytes and response to neoadjuvant chemotherapy in cervical cancer. Journal of Cancer. 2018;9(16):2938.

Schiffman M, Wentzensen N, Wacholder S, Kinney W, Gage JC, Castle PE. Human papillomavirus testing in the prevention of cervical cancer. Journal of the National cancer institute. 2011;103(5):368–83.

Shen S, Zhang S, Liu P, Wang J, Du H. Potential role of microRNAs in the treatment and diagnosis of cervical cancer. Cancer genetics. 2020;248:25–30.

Network CGAR. Integrated genomic and molecular characterization of cervical cancer. Nature. 2017;543(7645):378.

Xiao F-Y, Xie F, Sui L. Diagnostic accuracy of colposcopically directed biopsy and loop electrosurgical excision procedure for cervical lesions. Reproductive and Developmental Medicine. 2018;2(03):137–41.

Wang Q, Zhu C-Y, Chen L-M, Gao S-J, Du M, Zhang H-W, et al. Clinical value of human papillomavirus E6/E7 mRNA testing in patients with atypical squamous cells of undetermined significance and low-grade squamous intraepithelial lesion. Reproductive and Developmental Medicine. 2018;2(03):157–61.

Della Corte L, Barra F, Foreste V, Giampaolino P, Evangelisti G, Ferrero S, et al. Advances in paclitaxel combinations for treating cervical cancer. Expert Opinion on Pharmacotherapy. 2020;21(6):663–77.

Uyar D, Rader J. Genomics of cervical cancer and the role of human papillomavirus pathobiology. Clinical chemistry. 2014;60(1):144–6.

Aalijahan H, Ghorbian S. Long non-coding RNAs and cervical cancer. Experimental and molecular pathology. 2019;106:7–16.

Wu Z-H, Wang X-L, Tang H-M, Jiang T, Chen J, Lu S, et al. Long non-coding RNA HOTAIR is a powerful predictor of metastasis and poor prognosis and is associated with epithelial-mesenchymal transition in colon cancer. Oncology reports. 2014;32(1):395–402.

Liu Y, Yang Y, Li L, Liu Y, Geng P, Li G, et al. LncRNA SNHG1 enhances cell proliferation, migration, and invasion in cervical cancer. Biochemistry and Cell Biology. 2018;96(1):38–43.

Chen X, Yan CC, Zhang X, You Z-H. Long non-coding RNAs and complex diseases: from experimental results to computational models. Briefings in bioinformatics. 2017;18(4):558–76.

Tan J, Li X, Zhang L, Du Z. Recent advances in machine learning methods for predicting LncRNA and disease associations. Frontiers in Cellular and Infection Microbiology. 2022;12:1071972.

Yan H, Zheng G, Qu J, Liu Y, Huang X, Zhang E, et al. Identification of key candidate genes and pathways in multiple myeloma by integrated bioinformatics analysis. Journal of cellular physiology. 2019;234(12):23785–97.

Liu S, Xu C, Zhang Y, Liu J, Yu B, Liu X, et al. Feature selection of gene expression data for cancer classification using double RBF-kernels. BMC bioinformatics. 2018;19(1):1–14.

Chandrashekar G, Sahin F. A survey on feature selection methods. Computers & Electrical Engineering. 2014;40(1):16–28.

Zhang HH, Lu W. Adaptive Lasso for Cox's proportional hazards model. Biometrika. 2007;94(3):691–703.

Timofeev R. Classification and regression trees (CART) theory and applications. Humboldt University, Berlin. 2004;54.

Sutton CD. Classification and regression trees, bagging, and boosting. Handbook of statistics. 2005;24:303–29.

Choubin B, Abdolshahnejad M, Moradi E, Querol X, Mosavi A, Shamshirband S, et al. Spatial hazard assessment of the PM10 using machine learning models in Barcelona, Spain. Science of The Total Environment. 2020;701:134474.

Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieulent J, Jemal A. Global cancer statistics, 2012. CA: a cancer journal for clinicians. 2015;65(2):87–108.

Carter JR, Ding Z, Rose BR. HPV infection and cervical disease: a review. The Australian & New Zealand journal of obstetrics & gynaecology. 2011;51(2):103–8.

Chen D, Juko-Pecirep I, Hammer J, Ivansson E, Enroth S, Gustavsson I, et al. Genome-wide association study of susceptibility loci for cervical cancer. Journal of the National Cancer Institute. 2013;105(9):624–33.

Kori M, Gov E, Arga KY. Novel Genomic Biomarker Candidates for Cervical Cancer As Identified by Differential Co-Expression Network Analysis. Omics : a journal of integrative biology. 2019;23(5):261–73.

Yang A, Farmer E, Wu T, Hung C-F. Perspectives for therapeutic HPV vaccine development. Journal of biomedical science. 2016;23(1):1–19.

Dasari S, Wudayagiri R, Valluru L. Cervical cancer: Biomarkers for diagnosis and treatment. Clinica chimica acta; international journal of clinical chemistry. 2015;445:7–11.

Li J, Li Y-X, Li Y-Y. Differential regulatory analysis based on coexpression network in cancer research. BioMed research international. 2016;2016.

Cao S, Liu W, Li F, Zhao W, Qin C. Decreased expression of lncRNA GAS5 predicts a poor prognosis in cervical cancer. International journal of clinical and experimental pathology. 2014;7(10):6776.

Ji N, Wang Y, Bao G, Yan J, Ji S. LncRNA SNHG14 promotes the progression of cervical cancer by regulating miR-206/YWHAZ. Pathology-Research and Practice. 2019;215(4):668–75.




DOI: https://doi.org/10.37591/rrjocb.v12i1.3194

Refbacks

  • There are currently no refbacks.