切换至 "中华医学电子期刊资源库"

中华肺部疾病杂志(电子版) ›› 2024, Vol. 17 ›› Issue (04) : 535 -542. doi: 10.3877/cma.j.issn.1674-6902.2024.04.006

论著

单细胞和bulk RNA测序的综合分析预测肺鳞状细胞癌治疗反应和预后
姜露1, 周菊1, 毛杨1, 代黔1,()   
  1. 1. 400037 重庆,陆军(第三)军医大学第二附属医院临床医学研究中心
  • 收稿日期:2024-02-21 出版日期:2024-08-25
  • 通信作者: 代黔
  • 基金资助:
    重庆市自然科学基金面上项目(CSTB2023NSCQ-MSX0417)

Comprehensive analysis of single-cell and bulk RNA sequencing for predicting prognosis and treatment response in lung squamous cell carcinoma

Lu Jiang1, Ju Zhou1, Yang Mao1, Qian Dai1,()   

  1. 1. Clinical Medical Research Center of the Second Affiliated Hospital of Army Medical University, Chongqing 400037, China
  • Received:2024-02-21 Published:2024-08-25
  • Corresponding author: Qian Dai
引用本文:

姜露, 周菊, 毛杨, 代黔. 单细胞和bulk RNA测序的综合分析预测肺鳞状细胞癌治疗反应和预后[J]. 中华肺部疾病杂志(电子版), 2024, 17(04): 535-542.

Lu Jiang, Ju Zhou, Yang Mao, Qian Dai. Comprehensive analysis of single-cell and bulk RNA sequencing for predicting prognosis and treatment response in lung squamous cell carcinoma[J]. Chinese Journal of Lung Diseases(Electronic Edition), 2024, 17(04): 535-542.

目的

通过整合单细胞RNA测序(sing-cell RNA sequencing, scRNA-seq)和癌症基因组图谱(The Cancer Genome Atlas, TCGA)数据,探索肺鳞状细胞癌(lung squamous cell carcinoma, LUSC)中差异表达基因(differentially expressed genes, DEGs)和预后相关基因,并构建基于这些基因的预后模型。为了更好地了解LUSC中的免疫微环境(tumor microenvironment, TME),进一步分析了免疫细胞的浸润特征,揭示其与LUSC患者预后的潜在关联。

方法

从基因表达综合数据库GEO(GSE118245)中获取LUSC单细胞RNA测序数据,并通过质量控制和数据标准化,鉴定出不同的细胞群体。使用Seurat软件包进行主成分分析(principal component analysis, PCA)和统一流形近似投影(uniform manifold approximation and projection, UMAP)进行降维聚类。基于TCGA数据库中的LUSC患者样本,使用TCGAbiolinks包获取批量RNA测序数据,并对肿瘤样本和正常样本之间的DEGs进行筛选,采用加权基因共表达网络分析(weighted gene co-expression network analysis, WGCNA)构建基因共表达网络。使用Cox回归和最小绝对收缩和选择算子(least absolute shrinkage and selection operator, LASSO)回归构建基于DEGs的预后模型,Kaplan-Meier生存曲线评估患者总生存期(overall survival, OS)。使用CIBERSORT算法评估不同风险组中的免疫细胞浸润比例,比较22种免疫细胞在高风险和低风险组中的差异。

结果

从LUSC单细胞数据中质控得到出5 360个细胞,注释为13个不同的细胞群,并通过SingleR注释为8种细胞类型。与对照组相比,LUSC组中的中性粒细胞、CD4 T细胞和骨骼肌细胞比例显著上升。差异表达基因分析共筛选出3 396个DEGs,其中1 851个基因上调,1 545个基因下调。GO和KEGG富集分析表明,DEGs主要参与细胞周期、感染和补体级联反应等重要生物过程。在TCGA-LUSC数据中,使用WGCNA识别出与LUSC发展显著相关的蓝色模块,在此基础上筛选出61个交集基因进行进一步分析。单因素Cox回归和LASSO回归分析共识别出4个独立的预后相关基因(ITIH3、MME、PLAAT1和ATP13A5),构建了风险评分模型。Kaplan-Meier生存曲线显示高风险组患者的总生存期显著低于低风险组。免疫浸润分析结果表明,高风险组患者肿瘤中的CD8 T细胞、活化的记忆CD4 T细胞和滤泡辅助性T细胞比例显著高于低风险组,进一步支持了免疫微环境对LUSC进展的关键作用。

结论

通过整合单细胞和TCGA数据,成功鉴定了LUSC中的关键差异表达基因,构建了有效的预后模型。模型在多个验证队列中表现出预后预测,揭示的免疫细胞浸润特征为LUSC的免疫微环境提供了新的见解。免疫细胞在LUSC的发生和进展中发挥重要作用。

Objective

This study aims to explore differentially expressed genes (DEGs) and prognostic genes in lung squamous cell carcinoma (LUSC) by integrating single-cell RNA sequencing (scRNA-seq) and data from The Cancer Genome Atlas (TCGA), and to construct a prognostic model based on these genes. To further understand the immune microenvironment (TME) in LUSC, we analyzed immune cell infiltration characteristics and revealed their potential association with patient prognosis.

Methods

Single-cell RNA-seq data for LUSC were obtained from the Gene Expression Omnibus (GEO) database (GSE118245). After quality control and data normalization, distinct cell populations were identified. Principal component analysis (PCA) and uniform manifold approximation and projection (UMAP) were performed using the Seurat package to cluster the data. Bulk RNA sequencing data from LUSC patient samples in the TCGA database were obtained using the TCGAbiolinks package, and DEGs between tumor and normal samples were identified. Weighted gene co-expression network analysis (WGCNA) was employed to construct a gene co-expression network. Cox regression and least absolute shrinkage and selection operator (LASSO) regression were applied to develop a prognostic model based on DEGs, and Kaplan-Meier survival curves were used to evaluate overall survival (OS). Additionally, the CIBERSORT algorithm was used to assess immune cell infiltration proportions in different risk groups, and the infiltration levels of 22 immune cell types were compared between the high-risk and low-risk groups.

Results

After quality control, 5, 360 cells were identified from the LUSC single-cell data and annotated into 13 distinct cell clusters, which were further categorized into 8 cell types using SingleR. Compared with the control group, the proportions of neutrophils, CD4+ T cells, and skeletal muscle cells were significantly elevated in the LUSC group. DEG analysis identified 3, 396 DEGs, of which 1, 851 were upregulated and 1, 545 were downregulated. Gene Ontology (GO) and KEGG enrichment analyses revealed that these DEGs were mainly involved in biological processes such as the cell cycle, infection, and complement cascade reactions. In the TCGA-LUSC data, WGCNA identified the blue module as significantly associated with LUSC progression, from which 61 intersecting genes were selected for further analysis. Univariate Cox regression and LASSO regression identified four independent prognostic genes (ITIH3, MME, PLAAT1, and ATP13A5), and a risk score model was constructed based on these genes. Kaplan-Meier survival curves showed that patients in the high-risk group had significantly shorter OS than those in the low-risk group. The prognostic model performed well in both GEO validation cohorts (GSE192870 and GSE180712). Immune infiltration analysis indicated significant differences in the proportions of CD8+ T cells, activated memory CD4+ T cells, and follicular helper T cells between the high-risk and low-risk groups, supporting the crucial role of TME in LUSC progression.

Conclusion

This study successfully identified key DEGs in LUSC through the integration of single-cell RNA-seq and TCGA data and developed an effective prognostic model. The model demonstrated robust prognostic prediction in multiple validation cohorts, and the immune infiltration characteristics uncovered by this study provide new insights into the TME in LUSC. The findings suggest that immune cells play a critical role in LUSC development and progression, potentially offering new therapeutic targets for immunotherapy in LUSC.

图1 LUSC 10倍单细胞RNA测序数据中不同的簇注释和细胞类型鉴定。注:A:通过UMAP进行簇注释和细胞类型鉴定;B:不同分组细胞组成成分比例LUSC,肺鳞状细胞癌;scRNA-seq,单细胞RNA测序;UMAP,统一流形逼近与投影
图2 通过PCA降维后的分组信息与TCGA队列中的差异表达基因的富集分析。注:A:通过主成分分析(PCA)可以得到对照组与肿瘤组具有显著差异;B:通过热图可视化可以看到对照组与肿瘤组基因表达差异;C:韦恩图显示了三大差异分析R包分析得到的上下调基因交集;D、E:对已鉴定的差异表达基因进行GO和KEGG富集分析。DEGs,差异表达基因;edgeR、Deseq2及limma:三大差异分析R包;PCA,主成分分析;TCGA,癌症基因组图谱;GO,基因本体论;KEGG,京都基因与基因组百科全书
图3 通过WGCNA鉴定参与LUSC发展的枢纽DEGs。注:A:软阈值功率的无尺度拟合指数。基于无尺度R2(R2=0.90)确定WGCNA中的软阈值功率β。左侧面板显示了β和R2之间的关系。右侧面板显示了β和平均连通性之间的关系;B:基于不同指标对DEGs进行聚类的树形图;C:热图显示了不同基因模块和临床特征(正常vs.肿瘤)之间的相关性;D:Venn图识别WGCNA模块基因和标记基因之间的公共DEGs。DEGs,差异表达基因;LUSC,肺鳞状细胞癌;WGCNA,加权基因关联网络分析
图4 LUSC患者预后模型的建立和验证。注:A-C:生存曲线评估了构建的风险模型在TCGA队列中的风险分层能力和预测能力
图5 对高低风险组免疫细胞浸润特征、通路富集差异及药物敏感性进行分析。注:A:不同风险组的免疫细胞丰度;B:累积直方图显示了高风险组和低风险组中免疫细胞浸润的比例。不同颜色代表不同的细胞类型;C:箱线图显示了高风险组和低风险组患者中22种浸润免疫细胞比例的差异。红色代表高风险组,蓝色代表低风险组;D:高风险组中每种免疫细胞的浸润比例
1
Sung H, Ferlay J, Siegel RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortalityworldwide for 36 cancers in 185 countries[J]. CA: Cancer J Clin, 2021, 71(3): 209-249.
2
Travis WD, Brambilla E, Nicholson AG, et al. The 2015 world health organization classification of lung tumors: Impact ofgenetic, clinical and radiologic advances since the 2004 classification[J]. J Thorac Oncol, 2015, 10(9): 1243-1260.
3
陈国标,杜 巍,周建平,等. 肺癌病例分布及病理特征趋势分析[J/CD]. 中华肺部疾病杂志(电子版), 2021, 14(5): 590-592.
4
Zappa C, Mousa SA. Non-small cell lung cancer: current treatment andfuture advances[J]. Trans Lung Cancer Res, 2016, 5(3): 288-300.
5
Lin JJ, Shaw AT. Resisting resistance: Targeted therapies in lung cancer[J]. Trends Cancer, 2016, 2(7): 350-364.
6
Li X, Shao C, Shi Y, et al. Lessons learned from the blockade of immunecheckpoints in cancer immunotherapy[J]. J Hematol Oncol, 2018, 11(1): 31.
7
Havel JJ, Chowell D, Chan TA. The evolving landscape of biomarkers forcheckpoint inhibitor immunotherapy[J]. Nat Rev Cancer, 2019, 19(3): 133-150.
8
Hinshaw DC, Shevde LA. The tumor microenvironment innately modulatescancer progression[J]. Cancer Res, 2019, 79(18): 4557-4566.
9
Xiao Y, Yu D. Tumor microenvironment as a therapeutic target in cancer[J]. Pharmacol Ther, 2021, 221: 107753.
10
Zheng C, Zheng L, Yoo JK, et al. Landscape ofinfiltrating T cells in liver cancer revealed by single-cell sequencing[J]. Cell, 2017, 169(9): 1342-1356.e1316.
11
Liu YT, Sun ZJ. Turning cold tumors into hot tumors by improving T-cellinfiltration[J]. Theranostics, 2021, 11(11): 5365-5386.
12
Feins S, Kong W, Williams EF, et al. An introduction tochimeric antigen receptor (CAR) T-cell immunotherapy for human cancer[J]. Am J Hematol, 2019, 94(S1): S3-S9.
13
June CH, O′Connor RS, Kawalekar OU, et al. CAR Tcell immunotherapy for human cancer[J]. Sci (New York N.Y.), 2018, 359(6382): 1361-1365.
14
Stankovic B, Bjørhovde HAK, Skarshaug R, et al. Immune cell composition in human non-small cell lung cancer[J]. Front Immunol, 2019, 9: 3101.
15
Li L, Xiong F, Wang Y, et al. What are theapplications of single-cell RNA sequencing in cancer research: a systematic review[J]. J Exp Clin Cancer Res, 2021, 40(1): 163.
16
Azizi E, Carr AJ, Plitas G, et al. Single-cell map of diverse immune phenotypes in the breast tumormicroenvironment[J]. Cell, 2018, 174(5): 1293-1308.
17
Song P, Li W, Guo L, et al. Identification and validation of anovel signature based on NK cell marker genes to predict prognosis andimmunotherapy response in lung adenocarcinoma by integrated analysis ofsingle-cell and bulk RNA-sequencing[J]. Front Immunol, 2022, 13: 850745.
18
Jiang A, Wang J, Liu N, et al. Integration of single-cellRNA sequencing and bulk RNA sequencing data to establish and validate aprognostic model for patients with lung adenocarcinoma[J]. Front Genet, 2022, 13: 833797.
19
Zheng H, Liu H, Ge Y, et al. Integrated single-cell and bulk RNAsequencing analysis identifies a cancer associated fibroblast-related signature forpredicting prognosis and therapeutic responses in colorectal cancer[J]. Cancer Cell Int, 2021, 21(1): 552.
20
Wu T, Hu E, Xu S, et al. ClusterProfiler 4.0: A universal enrichment tool for interpreting omics data[J]. Innovation (Camb), 2021, 2(3): 100141.
21
Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies[J]. Nucleic Acids Res, 2015, 43(7): e47.
22
Engebretsen S, Bohlin J. Statistical predictions with glmnet[J]. Clin Epigenetics, 2019, 11(1): 123.
23
Blanche P, Dartigues JF, Jacqmin-Gadda H. Estimating and comparing time-dependent areas under receiver operating characteristic curves for censored event times with competing risks[J]. Stat Med, 2013, 32(30): 5381-5397.
24
Ito K, Murphy D. Application of ggplot2 to Pharmacometric Graphics[J]. CPT Pharmacometrics Syst Pharmacol, 2013, 2(10): e79.
25
Newman AM, Liu CL, Green MR, et al. Robust enumeration of cell subsets from tissue expression profiles[J]. Nat Methods, 2015, 12(5): 453-457.
26
Chong PK, Lee H, Zhou J, et al. ITIH3 is a potential biomarker for early detection of gastric cancer[J]. J Proteome Res, 2010, 9(7): 3671-3679.
27
Liu Y, Shi L, Yuan C, et al. Downregulation of ITIH3 contributes to cisplatin-based chemotherapy resistance in ovarian carcinoma via the Bcl-2 mediated anti-apoptosis signaling pathway[J]. Oncol Lett, 2022, 25(2): 61.
28
Thomas BC, Kay JD, Menon S, et al. Whole blood mRNA in prostate cancer reveals a four-gene androgen regulated panel[J]. Endocr Relat Cancer, 2016, 23(10): 797-812.
29
Pu J, Teng Z, Yang W, et al. Construction of a prognostic model for lung squamous cell carcinoma based on immune-related genes[J]. Carcinogenesis, 2023, 44(2): 143-152.
30
Sørensen DM, Holemans T, van Veen S, et al. Parkinson disease related ATP13A2 evolved early in animal evolution[J]. PLoS One, 2018, 13(3): e0193228.
31
Xiao Y, Yu D. Tumor microenvironment as a therapeutic target in cancer[J]. Pharmacol Ther, 2021, 221: 107753.
32
Vasic D, Lee JB, Leung Y, et al. Allogeneic double-negative CAR-T cells inhibit tumor growth without off-tumor toxicities[J]. Sci Immunol, 2022, 7(70): eabl3642.
[1] 洪玮, 叶细容, 刘枝红, 杨银凤, 吕志红. 超声影像组学联合临床病理特征预测乳腺癌新辅助化疗完全病理缓解的价值[J]. 中华医学超声杂志(电子版), 2024, 21(06): 571-579.
[2] 屈翔宇, 张懿刚, 李浩令, 邱天, 谈燚. USP24及其共表达肿瘤代谢基因在肝细胞癌中的诊断和预后预测作用[J]. 中华普外科手术学杂志(电子版), 2024, 18(06): 659-662.
[3] 顾雯, 凌守鑫, 唐海利, 甘雪梅. 两种不同手术入路在甲状腺乳头状癌患者开放性根治性术中的应用比较[J]. 中华普外科手术学杂志(电子版), 2024, 18(06): 687-690.
[4] 付成旺, 杨大刚, 王榕, 李福堂. 营养与炎症指标在可切除胰腺癌中的研究进展[J]. 中华普外科手术学杂志(电子版), 2024, 18(06): 704-708.
[5] 梁孟杰, 朱欢欢, 王行舟, 江航, 艾世超, 孙锋, 宋鹏, 王萌, 刘颂, 夏雪峰, 杜峻峰, 傅双, 陆晓峰, 沈晓菲, 管文贤. 联合免疫治疗的胃癌转化治疗患者预后及术后并发症分析[J]. 中华普外科手术学杂志(电子版), 2024, 18(06): 619-623.
[6] 张志兆, 王睿, 郜苹苹, 王成方, 王成, 齐晓伟. DNMT3B与乳腺癌预后的关系及其生物学机制[J]. 中华普外科手术学杂志(电子版), 2024, 18(06): 624-629.
[7] 孙建娜, 孔令军, 任崇禧, 穆坤, 王晓蕊. 266例首诊Ⅳ期乳腺癌手术患者预后分析[J]. 中华普外科手术学杂志(电子版), 2024, 18(05): 502-505.
[8] 袁庆港, 刘理想, 张亮, 周世振, 高波, 丁超, 管文贤. 尿素-肌酐比值(UCR)可预测结直肠癌患者术后的长期预后[J]. 中华普外科手术学杂志(电子版), 2024, 18(05): 506-509.
[9] 黄福, 王黔, 金相任, 唐云川. VEGFR2、miR-27a-5p在胃癌组织中的表达与临床病理参数及预后的关系研究[J]. 中华普外科手术学杂志(电子版), 2024, 18(05): 558-561.
[10] 唐诗, 薛传优, 叶兴, 张鸿举, 戴瑞. 急性病毒性肝炎患者血脂、血糖、蛋白、尿酸变化特点及其与预后的关联[J]. 中华消化病与影像杂志(电子版), 2024, 14(05): 396-399.
[11] 李素娟, 王文玲, 董洪敏, 李小凯, 黄思成, 王刚. 多原发与单原发大肠腺癌的预后分析[J]. 中华消化病与影像杂志(电子版), 2024, 14(05): 407-412.
[12] 孙文恺, 沈青, 杭丽, 张迎春. 纤维蛋白原与清蛋白比值、中性粒细胞与白蛋白比值、C反应蛋白与溃疡性结肠炎病情评估和预后的关系[J]. 中华消化病与影像杂志(电子版), 2024, 14(05): 426-431.
[13] 郭曌蓉, 王歆光, 刘毅强, 何英剑, 王立泽, 杨飏, 汪星, 曹威, 谷重山, 范铁, 李金锋, 范照青. 不同亚型乳腺叶状肿瘤的临床病理特征及预后危险因素分析[J]. 中华临床医师杂志(电子版), 2024, 18(06): 524-532.
[14] 董晟, 郎胜坤, 葛新, 孙少君, 薛明宇. 反向休克指数乘以格拉斯哥昏迷评分对老年严重创伤患者发生急性创伤性凝血功能障碍的预测价值[J]. 中华临床医师杂志(电子版), 2024, 18(06): 541-547.
[15] 黄圣楷, 许斌, 苏健, 孙龙. 海南省2010~2020年乙型肝炎流行趋势的时间序列分析及预测[J]. 中华临床医师杂志(电子版), 2024, 18(06): 555-561.
阅读次数
全文


摘要