Original Article

COL5A2-Dependent Cancer-associated Fibroblasts (CAF) Reveals Regulation of the Tumor Microenvironment and Response to Immunotherapy in Lung Adenocarcinoma

by Xiao-Qin Shi1, Xiao-Lan Wang1, Xiu-Jun Chang2, Kang-An Li3, Yiran Cai1*

1Pathology Center, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

2Department of Thoracic Surgery, Beijing Chest Hospital, Beijing, China

3Department of Radiology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China

*Corresponding author: Yi-Ran Cai, Pathology Center, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine. 100 Haining ST. Hongkou Distirct, Shanghai, China

Received Date: 27 September, 2023

Accepted Date: 19 October, 2023

Published Date: 24 October, 2023

Citation: Shi XQ, Wang XL, Chang XJ, Li KA, Cai Y (2023) COL5A2-Dependent Cancer-associated Fibroblasts (CAF) Reveals Regulation of the Tumor Microenvironment and Response to Immunotherapy in Lung Adenocarcinoma. J Oncol Res Ther 8: 10186. https://doi.org/10.29011/2574-710X.10186

Abstract

Background Cancer-associated fibroblasts (CAFs) are crucial for tumor microenvironment (TME) remodeling and correlated with tumor progression. Dynamic interactions between CAFs and tumor cells and immune cells in lung adenocarcinoma (LAC) are still not clear. Method The role of CAFs in LAC and potential novel mediators of their functions were investigated. Hallmark signals associated with CAFs and immune components in LAC were analyzed in cohorts from TCGA and GEO databases. These cohorts were analyzed by bioinformatic method with R and Bioconductor packages. Twenty LAC patients who were treated with anti-PD-1 drug were involved to evaluate their pathological response. Result Genes based on CAF markers in the literature were clustered and sieved in LAC to find representative biomarkers which reflect TME and predict the effect of immunotherapy. Most of the cancer hallmark signaling pathways were enriched in extracellular matrix organization-related GO terms. COL5A2 were upregulated in CAFs compared to normal tissue. The expression of COL5A2 as negatively correlated with CD8+ T cells. COL5A2 indicated poor prognostic outcomes and might be correlated with the immunosuppressive tumor microenvironment (TME). LACs with COL5A2 overexpression had better clinical outcomes after anti-PD-1 inhibitor in twenty LAC with neoadjuvant therapy. Conclusion CAFs play an essential role in tumor progression and the TME. We identified an extracellular protein, COL5A2, as a prognostic marker and potential therapeutic target in LACs.

Keywords: COL5A2; Lung adenocarcinoma; Cancer-associated Fibroblasts; Tumor microenvironment; Anti-PD-1 inhibitor; Neoadjuvant therapy

Introduction

Lung cancer is the most common malignant tumor worldwide. Lung adenocarcinoma (LAC) is an important histological type in non-small cell lung cancer. Recently, the management of LAC and the understanding of its biology have been the subject of great progresses. Thus, different histological subtypes have been identified, characterized by distinct genetic and molecular alterations, corresponding to different pathways of oncogenesis, inhibited by targeted therapies. The World Health Organization classification, which was revised in 2021 distinguishes subtypes, including lepidic, acinar, papillary, micropapillary and solid LAC. Of these, micropapillary and solid LACs are regarded poorly differentiated and associated with poor prognosis [1]. Anti-EGFR tyrosine kinase inhibitors (EGFR-TKI, such as Gefitinib and Ecotinib) and immunotherapies (such as nivolumab and ipilimumab) have successively led to a radical change in patient management and represent significant therapeutic options in the treatment of LAC [2, 3]. To further enhance the benefits of these treatments, different combinations, and sequences of TKI and immunotherapies are being investigated in clinical trials [3]. Patients who relapse on these therapies are left with very few options.

Tumor microenvironment (TME) is a complex ecosystem composed of tumor cells, infiltrating immune cells, and stromal cells intertwined with non-cellular components. The co-evolvement and dynamic interplay within and between these components shape the tumor's distinct biology and influence its response to cancer therapies. Cancer-associated fibroblasts (CAFs) constitute vastly heterogenous stromal cells and are prominent components of the microenvironment in solid tumors. Functionally, CAFs can contribute to the malignant development and progression by diverse mechanisms, including supporting tumor cell growth by secreting growth factors, extracellular matrix remodeling, promoting angiogenesis, and by mediating tumor-promoting inflammation [4,5]. The crucial role of TME, which serves as the soil for seeds (cancer cells), has been proven in many studies [6-8]. Cells in the TME mainly include stromal cells and immune cells. Recently, increasing evidence has high-lighted that appropriate stromal cells are crucial for the development of tumors [9,10] Among them, CAFs represent the main fraction, and accumulating evidence has indicated their role in cancer proliferation, progression and invasion [4,11]. Although various clinical trials targeting CAFs have been performed in recent years, such as targeting surface markers, reducing CAF infiltration and normalizing CAF functions, most of them are still ongoing [4]. Previous studies have identified many CAF markers, but few of them have moved into clinical practice. This may be due to the internal heterogeneity of CAFs. The CAFs seemed to originate from diverse cell types, such as fibrocytes, stellate cells, endothelial cells, and mesenchymal stem cells [12]. It is well accepted that most activated fibroblasts are derived from fibroblasts of adjacent normal tissues and induced by oxidative stress or specific cytokines and chemokines from cancer cells [13]. Hence, distinct subclusters have been identified by previous studies. Therefore, focusing on the function and mechanism of fibroblasts in the tumor microenvironment may provide a strategy for LAC treatment, especially immunotherapy.

In this study, we explored the relative infiltration level of fibroblasts in LAC and the correlation between CAFs and immune components in the TME. We further explored upregulated secreted proteins, which could be used to predict CAF function in LAC. To establish the relevance of the role of fibroblasts and the upregulated protein in LAC using publicly available datasets and clinical samples raises the probability that targeting biomarkers may yield clinical utility.

Methods

Datasets and tissue specimens

The Cancer Genome Atlas (TCGA) dataset was obtained using the TCGA biolinks and analyzed by packages in R software. The transcripts per million (TPM) value was estimated at the transcript level. Patients who were diagnosed with LAC histologically and available for transcriptomic data were included. A total of 501 LAC patients were enrolled in the TCGA cohort. Overall survival (OS) was assessed using vital status and days from diagnosis to death or the last follow-up date. Only patients with active follow-up information were included in survival analysis. Patients diagnosed with LAC and available for active follow-up information were included. For other LAC datasets, GSE68465 with the expression matrices were downloaded from the Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/geo/). The probes were mapped using the corresponding annotation platforms. The expression values were further normalized by the limma R package if necessary. The cell types were annotated using the Single R package if necessary. No identification information of participant was involved during download and analysis.

Tissues for immunohistochemistry and primary cell isolation were obtained at Beijing chest hospital and Shanghai general hospital between January 2010 and June 2021. Informed consent was obtained from participants. The study was conducted according to the principles stated in the Declaration of Helsinki.

Molecular markers, CAF score and CAF clustering

Representative immune-related genes (IRGs) and CAF signatures included the T cell signature (IFNG1, STAT1, CXCL10, IDO1, CXCL9), Myeloid dendritic cells  (CD1A, CD1B, CD1E, CLEC10A, CLIC2), inhibitory immune ligands/receptors (HAVCR2, CTLA4, LAG3, PDCD1, CD274), immune modulators (ENTPD1, NT5E), T cells associated-immune receptors (CD28, CD3D, CD3G, CD5, CD6, CHRM3-AS2, CTLA4, FLT3LG, ICOS, MAL, MGC40069, PBX4, SIRPG, THEMIS, TNFRSF25, TRAT1, CD8B, CD8A, EOMES, FGFBP2, GNLY, KLRC3, KLRC4, KLRD1) were analyzed. CAF scoring was evaluated through Tumor Immune Dysfunction and Exclusion (TIDE, http://tide.dfci.harvard.edu) based on tumor pre-treatment expression profiles, this TIDE module can estimate multiple published transcriptomic biomarkers to predict patient response. The edgeR and limma packages were used to calculate the fold change of genes between groups. Clustering was performed in individual datasets, and the samples were further classified into high-, medium-, and low-infiltration groups using the Complex Heatmap package. To further confirm the clustering results, the principal components analysis (PCA) method was applied as previously described. Inflammatory and myofibroblastic CAF features were also included to assess the internal characteristics. Comparisons of biological markers among different CAF infiltration groups are shown by the ggheatmap and Complex Heatmap packages.

Tumor microenvironment estimation

The Estimate the Proportion of Immune and Cancer cells (EPIC), xCell, and Microenvironment Cell Populations-counter (MCP-counter) algorithms were applied to calculate the cancer-associated fibroblast scores in datasets [14,15]. To analyze the correlation among fibroblasts and immune cells, fractions of 22 immune cells were estimated using the Cell-type Identification by Estimating Relative Subsets of RNA Transcripts (CIBERSORT) algorithm [16,17]. The estimation of stromal and immune cells in malignant tumor tissues using expression data (ESTIMATE) was applied to calculate the overall stromal and immune scores in cancer [18].

Functional analysis

The GSVA package was used for gene set variation analysis (GSVA) [19]. The GSVA results were compared between the high- and low-CAF infiltration groups and are displayed. Gene set enrichment analysis (GSEA) was used to explore the biological functions and performed using GSEA 4.1.0. Hallmark and gene ontology gene sets were obtained from the MSigDB Collections (http://www.gsea-msigdb.org/gsea). 

Immunohistochemistry

First, Formalin-fixed, paraffin-embedded tissue was cut into 4-μm sections and mounted on glass slides and then were baked in a 63 °C oven for one hour. The tissues were de-vaxed and rehydrated with a sequential procedure: dimethylbenzene 15 min × 2 times; water-free alcohol, 7 min × 2 times; 90% alcohol, 7 min; 80% alcohol, 7 min; 70% alcohol, 7 min; triple-rinsed with water for three minutes each time. Antigen recovery was performed for 5 min in boiling citric acid solution. Next, the tissues were incubated in blocking solution (76.8% methanol and 7.2% H2O2) for 10 min, and then triple-rinsing with 1 × PBS for 5 min each time. Rabbit polyclonal COL5A2 antibody (OriGene) was diluted with Dako REAL Antibody Diluent (Dako S3022) at a ratio of 1:500 and incubated overnight in a humid container in a 4 °C refrigerator. Tissues were triple-rinsed with 1 × PBS for 5 min each time before a further incubation with Dako EnVision™+/HRP second antibody reagent (Dako SM802) for 30 min. The tissues were washed with PBS and stained with substrate 3,3ʹ-diaminobenzidine (DAB) (Dako EnVision) DAB+ Chromogen (Dako DM827) in EnVision™ Substrate Buffer (Dako SM803) for 5 min. The nuclei were counterstained with Dako Hematoxylin. IHC staining was assessed by scores based on the percentage of positive cells (0: <5%; 1: 5%–25%; 2: 25%–50%; 3: 50%–75%; 4: >75%) multiplied by scores based on the intensity of staining, (0: colorless; 1: light yellow; 2: brown; 3: dark brown), with 6–12 considered high expression and 0–4 considered low expression. The primary antibody against COL5A2 used in IHC testing was purchased from LifeSpan BioSciences, lnc (Seattle, WA, U.S.A.).

Western blot

For LAC tumors, cells were laser captured from tumor bulks to isolate CAF, tumor cells as our former study [20]. These cells were lysed using RIPA buffer (Beyotime) at 4 °C directly.  Phenylmethanesulfonyl fluoride (PMSF, Beyotime) was added to reduce protein degradation during extraction. Proteins were separated in SDS-polyacrylamide gels and transferred to PVDF membranes, and nonfat milk was used to block the nonspecific binding sites on the membrane. The membranes were incubated with primary antibodies against AKT1, COL5A2 (1:1000, Proteintech) and GAPDH (1:3000, SAB) at 4 °C overnight. The secondary antibody (1:5000, Bioss) was applied on the following day, and the reaction was detected using enhanced chemiluminescence solution (ECL, Affinity).

Pathologic Assessment of Response in LACs based on COL5A2 in resected tumors after ICI therapy

Twenty LAC patients, 18 to 65 years of age, were then recruited for immune checkpoint inhibitor therapy (ICIT) by Camrelizumab. Tumor size and standardized uptake value at baseline were evaluated by Positron-emission tomography (PET) plus contrast-enhanced CT. These patients had histologically confirmed NSCLC (stage IB–IIIA, American Joint Committee on Cancer, 8th edition) that was surgically resectable. All patients had treatment-naive primary tumor and adequate organ function. Exclusion criteria include as previous study [21]. Operation was performed between day 26 and 41 after two cycles treatment of Camrelizumab (200 mg, intravenously, day 1 out of 22). All patients provided written informed consent for the use of their tumor specimens. Sufficient fresh specimens were used for the following assays.

A standardized procedure was used to evaluate the percentages of (1) viable tumor, (2) necrosis, and (3) stroma (including inflammation and fibrosis).  The tumors with diameter < 3 cm were entirely sampled. If a tumor was larger than 3 cm, an approximately 0.5 cm thick cross-section of tumor was separated in its maximum dimension. True tumor bed consisted of viable tumor, necrosis and stroma which included both fibrosis and inflammation confined to the tumor bed.

The percentages of viable tumor, stromal tissue, and necrosis were estimated based on the microscopic sections. Each component was assessed in 10% increments unless the amount was less than 5%. A semi quantitative approach was done on the two stromal tissue components: fibrosis and inflammation. The final pathologic responses were determined based on the histologic features correlated with the gross findings.

Statistical methods

The best cutoff values for specific markers in each cohort were determined using the survminer package. The survival package was used for Kaplan-Meier overall survival analysis, and the log-rank test was applied for comparison. The hazard ratio (HR) was calculated via univariate Cox regression. Immune signatures were divided into two groups according to the median value and calculated by Cox regression in. Student’s t-test or Wilcoxon rank-sum test were used for comparison of normally and non-normally distributed variables in unpaired groups, respectively. The paired Student’s t-test was performed for paired samples. Chi-square test and Fisher’s exact test were applied for comparison of clinical features. The Spearman method was applied for correlation analysis. All P values were two-tailed. Statistical analysis was performed using R software (Version 4.3.1, https://www.r-project.org).

Results

CAF scores Are Correlated To A Poor Prognosis In LAC

CAF scoring was evaluated through EPIC, xCell, MCPcounter R packages and Tumor Immune Dysfunction and Exclusion (TIDE) on 111 CAF-related genes from MCPcounter (Table 1). We hypothesized that CAFs remodel tumor microenvironment and promote tumor progression and influence the prognosis. To explore these possibilities, we correlated clinical data to the CAF scores based on the expression levels of CAF-related genes (Table 1).

These CAF scores were correlated with overall survivals in TCGA cohort by EPIC, MCPcounter, xCell and TIDE (Figure 1A-C, OS, p=0.025, HR=1.456, 95% CI 1.048-2.025; p=0.022, HR= 1.631, 95% CI 1.068-2.493; p=0.006, HR=1.55, 95% CI 1.129-2.127, respectively). The stromal scores for TCGA cohort were also analyzed, but we found that patients with high stromal score had better prognosis than those with low stromal scores (Figure 1D, p=0.004, HR=0.651, 95% CI 0.484-0.876).

Similarly, the prognosis of patients from GSE68465 correlated to CAFs were evaluated by xCell and TIDE, and found that the overall survival of patients with low-CAF scores had better prognoses than those with high-CAF scores by xCell analysis (Figure 1E, p= 0.022, HR= 1.644, 95% CI 1.069-2.529), but it was contradicted by TIDE analysis showing a better clinical outcome in patients with high-CAF scores than with low-CAF scores (Figure 1F, p= 0.013, HR= 0.72, 95% CI 0.554-0.934). Patients with high stromal scores had better prognoses than those with low stromal scores (Fig. 1G, p=0.032, HR=0.73, 95% CI 0.546-0.974). It was consistent with that from TCGA cohorts.

 

Figure 1: The expression levels of CAF-related genes were correlated with overall survivals in TCGA cohort by different analyses of EPIC, MCPcounter, xCell and TIDE. (A-C) the overall survivals of high- and low- CAF-associated genes expression groups from TCGA cohort were compared by EPIC, MCPcounter and TIDE. (D) the overall survivals of patients from TCGA with high- and low- stromal scores were analyzed. (E-F) the overall survivals of high- and low- CAF-associated genes expression groups from GSE68465 were compared by xCell and TIDE. (G) the overall survivals of patients from GSE68465 with high- and low- stromal scores were analyzed. (H) Heatmap showed the results of correlation analyses of different modules for TCGA cohort. (I) Heatmap showed the results of correlation analyses of different modules for GSE68465 cohort.

Functional Enrichment Analysis of CAF-Associated Differentially Expressed Genes in LACs of TCGA and GSE68465 cohorts

The relative abundance of fibroblasts in LACs was estimated by expression profile clustering using classic CAF markers. The intersection analysis displayed by Venn plot of the common genes from TCGA and GSE68465. These genes were SPARC, GLT8D2, ANGPTL2, MMP2, COL6A3, AEBP1, INHBA, COL5A2, COL5A1, COL1A2, MXRA5, and THBS2 (Figure 2A). Results from gene ontology (GO) enrichment analysis indicated that the DEGs mapped to the extracellular matrix organization-related GO terms, such as extracellular structure organization, endodermal cell differentiation and collagen fibril organization (Figure 2B). The Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis also displayed the enrichment of protein digestion and absorption and ECM-receptor interaction pathways (Figure 2C). Thus, the overall functions of DEGs seemed to map on extracellular matrix remodeling-related activities, which implied that the involvement of CAFs was a predominant feature of TME in LACs.

 

Figure 2: Common genes from TCGA and GEO cohorts and GO and KEGG analyses. (A) Venn plots showed common CAF-associated genes from TCGA and GSE68465 cohorts. (B) Bubble GO analysis chart identified molecular functions with genes enriched in each module. (C) Bubble KEGG plot analysis identified cancer-related pathways with genes enriched in each module.

Construction of a degenerated TME subtyping method by WGCNALASSO

To investigate effective biomarkers derived from CAFs and their corresponding prognostic value and to increase the accuracy and simplicity of the signature, WGCNA analysis was performed to divide the DEGs into different modules and the mostly prognosis-related genes were clustered by lasso regression in each module. The purpose of this step was to enhance the orthogonality of the genes involved in the signature. Cluster analysis on gene expression of TCGA cohort by WGCNA and the result were shown in Figure 1H. The MEyellow group was mostly associated with CAF score and stromal score. Likely, the cluster analysis for GSE68465 cohort was explored and the result was shown in Figure 1I. The MEred group was significantly associated with CAF score and stromal scores.

As shown in Figure 1H-I, these different modules were correlated with different microenvironmental factors and different clinical characteristics (Figs. 1A-G). Subsequently, we performed a Pearson’s correlation analysis on the prognostic genes. Based on these results, the genes that we selected exhibited different micro environmental statuses and a high orthogonality. Lastly, two modules from TCGA cohort and 9 modules from GSE68465 were selected for lasso regression, which included UNC5B, CALD1, FSTL1, LAMA4, PRRX1, SPARC, TIMP2, GLT8D2, THY1, ANGPTL2, OLFML2B, COL6A2, ADAMTS2, MMP14, FBN1, FAP, ANTXR1, ADAMTS12, MMP2, COL6A3, AEBP1, INHBA, COL5A2, COL5A1, COL1A2, COL15A1, POSTN, COL3A1, ITGA11, MXRA5, ADAM12, SULF1, VCAN, THBS2, COL1A1 and COL12A1 out of MEyellow module from TCGA cohort and GLT8D2, SPARC, COL6A3, ANGPTL2, ACTA2, COL1A2, CTSK, MMP2, BGN, ISLR, CDH11, AEBP1, COL5A1, DCN, THBS2, COL5A2, TAGLN, INHBA, MXRA5 out of MEred module from GSE68465 cohort. Next, a prognostic model consisting of two genes, COL5A2 and COL5A1 was generated (Figure 3A-B)), and the risk score of each patient as the sum of each gene multiplied by the corresponding coefficient 0.128 for COL5A2 and 0.003 for COL5A1.

To avoid overfitting effects, we evaluated the the correlation of CAF and risk scores of samples from TCGA cohort by GGally algorithm using EPIC, MCPcounter, xCell and TIDE. All the results from different algorithms were positively correlated. Based on the classification, patients in the TCGA cohort were divided into high- and low-risk groups by the median risk score. According to the expression of COL5A2 and COL5A1, patients were classified in the high-risk group (n=235) and low-risk group (n=236) with mean risk score= 1.1, range 0.59-1.55) (Figure 3D). We found that with increased risk, COL5A2 and COL5A1 were up-regulated accordingly. Notably, the 2-gene panel integrated with all of the algorithms turned out to represent the CAF-related genes panel (Figure 3E). CAF is an important component in tumor and may influence therapeutic effects. Therefore, this signature was testified by tumor immune dysfunction and exclusion (TIDE) to predict tumor treatment effects based on tumor pre-treatment expression profiles, this TIDE module can estimate multiple published transcriptomic biomarkers to predict patients’ response (Figure 3F).  54% versus 18% of patients respond to anti-PD-1 or anti-CTLA-4 in low-risk group compared to high-risk group (Fig. 3G-H, p <0.001). Noticeably, The accuracy of the signature was validated by the AUCs, which were 0.757 (95% CI 0.708-0.801) (Figure 3I). These results indicated that this signature had a promising application in the prediction of clinical outcomes of immune therapy.

 

Figure 3: CAF-related genes were analysed with patients’ clinical traits and a model were explored to predict immunotherapeutic response of LAC patients. (A) Univariate Cox regression analysis identifying CAF-associated genes correlating with overall survival. (B) Partial likelihood deviance revealed by LASSO regression in the 10-fold cross-validation to establish a model to predict risk of CAF-associated genes. The optimal values were shown within the two dotted vertical lines. (C) Correlation of the expression of CAF-associated genes, stromal score and risk score were analyzed by EPIC, MCPcounter, xCell and TIDE. (D) Heatmap for CAF-associated genes generated by comparison of the high score group vs. the low score group in risk. Row name of heatmap is the gene name, and column name is the ID of samples which not shown in plot. (E) Correlation of the genes in predicting model with the CAF-associated genes. (F) the upper figure showed the response of patients from TCGA cohort to immune checkpoint inhibitor. The lower figure showed the response of patients from GSE68465 cohort to immune checkpoint inhibitor. (G) Boxplot showed the responsive percentage of patients from high-risk and low-risk groups to anti-PD-1 therapy. (H) Violin plot showed the significant difference of response to immune checkpoint inhibitor between high-risk and low-risk groups. (I) ROC curve of the response to immune checkpoint inhibitor.

Functional Analysis to CAF High- and Low-Risk Groups

The GSVA package was used for gene set variation analysis (GSVA). The GSVA results were compared between the high- and low-CAF infiltration groups and are displayed. Gene set enrichment analysis (GSEA) was used to explore the biological functions of COL5A2 and COL5A1 using GSEA 4.3.2 Hallmark; and gene ontology gene sets were obtained from the MSigDB Collections (http://www.gsea-msigdb.org/gsea). The GO analyses demonstrated the activation cytosolic ribosome, large ribosomal subunit, ribosomal subunit, ribosome and structure constituent of ribosome were present in low-risk group (Figure 4A). In contrast, nucleosome assembly, DNA packaging complex, nucleosome, extracellular matrix structural constituent and structural constituent of chromatin were activated in high-risk group (Figure 4B). The KEGG analysis showed that glycine serine and threonine metabolism, linoleic acid metabolism, oxidative phosphorylation, parkinsons disease and ribosome pathways were involved in low-risk group (Figure 4C). However, cytokine-cytokine receptor interaction, ECM receptor interaction, focal adhesion, pathways in cancer and systemic lupus erythematousus pathways were upregulated in high-risk group (Figure 4D). These results suggested that extracellular matrix-associated pathways and function were active in COL5A2-upregualted LACs. Correlation analyses revealed that vatomical structure formation (R=0.65, p <0.001), fibroblast migration (R=0.66, p <0.001) and fibroblast proliferation (R=0.64, p <0.001) were the significant risk factors in LACs (Figure 4E-G).

 

Figure 4: GO and KEGG analyses on LACs from low- and high-risk groups. A-B, GO analyses of CAF-associated genes are shown in low-and high risk groups. C-D, KEGG signaling pathways based on CAF-associated genes are shown in low-and high- risk groups. E-G, Scatter plots showing the correlation of structure formation, fibroblast migration and proliferation with risk score. p and R values from Spearman correlation analyses.

COL5A2 Expression Were Associated With the Clinicopathological Staging of LAC Patients

For determining the relationship between the proportion of immune and stromal components with the clinicopathological characteristics of LAC cases from TCGA database. We firstly analyzed the corresponding the expression of COL5A2 in normal and tumor tissues. Whether or not in paired normal and tumor tissue, the expression level was higher in tumor than in normal tissues (Figure 5A-B). We also analyzed the expression level of COL5A2 to clinical information. COL5A2 expression was not correlated with patient’s age, gender (Fig. 5C-D). We noticed that COL5A2 expression was associated with T stage. Tumor in T1 had low level of COL5A2 expression than T2 and T3 (Figure 5E). Similar tendency was found in stage I versus stage II (p <0.01, Figure 5H). However, with in advanced stage of LACs, the expression of COL5A2 was not positively increased with tumor progression (Figure 5F-G). This suggested that COL5A2 expression increased with tumor enlargement. Heatmap plot showed other clinical characteristics in the groups of high- and low-expresison of COL5A2 (Figure 5I).

As shown in Figure 5J, A nomogram model was constructed that included T stages, tumor status, pathologic stages, and COL5A2 expression levels as parameters. The nomogram showed a significantly high clinical value in predicting the 1 -, 3-, and 5-year survival probability of the LAC patients (Figure 5J).

Univariate COX regression analysis on clin-pathological factors, and the results indicated that the expression of COL5A2 and clinical stage were the significant prognostic factors to patients with HR=1.174 (95% CI 1.05-1.313) and HR=1.626 (95% CI 1.413-1.871), respectively (Figure 5K). Similar results were concluded by multivariate regression (Fig. 5L).