- Research
- Open access
- Published:
Multiple machine learning-based integrations of multi-omics data to identify molecular subtypes and construct a prognostic model for HNSCC
Hereditas volume 162, Article number: 17 (2025)
Abstract
Background
Immunotherapy has introduced new breakthroughs in improving the survival of head and neck squamous cell carcinoma (HNSCC) patients, yet drug resistance remains a critical challenge. Developing personalized treatment strategies based on the molecular heterogeneity of HNSCC is essential to enhance therapeutic efficacy and prognosis.
Methods
We integrated four HNSCC datasets (TCGA-HNSCC, GSE27020, GSE41613, and GSE65858) from TCGA and GEO databases. Using 10 multi-omics consensus clustering algorithms via the MOVICS package, we identified two molecular subtypes (CS1 and CS2) and validated their stability. A machine learning-driven prognostic signature was constructed by combining 101 algorithms, ultimately selecting 30 prognosis-related genes (PRGs) with the Elastic Net model. This signature was further linked to immune infiltration, functional pathways, and therapeutic sensitivity.
Results
CS1 exhibited superior survival outcomes in both TCGA and META-HNSCC cohorts. The PRG-based signature stratified patients into low- and high-risk groups, with the low-risk group showing prolonged survival, enhanced immune cell infiltration (B cells, T cells, monocytes), and activated immune functions (cytolytic activity, T cell co-stimulation). High-risk patients were more sensitive to radiotherapy and chemotherapy (e.g., Cisplatin, 5-Fluorouracil), while low-risk patients responded better to immunotherapy and targeted therapies.
Conclusion
Our study delineates two molecular subtypes of HNSCC and establishes a robust prognostic model using multi-omics data and machine learning. These findings provide a framework for personalized treatment selection, offering clinical insights to optimize therapeutic strategies for HNSCC patients.
Introduction
Head and neck squamous cell carcinoma (HNSCC), a heterogeneous malignancy arising from mucosal epithelia in the oral cavity, pharynx, and larynx, remains a significant global health challenge due to its aggressive behavior and poor prognosis. According to GLOBOCAN 2020, HNSCC accounts for over 830,000 new cases and 430,000 deaths annually worldwide, ranking seventh in cancer incidence and mortality [1, 2]. In the United States alone, an estimated 54,540 new cases and 11,580 deaths were projected for 2023 [3]. The disease exhibits a striking gender disparity, with males facing a 2.5-fold higher incidence than females, driven by established risk factors such as tobacco use, alcohol consumption, and high-risk human papillomavirus (HPV) infection [4, 5]. While HPV-positive oropharyngeal cancers often respond better to therapy, the majority of HNSCC patients present with advanced-stage disease characterized by high rates of locoregional recurrence, distant metastasis, and 5-year survival rates below 50% despite multimodal therapies [6,7,8].
Current standard treatments—surgery, radiotherapy, and platinum-based chemotherapy—are associated with severe toxicities, including dysphagia, mucositis, and irreversible organ dysfunction, which profoundly impair patients’ quality of life [9, 10]. The advent of immune checkpoint inhibitors (ICIs) targeting PD-1/PD-L1 (e.g., pembrolizumab, nivolumab) and CTLA-4 (e.g., tremelimumab) has revolutionized HNSCC management, offering durable responses in a subset of patients by reactivating antitumor immunity [11,12,13]. However, clinical benefits remain limited, with objective response rates below 20% in recurrent/metastatic settings, and intrinsic or acquired resistance mechanisms—such as T cell exhaustion, immunosuppressive myeloid cell infiltration, and defects in antigen presentation—continue to hinder therapeutic efficacy [14,15,16]. These challenges underscore the critical need for strategies to predict treatment responsiveness and tailor therapies based on molecular profiles.
Advances in multi-omics technologies (genomics, transcriptomics, epigenomics) have unraveled the profound molecular heterogeneity of HNSCC, revealing distinct subtypes defined by HPV status, somatic mutations (e.g., TP53, CDKN2A, PIK3CA), and immune microenvironment composition [17,18,19]. Similarly, transcriptomic analyses have identified immune “hot” and “cold” tumors, with the former showing enhanced cytotoxic lymphocyte infiltration and superior responses to ICIs [21]. Despite these insights, translating molecular subtyping into clinically actionable frameworks remains elusive, partly due to the lack of robust prognostic models integrating multi-omics data and real-world treatment outcomes.
Machine learning (ML) has emerged as a powerful tool to address this gap, enabling the integration of high-dimensional omics data for predictive modeling. Recent studies have leveraged ML algorithms to identify prognostic gene signatures, predict drug sensitivity, and stratify patients for immunotherapy [22,23,24]. However, many models suffer from overfitting, limited generalizability, or a failure to account for the dynamic interplay between tumor cells and the immune microenvironment. Furthermore, most existing signatures rely on single-omics approaches, neglecting the complementary insights offered by multi-omics integration [25].
In this study, we integrated transcriptomic, genomic, and clinical data from four independent HNSCC cohorts to define molecular subtypes and construct a machine learning-driven prognostic signature. By employing consensus clustering algorithms and ML models, we identified two robust subtypes (CS1 and CS2) and a 30-gene prognostic signature linked to immune infiltration and therapeutic sensitivity. These findings advance the paradigm of precision oncology in HNSCC, offering actionable insights to optimize therapeutic strategies and improve patient outcomes.
Materials and methods
Downloading and processing of raw data
We collected HNSCC datasets from the TCGA database, GEO database, and previous literature based on the following criteria: HNSCC (including oral cavity, pharynx, and larynx, etc.) as the research direction, the inclusion of prognostic information, a minimum dataset size of over 80 cases, and maximal gene similarity between datasets. As a result, we collected four datasets: TCGA-HNSCC, GSE27020, GSE41613, and GSE65858 (Table 1). We acquired omics data on HNSCC from the TCGA-HNSCC cohort, including mRNA, lncRNA, microRNA, DNA methylation, somatic mutations, and clinical data. We only retained 489 samples that contained both the above histologic data and clinical data. For mRNA and lncRNA data, we used log2 (TPM + 1) to make the data more normally distributed. For microRNA data, we used log2 (RPM + 1) to make the data more normally distributed. In addition, we acquired microarray and clinical data for three HNSCC cohorts from GEO (GSE27020, GSE41613, and GSE65858) [18,19,20,21]. For the microarray data, we use log2 ((RMA or RSN) + 1) to make the data more normally distributed. Finally, we used the “ComBat” function from the “sva” package, which uses a parametric empirical Bayesian framework to adjust the batch effect data, merging the different queues to eliminate the batch effect [22].
Multiomics consensus clustering analysis
The first step was to identify elite genes via the “getElites” function from the “MOVICS” package, which aims to reduce data dimension for multi-omics integrative clustering analysis [23]. For continuous variables (mRNA, lncRNA, microRNA, and DNA methylation data), we selected the top 1500 genes with the highest degree of variation by setting the “mad” parameter and used univariate Cox analysis to identify prognostic-related genes (PRGs) in each cohort (p < 0.05). For discrete variables (somatic mutation data), we selected the top 5% of genes with the highest mutation frequency by setting the “freq” parameter [24]. The optimal number of clusters was determined by evaluating the sum of the Clustering Prediction Index (CPI) and gap statistics [25]. Next, we obtained the optimal number of clusters with the “getClustNum” function from the “MOVICS” package and performed cluster analysis with the “getMOIC” function from the “MOVICS” package, which contains 10 clustering algorithms (SNF, CIMLR, PINSPlus, NEMO, COCA, MoCluster, LRAcluster, ConsensusClustering, IntNMF, and iClusterBayes) [25,26,27,28]. Lastly, we borrowed the idea of consensus clustering to integrate the clustering results of multiple algorithms to improve the reliability and stability of the clustering.
Validation of molecular subtypes
We utilized the “runMarker” function to identify marker genes for each subtype, specifying a threshold of n.marker = 1000 and adjusting the p-value cutoff to p.adj.cutoff = 0.05. To ascertain the robustness of the identified subtypes, we validated the clustering results by leveraging the marker genes specific to each subtype within the verification set. Additionally, we compared these results with the consistency achieved by the Nearest Template Prediction (NTP) and Partition Around Medoids (PAM) classifiers, employing them for consensus clustering. This comprehensive approach ensured that the subtypes were not only well-defined but also consistent across different validation methods.
Gene expression profile, enrichment analysis, and immune cell infiltration
Firstly, we utilized Principal Component Analysis (PCA), T-distributed Stochastic Neighbor Embedding (TSNE), and Uniform Manifold Approximation and Projection (UMAP) analysis to find the gene distribution of cancer subtypes (CSs) [29]. We analyzed the gene expression patterns of genes in the follow-up model across different CSs, aiming to identify distinct regulatory patterns. Finally, we performed Gene Set Variation Analysis (GSVA) to evaluate the activation status of the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway in each CS and evaluated the level of immune cell infiltration in different CSs using the single-sample Gene Set Enrichment Analysis (ssGSEA) method [30].
Development of a prognostic signature
Given the limited sample size in some cohorts, we pooled the three cohorts together to form the META-HNSCC cohort. Aiming to develop a prognostic model with high accuracy and generalizability, we integrated 10 diverse machine learning methods, including CoxBoost, stepwise Cox, Lasso, Ridge, Enet, survival-SVMs, GBMs, SuperPC, plsRcox, and RSF [31,32,33,34]. The development pipeline of the machine learning-driven signature is as follows. First, we performed univariate Cox analysis within the TCGA-HNSCC and META-HNSCC datasets. Genes with p < 0.05 and the same hazard ratio (HR) orientation were considered PRGs. Then, 101 combinations of 10 machine learning algorithms were utilized to develop the most predictive signature with the best C-index performance. Finally, upon establishing the model on the training set, we proceeded to rigorously test it across all validation cohorts. For each model, we computed the average C-index, ultimately considering the model with the highest C-index value as the optimal one.
Prognostic value and clinical application
We predicted the risk score using the linear predictor method based on the “predict” function in the “stats” package. We selected the best cutoff based on the “surv_cutpoint” function in the “survminer” package to classify into high- and low-risk groups. Survival analyses were conducted among these distinct groups to evaluate the prognostic significance of the signature. Additionally, we conducted a search for 19 prognostic signatures pertinent to HNSCC and computed scores for each sample [35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53]. The predictive capability of each signature within each cohort was assessed using the C-index. To compare the predictive power of the model with traditional clinical characteristics, the C-index was employed. Finally, we constructed a nomogram containing traditional clinical information and this model to predict the survival of HNSCC patients.
Exploration of molecular functions, pathways, and gene mutations
We obtained differentially expressed genes (DEGs) in different risk groups by differential expression analysis (|log2FC > = 1| and FDR < 0.05) using the R package “limma”. Then, Gene Ontology (GO) and KEGG analyses were performed on the identified DEGs to clarify the biological functions and signaling pathways (p < 0.05) [34, 54, 55]. To delve deeper into the genomic intricacies of HNSCC and investigate its correlation with varying risk scores, the R package “Maftools” was employed [56].
Exploration of the immune landscape
Various algorithms were utilized to deconvolve and quantify the immune cell composition within the tumor microenvironment (TME) [57,58,59,60,61,62]. To further delve into the immune function of the distinct groups, the ssGSEA approach was utilized. In addition, we analyzed the expression patterns of ICGs in distinct groups. Furthermore, the TIDE approach was used to anticipate the immunogenicity of HNSCC, thereby facilitating informed decisions regarding immunotherapy [63]. TIDE was a computational framework developed to evaluate the potential of tumor immune escape from the gene expression profiles of cancer samples.
Prediction of treatment sensitivity
Apart from immunotherapy, chemotherapy and targeted therapies are conventional treatments for cancer. We used the ssGSEA algorithm to calculate two radiotherapy biomarkers to predict response to radiotherapy. Finally, the package “oncoPredict” was employed to forecast the sensitivity of patients in various groups to conventional chemotherapeutic drugs (5-Fluorouracil, Cisplatin, Docetaxel, Oxaliplatin, and Paclitaxel) and targeted therapy drugs [64]. The package “oncoPredict” allowed for building drug response models using screening data between bulk RNA-Seq and a drug response metric.
Exploration of scRNA
We utilized the TISCH2 database to meticulously analyze the expression patterns of the signature genes in distinct cell populations isolated from HNSCC patients enrolled in the GSE103322 study [65]. This approach allowed us to delve into the TME of HNSCC at a single-cell resolution. The TISCH2 database serves as a valuable repository for the interpretation of scRNA-seq data, offering comprehensive gene expression profiles of diverse immune cells within the TME. We aimed to gain deeper insights into the potential involvement of distinct immune cell subsets in the pathogenesis and progression of HNSCC [66].
Quantitative RT-PCR analysis
The np69 and FaDu cell line were obtained from Wuhan Pricella Biotechnology Co.,Ltd. The np69 and FaDu cell lines were cultured in DMEM. Total RNA was extracted from the cells using the TRIzol reagent (Invitrogen). A PrimeScript™ RT Reagent Kit (Takara, Japan) was used for reverse transcription, and PowerUp SYBR Green Master Mix (Thermo Fisher Scientifc) was used to perform qRT‒ PCR according to the manufacturer’sinstructions. The primer sequences are shown in Table S5.
Results
Recognition of multiple cancer subtypes
The workflow of our research is outlined in Fig. 1. We validated our findings by conducting a comparative analysis of the data both before and after addressing batch effects using PCA (Fig. S1A and B). Utilizing 10 multi-omics integrated clustering algorithms, we recognized two distinct subtypes and determined their respective numbers (Fig. 2A and B). Subsequently, we integrated molecular expression patterns across various transcriptomes (including mRNAs, lncRNAs, and miRNAs), epigenetic methylation patterns, and somatic mutations using a consensus-based approach (Fig. 2C and E). Our classification system demonstrated a strong correlation with overall survival (OS) (p = 0.011; Fig. 2F). Significantly, CS1 showed the most favorable survival outcome among all identified subtypes.
Validation of molecular subtypes
We identified 1000 genes uniquely overexpressed in each subtype as classifiers. These genes were subsequently validated in an external cohort to confirm the stability of the identified subtypes. Comparable results were observed in the META-HNSCC cohort (Fig. 3A). Notably, the CS1 subtype within the META-HNSCC cohort, which encompassed three datasets, exhibited the most favorable prognosis among all subtypes (p < 0.001; Fig. 3B). Additionally, the NTP successfully categorized each sample in the external cohort into the previously identified CSs. The consistency of the CSs with both the NTP and PAM methods was further evaluated and found to be statistically significant (p < 0.001; Fig. 3C and E). PCA, tSNE and UMAP methods showed significant differences in mRNA expression profiles between groups (Fig. 3F and H).
Gene expression profile, enrichment analysis, and immune cell infiltration
CS2 showed considerably higher expression levels of genes in the follow-up model compared to CS1, indicating a stronger association between genes and CS2 (Fig. 3I). Clinical analysis showed that CS2 was associated with greater age, females, non-metastasis, and other clinical information (Fig. 4A). The results indicated that CS1 was enriched in pathways related to glycosaminoglycan biosynthesis, chondroitin sulfate, MAPK signaling pathway, regulation of actin cytoskeleton, gap junction, and glioma. CS2 was significantly enriched in pathways related to maturity onset diabetes of the young, linoleic acid metabolism, steroid hormone biosynthesis, retinol metabolism, and metabolism of xenobiotics by cytochrome p450 (Fig. 4B). The ssGSEA algorithm yielded a higher level of immune cell infiltration in CS2 (Fig. 4C).
Development of the machine learning-driven signature
The development pipeline of the machine learning-driven signature is as follows. First, we performed univariate Cox analysis within the TCGA-HNSCC and META-HNSCC datasets. Genes with p < 0.05 and the same HR orientation were considered PRGs. We screened 135 PRGs from the TCGA-HNSCC and META-HNSCC cohorts, and their expression was significantly associated with OS. Among them, 12 PRGs were risk factors, while 18 PRGs were protective factors. Then, 101 combinations of 10 machine learning algorithms were utilized to develop the most predictive signature with the best C-index performance. Finally, upon establishing the model on the training set, we proceeded to rigorously test it across all validation cohorts. For each model, we computed the average C-index, ultimately considering the model with the highest C-index value as the optimal one (Fig. 5A). The model incorporating Enet [alpha = 0.1] demonstrated the highest average C-index and was chosen to construct the final model, which was based on 30 PRGs (Fig. 5B and C). Subsequently, we calculated risk scores for individual samples in all cohorts. Notably, the high-risk group within both the TCGA-HNSCC and META-HNSCC sets displayed a less favorable clinical outcome (Fig. 5D and E). To validate the prognostic significance of the genes included in the model, we employed Kaplan-Meier analysis in HNSCC. These results were largely consistent with those derived from the Cox analysis (Fig.S2 and S3). Our findings indicated a significant association of these genes with DSS and PFS in HNSCC, underscoring their strong prognostic relevance to patient outcomes (Fig. S4 and S5).
Comparison of prognostic signatures
In order to conduct a comparison between the prognostic signature and others, we examined 17 different published models. These published models are related to various biological processes, such as angiogenesis, hypoxia, pyroptosis, circadian regulation, fatty acid metabolism, necroptosis, immune response, ferroptosis, etc. Remarkably, the prognostic signature exhibited superior C-index performance compared to all models in both the TCGA-HNSCC and META-HNSCC sets (Fig. 6A and B). Univariate and multivariate Cox analyses confirmed that the risk score derived from the signature was an independent prognostic factor (Fig. 7A and B). Furthermore, the C-index analysis validated the enhanced prognostic efficacy of the signature over clinical characteristics (Fig. 7C). To enable accurate prediction of HNSCC survival, a nomogram was developed integrating the prognostic model and clinical characteristics (Fig. 7D). These findings collectively underscore the robust predictive power and clinical utility of our prognostic signature in the context of HNSCC, positioning it as a valuable tool for informing patient outcomes and treatment strategies.
Exploration of molecular functions, pathways, and gene mutations
704 DEGs were recognized among various groups. BP terms for the high-risk group were related to muscle system processes, muscle contraction, and muscle organ development. CC terms for the high-risk group were associated with sarcomere, myofibril, and contractile fiber. MF terms for the high-risk group were related to actin binding, structural constituents of muscle, and actin filament binding (Fig. 8A and Table S1). KEGG analysis indicated that the high-risk group was enriched in pathways including hypertrophic cardiomyopathy, dilated cardiomyopathy, cardiac muscle contraction, motor proteins, and adrenergic signaling in cardiomyocytes (Fig. 8B and Table S2). The molecular functions and pathways of the high-risk group mainly focused on cell-molecule interactions, metabolic and uptake processes, and immune and inflammatory responses. BP terms for the low-risk group were related to keratinization, epidermal cell differentiation, and skin development. CC terms for the low-risk group were related to the cornified envelope, apical plasma membrane, and apical part of the cell. MF terms for the low-risk group were related to endopeptidase inhibitor activity, structural constituents of skin epidermis, and peptidase inhibitor activity (Fig. 8C and Table S3). KEGG analysis showed that the low-risk group was enriched in pathways including primary immunodeficiency, salivary secretion, staphylococcus aureus infection, linoleic acid metabolism, and arachidonic acid metabolism (Fig. 8D and Table S4). The molecular functions and pathways in the low-risk group were mainly focused on immunity and infection, metabolism and nutrition, endocrinology, and signaling. Moreover, the top 10 genes with mutations were identified, and the frequency of mutations was higher in the high-risk group (Fig. 8E and F).
Exploration of the immune landscape
The risk score was negatively connected with B cells, mast cells, monocytes, myeloid dendritic cells, neutrophils, and T cells (Fig. 9). Some immune functions were activated in the low-risk group, including CCR, cytolytic activity, inflammation-promoting, and T cell co-stimulation (Fig. 10A). Expression of ICGs was also higher in the low-risk group, including CTLA4, PDCD1, TIGIT, and IDO1 (Fig. 10B). The low-risk patients had markedly lower TIDE scores, indicating a more sensitive response to immunotherapy (p < 0.001; Fig. 10C). Finally, we found that combining the TIDE score and risk score could better predict patient prognosis (Fig. 10D and E).
Identification of drugs
Two radiotherapy-associated biomarkers (cell cycle and hypoxia) were markedly enriched in the high-risk patients, suggesting greater suitability for radiation treatment (Fig. 11A). Four commonly used chemotherapeutic drugs were more sensitive in the high-risk group, implying a greater likelihood of benefit from chemotherapy (Fig. 11B). Most EGFR antagonists were more sensitive in the low-risk group, indicating that targeted EGFR therapy is more appropriate for low-risk patients (Fig. 11C).
Exploration of scRNA
The GSE103322 dataset contains approximately 5,902 single cells from 18 HNSCC patients, including 5 matched pairs of primary tumors and lymph node metastases. Figure 12A illustrated the distribution of 20 distinct clusters; Fig. 12B displayed the distribution of 11 various cell types; Fig. 12C presented the proportions of these 11 cell types; and Fig. 12D demonstrated the proportions of the 11 cell types across different patient samples. Notably, the expression levels of genes in the model were significantly elevated in myofibroblasts, fibroblasts, and mast cells (Fig. 12E and F).
Quantitative RT-PCR analysis
We detected the expression levels of 11 PRGs (APP, PTX3, INHBB, VSIG4, CHGB, CAMK2N1, ADAMTS1, TGM2, SHANK2, ANO1, PRSS12) in control and tumor cell groups using qRT-PCR analysis. The results showed that the mRNA expression levels of 10 PRGs (APP, PTX3, INHBB, VSIG4, CAMK2N1, ADAMTS1, TGM2, SHANK2, ANO1, PRSS12) were upregulated in the tumor group (Fig. S6).
Discussion
The rapid advancement of high-throughput sequencing technologies has driven advances in oncological research and helped us gain deeper knowledge of the intrinsic mechanisms and mutational features of tumorigenesis [67]. Zhu et al. analyzed the sequencing data of HNSCC from the GEO and TCGA databases to identify the potential role of the pyroptosis-related gene [38]. Yin et al. used cell differentiation trajectories to identify HNSCC molecular classifications and gene signatures that forecast prognosis and immunotherapy response, offering more integrated predictions and perspectives for the treatment of HNSCC [36]. Zhu et al. constructed an inflammation-related model for HNSCC to predict survival outcomes and treatment response in patients by analyzing transcriptomic data from HNSCC patients, providing a new entry point and direction for the investigation of HNSCC-related immunotherapy [46]. However, single-omics analyses make it difficult to perform in-depth investigations in the context of complex biological mechanisms and have limited reliability and persuasive power for research conclusions [68]. By analyzing these histological data together, researchers are able to capture key biological signals that may have been missed in a single histological analysis, resulting in a more comprehensive understanding of the complex immune response mechanisms of disease. This approach can reveal interactions between different histological dimensions and help identify potential immune escape mechanisms or specific immunotherapeutic targets [69]. Machine learning algorithms offer several advantages over traditional algorithms, including the ability to handle complex and non-linear relationships, adaptability to changing conditions, efficient processing of large-scale data, and superior generalization capabilities [70]. Therefore, our study adopted multi-omics integrated analysis to produce important evidence for the essential basis for precision and individualized treatment of HNSCC.
In this research, we recognized CSs (CS1 and CS2) from 10 multi-omics algorithms, and CS1 indicated the most favorable survival outcome. CS1 also proved to have the best prognosis in the META-HNSCC cohort. The level of immune cell infiltration was significantly higher in CS2, suggesting that CS2 patients were more sensitive to immune responses. We also screened 135 PRGs from the TCGA-HNSCC and META-HNSCC cohorts using univariate Cox analysis. We further screened 30 PRGs to be included in the study to obtain a prognostic signature and construct the integration framework. Based on 101 algorithms, we computed the average C-index for each model to evaluate its predictive power. And risk scores revealed that the high-risk patients might suffer from a worse clinical prognosis. Compared to 17 other published signatures, the prognostic signature displayed strong predictive power in each of the cohorts. We performed functional enrichment analyses of DEGs from various groups and found that multiple cancer-related pathways were significantly activated in the high-risk group, indicating that they were more susceptible to cancer development. Besides, the top 10 mutated genes had higher frequencies in the high-risk group.
We screened 135 PRGs from the TCGA-HNSCC and META-HNSCC cohorts, and their expression was significantly associated with OS. Among them, 12 PRGs were risk factors, while 18 PRGs were protective factors. Consistent with previous research, these PRGs have been shown to be significantly associated with prognosis and biological functions across a variety of tumor types, including HNSCC [71,72,73,74]. APP is an androgen-induced gene that promotes the proliferative activity of breast cancer cells and has been recognized as a potent prognostic factor in patients with ER-positive breast cancer [75]. PTX3 overexpression accelerates tumor metastasis and suggests poor prognosis in hepatocellular carcinoma by driving epithelial-mesenchymal transition [76]. VSIG4 expression is associated with poor prognosis in patients with progressive gastric cancer [77]. High ADAMTS1 expression is associated with reduced survival in patients with lymph node-negative breast cancer [78]. High TGM2 expression in LUSC is associated with poorer prognosis, and high TGM2 expression is strongly associated with pro-tumor inflammation and may increase susceptibility to immunotherapy [79]. Enhanced expression of ANO1 in HNSCC leaded to cell migration and is associated with poor prognosis [80]. CTSG inhibited proliferation and metastasis of HNSCC by blocking the JAK2/STAT3 pathway [81]. Overexpression of MASP1 in hepatocellular carcinoma cell lines significantly inhibited proliferation, invasion and migration [82]. BTG3 protein expression might be considered as a good marker indicating good prognosis in epithelial ovarian carcinoma [83]. MEIS1 was downregulated in most tumors, and high MEIS1 expression predicted better OS in patients with HNSCC, adrenocortical carcinoma, and clear cell renal carcinoma [84]. These findings collectively highlight the intricate roles that PRGs play in modulating tumor biology and prognosis, providing valuable insights for future therapeutic strategies and prognostic assessments.
By using the ssGSEA algorithm and the TIDE algorithm, we analyzed the immunization status between various groups. Some immune function was activated in the low-risk patients, and the expression of ICGs (CTLA4, PDCD1, TIGIT, and IDO1) was high in the low-risk patients. CTLA4 is a protein receptor that plays a vital role in regulatory T cell activation and tolerance [85]. The immunosuppressive effects of CTLA-4 effectively stimulate the immune response, thereby affecting the proliferation of cancer cells. Ipilimumab, as a CTLA-4 inhibitor, was the first drug proven to prolong OS in patients with advanced melanoma [86]. PDCD1 (PD-1) can be expressed on the surface of immune cells [87]. PD-1 regulates the immune system by down-regulating the immune response to human cells and inhibiting T-cell inflammatory activity [88]. Pembrolizumab (PD-1 antibodies) as a first-line treatment strategy for HNSCC was shown to improve the prognosis of advanced HNSCC [89]. TIGIT is a suppressor receptor shared by T and NK cells that suppresses tumor cell killing by NK and T cells [90]. TIGIT could lead to NK cell depletion during tumor progression, and further studies revealed that an anti-TIGIT monoclonal antibody could reverse NK cell depletion and be used in immunotherapy for a variety of tumors [91]. In addition, based on the results of available clinical studies, anti-TIGIT antibody drugs are regarded as enhancing the human immune response against cancer cells [92]. IDO1 is the rate-limiting enzyme for the conversion of metabotropic tryptophan to kynurenine [93]. Overexpression of IDO1 in tumor tissues caused tryptophan depletion in the TME, which suppressed T-cell immune function and mediated immune escape from tumors [94]. IDO inhibitors (including Epacadostat and Indoximod) are currently in clinical trials and are expected to be used in the future as molecular immunotherapeutic agents for tumors in the treatment of cancer [95]. Moreover, the low-risk patients demonstrated markedly lower TIDE scores, demonstrating more sensitivity to immunotherapy. The above findings suggest that immunotherapy may benefit the low-risk group.
The mainstays of clinical treatment for HNSCC included surgery, radiotherapy, and chemotherapy. The majority of advanced HNSCC patients could be eradicated by surgery; however, some patients still needed to be treated with chemotherapy to control the progression of the disease after surgery [96]. We found that radiotherapy, 5-fluorouracil (5-Fu), Cisplatin, Docetaxel, and Paclitaxel were more sensitive in the high-risk group. 5-Fu is an antimetabolite broad-spectrum antitumor agent that is widely used as the classical chemotherapeutic agent for multiple malignancies [97]. As a first-line anticancer drug, cisplatin is employed in the treatment of various solid tumors, but studies showed that cisplatin had significant toxic effects on the kidneys, limiting its use in clinic treatment [98]. Since 2010, taxane-based anticancer drugs have been applied to the treatment of HNSCC, such as docetaxel and paclitaxel [99]. Despite greater advances in the treatment of HNSCC with chemotherapy, it did not prolong the OS of patients, and the prognosis is still poor. At the present stage, targeted drug therapy has demonstrated great therapeutic potential in HNSCC [100]. Our further analyses revealed that targeted therapies were more sensitive in low-risk patients. High EGFR expression is related to a worse survival condition in HNSCC patients [101]. Zalimumab is a monoclonal antibody against EGFR that effectively inhibits tumor growth by blocking EGFR signaling in preclinical models [102]. Cetuximab, an anti-EGFR monoclonal antibody, combined with radiotherapy, was found to improve OS in some patients with locally advanced HNSCC [103].
Fibroblasts, particularly cancer-associated fibroblasts (CAFs), are critical components of the TME and play pivotal roles in the genesis and progression of HNSCC. Signaling molecules such as TGF-β, FGF, and VEGF secreted by CAFs can activate signaling pathways in HNSCC tumor cells and promote cell division and growth. CAFs can also attract endothelial cells to enter the HNSCC tumor microenvironment through the secretion of chemokines to further promote angiogenesis. CAFs also enhance the migration ability and invasiveness of tumor cells by regulating EMT processes in HNSCC tumor cells. Immunomodulatory factors such as TGF-β and IL-6 secreted by CAFs can inhibit the activity of effector T cells and promote immunosuppressive cells such as regulatory T cells. migratory ability and invasiveness of tumor cells. Immunomodulatory factors such as TGF-β and IL-6 secreted by CAFs inhibit the activity of effector T cells and promote the accumulation of immunosuppressive cells such as regulatory T cells and myeloid-derived suppressor cells.By altering the physical properties of the tumor microenvironment, such as increasing the density of the extracellular matrix, CAFs limit the penetration of drugs into tumor tissues and reduce the effects of chemotherapy drugs and the killing effect of radiotherapy on tumor cells. These effects include promotion of tumor proliferation and growth, promotion of tumor angiogenesis, promotion of tumor metastasis and invasion, modulation of immune escape, and promotion of resistance to chemotherapy and radiotherapy [104]. Analysis of scRNA-seq indicated that the expression of the genes within our model was significantly upregulated, predominantly in fibroblasts. This suggests that these genes may further influence tumor development through their interaction with fibroblasts.
Our study has several limitations that merit careful consideration. First, the retrospective design relying on pre-existing datasets (e.g., TCGA, GEO) inherently carries risks of selection bias. Since these data were derived from specific institutions, regions, or subpopulations, the cohorts may lack representativeness across diverse geographic, ethnic, or clinical contexts. This compromises the external validity of our findings and limits their generalizability to broader clinical settings. Second, incomplete or inconsistently annotated clinical data in public repositories, such as treatment histories or HPV status, restricted our ability to perform granular subgroup analyses. Such gaps could introduce systematic bias, potentially leading to overinterpretation of results and misalignment with real-world patient outcomes. Third, while our computational models identified robust molecular subtypes and prognostic signatures, the biological mechanisms underlying these findings remain speculative. Experimental validation—via functional assays or prospective clinical cohorts—is essential to confirm the causal roles of the 30 PRGs in HNSCC progression and therapy resistance.
Despite these limitations, our conclusions are strengthened by rigorous cross-validation using independent cohorts (e.g., META-HNSCC), which confirmed the reproducibility of the subtypes (CS1/CS2) and the prognostic signature. Clinicians should interpret these results cautiously, prioritizing supplementary validation in localized patient populations before implementing risk-stratified therapies. Moving forward, prospective multi-center studies with standardized data collection protocols and integrated multi-omics profiling will be critical to refine these models and translate them into actionable clinical tools.
Conclusion
In conclusion, we identified two CSs of HNSCC using multi-omics data, predicted the prognosis and treatment response of patients by constructing a model with 30 PRGs and the Enet [alpha = 0.1] algorithm, and finally found that the high-risk patients were more sensitive to radiotherapy, 5-Fu, cisplatin, doxorubicin, and paclitaxel, whereas the low-risk patients were more sensitive to immunotherapy and targeted therapies. These findings provided valuable insights into the diagnosis and personalized treatment of HNSCC patients and offered a promising new strategy for clinical practice.
Subtypes of HNSCC based on multi-omics. (A) Calculation of the cluster prediction index and gap statistic to select the multi-omics clustering for HNSCC; (B) Calculating of the Silhoutte score to evaluate the sample similarity in each subgroup; (C) Visualization of multi-omics data; (D) Consensus heatmap for two clusters by the 10 multi-omics algorithms; (F) Kaplan-Meier analysis of OS in the two cluster subtypes
Molecular landscape for different CSs. (A) Validation of molecular subtypes in the META-HNSCC dataset; (B) Survival analysis in the META-HNSCC dataset; (C) The consistency between CSs and NTP in the TCGA-HNSCC dataset; (D) The consistency between CSs and PAM in the TCGA-HNSCC dataset; (E) The consistency between NTP and PAM in the META-HNSCC dataset; (F–H) PCA, tSNE, and UMAP validated two clusters. (I) The expression patterns in the various CSs
Construction of the prognostic signature. (A) C-index values of 101 machine learning algorithms in TCGA-HNSCC and META-HNSCC cohorts; (B) Selection of hub PRGs by the Enet [alpha = 0.1] algorithm; (C) Univariate Cox analysis of hub PRGs in TCGA-HNSCC and META-HNSCC cohorts; (D–E) Survival analysis of different groups in the TCGA-HNSCC and META-HNSCC cohorts
Construction of independent prognostic analyses and a clinical nomogram. (A) Univariate Cox analyses for the prognostic and clinical features; (B) Multivariate Cox analyses for the prognostic and clinical features of patients in the TCGA-HNSCC dataset; (C) C-index analysis of prognostic accuracy in the TCGA-HNSCC dataset; (D) Nomogram construction of the model and clinicopathological characteristics for HNSCC
The TME-related molecular characteristics of various groups. (A) Comparison of immune function between various groups; (B) Difference the expression levels of ICGs in the two groups; (C) The low-risk patients demonstrated markedly lower TIDE scores; (D) Survival probability between H-TIDE and L-TIDE; (E) Survival probability between H-TIDE + high risk, H-TIDE + low risk, L-TIDE + high risk, and L-TIDE + low risk
Comparison of drug sensitivity between different groups. (A) Two radiotherapy-associated biomarkers (cell cycle and hypoxia) were enriched in the high-risk group; (B) Comparison of drug sensitivity (5-Fu, Cisplatin, Docetaxel, and Paclitaxel) between various groups; (C) The drug sensitivity of targeted therapies between different groups
The immune microenvironment and cellular communication characteristics of HNSCC at the single-cell level. (A and B) The distribution of various clusters and cell types from the GSE103322 cohort; (C and D) The proportion of different cell types in various samples; (E) The signature expression levels were significantly higher in myo-fibroblasts, fibroblasts, and mast cells in the GSE103322 cohort
Data availability
All results generated in this study can be obtained by contacting the corresponding authors on reasonable request. The complete code and critical data are available on Github (https://github.com/JYfantast/HNSCC).
References
Johnson DE, Burtness B, Leemans CR, Lui V, Bauman JE. Grandis. Head and neck squamous cell carcinoma. Nat Rev Dis Primers. 2020;6(1):92.
Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A. Bray. Global Cancer statistics 2020: GLOBOCAN estimates of incidence and Mortality Worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–49.
Siegel RL, Miller KD, Wagle NS, Jemal A. Cancer statistics, 2023. CA Cancer J Clin. 2023;73(1):17–48.
Mody MD, Rocco JW, Yom SS, Haddad RI. Saba. Head and neck cancer. Lancet. 2021;398(10318):2289–99.
Kitamura N, Sento S, Yoshizawa Y, Sasabe E, Kudo Y, Yamamoto T. Current trends and future prospects of molecular targeted therapy in head and neck squamous cell carcinoma. Int J Mol Sci. 2020;22(1).
Almoshari Y. Development, therapeutic evaluation and theranostic applications of Cubosomes on cancers: an updated review. Pharmaceutics. 2022;14(3).
Galloway TJ, Ridge JA. Management of squamous Cancer metastatic to cervical nodes with an unknown primary site. J Clin Oncol. 2015;33(29):3328–37.
Dagher OK, Schwab RD, Brookens SK. Posey. Advances in cancer immunotherapies. Cell. 2023;186(8):1814.
Darvin P, Toor SM, Sasidharan NV, Elkord E. Immune checkpoint inhibitors: recent progress and potential biomarkers. Exp Mol Med. 2018;50(12):1–11.
Ferris RL, Blumenschein GJ, Fayette J, Guigay J, Colevas AD, Licitra L, Harrington K, Kasper S, Vokes EE, Even C, Worden F, Saba NF, Iglesias DL, Haddad R, Rordorf T, Kiyota N, Tahara M, Monga M, Lynch M, Geese WJ, Kopit J, Shaw JW. Gillison. Nivolumab for Recurrent Squamous-Cell Carcinoma of the Head and Neck. N Engl J Med. 2016;375(19):1856–67.
Burtness B, Harrington KJ, Greil R, Soulieres D, Tahara M, de Castro GJ, Psyrri A, Baste N, Neupane P, Bratland A, Fuereder T, Hughes B, Mesia R, Ngamphaiboon N, Rordorf T, Wan IW, Hong RL, Gonzalez MR, Roy A, Zhang Y, Gumuscu B, Cheng JD, Jin F. D. Rischin. Pembrolizumab alone or with chemotherapy versus cetuximab with chemotherapy for recurrent or metastatic squamous cell carcinoma of the head and neck (KEYNOTE-048): a randomised, open-label, phase 3 study. Lancet. 2019;394(10212):1915–1928.
Borel C, Jung AC, Burgy M. Immunotherapy breakthroughs in the treatment of recurrent or metastatic Head and Neck squamous cell carcinoma. Cancers (Basel). 2020;12(9).
Ferris RL, Haddad R, Even C, Tahara M, Dvorkin M, Ciuleanu TE, Clement PM, Mesia R, Kutukova S, Zholudeva L, Daste A, Caballero-Daroqui J, Keam B, Vynnychenko I, Lafond C, Shetty J, Mann H, Fan J, Wildsmith S, Morsli N, Fayette J. Licitra. Durvalumab with or without tremelimumab in patients with recurrent or metastatic head and neck squamous cell carcinoma: EAGLE, a randomized, open-label phase III study. Ann Oncol. 2020;31(7):942–50.
Ruffin AT, Li H, Vujanovic L, Zandberg DP, Ferris RL. Bruno. Improving head and neck cancer therapies by immunomodulation of the tumour microenvironment. Nat Rev Cancer. 2023;23(3):173–88.
Cramer JD, Burtness B. R. L. Ferris. Immunotherapy for head and neck cancer: recent advances and future directions. Oral Oncol. 2019;99:104460.
Sarah J, John D. Personalised cancer medicine. Int J Cancer. 2014;13(7).
Apostolia T, Elena MF, Mina N, Razelle K. Review of precision cancer medicine: evolution of the treatment paradigm. Cancer Treat Rev. 2020;86.
Fountzilas E, Kotoula V, Angouridakis N, Karasmanis I, Wirtz RM, Eleftheraki AG, Veltrup E, Markou K, Nikolaou A, Pectasides D. G. Fountzilas. Identification and validation of a multigene predictor of recurrence in primary laryngeal cancer. PLoS ONE. 2013;8(8):e70429.
Lohavanichbutr P, Mendez E, Holsinger FC, Rue TC, Zhang Y, Houck J, Upton MP, Futran N, Schwartz SM, Wang P, Chen C. A 13-gene signature prognostic of HPV-negative OSCC: discovery and external validation. Clin Cancer Res. 2013;19(5):1197–203.
Zhao Y, Chen D, Yin J, Xie J, Sun CY, Lu M. Comprehensive Analysis of Tumor Immune Microenvironment Characteristics for the prognostic prediction and immunotherapy of oral squamous cell carcinoma. Front Genet. 2022;13:788580.
Wichmann G, Rosolowski M, Krohn K, Kreuz M, Boehm A, Reiche A, Scharrer U, Halama D, Bertolini J, Bauer U, Holzinger D, Pawlita M, Hess J, Engel C, Hasenclever D, Scholz M, Ahnert P, Kirsten H, Hemprich A, Wittekind C, Herbarth O, Horn F, Dietz A, Loeffler M. The role of HPV RNA transcription, immune response-related gene expression and disruptive TP53 mutations in diagnostic and prognostic profiling of head and neck cancer. Int J Cancer. 2015;137(12):2846–57.
Leek JT, Johnson WE, Parker HS, Jaffe AE. Storey. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics. 2012;28(6):882–3.
Lu X, Meng J, Zhou Y, Jiang L, Yan F. MOVICS: an R package for multi-omics integration and visualization in cancer subtyping. Bioinformatics. 2021;36(22–23):5539–41.
Chu G, Ji X, Wang Y, Niu H. Integrated multiomics analysis and machine learning refine molecular subtypes and prognosis for muscle-invasive urothelial cancer. Mol Ther Nucleic Acids. 2023;33:110–26.
Chalise P, Fridley BL. Integrative clustering of multi-level ‘omic data based on non-negative matrix factorization algorithm. PLoS ONE. 2017;12(5):e176278.
Yin J, Xu L, Wang S, Zhang L, Zhang Y, Zhai Z, Zeng P, Grzegorzek M, Jiang T. Integrating immune multi-omics and machine learning to improve prognosis, immune landscape, and sensitivity to first- and second-line treatments for head and neck squamous cell carcinoma. Sci Rep. 2024;14(1):31454.
Hoshida Y. Nearest template prediction: a single-sample-based flexible class prediction with confidence assessment. PLoS ONE. 2010;5(11):e15543.
Pierre-Jean M, Deleuze JF, Le Floch E, Mauger F. Clustering and variable selection evaluation of 13 unsupervised methods for multi-omics data integration. Brief Bioinform. 2020;21(6):2011–30.
Van Der Maaten L. Accelerating t-SNE using tree-based algorithms. J Mach Learn Res. 2014;15(1):3221–45.
Hanzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinformatics. 2013;14:7.
Friedman J, Hastie T. Tibshirani. Regularization paths for generalized Linear models via Coordinate Descent. J Stat Softw. 2010;33(1):1–22.
Bastien P, Bertrand F, Meyer N, Maumy-Bertrand M. Deviance residuals-based sparse PLS and sparse kernel PLS regression for censored data. Bioinformatics. 2015;31(3):397–404.
Bair E, Tibshirani R. Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol. 2004;2(4):E108.
Gu Z, Eils R, Schlesner M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics. 2016;32(18):2847–9.
Yuan Z, Huang J, Teh BM, Hu S, Hu Y, Shen Y. Exploration of a predictive model based on genes associated with fatty acid metabolism and clinical treatment for head and neck squamous cell carcinoma. J Clin Lab Anal. 2022;36(11):e24722.
Yin J, Zheng S, He X, Huang Y, Hu L, Qin F, Zhong L, Li S, Hu W, Zhu J. Identification of molecular classification and gene signature for predicting prognosis and immunotherapy response in HNSCC using cell differentiation trajectories. Sci Rep. 2022;12(1):20404.
Chen Y, Feng Y, Yan F, Zhao Y, Zhao H, Guo Y. A Novel Immune-related gene signature to identify the Tumor Microenvironment and Prognose Disease among patients with oral squamous cell carcinoma patients using ssGSEA: a Bioinformatics and Biological Validation Study. Front Immunol. 2022;13:922195.
Zhu W, Zhang J, Wang M, Zhai R, Xu Y, Wang J, Wang M, Zhang H, Liu L. Development of a prognostic pyroptosis-related gene signature for head and neck squamous cell carcinoma patient. Cancer Cell Int. 2022;22(1):62.
Li Z, Shen L, Li Y, Shen L, Li N. Identification of pyroptosis-related gene prognostic signature in head and neck squamous cell carcinoma. Cancer Med. 2022;11(24):5129–44.
Chen B, Han Y, Sheng S, Deng J, Vasquez E, Yau V, Meng M, Sun C, Wang T, Wang Y, Sheng M, Wu T, Wang X, Liu Y, Lin N, Zhang L. Shao. An angiogenesis-associated gene-based signature predicting prognosis and immunotherapy efficacy of head and neck squamous cell carcinoma patients. J Cancer Res Clin Oncol. 2024;150(2):91.
Qian X, Tang J, Chu Y, Chen Z, Chen L, Shen C, Li L. A Novel pyroptosis-related gene signature for Prognostic Prediction of Head and Neck squamous cell carcinoma. Int J Gen Med. 2021;14:7669–79.
Jin Y, Wang Z, Huang S, Liu C, Wu X, Wang H. Identify and validate circadian regulators as potential prognostic markers and immune infiltrates in head and neck squamous cell carcinoma. Sci Rep. 2023;13(1):19939.
Yanan L, Hui L, Zhuo C, Longqing D, Ran S. Comprehensive analysis of mitophagy in HPV-related head and neck squamous cell carcinoma. Sci Rep. 2023;13(1):7480.
Li C., Wang X., Qin R., Zhong Z., Sun C. Identification of a ferroptosis gene set that mediates the prognosis of squamous cell carcinoma of the Head and Neck. Front Genet. 2021;12:698040.
Shen Y, Li L, Lu Y, Zhang M, Huang X, Tang X. Establishment and validation of a comprehensive prognostic model for patients with HNSCC metastasis. Front Genet. 2021;12:685104.
Zhu L, Wang Y, Yuan X, Ma Y, Zhang T, Zhou F, Yu G. Effects of immune inflammation in head and neck squamous cell carcinoma: Tumor microenvironment, drug resistance, and clinical outcomes. Front Genet. 2022;13:1085700.
Zhang Z, Hu X, Qiu D, Sun Y, Lei L. Development and validation of a necroptosis-related prognostic model in head and neck squamous cell carcinoma. J Oncol. 2022;2022:8402568.
Huang J, Huo H, Lu R. A novel signature of necroptosis-associated genes as a potential prognostic tool for head and neck squamous cell carcinoma. Front Genet. 2022;13:907985.
Chen L, Zhang X, Lin J, Wen Y, Chen Y, Chen CB. Construction and validation of a prognostic model based on stage-associated signature genes of head and neck squamous cell carcinoma: a bioinformatics study. Ann Transl Med. 2022;10(24):1316.
Luo J, Huang Y, Wu J, Dai L, Dong M, Cheng B. A novel hypoxia-associated gene signature for prognosis prediction in head and neck squamous cell carcinoma. BMC Oral Health. 2023;23(1):864.
Shen S, Bai J, Wei Y, Wang G, Li Q, Zhang R, Duan W, Yang S, Du M, Zhao Y, Christiani DC, Chen F. A seven-gene prognostic signature for rapid determination of head and neck squamous cell carcinoma survival. Oncol Rep. 2017;38(6):3403–11.
Yang F, Zhou LQ, Yang HW, Wang YJ. Nine-gene signature and nomogram for predicting survival in patients with head and neck squamous cell carcinoma. Front Genet. 2022;13:927614.
Ribeiro IP, Esteves L, Caramelo F, Carreira IM. Melo JB. Integrated multi-omics signature predicts survival in head and neck cancer. Cells. 2022;11(16).
Wu T, Hu E, Xu S, Chen M, Guo P, Dai Z, Feng T, Zhou L, Tang W, Zhan L, Fu X, Liu S, Bo X. Yu. clusterProfiler 4.0: a universal enrichment tool for interpreting omics data. Innov (Camb). 2021;2(3):100141.
Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W. Smyth. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):e47.
Mayakonda A, Lin DC, Assenov Y, Plass C. Koeffler. Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Res. 2018;28(11):1747–56.
Newman AM, Liu CL, Green MR, Gentles AJ, Feng W, Xu Y, Hoang CD, Diehn M. Alizadeh. Robust enumeration of cell subsets from tissue expression profiles. Nat Methods. 2015;12(5):453–7.
Dienstmann R, Villacampa G, Sveen A, Mason MJ, Niedzwiecki D, Nesbakken A, Moreno V, Warren RS, Lothe RA, Guinney J. Relative contribution of clinicopathological variables, genomic markers, transcriptomic subtyping and microenvironment features for outcome prediction in stage II/III colorectal cancer. Ann Oncol. 2019;30(10):1622–9.
Aran D, Hu Z. Butte. xCell: digitally portraying the tissue cellular heterogeneity landscape. Genome Biol. 2017;18(1):220.
Finotello F, Mayer C, Plattner C, Laschober G, Rieder D, Hackl H, Krogsdam A, Loncova Z, Posch W, Wilflingseder D, Sopper S, Ijsselsteijn M, Brouwer TP, Johnson D, Xu Y, Wang Y, Sanders ME, Estrada MV, Ericsson-Gonzalez P, Charoentong P, Balko J, de Miranda N. Trajanoski. Molecular and pharmacological modulators of the tumor immune contexture revealed by deconvolution of RNA-seq data. Genome Med. 2019;11(1):34.
Tamminga M, Hiltermann T, Schuuring E, Timens W, Fehrmann RS. Groen. Immune microenvironment composition in non-small cell lung cancer and its association with survival. Clin Transl Immunol. 2020;9(6):e1142.
Racle J, de Jonge K, Baumgaertner P, Speiser DE, Gfeller D. Simultaneous enumeration of cancer and immune cell types from bulk tumor gene expression data. Elife. 2017;6.
Fu J, Li K, Zhang W, Wan C, Zhang J, Jiang P. Liu. Large-scale public data reuse to model immunotherapy response and resistance. Genome Med. 2020;12(1):21.
Maeser D, Gruener RF, Huang RS. oncoPredict: an R package for predicting in vivo or cancer patient drug response and biomarkers from cell line screening data. Brief Bioinform. 2021;22(6).
Puram SV, Tirosh I, Parikh AS, Patel AP, Yizhak K, Gillespie S, Rodman C, Luo CL, Mroz EA, Emerick KS, Deschler DG, Varvares MA, Mylvaganam R, Rozenblatt-Rosen O, Rocco JW, Faquin WC, Lin DT, Regev A. Bernstein. Single-cell transcriptomic analysis of primary and metastatic Tumor ecosystems in Head and Neck Cancer. Cell. 2017;171(7):1611–24.
Sun D, Wang J, Han Y, Dong X, Ge J, Zheng R, Shi X, Wang B, Li Z, Ren P, Sun L, Yan Y, Zhang P, Zhang F, Li T, Wang C. TISCH: a comprehensive web resource enabling interactive single-cell transcriptome visualization of tumor microenvironment. Nucleic Acids Res. 2021;49(D1):D1420–30.
Song Q, Merajver SD, Li JZ. Cancer classification in the genomic era: five contemporary problems. Hum Genomics. 2015;9:27.
Zhao L, Dong Q, Luo C, Wu Y, Bu D, Qi X, Luo Y, Zhao Y. DeepOmix: a scalable and interpretable multi-omics deep learning framework and application in cancer survival analysis. Comput Struct Biotechnol J. 2021;19:2719–25.
Ivanisevic T, Sewduth RN. Multi-omics Integration for the design of Novel therapies and the identification of novel biomarkers. Proteomes. 2023;11(4).
Maceachern SJ. Forkert. Machine learning for precision medicine. Genome. 2021;64(4):416–25.
Zou G, Ren B, Liu Y, Fu Y, Chen P, Li X, Luo S, He J, Gao G, Zeng Z, Xiong W, Li G, Huang Y, Xu K. Zhang. Inhibin B suppresses anoikis resistance and migration through the transforming growth factor-beta signaling pathway in nasopharyngeal carcinoma. Cancer Sci. 2018;109(11):3416–27.
Yoshida R, Ohuchi N, Kimura N. Clinicopathological study of chromogranin A, B and BRCA1 expression in node-negative breast carcinoma. Oncol Rep. 2002;9(6):1363–7.
Heinze K, Rengsberger M, Gajda M, Jansen L, Osmers L, Oliveira-Ferrer L, Schmalfeldt B, Durst M, Hafner N, Runnebaum IB. CAMK2N1/RUNX3 methylation is an independent prognostic biomarker for progression-free and overall survival of platinum-sensitive epithelial ovarian cancer patients. Clin Epigenetics. 2021;13(1):15.
Liu S, Liu W, Ding Z, Yang X, Jiang Y, Wu Y, Liu Y, Wu J. Identification and validation of a novel tumor driver gene signature for diagnosis and prognosis of head and neck squamous cell carcinoma. Front Mol Biosci. 2022;9:912620.
Takagi K, Ito S, Miyazaki T, Miki Y, Shibahara Y, Ishida T, Watanabe M, Inoue S, Sasano H, Suzuki T. Amyloid precursor protein in human breast cancer: an androgen-induced gene associated with cell proliferation. Cancer Sci. 2013;104(11):1532–8.
Song T, Wang C, Guo C, Liu Q, Zheng X. Pentraxin 3 overexpression accelerated tumor metastasis and indicated poor prognosis in hepatocellular carcinoma via driving epithelial-mesenchymal transition. J Cancer. 2018;9(15):2650–8.
Kim SW, Roh J, Lee HS, Ryu MH, Park YS. Park. Expression of the immune checkpoint molecule V-set immunoglobulin domain-containing 4 is associated with poor prognosis in patients with advanced gastric cancer. Gastric Cancer. 2021;24(2):327–40.
Tan IA, Frewin K, Ricciardelli C, Russell DL. ADAMTS1 promotes adhesion to Extracellular Matrix proteins and predicts Prognosis in early stage breast Cancer patients. Cell Physiol Biochem. 2019;52(6):1553–68.
Chang W, Gao W, Liu D, Luo B, Li H, Zhong L, Chen Y. The upregulation of TGM2 is associated with poor prognosis and the shaping of the inflammatory tumor microenvironment in lung squamous cell carcinoma. Am J Cancer Res. 2024;14(6):2823–38.
Ruiz C, Martins JR, Rudin F, Schneider S, Dietsche T, Fischer CA, Tornillo L, Terracciano LM, Schreiber R, Bubendorf L. Kunzelmann. Enhanced expression of ANO1 in head and neck squamous cell carcinoma causes cell migration and correlates with poor prognosis. PLoS ONE. 2012;7(8):e43265.
Hua H, Yang X, Meng D, Gan R, Chen N, He L, Wang D, Jiang W, Si D, Wang X, Zhang X, Wei X, Wang Y, Li B, Zhang H, Gao C. CTSG restraines the proliferation and metastasis of head and neck squamous cell carcinoma by blocking the JAK2/STAT3 pathway. Cell Signal. 2024;127:111562.
Yu H, Wang C, Ke S, Xu Y, Lu S, Feng Z, Bai M, Qian B, Xu Y, Li Z, Yin B, Li X, Hua Y, Zhou M, Li Z, Fu Y, Ma Y. An integrative pan-cancer analysis of MASP1 and the potential clinical implications for the tumor immune microenvironment. Int J Biol Macromol. 2024;280(Pt 3):135834.
Deng B, Zhao Y, Gou W, Chen S, Mao X, Takano Y, Zheng H. Decreased expression of BTG3 was linked to carcinogenesis, aggressiveness, and prognosis of ovarian carcinoma. Tumour Biol. 2013;34(5):2617–24.
Li H, Tang Y, Hua L, Wang Z, Du G, Wang S, Lu S. W. Li. A systematic Pan-cancer analysis of MEIS1 in human tumors as Prognostic biomarker and immunotherapy target. J Clin Med. 2023;12(4).
Wing K, Onishi Y, Prieto-Martin P, Yamaguchi T, Miyara M, Fehervari Z, Nomura T, Sakaguchi S. CTLA-4 control over Foxp3 + regulatory T cell function. Science. 2008;322(5899):271–5.
Mcdermott D, Haanen J, Chen TT, Lorigan P, O’Day S. Efficacy and safety of ipilimumab in metastatic melanoma patients surviving more than 2 years following treatment in a phase III trial (MDX010-20). Ann Oncol. 2013;24(10):2694–8.
Bardhan K, Anagnostou T, Boussiotis VA. The PD1:PD-L1/2 pathway from discovery to clinical implementation. Front Immunol. 2016;7:550.
Pardoll DM. The blockade of immune checkpoints in cancer immunotherapy. Nat Rev Cancer. 2012;12(4):252–64.
de Sousa LG, Ferrarotto R. Pembrolizumab in the first-line treatment of advanced head and neck cancer. Expert Rev Anticancer Ther. 2021;21(12):1321–31.
Gur C, Ibrahim Y, Isaacson B, Yamin R, Abed J, Gamliel M, Enk J, Bar-On Y, Stanietsky-Kaynan N, Coppenhagen-Glazer S, Shussman N, Almogy G, Cuapio A, Hofer E, Mevorach D, Tabib A, Ortenberg R, Markel G, Miklic K, Jonjic S, Brennan CA, Garrett WS, Bachrach G. Mandelboim. Binding of the Fap2 protein of Fusobacterium nucleatum to human inhibitory receptor TIGIT protects tumors from immune cell attack. Immunity. 2015;42(2):344–55.
Zhang Q, Bi J, Zheng X, Chen Y, Wang H, Wu W, Wang Z, Wu Q, Peng H, Wei H, Sun R, Tian Z. Blockade of the checkpoint receptor TIGIT prevents NK cell exhaustion and elicits potent anti-tumor immunity. Nat Immunol. 2018;19(7):723–32.
Chu X, Tian W, Wang Z, Zhang J, Zhou R. Co-inhibition of TIGIT and PD-1/PD-L1 in Cancer Immunotherapy: mechanisms and clinical trials. Mol Cancer. 2023;22(1):93.
Pallotta MT, Rossini S, Suvieri C, Coletti A, Orabona C, Macchiarulo A, Volpi C, Grohmann U. Indoleamine 2,3-dioxygenase 1 (IDO1): an up-to-date overview of an eclectic immunoregulatory enzyme. FEBS J. 2022;289(20):6099–118.
Joyce JA, Fearon DT. T cell exclusion, immune privilege, and the tumor microenvironment. Science. 2015;348(6230):74–80.
Zhai L, Ladomersky E, Lenzen A, Nguyen B, Patel R, Lauing KL, Wu M. Wainwright. IDO1 in cancer: a Gemini of immune checkpoints. Cell Mol Immunol. 2018;15(5):447–57.
M. J. Worsham. Identifying the risk factors for late-stage head and neck cancer. Expert Rev Anticancer Ther. 2011;11(9):1321–5.
Longley DB, Harkin DP. Johnston. 5-fluorouracil: mechanisms of action and clinical strategies. Nat Rev Cancer. 2003;3(5):330–8.
Shaloam D. T. Paul Bernard. Cisplatin in cancer therapy: molecular mechanisms of action. Eur J Pharmacol. 2014;740.
Wang T, Yu J, Liu M, Chen Y, Zhu C, Lu L, Wang M, Min L, Liu X, Zhang X, Gubat JA, Chen Y. The benefit of taxane-based therapies over fluoropyrimidine plus platinum (FP) in the treatment of esophageal cancer: a meta-analysis of clinical studies. Drug Des Devel Ther. 2019;13:539–53.
Kozakiewicz P, Grzybowska-Szatkowska L. Application of molecular targeted therapies in the treatment of head and neck squamous cell carcinoma. Oncol Lett. 2018;15(5):7497–505.
Bossi P, Resteghini C, Paielli N, Licitra L, Pilotti S. Perrone. Prognostic and predictive value of EGFR in head and neck squamous cell carcinoma. Oncotarget. 2016;7(45):74362–79.
Overdijk MB, Verploegen S, Brakel JVD, Bueren JLV, Rigter G, Vink T, Winkel JGJ, Parren PWHI. W. K. Bleeker. Role of ADCC in the in vivo antitumor effects of zalutumumab, a human anti-EGF receptor antibody. J Clin Oncol. 2010;28.
Patel AN, Mehnert JM, Kim S. Treatment of recurrent metastatic head and neck cancer: focus on cetuximab. Clin Med Insights Ear Nose Throat. 2012;5:1–16.
Raudenska M, Balvan J, Hanelova K, Bugajova M, Masarik M. Cancer-associated fibroblasts: mediators of head and neck tumor microenvironment remodeling. Biochim Biophys Acta Rev Cancer. 2023;1878(5):188940.
Funding
Not applicable.
Author information
Authors and Affiliations
Contributions
G.Q. and X.L. wrote the main manuscript text and G.Q. X.L. and C.L. prepared Figs. 1, 2 and 3. All authors reviewed the manuscript.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Luo, X., Li, C. & Qin, G. Multiple machine learning-based integrations of multi-omics data to identify molecular subtypes and construct a prognostic model for HNSCC. Hereditas 162, 17 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s41065-025-00380-0
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s41065-025-00380-0