December Integration of Multi-omics Data for the Classification of Glioma Types and Identification of Novel Biomarkers Francisca G. Vieira 0 Regina Bispo 0 1 Marta B. Lopes marta.lopes@fct.unl.pt 0 1 2 Center for Mathematics and Applications (NOVA Math), NOVA School of Science and Technology , Portugal Department of Mathematics, NOVA School of Science and Technology , Portugal UNIDEMI, Department of Mechanical and Industrial Engineering, NOVA School of Science and Technology , Portugal 2023 23 2023

Glioma is currently one of the most prevalent types of primary brain cancer. Given its high level of heterogeneity along with the complex biological molecular markers, many efforts have been made to accurately classify the type of glioma in each patient, which, in turn, is critical to improve early diagnosis and increase survival. Nonetheless, as a result of the fast-growing technological advances in high throughput sequencing and evolving molecular understanding of glioma biology, its classification has been recently subject to significant alterations. In this study, we integrate multiple glioma omics modalities (including mRNA, DNA methylation, and miRNA) from The Cancer Genome Atlas (TCGA), while using the revised glioma reclassified labels, with a supervised method based on sparse canonical correlation analysis (DIABLO) to discriminate between glioma types. We were able to find a set of highly correlated features distinguishing glioblastoma from lower-grade gliomas (LGG) that were mainly associated with the disruption of receptor tyrosine kinases signaling pathways and extracellular matrix organization and remodeling. On the other hand, the discrimination of the LGG types was characterized primarily by features involved in ubiquitination and DNA transcription processes. Furthermore, we could identify several novel glioma biomarkers likely helpful in both diagnosis and prognosis of the patients, including the genes PPP1R8, GPBP1L1, KIAA1614, C14orf23, CCDC77, BVES, EXD3, CD300A and HEPN1. Overall, this classification method allowed to discriminate the different TCGA glioma patients with very high performance, while seeking for common information across multiple data types, ultimately enabling the understanding of essential mechanisms driving glioma heterogeneity and unveiling potential therapeutic targets.

Canonical Correlation Analysis Classification Glioma Multi-omics Survival
Introduction

that capture biological sources of common variability across various omics datasets. These techniques include, for instance, sparse variants of both Canonical Correlation Analysis (CCA) [ 10, 11 ] and Multi-block Partial Least Squares (PLS) [ 12 ], joint Non-negative Matrix Factorization (NMF) [ 13 ], Joint and Individual Variation Explained (JIVE) [ 14, 15 ] and Multiomics Factor Analysis (MOFA) [ 16, 17 ]. Furthermore, supervised extensions of CCA have been recently suggested in a classification framework to identify relevant multi-view features that are not only highly correlated but also optimally separate subjects into distinct groups [ 18, 19, 20 ]. In particular, Data Integration Analysis for Biomarker discovery using Latent cOmponents (DIABLO) [ 18 ] has been used in multiple omics applications, including the identification of disease mechanisms and biomarkers [ 21, 22 ]. Moreover, a recent benchmark analysis in cancer type classification [ 23 ] revealed that DIABLO performed significantly better than several other methods, including MOFA/MOFA+ [ 17 ], principal component analysis (PCA) and iClusterBayes/iClusterPlus/iCluster [ 24 ].

In this work, an integrative multi-layer omics study using mRNA, DNA methylation, and miRNA data from The Cancer Genome Atlas (TCGA) was performed to classify and characterize the different glioma types, considering its recently updated reclassification [ 7, 8 ]. Firstly, in Section 2, we describe the data collected and detail all the methods used in this work. We used DIABLO to select a subset of highly correlated features from the different modalities relevant to distinguish either the three glioma types (as present in Section 3.1) or specifically to discriminate the two LGG types (Section 3.2). This is particularly relevant considering the ongoing evolution in glioma classification and the need of identifying novel genetic markers that aid in its characterization. Aiming to delineate unknown mechanisms of glioma pathobiology and unveil novel therapeutic targets, we further investigated the correlation between the selected features, the pathways involved, and the prognostic value based on their effect in the survival probability of the patients (Section 3.3). Lastly, in Sections 4 and 5, we discuss the obtained results and highlight the main conclusions of the work developed, respectively. 2 2.1

Materials and Methods Data Description

In this study, we used the GBM and LGG datasets from TCGA, available in the Genomic Data Commons Data Portal with project names TCGA-GBM and TCGA-LGG. The TCGA level-3 data on gene expression (Illumina mRNAseq), DNA methylation (Illumina HumanMethylation450), miRNA expression (Illumina miRNAseq, BCGSC miRNA Profiling workflow) and clinical information were retrieved. RNA-seq data normalized for gene length and for sequencing depth (Transcript per million, TPM) with upper quantile normalization was extracted for DIABLO and survival analysis. RNA-seq RSEM expected counts, without normalization, were extracted for differential gene expression analysis. The R packages RTCGAToolbox v2.28.1 [ 25 ] and TCGAbiolinks 2.25.3 [ 26 ] were used to collect the data.

DNA methylation data was first preprocessed by removing non-valid entries (start = end = -1), probes mapping multiple places, non-CpG sites, sexual chromosome and missing (NA) entries. For each dataset, features with zero variance or having very few unique values relative to the number of samples (cutoff of 10%) and a large ratio of the frequency of the most common value to the frequency of the second most common value (cutoff ratio of 95/5) were filtered. The intersection of the datasets was done to keep only the samples that were present in all the datasets. From the 278 GBM individuals, 108 had both mRNA and DNA methylation data, and 5 had miRNA data, so the latter data type was not used in GBM discrimination. From the 262 astrocytoma individuals, 248 had both mRNA and DNA methylation data, and 243 had miRNA data, while from the 169 oligodendroglioma individuals, 166 had mRNA, DNA methylation and miRNA data, characterized respectively by 19068, 297602 and 1118 filtered features.

In the classification task, the data was split into train and test subsets, in a predefined ratio (70% for training and 30% for testing), while preserving relative ratios of the different classes (the different glioma types are in the same proportion as the original dataset). 2.2

Data Integration Analysis for Biomarker discovery using Latent cOmponents - DIABLO

Data Integration Analysis for Biomarker discovery using Latent cOmponents (DIABLO) [ 18 ] was applied in this work, using the R package mixOmics v6.22.0 [ 27 ], to select co-expressed variables from the different datasets that discriminate between the glioma types. DIABLO discriminant analysis extends the multivariate methodology Generalized Canonical Correlation Analysis (GCCA) [ 11 ] to a supervised framework by replacing one data matrix - (@) with the outcome dummy matrix Y. For Q omics datasets measuring the expression levels of the Pq omics variables on the same biological samples, GCCA solves for each component (canonical variate) h:

max 0ℎ(1) ,...,0ℎ(&) @, 9=1,@≠ 9 & Õ s.t. a(@) ℎ 2 = 1 and a(@) ℎ 1 (ℎ = 1, ..., ) (1) where -ℎ(@) and -ℎ( 9) represent the datasets, 0 ℎ(@) and 0 ℎ(9) are the loading vectors (component coefficients) on component h with q, j = 1,...,Q (q ≠ j). For all a ∈ R=, ℓ1 and ℓ2 norms are, respectively, defined by ||a||1 = Í1? 0 9 and ||a||2 = Í1? 029 1/2 _ (@) is a ℓ1 penalisation parameter to induce sparsity on the loading vector aℎ by shrinking some coefficients to zero. One data matrix - (@) is given by the outcome dummy matrix, Y, indicating the class membership of each individual. = 2@, 9 @, 9 is the design matrix, a Q×Q matrix that specifies whether datasets should be correlated and includes values between zero (datasets are not connected) and one (datasets are fully connected), enabling to model a particular relation between pairs of omics data and to constraint the model to only take into account those specific pairwise covariances, as expected from prior biological knowledge or experimental design. The design matrix was specified based on a data-driven approach. Using PLS as a preliminary analysis, integrating two datasets at a time to assess the common information between them, the correlation value was then used in cq,j for q, j < Q, while cq,j = 1 for q, j = Q.

In this work, 2 components were used, since in general, K - 1 components are sufficient to achieve the best classification performance, where K is the number of classes [ 27 ]. It was also verified, by evaluating the difference in overall misclassification error rate, that there was no gain in performance when adding more components to the sparse model.

When using sparse GCCA, the component scores th = Xhah are defined on a small subset of variables with non-zero coefficients, leading to variable selection that aims to optimally maximize the discrimination between the K outcome classes in Y. In DIABLO, a soft-thresholding is used, replacing the non negative parameter _ (@) that controls the amount of shrinkage in ah by the number of features to select on each dimension [ 27 ]. To select the number of features, a step-by-step approach was used, by assessing the performance of the model (measured via overall misclassification error rate) for each value of features provided as a grid, one component at a time, using 5x5 cross-validation (CV). First, a grid with a higher amplitude of values but low resolution was evaluated, and subsequently finer grids were analyzed around the previously selected values with progressively smaller steps, until the values converged.

Considering an independent test set, the predicted coordinates (scores) are computed for each new observation on the set of H latent components, and then used to predict each of the dummy variables. The final predicted class is then obtained, minimizing the distance to the centroid. The predictions are combined by weighted vote, where each omics dataset weight is defined as the correlation between the latent components associated to that particular data set and the outcome, from the training set. The final prediction is the class that obtains the highest weight across all datasets.

In this work, 30 different partitions of the data were considered, retrieving from each trained model the variables with nonzero coefficients and their absolute values, such as the performance measures (accuracy, precision, recall, F1) after predicting the labels in each test set. The receiver operating characteristic (ROC) curves were also used to evaluate the performance of each model and each component in predicting the glioma type, and the respective area under the curve (AUC) was determined. 2.3

Selection of Features and Database Search

For further analyzes we have selected the variables that, within the 30 different models, occurred more frequently in the components with non-zero coefficient and with higher median absolute value: frequency > 15 and median absolute value > 0.05 or frequency > 10, and median absolute value > 0.2. The outcome class of each feature (the glioma type for which the expression of that feature is more distinguished from the other types) was determined as follows: For the RNA features, the respective outcome type was determined by differential gene expression analysis, where, for each gene, the outcome is the class in which it is more differentially expressed (with lower false discovery rate (FDR) and higher log(FC)). For the methylation features, the outcome was determined for each methylation site as the class with highest averaged methylation level difference to the other two classes. For the miRNA features, the outcome was determined as the class with higher averaged miRNA expression (in this case there were only the two LGG types).

Each selected variable was searched for associations with glioma and cancer over different databases. In the case of DNA methylation features, they were first mapped back to gene symbols, using methylGSA v1.16.0 package [ 28 ] and the multisymbol checker tool from HUGO Gene Nomenclature Committee. The search was performed using all the human collections from the Molecular Signatures Database (MolSigDB) [ 29 ] and OMIMr [ 30 ] with the keywords "glioma", "glioblastoma", "oligodendroglioma", "astrocytoma" and in PubMedr, using as keywords the symbol of each gene and "AND glioma" or "AND cancer". The gene targets of each miRNA were determined using the MiRTarBase database [ 31 ]. 2.4

Differential Expression Analysis

Differential expression analysis was performed using the edgeR package v3.40.2 [ 32 ]. Genes with very low counts were first filtered, and trimmed mean of M values (TMM) normalization was applied to account for compositional biases [ 33 ]. Differential expression between the experimental groups was tested using quasi-likelihood F-test (considering a negative binomial distribution). Genes with an FDR-adjusted p-value (based on Benjamini and Hochberg (BH) adjustment [ 34 ]) of less than 0.05 were deemed significantly differentially expressed genes (DEG). Furthermore, to define a compromise between statistical significance and biological variation, the genes for which: − log10 ( ') > max(− log10 ( ')) + log( ) ∗ [− log10 (0.05) − <0G (− log10 ( '))] were considered positively differentially expressed while the genes for which: − log10 ( ') > max(− log10 ( ')) + log( ) ∗ [log10 (0.05) + max(− log10 ( '))] were considered negatively differentially expressed. All the remaining DEGs, although significant were considered not relevant given the low log(FC). 2.5

Overrepresentation and Gene Set Enrichment Analysis

Differentially expressed genes that demonstrated significant fold changes between conditions were screened for KEGG pathways and Gene Ontology (GO) terms with over-representation enrichment analysis (ORA) in the software WebGestalT [ 35 ]. Using the same software, the full list of expressed genes ranked by a combination of statistical significance and fold-change: was used in gene set enrichment analysis (GSEA). Terms and pathways were considered significant for a FDR < 0.05 (BH adjustment). The Pearson correlation between each pair of selected features was computed using the respective values of expression in all the patients, with the R package Hmisc v5.0-1 [ 36 ] and plotted using the R package pheatmap v1.0.12 [ 37 ].

A Kaplan-Meier survival curve was estimated using the clinical data from the patients of the TCGA-LGG and TCGA-GBM projects, with the R packages survival v3.5-3 for the analysis and survminer v0.4.9 for plotting [ 38, 39 ]. For each evaluated gene (RNA and DNA methylation selected features), the tumor samples were split into two groups: low and high expression levels, using either the median value as cut-off or when existing two distinct density peaks, the threshold is set as the expression for which the density has the local minimum between the peaks. The statistical significance of survival differences in the analysis was assessed using the log-rank test, using the 0.05 significance level. 3 3.1

Results Integration of two omics layers for the classification of glioma types

To discriminate between the three glioma types, namely GBM, astrocytoma, and oligodendroglioma, two components of mRNA and DNA methylation features were first obtained using DIABLO.

Regression analysis with PLS showed that the datasets were highly correlated (cor = 0.8796), and therefore the design matrix was set with a weight of 0.8. Based on the grid search, the number of mRNA variables with non-zero coefficient that was chosen were 41 (out of 19068) for the first component and 7 for the second component, while the number of DNA methylation variables chosen was 11 (out of 297602) for the first component and 41 for the second component.

The performance metrics evaluating the classification of the glioma types, based on the obtained components, are summarized in Table 1. Keeping only two components was sufficient to achieve a good overall accuracy of 98%, with the first component primarily differentiating GBM from the LGG and the second component mostly distinguishing between astrocytoma and oligodendroglioma, as depicted by Figure 1A. It is clear that GBM is considerably easier to identify since it has the highest F1 value and its ROC curve separating from the LGG types has an area under the curve close to 1 in all the components (Figure 1B).

DRG2 and THRAP3, both already reported in glioma studies, and the methylation sites cg24899806 (KCND2), cg19093820 (GPR156), cg26077062 (SYBU), cg04951819 (CCDC77) and cg03780927 (BCL9L), where GPR156 and CCDC77 were never reported in glioma, and the latter also never reported in any other cancer studies. Interestingly, two of the most important features to distinguish oligodendroglioma, the gene FBX042 and the methylated gene CD300LB were also never reported in glioma studies. 3.2

Discrimination of Lower-grade glioma types using three omics layers

Once shown in the previous analysis that the LGG were more challenging to distinguish, a further analysis keeping only these two classes and including an additional data view - miRNA - was performed with DIABLO, again using only two components.

Regression analysis with PLS showed smaller correlation between DNA methylation and miRNA (cor = 0.6027) than the one obtained for the other pairs of data, and therefore the design matrix was set with a weight of 0.6 for this dataset pair and 0.8 for the others. Based on the grid search, the number of mRNA variables with non-zero coefficient that was chosen was 7 in both components, 29 DNA methylation variables for the first component and 42 for the second component, and 11 miRNA variables in both components.

The performance metrics evaluating the classification of the glioma types, based on the obtained components, are present in Table 2 and the AUC obtained are shown in Table 3. An accuracy of 97% was achieved, and the first component was able to distinguish by itself the classes (Figure 2A). Indeed, when observing the circos plot in Figure 2B, we can also see that the averaged expression levels of the selected features do not always separate the two classes (the lines of expression of the second component features tend always to overlap). The features are highly correlated between blocks, where the DNA methylation features are mainly negatively correlated with the expression of the genes, while there are many positive correlations between miRNA and both genes and methylation features.

We have then selected only the variables that occurred more frequently in the first component with non-zero coefficient, given that the ones selected by the second component were not useful to distinguish the classes. The selected variables are shown in Figure 2C (mRNA features in Fig.2C1, DNA methylation features in Fig.2C2 and miRNA features in Fig.2C3), ordered by the respective importance in the components (median absolute value) and grouped by outcome. Note that the selected features are not exactly the ones already selected before when only using mRNA and DNA methylation data (shown in the previous section 2.1). Indeed, the selection still includes some genes that were already chosen before: LRRC41, SFRS4, GPBP1L1 and DRG2, highlighting their relevance for the separation of the classes, while also adding the genes PSMB2, NADK and PPP1R8, but now discarding FBX042 and CSDE1 as relevant genes. Also in the methylation sites, the more relevant methylated genes, such as CD300LB, PLCG1, MAPKAP1, FLJ37543, NANOS1, ASAP1, TMC8 and SNAPC2 are maintained in the selection, while more differences are detected within the comparison of features with lower absolute value. Furthermore, most of the genes and methylation sites are associated with the distinction of the oligodendroglioma class, while the majority of the selected miRNA are more expressed in astrocytoma. It is also interesting to note that the gene GPBP1L1 that was selected in both analyses (2 and 3 views) was never reported in any cancer study. 3.3 3.3.1

Investigation on the selected features Differential gene expression, overrepresentation and gene set enrichment analyses

Differential gene expression analysis was performed between each pair of glioma types to get a more deep insight about the differences in gene expression between the classes. It was also performed to compare the results with the genes selected by DIABLO and understand whether this method gives additional and more useful information than the one that could be directly obtained by just a single-omics analysis. Figure 3A shows the volcano plots of the differentially expressed genes (DEGs) between each pair of glioma types, where the top 15 genes (with the lowest FDR) are labelled. Given the similarity between the LGG types, the DEGs obtained for astrocytoma vs. oligodendroglioma have a considerably lower log(FC) than the ones comparing the other classes. As expected, all the mRNA features selected by DIABLO were included in at least one of the sets of DEGs. Nevertheless, it is interesting to verify that not all the top DEGs were selected by DIABLO as the most relevant features to discriminate between the classes.

Furthermore, overrepresentation (ORA) and gene set enrichment analysis (GSEA) were performed to have a more comprehensive view of the pathways and biological processes that are distinguished between classes (for more details, see Supplementary Material). The genes that are overexpressed in GBM when comparing with both LGG types are mainly associated with biological processes such as defense response, cell adhesion, extracellular matrix (ECM) structure organization and ECM-receptor interaction, while the genes that are underexpressed in GBM enrich for biological processes related with cell-cell signaling, synapse (including its organization, modulation of its transmission and vesicle cycle), regulation of transport, especially monoatomic ion transmembrane transport, and regulation of membrane potential. The biological processes that are differentially upregulated in GBM when comparing only to astrocytoma are mainly cytokine-mediated signalling and leukocyte migration, while with oligodendroglioma include the regulation of immune system process, cell population and leukocyte proliferation, cell migration, motility and secretion and response to cytokine. When comparing the LGG types, the upregulated genes in astrocytoma enrich for biological processes such as cytokine production, hemopoiesis, regulation of immune and defense response, cell and leukocyte proliferation and migration, while the downregulated genes are associated with cell-cell signaling, homeostasis, membrane potential and transport, synapse, nervous system and head development. 3.3.2

Correlation analysis

In the previous subsections, the presence of several high correlations between features of different modalities was detected, when observing the two circos plots in Figures 1 and 2. Therefore, a correlation plot was further obtained with all the selected features, to understand not only the inter-block but also the intra-block correlations, as shown in Figure 3. There is a block that clearly stands out, with 10 methylation sites (between cg10504751 and cg17105609 (Figure 3C)) that are located respectively in the genes GNAO1, MRC2, ARNTL, MCF2L2, TMEM106A, FGFRL1, KCNB1, KIAA1614, FES, FGFRL1, positively correlated with each other and negatively correlated with the group of 27 genes (between DCTD and PION (Figure 3C)). All that features are associated with the distinction of GBM, and suggest involvement in common biological processes or cellular components. Some other smaller blocks with high positive correlations can be detected, such as the one composed by the methylation sites cg13303069, cg07847030 and cg17484733, all located in the gene TCF7L1 and the one composed by the methylation sites cg22762615, cg02179499 and cg07250222, all located in the gene FGFR2. Both these blocks are negatively correlated with the group of genes GPBP1L1, LRRC41, CSDE1, FBXO42 and SFRS4, which are associated with the oligodendroglioma type. Additionally, to understand the possible prognosis value of each of the selected features, the Kaplan-Meier survival curve was then plotted using their gene expression levels. It was performed for GBM and LGG patients separately, given the remarkably different survival times from the TCGA cohorts: generally lower than 5 years for GBM but reaching 15 years in LGG patients, and to compare the different expression profiles in the two groups. The p-values of the log-rank test used to assess the statistical significance of the survival differences are summarized in Figure 3D, listing only the features which have a significant effect on survival in at least one of the groups. Notably, there are four different genes whose expression has an impact on survival probability in both groups of patients: GPBP1L1, PLCG1, SNAPC2 and EXD3. Figure 3B exemplifies the Kaplan-Meier curves of GPBP1L1, selected within the mRNA features, and PLCG1, selected within the DNA methylation features (with p-value lower than 0.01 in both groups of patients). On the one hand, the higher expression of these genes reflects a higher survival probability in GBM patients (left side), while on the other hand, it has the opposite effect in LGG patients, by negatively impacting their survival probability. It is also interesting to note that some genes have a very significant effect (p-value <0.01) in the survival of one group of patients while its expression is helpful to distinguish another type. For instance, the expression of CCDC46 gene impacts the survival of LGG patients, but is useful to discriminate the GBM type, while in the case of THRAP3 gene, we have the reverse situation, distinguishing oligodendroglioma. 4

Discussion

Over the last years, the classification of glioma tumors has been subject of significant alterations as a result of the evolving understanding of its biology derived from extensive research into the molecular profiling of glioma types. The significant role of genetic markers is being increasingly highlighted in the determination of the final diagnosis, and the publicly accessible multi-omics databases, such as TCGA, are particularly helpful in gaining new knowledge regarding glioma characterization. However, the diagnostic categories reported for the TCGA-LGG and -GBM projects datasets are still not entirely consistent with the recently published WHO-2021 guidelines [ 7, 8 ]. In this study, we used the recently revised glioma reclassification and integrate multiple data modalities (including RNA-seq, DNA methylation, and miRNA) to gain more understanding of the biological mechanisms underlying glioma heterogeneity and discover novel biomarkers that can help further characterize each type and therefore be helpful for diagnosis and prognosis of the patients.

Using DIABLO, we were able to find predictive models that classify the three glioma types with high accuracy and identify relevant features in their characterization. This method showed to be more valuable than just applying a single-omics analysis like differential gene expression, where the correlation between features of different modalities is not considered. This network of connections between data layers has thus allowed to extract not only the most differentially expressed genes between classes, but the ones that are important for the separation of the three types, considering also the importance of DNA methylation and miRNA features and the relation established between each other.

It is known that GBM tumors are characterized by a highly malignant profile with increased cell proliferation and aggressiveness relatively to the LGG types, presenting significantly lower survival times even after chemotherapy and radiation treatments [ 40 ]. In this study, two highly correlated groups of molecular features were found to distinguish GBM from LGG, with the expression of the methylated sites positively correlated with each other and, in turn, negatively correlated with the selected genes. Interestingly, all these genes were revealed to be upregulated in GBM when compared with both astrocytoma or oligodendroglioma, while the methylation sites showed a considerably lower average level of methylation. It is worth noting that KIAA0495, which was selected with the highest loading in the distinction of GBM, codes a long non-coding RNA located at the arm of chromosome 1p, the absence of which is characteristic in oligodendroglioma. Nevertheless, the same gene is also downregulated in astrocytoma, where 1p deletion events are not so prevalent. In fact, KIAA0495 has been suggested to modulate cellular apoptosis by regulation of p53-dependent antiapoptotic genes and to promote brain glioma proliferation and invasion, being associated with worse patients’ prognosis [ 41 ]. Other selected features including XKR8, EMP3 and GNAO1 also showed to be involved in the apoptosis process, but only the role of EMP3 was already elucidated in glioma. The overexpression of EMP3 in GBM is likely associated with the lack of EMP3 hypermethylation that is present in LGG types [ 42 ], and its most thoroughly researched function is its regulation of receptor tyrosine kinase (RTK) signalling. The phosphorylation of tyrosine residues in signaling proteins is catalyzed by activated RTKs, starting multi-step signaling cascades which promote differentiation and proliferation. In particular, EMP3 has been shown to foster the phosphorylation of the RTKs EGFR and ErbB2/HER2, as well as their downstream effectors (ERK, PI3K, Akt). Moreover, as a result of the EGFR gene amplification typically present in IDH-wt GBM, EGFR is frequently overactivated in this type, which thus correlates with increased proliferation, apoptosis resistance, and migration of tumor cells [ 43 ]. A set of other selected features, including MSN, EFEMP2 GNAO1, TOM1L1, FES, FGFRL1 and TMEM106A, was also reported to participate in RTK signalling pathways. It is important to highlight the association of CD58, CUL7, MRC2, SHROOM3 and the above-mentioned FES, MSN, EFEMP2, GNAO1 in the extracellular matrix (ECM) organization, and, in special, the role of the last three in the integrin pathway. The ECM act as a biomechanical scaffold and a biochemical regulator of tumor cell homeostasis and is a crucial part of the GBM tumor microenvironment (TME). The ability of GBM tumor cells to migrate by inducing changes in actin cytoskeleton dynamics and ECM remodelling is well documented [ 44 ]. Integrins, as cell adhesion transmembrane receptors, act as ECM-cytoskeletal linkers that transmit signals between cells and the environment. Indeed, tumors can take advantage of the integrin-facilitated biological communication to take part in every stage of cancer progression, including tumor initiation, proliferation, and invasiveness [ 45 ]. In particular, while the MSN gene encodes an Ezrin-radixin-moesin (ERM) family protein that connects the actin cytoskeleton to the plasma membrane, EFEMP2 was reported to be associated with the expression of matrix metalloproteinases (MMPs) [ 46 ], another group of proteins crucial in tissue remodelling. Although never described in glioma studies, EFEMP2 and GNAO1 are reported in pathway databases to be also involved with phospholipase-C (PLC) activity. PLC is stimulated by the P2Y2 receptor when coupled to specific G proteins, to hydrolyze PIP2, which is involved in calcium response, modulates a variety of actin binding proteins and activates Rac1 and RhoA [ 44 ]. These are small GTP-binding proteins of the Rho family known to act as molecular switches to control actin cytoskeleton dynamics. Interestingly, GNAO1 is also linked to G-protein signaling and calcium regulation, and other selected features SWAP70 and MCF2L2 are players in Rho GTPase cycle. Furthermore, the role of GNAO1 in purinergic signaling should be further investigated, since this pathway has been emerging as an important factor giving glioma cells invasive potential and resistance to ATP-induced cell death [ 47 ]. In this context, it has been proposed that besides alterations in the extracellular nucleotide/nucleoside metabolism, the disruption of purinergic signalling creates an inflammatory microenvironment with increase in cytokine production (specifically in IL-1V, IL-6 and TNF) and regulated platelet function [ 47 ]. Notably, several of the selected features are involved in either nucleotide metabolism or transport (DCTD and SLC43A3), in platelet function (TAGLN2) or in cytokine production or regulation (CD58, TMEM106A, FES, MSN and GNAO1). It is interesting to highlight that TMEM106A, a transmembrane protein found to be expressed on the surface of macrophages, specifically induces the release of IL-1V, IL-6 and TNF upon activation of the MAPK and NF-kappaB signaling pathways. This gene was never reported in glioma studies, but showed to be a tumor suppressor in gastric, renal and lung cancer [ 48 ], and therefore might also be also considered a novel marker candidate in glioma. Most of the remaining selected features are associated with metabolism, hemostasis and transport. For instance, ARSD that encodes the protein arylsulfatase D, essentially involved in sphingolipid, estrogen and protein metabolism, only gained attention recently for its role in amyloidosis [ 49 ]. Given that amyloid build-up is a part of the glioma tumor environment and it was inclusively indicated as a potential target for developing a novel class of anti-tumor drugs [ 50 ], more investigation should be dedicated to ARSD gene, whose function was never described in glioma.

Although all these genes were revealed to be relevant in biological processes underlying GBM and especially in its highly proliferative and invasive profile, the survival analysis showed that only the expression of the genes CCDC46 (with updated symbol CEP112), CD58, EFEMP2 and KIAA1614 were significant in patient survival probability, the first two in LGG and the last two in GBM types. In fact, even though these genes are upregulated in GBM patients, they might not directly influence their survivability and still have an impact on LGG. Indeed, CD58 was already reported before to be a prognostic biomarker in LGG [ 51 ]. It is worth noting that besides the involvement in the centrosome cycle, the function of CCDC46 and KIAA1614 in glioma is still unknown and deserves future research.

In contrast to GBM, LGG tumors typically show a more indolent course. However, many might eventually transform into a more aggressive type [ 52 ]. Further characterization of the different omics profiles can possibly offer new insights on tumor microenvironment and development from lower grade to higher grade gliomas. The current classification of LGG types is based mainly on two genetic markers, where the difference between astrocytoma and oligodendroglioma still relies only on the absence of 1p/19q codeletion [ 7 ]. In this study, multiple genetic features, including mRNA, DNA methylation and miRNA, were found to discriminate between astrocytoma and oligodendroglioma. Most of the selected genes were differentially expressed between GBM and oligodendroglioma but not between GBM and astrocytoma, with the exception of the gene DRG2, where the opposite was verified. In general, the expression of those genes was downregulated in oligodendroglioma when comparing with either astrocytoma or GBM, while the expression of DRG2 is upregulated in both oligodendroglioma and GBM when comparing with astrocytoma. Also, there are just a few selected sites where the level of methylation presents a considerable difference between astrocytoma and both the other two types. Therefore, astrocytoma seems to be the more difficult type to define and separate, with large similarities with GBM and oligodendroglioma, while these two are more easily to distinguish. Nonetheless, it is still possible to highlight some few exceptional features that were selected DRG2, KCND2 and C14orf23, which allow for the separation of astrocytoma. Indeed, DRG2 was already reported before to be typically under-expressed in IDH-mutant samples, characteristic of LGG types, explaining its difference to GBM. This gene encodes a GTP-binding protein which catalyzes the conversion of GTP to GDP, and is known to be involved in the regulation of cell growth and differentiation. Although its function in glioma is still not completely elucidated, it was recently demonstrated to be positively correlated with several steps in anti-tumor immune response [ 53 ], and its depletion was shown to promote survival [ 54 ]. In the case of KCND2, it showed a lower level of methylation and a higher expression level in astrocytoma. It encodes a voltage-gated potassium channel that mediates transmembrane potassium transport in excitable membranes, primarily in the brain, regulating neurotransmitter release. Moreover, it also participates in signalling pathways such as ERK [ 55 ], important for cellular proliferation, and GDNF [ 56 ] (Glial cell line-derived neurotrophic factor), which guides glioma-associated microglia/macrophages (GAMs) recruiting in tumor immune resistance [ 57 ].

Furthermore, C14orf23, also known as LINC01551, is a lncRNA that influences cell cycle and transcription. It was never described in glioma, but it was suggested to promote metastatic ability by post-transcriptional regulation of miRNA (by "“sponge adsorption”) [ 58 ]. It would be interesting to further investigate the relationship between this gene and the miRNA selected as key players in glioma in order to understand its role in metastic progression.

From the selected features that distinguish oligodendroglioma, with down-regulated gene expression and higher level of methylation compared to the other types, one should note that the majority are related either with ubiquitination processes (LRRC41, FBXO42, PSMB2), RNA-binding activity, especially mRNA processing and splicing (CSDE1, SFRS4, PPP1R8, GPBP1L1, THRAP3, SNAPC2, NANOS1) or signalling (LRRC41, PSMB2, PPP1R8, DRG2, GPR156, KCND2, BCL9L, PLCG1, MAPKAP1, FGFR2). The signalling pathways influenced by these genes are mainly associated with RTK cascades, including the involvement of ERK, PI3K, Akt, EGFR and Rho GTPases, that were already discussed previously to be associated with increased proliferation and migration of tumor cells, and thus with a more invasive glioma profile. Additionally, other signalling pathways such as Notch and Wnt were also found to be essential in type distinction. For instance, PSMB2 and BCL9L are known to be involved in the Wnt pathway, which activation was verified to increase the stemness of glioma cells [ 59 ]. On the other hand, FBXO42 not only participates in ubiquitination and degradation of p53/TP53, but it was also shown as a critical regulator of the Notch pathway via modulation of RBPJ-dependent global chromatin landscape changes in leukemia [ 60 ]. It would be of great importance to further research the function of FBXO42 in glioma, since it was only reported in other types of cancer. Interestingly, this gene is targeted by (hsa-)miR-186-5p, such as LRRC41 and CSDE1. MicroRNAs (miRNAs) have recently attracted interest and their potential impact on oncogenic processes has been thoroughly studied. Based on altered miRNA expression profiles, it has been possible to identify and diagnose various tumors, as well as to forecast their development, prognosis, and response to treatment. For example, the selected miR-186 was already described in multiple cancers, including glioma, with involvement in the regulation of inflammatory response and apoptosis [ 61 ]. Among the other selected miRNA, it is also highlighted miR-92b, which also targets the selected gene CSDE1 and GPBP1L1. This miRNA was indicated to restrain the proliferation, invasion, and stimulate apoptosis of glioma cells by targeting PTEN/Akt signaling pathway, suggesting a possible antitumor effect in glioma treatment [ 62 ].

Multiple genes revealed to have an important prognostic value, with significant effects on survival probability of LGG (including the genes DRG2, LRRC41, NADK, PPP1R8 and C14orf23) and GBM patients (including the genes THRAP3, PSMB2 and MAPKAP1). Notably, four genes showed to impact both LGG and GBM survival: GPBP1L1, PLCG1, SNAPC2 and EXD3. GPBP1L1, SNAPC2 and EXD3 are likely key players in transcription, where the former is predicted to enable DNA and RNA binding and regulate transcription, the second encodes a subunit of the snRNA-activating protein complex, associated with the TATA box-binding protein, which is necessary for RNA polymerase II and III dependent small-nuclear RNA gene transcription, and the latter is involved in genetic stability and correction of DNA polymerase errors. While SNAPC2 was already identified in glioma studies [ 63 ], EXD3 was only reported in gastric cancer [ 64 ] and GPBP1L1 was never reported in any cancer study. PLCG1, on the other hand, plays an important role in the intracellular transduction of receptor-mediated tyrosine kinase activators, already discussed to be fundamental in actin reorganization and cell migration. It is however very intriguing the relation of the expression profiles of GPBP1L1 and PLCG1 with the survival probability observed between GBM and LGG patients, in Figure 3B, with low expression affecting survival in GBM and high expression affecting survival in LGG. This suggests that the role of these genes and/or their interactions might be different in the two types, yet impacting patient survival in both cases. In fact, the expression of PLCG1 was demonstrated to be significantly correlated with IDH status [ 65 ], which is one of the key characteristics that distinguish LGG and GBM. 5

Conclusions

In this study, we used the recently updated glioma reclassification to further characterize the different glioma types: GBM, astrocytoma and oligodendroglioma. Based on the integration of multiple omics’ layers (mRNA, DNA methylation and miRNA data from TCGA) using a supervised and sparse variant of canonical correlation analysis (DIABLO), we were able to discriminate the three glioma types with very high performance. Indeed, the correlation between blocks showed to be of extreme importance in the selection of the features. The group of correlated features that revealed to be relevant in the distinction of GBM from LGG types were mainly associated with RTK signalling and extracellular matrix organization, which is indeed a crucial part of the GBM tumor microenvironment. On the other hand, the discrimination of the LGG types was characterized mainly by features involved in ubiquitination and DNA transcription processes.

Furthermore, we were able to identify several novel features, that deserve future attention. For instance, the methylation of the gene KIAA1614 is a potential biomarker with both diagnosis and prognosis value in GBM, yet its function is still not elucidated in glioma. Moreover, the gene GPBP1L1 has a great impact in both LGG and GBM patient survival and showed distinct downregulation in oligodendroglioma when compared to both astrocytoma and GBM, but its role in cancer is still not described. Nonetheless, this gene is known to be targeted by miR-92b, a glioma-reported miRNA which was also selected by our method as a relevant feature to distinguish LGG types. Therefore, the interaction between the selected genes and miRNA features might also be essential in further investigation of these novel biomarkers.

Supplementary information

The supplementary file contains the results from the Overrepresentation and Gene Set Enrichment Analyzes, for all pairs of glioma types using GO terms and KEGG pathways.

Acknowledgements

The results presented here are based upon data generated by The Cancer Genome Atlas (TCGA) Research Network.

Funding Statement

This work was funded by national funds through the FCT - Fundação para a Ciência e a Tecnologia, I.P., with references CEECINST/00042/2021, UIDB/00297/2020 and UIDP/00297/2020 (NOVA Math), UIDB/00667/2020 and UIDP/00667/2020 (UNIDEMI), and under the scope of the research project “MONET – Multi-omic networks in gliomas” (PTDC/CCI-BIO/4180/ 2020).

Competing interests

The authors have declared no competing interests.

Authors’ contributions

Francisca G. Vieira: Conceptualization, Methodology, Data analysis, Writing – original draft. Regina Bispo: Conceptualization, Methodology, Writing - review and editing, Supervision. Marta B. Lopes: Conceptualization, Methodology, Writing - review and editing, Supervision, Funding acquisition.

1. Nicholson JG , Fine HA . Diffuse glioma heterogeneity and its therapeutic implications . Cancer Discovery . 2021 ; 11 ( 3 ): 575 - 590 . 2. Wiestler B , Capper D , Holland-Letz T , et al. ATRX loss refines the classification of anaplastic gliomas and identifies a subgroup of IDH mutant astrocytic tumors with better prognosis . Acta Neuropathologica . 2013 ; 126 ( 3 ): 443 - 451 . 3. Capper D , Jones DT , Sill M , et al. DNA methylation-based classification of central nervous system tumours . Nature . 2018 ; 555 ( 7697 ): 469 - 474 . 4. Li L , Wei Y , Shi G , et al. Multi-omics data integration for subtype identification of Chinese lower-grade gliomas: A joint similarity network fusion approach . Computational and Structural Biotechnology Journal . 2022 ; 20 : 3482 - 3492 . 5. Sienkiewicz K , Chen J , Chatrath A , et al. Detecting molecular subtypes from multi-omics datasets using SUMO . Cell Reports Methods . 2022 ; 2 ( 1 ). 6. International Agency for Research on Cancer . WHO classification of tumours of the central nervous system . World Health Organization classification of tumoursIARC5 ed . 2022 . 7. Louis DN , Perry A , Wesseling P , et al. The 2021 WHO classification of tumors of the central nervous system: A summary . Neuro-Oncology . 2021 ; 23 ( 8 ): 1231 - 1251 . 8. Mendonça ML , Coletti R , Gonçalves CS , et al. Updating TCGA glioma classification through integration of molecular profiling data following the 2016 and 2021 WHO guidelines . bioRxiv. 2023 . 9. Qi L , Wang W , Wu T , Zhu L , He L , Wang X. Multi-Omics Data Fusion for Cancer Molecular Subtyping Using Sparse Canonical Correlation Analysis . Frontiers in Genetics . 2021 ; 12 . 10. González I , Déjean S , Martin PG , Baccini A. CCA : An R package to extend canonical correlation analysis . Journal of Statistical Software . 2008 ; 23 ( 12 ): 1 - 14 . 11. Tenenhaus A , Philippe C , Guillemot V , Le Cao KA , Grill J , Frouin V . Variable selection for generalized canonical correlation analysis . Biostatistics . 2014 ; 15 ( 3 ): 569 - 583 . 12. Li W , Zhang S , Liu CC , Zhou XJ . Identifying multi-layer gene regulatory modules from multi-dimensional genomic data . Bioinformatics . 2012 ; 28 ( 19 ): 2458 - 2466 . 13. Zeng Z , Vo AH , Mao C , Clare SE , Khan SA , Luo Y. Cancer classification and pathway discovery using non-negative matrix factorization . Journal of Biomedical Informatics . 2019 ; 96 . 14. Lock EF , Hoadley KA , Marron JS , Nobel AB . Joint and individual variation explained (JIVE) for integrated analysis of multiple data types . Annals of Applied Statistics . 2013 ; 7 ( 1 ): 523 - 542 . 15. Palzer EF , Wendt C , Bowler R , Hersh CP , Safo SE , Lock EF . sJIVE: Supervised Joint and Individual Variation Explained . arXiv e-prints. 2021 :arXiv: 2102 . 13278 . 16. Argelaguet R , Velten B , Arnol D , et al. Multi-Omics Factor Analysis-a framework for unsupervised integration of multi-omics data sets . Molecular Systems Biology . 2018 ; 14 ( 6 ). 17. Argelaguet R , Arnol D , Bredikhin D , et al. MOFA+: A statistical framework for comprehensive integration of multi-modal single-cell data . Genome Biology . 2020 ; 21 ( 1 ). 18. Singh A , Shannon CP , Gautier B , et al. DIABLO: An integrative approach for identifying key molecular drivers from multi-omics assays . Bioinformatics . 2019 ; 35 ( 17 ): 3055 - 3062 . 19. Zhang Y , Gaynanova I. Joint association and classification analysis of multi-view data . Biometrics . 2022 ; 78 ( 4 ): 1614 - 1625 . 20. Safo SE , Min EJ , Haine L . Sparse linear discriminant analysis for multiview structured data . Biometrics . 2022 ; 78 ( 2 ): 612 - 623 . 21. Jin X , Liu L , Wu J , et al. A multi-omics study delineates new molecular features and therapeutic targets for esophageal squamous cell carcinoma . Clinical and Translational Medicine . 2021 ; 11 ( 9 ). 22. Zheng P , Sun S , Wang J , et al. Integrative omics analysis identifies biomarkers of idiopathic pulmonary fibrosis . Cellular and Molecular Life Sciences . 2022 ; 79 ( 1 ). 23. Cai Z , Poulos RC , Liu J , Zhong Q. Machine learning for multi-omics data integration in cancer . iScience . 2022 ; 25 ( 2 ). 24. Mo Q , Shen R , Guo C , Vannucci M , Chan KS , Hilsenbeck SG . A fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data . Biostatistics . 2018 ; 19 ( 1 ): 71 - 86 . 25. Samur MK . RTCGAToolbox: A New Tool for Exporting TCGA firehose data . PLoS ONE . 2014 ; 9 ( 9 ). 26. Colaprico A , Silva TC , Olsen C , et al. TCGAbiolinks: An R/Bioconductor package for integrative analysis of TCGA data . Nucleic Acids Research . 2016 ; 44 ( 8 ): e71 . 27. Rohart F , Gautier B , Singh A , Lê Cao KA . mixOmics: An R package for 'omics feature selection and multiple data integration . PLoS Computational Biology . 2017 ; 13 ( 11 ). 28. Ren X , Kuan PF . methylGSA: a Bioconductor package and Shiny app for DNA methylation data length bias adjustment in gene set testing . Bioinformatics . 2019 ; 35 ( 11 ): 1958 - 1959 . 29. Liberzon A , Birger C , Thorvaldsdóttir H , Ghandi M , Mesirov JP , Tamayo P. The Molecular Signatures Database Hallmark Gene Set Collection . Cell Systems . 2015 ; 1 ( 6 ): 417 - 425 . 30. Amberger JS , Bocchini CA , Scott AF , Hamosh A. OMIM. org: Leveraging knowledge across phenotype-gene relationships . Nucleic Acids Research . 2019 ; 47 ( D1 ). 31. Huang HY , Lin YCD , Cui S , et al. MiRTarBase update 2022 : An informative resource for experimentally validated miRNA-target interactions . Nucleic Acids Research . 2022 ; 50 ( D1 ): D222 - D230 . 32. Chen Y , Lun AAT , Smyth GK . From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline . F1000Research . 2016 ; 5 : 1438 . 33. Robinson MD , Oshlack A. A scaling normalization method for differential expression analysis of RNA-seq data . Genome Biology . 2010 ; 11 ( 3 ). 34. Benjamini Y , Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing . Journal of the Royal Statistical Society: Series B (Methodological) . 1995 ; 57 ( 1 ): 289 - 300 . 35. Liao Y , Wang J , Jaehnig EJ , Shi Z , Zhang B. WebGestalt 2019 : gene set analysis toolkit with revamped UIs and APIs . Nucleic Acids Research . 2019 ; 47 ( W1 ): W199 - W205 . 36. Harrell Jr FE. Hmisc: Harrell Miscellaneous 2023 . R package version 5 .0- 1 . 37. Kolde R. pheatmap: Pretty Heatmaps 2019 . R package version 1 .0.12. 38. Terry M. Therneau , Patricia M. Grambsch . Modeling Survival Data: Extending the Cox Model . New York: Springer 2000. 39. Kassambara A , Kosinski M , Biecek P. survminer: Drawing Survival Curves using 'ggplot2' 2021. R package version 0.4.9. 40. Smith HL , Wadhwani N , Horbinski C . Major Features of the 2021 WHO Classification of CNS Tumors . Neurotherapeutics . 2022 ; 19 ( 6 ): 1691 - 1704 . 41. Xiao S , Wang R , Wu X , Liu W , Ma S. The Long Noncoding RNA TP73-AS1 Interacted with miR-124 to Modulate Glioma Growth by Targeting Inhibitor of Apoptosis-Stimulating Protein of p53 . DNA and Cell Biology . 2018 ; 37 ( 2 ): 117 - 125 . 42. Kunitz A , Wolter M , Van Den Boom J , et al. DNA hypermethylation and Aberrant Expression of the EMP3 Gene at 19q13 . 3 in Human Gliomas . Brain Pathology . 2007 ; 17 ( 4 ): 363 - 370 . 43. Martija AA , Pusch S. The multifunctional role of emp3 in the regulation of membrane receptors associated with idh-wildtype glioblastoma . 2021 . 44. Kłopocka W , Korczyński J , Pomorski P. Cytoskeleton and Nucleotide Signaling in Glioma C6 Cells. in Advances in Experimental Medicine and Biology (Barańska J. , ed.); 1202 : 109 - 128Cham : Springer International Publishing 2020 . 45. Pang X , He X , Qiu Z , et al. Targeting integrin pathways: mechanisms and advances in therapy . Signal Transduction and Targeted Therapy . 2023 ; 8 ( 1 ). 46. Wang L , Chen Q , Chen Z , et al. EFEMP2 is upregulated in gliomas and promotes glioma cell proliferation and invasion . International Journal of Clinical and Experimental Pathology . 2015 ; 8 ( 9 ): 10385 - 10393 . 47. Braganhol E , Wink MR , Lenz G , Battastini AMO . Purinergic Signaling in Glioma Progression . in Advances in Experimental Medicine and Biology; 1202 : 87 - 108 2020. 48. Liu J , Zhu H. TMEM106A inhibits cell proliferation, migration, and induces apoptosis of lung cancer cells . Journal of Cellular Biochemistry . 2019 ; 120 ( 5 ): 7825 - 7833 . 49. Lin Y , Fan L , Zhang R , Pan H , Li Y. ARSD is responsible for carcinoma and amyloidosis of breast epithelial cells . European Journal of Cell Biology . 2022 ; 101 ( 2 ). 50. Zayas-Santiago A , Díaz-García A , Nuñez-Rodríguez R , Inyushin M. Accumulation of amyloid beta in human glioblastomas . Clinical and Experimental Immunology . 2020 ; 202 ( 3 ): 325 - 334 . 51. Sato K , Tahata K , Akimoto K. Five genes associated with survival in patients with lower-grade gliomas were identified by information-theoretical analysis . Anticancer Research . 2020 ; 40 ( 5 ): 2777 - 2785 . 52. Binder H , Willscher E , Loeffler-Wirth H , et al. DNA methylation, transcriptome and genetic copy number signatures of diffuse cerebral WHO grade II/III gliomas resolve cancer heterogeneity and development . Acta neuropathologica communications . 2019 ; 7 ( 1 ): 59 . 53. Chen R , Wu W , Liu T , et al. Large-scale bulk RNA-seq analysis defines immune evasion mechanism related to mast cell in gliomas . Frontiers in Immunology. 2022 ; 13 . 54. Pappula AL , Rasheed S , Mirzaei G , Petreaca RC , Bouley RA. A genome-wide profiling of glioma patients with an idh1 mutation using the catalogue of somatic mutations in cancer database . Cancers . 2021 ; 13 ( 17 ). 55. Adams JP , Anderson AE , Varga AW , et al. The A-type potassium channel Kv4.2 is a substrate for the mitogen-activated protein kinase ERK . Journal of Neurochemistry . 2000 ; 75 ( 6 ): 2277 - 2287 . 56. Shinoda M , Fukuoka T , Takeda M , Iwata K , Noguchi K. Spinal glial cell line-derived neurotrophic factor infusion reverses reduction of Kv4.1-mediated A-type potassium currents of injured myelinated primary afferent neurons in a neuropathic pain model . Molecular Pain . 2019 ; 15 . 57. Catalano M , D'Alessandro G , Trettel F , Limatola C . Role of Infiltrating Microglia/Macrophages in Glioma . in Advances in Experimental Medicine and Biology; 1202 : 281 - 298 2020. 58. Xue MY , Cao HX . LINC01551 promotes metastasis of nasopharyngeal carcinoma through targeting microRNA-132-5p . European review for medical and pharmacological sciences . 2020 ; 24 ( 7 ): 3724 - 3733 . 59. Tao B , Song Y , Wu Y , et al. Matrix stiffness promotes glioma cell stemness by activating BCL9L/Wnt/V-catenin signaling . Aging . 2021 ; 13 ( 4 ): 5284 - 5296 . 60. Jiang H , Bian W , Sui Y , et al. FBXO42 facilitates Notch signaling activation and global chromatin relaxation by promoting K63-linked polyubiquitination of RBPJ . Science Advances . 2022 ; 8 ( 38 ). 61. Li Q , Wu M , Fang G , et al. MicroRNA -186-5p downregulation inhibits osteoarthritis development by targeting MAPK1 . Molecular Medicine Reports . 2021 ; 23 ( 4 ). 62. Song H , Zhang Y , Liu N , et al. miR -92b regulates glioma cells proliferation, migration, invasion, and apoptosis via PTEN/Akt signaling pathway . Journal of Physiology and Biochemistry . 2016 ; 72 ( 2 ): 201 - 211 . 63. Ma J , Hou X , Li M , et al. Genome-wide methylation profiling reveals new biomarkers for prognosis prediction of glioblastoma . Journal of Cancer Research and Therapeutics . 2015 ; 11 : C212 - C215 . 64. Sun D , Liu M , Huang F . Bioinformatics analysis of expression and function of EXD3 gene in gastric cancer . Nan fang yi ke da xue xue bao = Journal of Southern Medical University. 2019 ; 39 ( 2 ): 215 - 221 . 65. Li T , Yang Z , Li H , et al. Phospholipase CW1 (PLCG1) overexpression is associated with tumor growth and poor survival in IDH wild-type lower-grade gliomas in adult patients . Laboratory Investigation . 2022 ; 102 ( 2 ): 143 - 153 .