(Higher Education Institutional Excellence Programme of the Ministry for Innovation
and Technology in Hungary, within the framework of the Bionic thematic programme of
the Semmelweis University)
(ÚNKP-19-3-IV-SE-5) Támogató: Innovációs és Technológiai Minisztérium
(ELIXIR Hungary)
Szakterületek:
Klinikai orvostan
Onkológia
Orvos- és egészségtudomány
Large oncology repositories have paired genomic and transcriptomic data for all patients.
We used these data to perform two independent analyses: to identify gene expression
changes related to a gene mutation and to identify mutations altering the expression
of a selected gene. All data processing steps were performed in the R statistical
environment. RNA-sequencing and mutation data were acquired from TCGA. The DESeq2
algorithm was applied for RNA-seq normalization, and transcript variants were annotated
with BioMart. MuTect2-identified somatic mutation data were utilized, and the MAFtools
Bioconductor program was used to summarize the data. The Mann-Whitney test was used
for differential expression analysis. The established database contains 7,876 solid
tumors from 18 different tumor types with both somatic mutation and RNA-seq data.
The utility of the approach is presented via three analyses in breast cancer: gene
expression changes related to TP53 mutations, gene expression changes related to CDH1
mutations, and mutations resulting in altered PGR expression. The breast cancer database
was split into equally sized training and test sets, and these datasets were analyzed
independently. The highly significant overlap of the results (chi-square statistic
= 16719.7 and p<0.00001) validates the presented pipeline. Finally, we set up a portal
at http://www.mutarget.com enabling the rapid identification of novel mutational targets.
By linking somatic mutations and gene expression, it is possible to identify biomarkers
and potential therapeutic targets in different types of solid tumors. The registration-free
online platform can increase the speed and reduce the development cost of novel personalized
therapies.