Integrated metabolome and transcriptome analysis of the NCI60 dataset

Gang Su; Charles F Burant; Christopher W Beecher; Brian D Athey; Fan Meng
BMC Bioinformatics. 2011; 12(Suppl 1):S36.


Background: Metabolite profiles can be used for identifying molecular signatures and mechanisms underlying diseases since they reflect the outcome of complex upstream genomic, transcriptomic, proteomic and environmental events. The scarcity of publicly accessible large scale metabolome datasets related to human disease has been a major obstacle for assessing the potential of metabolites as biomarkers as well as understanding the molecular events underlying disease-related metabolic changes. The availability of metabolite and gene expression profiles for the NCI-60 cell lines offers the possibility of identifying significant metabolome and transcriptome features and discovering unique molecular processes related to different cancer types.

Methods: We utilized a combination of analytical methods in the R statistical package to evaluate metabolic features associated with cancer cell lines from different tissue origins, identify metabolite-gene correlations and detect outliers cell lines based on metabolome and transcriptome data. Statistical analysis results are integrated with metabolic pathway annotations as well as COSMIC and Tumorscape databases to explore associated molecular mechanisms.

Results: Our analysis reveals that although the NCI-60 metabolome dataset is quite noisy comparing with microarray-based transcriptome data, it does contain tissue origin specific signatures. We also identified biologically meaningful gene-metabolite associations. Most remarkably, several abnormal gene-metabolite relationships identified by our approach can be directly linked to known gene mutations and copy number variations in the corresponding cell lines.

Conclusions: Our results suggest that integrative metabolome and transcriptome analysis is a powerful method for understanding molecular machinery underlying various pathophysiological processes. We expect the availability of large scale metabolome data in the coming years will significantly promote the discovery of novel biom arkers, which will in turn improve the understanding of molecular mechanism underlying diseases.