Abstract
The Brain Data Alchemy Project, year 3: Public release of a pipeline for teaching research reproducibility and discovery science while mining gold from archived genomics data
Nguyen DM, Hagenauer M, Duan TQ, Flandreau EI, Rhoads C, Xiong J, Bader A, Gyles TM, Hughes BW, Mclain C, Geoghegan E, Drozman A, Bhuiyan MR, Espinoza S, Lewis A, Mensch S, Chennupati L, Nestler EJ, Watson Jr SJ, Akil H
53rd Annual Meeting of the Society for Neuroscience. 2024.
Abstract
During the past decade, the landscape of neuroscience research has undergone two major transformations in the way that data are collected, analyzed, and interpreted. First, there has been an intensive push to reform scientific practices to improve research reproducibility. Second, accelerated growth in computing power and ‘omics knowledge has led to a blossoming of "discovery science". In this new landscape, trainees need to acquire skills that are not included in traditional curriculum.
We have addressed this need by creating an intensive summer program that provides direct, hands-on experience with experimental design and statistical issues related to research reproducibility and discovery science. During the program, trainees conduct a systematic meta-analysis focused on a chosen neuroscience topic using a burgeoning trove of publicly available transcriptional profiling datasets (>15,000 microarray and RNA-Seq datasets).
We successfully piloted the program in 2022 (n=6 trainees), 2023 (n=5 trainees), and 2024 (n=8 trainees). Across a single summer session of 9-10 weeks, participants learned R coding, literature survey, and completed comprehensive genomics meta-analyses capable of serving as either preliminary data for grants or small-scale publications The topics chosen by the trainees for their meta-analyses were diverse, spanning areas such as antidepressant usage, cocaine exposure, chronic stress, and glioblastoma. Each of the meta-analyses revealed a set of differentially expressed genes that can shed light on neuropsychiatric or neurological disorders.
Following three successful pilots, we have standardized the curriculum and meta-analysis pipeline, making them publicly available. For the curriculum, we have provided a week-by-week overview of the activities, readings, and resources used to guide the summer research projects. For the projects, we have registered the meta-analysis pipeline on Protocols.io and Github in a manner that can be easily referenced, reproduced, or adapted, including templates for pre-registering the projects within the Open Science Framework (OSF.io), creating PRISMA diagrams for search terms and inclusion/exclusion procedures, and standardized reporting of results.
In our poster, we will provide a tour of these publicly-available materials as well as a detailed example of a meta-analysis project following our pipeline from beginning to end.