A new set of Answer ALS project data is now available through the Answer ALS Data Portal. This data release is another milestone reaching 63% of the total estimated 150 TB of data that will be generated and openly shared. This massive trove of data from multiple biological assessments or “multi-omics” has been combined with study participants’ clinical data and is now publicly released. The datasets are all derived from individual Answer ALS study participants.
“These biological and clinical data lay the foundation for researchers around the world to help tackle the challenge of understanding ALS subgroups, real human-relevant pathways, and the tools to find individuals’ disease-specific drugs,” said Dr. Jeffrey Rothstein, Director, Robert Packard Center for ALS Research at Johns Hopkins University, School of Medicine. He continued, “Representing over half of the data derived from 1,000 ALS patient participants in the study, this data can be the foundation for companies and labs to earnestly tackle ALS biology.”
The newly released epigenomic, transcriptomic, and proteomics data complements existing genomic and clinical data that was previously released for the same study participants. Together, the multi-omic and clinical datasets will help researchers compare the genetic backgrounds between case and control populations. Having these comparisons is important to understand the differences in those with and without ALS and, notably, will also help researchers identify potential differences between sub-groups of ALS patients. By increasing the number of participants with a complete set of clinical and multi-omic data, researchers will have the ability to use deep learning models to deliver better predictions as to the cause of a disease. This will amplify the likelihood of discovering genes and pathways that drive ALS disease progression.
With each new data release, the accuracy of these discoveries and predictions increases. To further help researchers understand and interpret the data, this release also incorporates preliminary analyses including joint genotyping, RNA gene expression, and protein intensity matrices. These processed data files help researchers identify gene variants and affected pathways across populations.
Ed Rapp, Answer ALS Advisory Board Chair and person living with ALS, said, “As I’ve said before, breakthroughs in ALS are like dominoes. You have to get the first domino to fall and once you do, it leads you down a path to success.” Rapp added, “With this data release, we are tipping the dominoes by deploying exponential computing capabilities that allow large data storage, genetic sequencing in an hour, and deeper machine learning into the disease. That effort will then facilitate all other dominoes by the discovery of ALS subgroups and ultimately treatments or a cure.”
Emily Baxi, Ph.D., executive director of the Robert Packard Center for ALS Research at Johns Hopkins added, “Answer ALS researchers continue to build 1,000’s of patient profiles, constructed piece by piece from multiple sources of data. Openly sharing and using the power of AI and machine learning to integrate and analyze these profiles will ultimately help uncover ALS patient subgroups and identify the most effective treatment strategies for each.”