
Adobe Stock 270629805
Number and Type of Mutations Relate to Risk of Death at Earlier Age, Suggesting Focus for Future Detection and Prevention
Sudden death in children is rare but devastating. A team led from Northwestern University used PSC’s flagship Bridges-2 system to study the full DNA sequences of children up to 20 years old who had suffered sudden death, comparing them to the genes of people who didn’t. Using known sudden-death-associated genes as well as suspected risk-carrying DNA sequences identified by artificial intelligence (AI), the scientists discovered that the type of these genes and an increasing number of risk-carrying variations pose a higher probability of younger age at sudden death. The hope is the findings will help doctors learn to identify and reduce the risk of sudden death in young people.
WHY IT’S IMPORTANT
“Sudden death syndrome” — unexpected death in apparently healthy children — is rare but devastating. It affects about two children out of 100,000 aged one to 17 years. It’s more common in infants under one year old, striking 120 out of 100,000. Even more tragic, some families have a higher genetic risk for sudden death and suffer the loss of multiple children.
“It’s horrible when [sudden death] happens to anyone. But it’s especially difficult when it happens to a younger person and families are left trying to figure out why … As bad as sudden death is, happening twice in a family is far more than twice as bad.”
— Elizabeth McNally, Northwestern University
With colleagues at Northwestern University and elsewhere, Northwestern cardiologist Elizabeth McNally wanted to take a deep dive into the genetic causes of sudden death in young people. To do this, the scientists turned to two unique resources. One of them is the Sudden Death in the Young Case Registry, an effort by the National Institutes of Health and Centers for Disease Control and Prevention that collects clinical data and DNA specimens from sudden death cases in people less than 20 years old. The other was the ecosystem of supercomputers offered by the National Science Foundation’s ACCESS program, including PSC’s flagship Bridges-2.
HOW PSC HELPED
Comparing patients’ entire DNA sequences — their genomes — requires studying many different human genomes simultaneously. There is normally a lot of variation in any single genome, so compiling the data from many genomes at the same time helps sort out what is really rare and unusual. Doing this, though, takes powerful computing.
Carrying out whole genome sequencing to search for mutations hidden in the DNA is like a massive game of Concentration. Unlike, say, calculating the flow of air over the design for a new jet, it isn’t about splitting a giant math problem into tiny pieces that can be number-crunched in parallel. Instead, the computer compares the DNA sequences from many people at the same time to identify the very rare events that create vulnerability for the heart.
Scientists probably haven’t identified all the risk-carrying sequences for heart risk or epilepsy, the two main suspects in sudden death. Because of that, the Northwestern-led team also used an AI tool that compared known disease gene sequences with the patients’ sequences to try to spot unknown variations that also carry risk.
“The complexity of trying to look at [the] whole genome sequence is not the same as when you only look at 150 genes … That’s where [Bridges-2] came in … Our first pass was to look at those same genes that we would [test for] in the clinical setting … But of course we have available all the rest of the genome to begin to better predict risk.”
— Elizabeth McNally, Northwestern University
For that, the team would need a computer with heavy memory, the same as RAM on a laptop. To analyze the number of patients’ genomes necessary for good statistical results, they needed to speed things up by processing numbers of genomes in parallel. That combination of the need for many processors, each having access to large amounts of shared memory, brought them first to Stampede2 at ACCESS member Texas Advanced Computing Center, and then to Bridges-2 at PSC, also a leading member of ACCESS.
Bridges-2 offers massive memory in multiple nodes. Its 504 regular memory nodes each have either 256 or 512 gigabytes of memory. That compares with 16 to 32 GB of memory, total, in a typical high-end laptop. Its four extreme memory nodes each sport a fearsome 4,000 GB of RAM.
McNally and her colleagues compared the genomes of 211 children in the sudden death database to 211 randomly selected people with matched sex and ancestry who had not died suddenly. They drew those “control” sequences from the 1000 Genomes Project database. The team used Bridges-2 to search for sequences in genes associated with cardiomyopathy (disorders of the heart muscle), arrhythmia (irregular heartbeats), and epilepsy, which had been linked to sudden death in infants.
The genomes of kids who had died suddenly showed a higher frequency of rare DNA variations in these genes. Kids who had a larger number of rare genetic variants in specific genes also had a higher risk of earlier death. Over age one, the risk of sudden death was coming largely from variation in cardiomyopathy or arrhythmia genes. Under one, though, the risk was attached more to the prevalence of variation in epilepsy genes. The relationship between epileptic seizures, heart events, and sudden death will be a focus of future work for the Northwestern-led group.
The AI tool also identified several gene variants not seen before that seem to increase risk of sudden death. These sequences could become part of future diagnostic tools to identify sudden death risk and prevent it from happening.
The current paper only looks at DNA sequences that direct the body to make proteins. Human DNA also contains a lot of noncoding sequences, which don’t direct protein synthesis. Scientists aren’t exactly sure what this DNA does, but they suspect it may play a role in regulating the activity of coding DNA. Future investigations may include analysis of the sudden death risk of these sequences. The current work sets the stage for McNally and her collaborators to tackle this even larger computational problem.