Skip to content

Proving the Power of Combinatorial Analytics

Validation of COVID-19 Insights


by Krystyna Taylor, Senior Portfolio Manager, PrecisionLife

Share this:

At a time when the genetic predisposition to severe COVID-19 was still a mystery, PrecisionLife’s groundbreaking 2020 study on risk factors in severe COVID-19 patient 1 identified 68 novel genes that were associated with serious disease and hospitalization in the first small UK Biobank COVID-19 population. Since then, over 70% of these gene targets have subsequently been independently validated in the global COVID-19 scientific literature, ranging from massively powered multinational GWAS meta-analyses to phase 2 clinical trials.

In May 2020 SARS-CoV-2 was a new threat whose clinical implications for different patients were unclear.

There were no approved vaccines for COVID-19, the RECOVERY trial results had ended hopes of any clinical benefit of hydroxychloroquine, and the only genetic associations identified for severe COVID-19 risk were linked to the ABO gene and a narrow region on chromosome 3 2. We didn’t understand the range of symptoms of the disease or why some patients got mild symptoms and others had life threatening disease.

At PrecisionLife we seized the opportunity to analyze the first small cohort of 779 severe COVID-19 patients reported in the UK Biobank. We wanted to shed light on which host genes were implicated with the most serious forms of the diseases, and why there were so many different symptoms.


Finding the first genes linked to hospitalization risk

Our analysis, completed and written up two weeks later, demonstrated that 68 protein-coding genes had a significant association with hospitalization risk. For all but two there was no published evidence linking these genes to COVID-19, however PrecisionLife categorized them into 6 main pathways that could be associated with dysregulated host immune response to SARS-CoV-2:

  1. Viral replication
  2. Inflammation
  3. Endothelial cell dysfunction (associated with micro-coagulation)
  4. Lipid droplet biology (associated with reduced cell membrane integrity)
  5. Wnt/β-catenin signaling (associated with senescence and apoptosis)
  6. Alzheimer’s associated neurodegeneration (MAPT region)

This provided a range of plausible – if untested – hypotheses around the biological impact of variants in these genes on COVID-19 severity and symptoms.

Worldwide research validates combinatorial analytics

The global scientific efforts since we completed our initial analysis in June 2020 have been prodigious.

48 of the 68 genes identified in our initial genetic analysis using combinatorial analytics have now been independently published in articles in relation to SARS-CoV-2 replication or development of severe COVID-19.

The significance of some targets has been replicated in larger multinational genetic studies, whilst others have been evaluated in preclinical in vitro studies, or are targeted by drugs currently in phase II clinical trials for COVID-19.

This body of evidence provides retrospective validation for the genetic signals found that were entirely novel to the disease at the time of our initial COVID-19 work at PrecisionLife.

COVID 19 disease map compressed

Disease architecture showing high resolution patient stratification from PrecisionLife analysis of severe Covid-19 patients in May 2020 – colors represent patient subgroups, circles represent disease associated SNPs and lines represent co-associated SNPs

Examples of novel discoveries retrospectively validated:

1. Genes

After analyzing fewer than 1,000 cases, PrecisionLife was the first to report a significant association of 5 genes – KANSL, LINC02210, MAPT, SHT and SPPL2C – from the 17q21.31 locus in severe COVID-19 development.

These have subsequently been validated by the results from the COVID-19 Host Genetics Initiative in July 20213, who performed three genome-wide association meta-analyses of 49,562 severe COVID-19 patients with over 2 million mild COVID-19 patients and demonstrated a significant association between this group of genes and hospitalization after SARS-CoV-2 infection.

2. Proteins & Pathways

There is also now significant preclinical evidence as to how many of the genetic variants identified by PrecisionLife impact the course of COVID-19 disease development.

For example, we hypothesized that variants in MLKL could lead to over-activation of the necroptotic inflammatory response, resulting in organ damage, and proposed that an MLKL inhibitor – necrosulfonamide – could reduce COVID-19-associated inflammation. There is now evidence that MLKL is highly upregulated in immune and epithelial cells in the lungs of severe COVID-19 patients and that necrosulfonamide inhibits the release of inflammatory cytokine IL-1β in response to SARS-CoV-2 4.

3. Drugs

PrecisionLife’s genetic signals indicated the potential of 29 drugs and clinical candidates for use in treatment strategies to improve clinical outcomes in severe COVID-19 patients.

13 of these candidates have now been evaluated in COVID-19, ranging from targets in preclinical work investigating impact on SARS-CoV-2 replication and inflammatory cytokine secretion in vitro, to 14 different clinical trials (recruiting/active/completed) measuring the impact of these drugs on disease severity reduction in COVID-19 patients.

The results for many of these clinical trials have not yet been posted, although one that has, dutasteride – an inhibitor of the PrecisionLife COVID-19 gene target SRD5A1 – reduced SARS-CoV-2 viral shedding and inflammation in a randomized, double-blind, placebo-controlled interventional trial.

Combinatorial analytics finds deeper insights faster from smaller patient datasets

We performed our analysis using only genetic data from fewer than 1,000 COVID-19 patients in the UK Biobank at a time when standard analytical approaches such as GWAS failed to find useful signal from much larger populations.

Although the targets identified by PrecisionLife were almost entirely novel at the time of publication, over 70% of them have now demonstrated strong scientific evidence in SARS-CoV-2 infection or COVID-19 development. This shows the power of the combinatorial analytics approach to identify the drivers of complex disease biology in specific patient subgroups with implications for improving novel target discovery, precision drug development, clinical trials, and personalized healthcare.

Further information on our COVID-19 research, including links to our published studies, are available at



  2. COVID-19 Host Genetics Initiative (2021). Mapping the human genetic architecture of COVID-19. Nature600(7889), 472–477.
  3. Xu, G., Li, Y., Zhang, S., et al (2021). SARS-CoV-2 promotes RIPK1 activation to facilitate viral propagation. Cell research31(12), 1230–1243.
  4. Cadegiani, F. A., McCoy, J., Gustavo Wambier, C., & Goren, A. (2021). Early Antiandrogen Therapy With Dutasteride Reduces Viral Shedding, Inflammatory Responses, and Time-to-Remission in Males With COVID-19: A Randomized, Double-Blind, Placebo-Controlled Interventional Trial (EAT-DUTA AndroCoV Trial - Biochemical). Cureus13(2), e13047.

Contact us

Ask us a question or contact us to discuss potential collaborations and partnership opportunities by sending us a message here and we'll get back to you as soon as we can.

Form header

Sign Up

Subscribe to our blog