Skip to content

How to find druggable targets for chronic diseases

Share this:

PrecisionLife brings new insight into chronic disease biology through its unique patented combinatorial analytics platform.

Linking Targets to Patients

In spite of the investment of billions in the development of new drugs, the average ROI on this drug R&D fell from 10.8% to 1.8% in the last decade (Deloitte). This is not sustainable. Pharma companies have responded by focusing on early identification of risk in R&D to improve productivity, with significant success. AstraZeneca increased its productivity in from inception to successful Phase III by 5X from 4-19% using its 5Rs framework – selecting the Right Target (strong link to disease), expressed in the Right Tissue, with the Right Safety, Right Patient, and Right Commercial profile.

Nevertheless, many failures still come late – in phase 3 – making them expensive and disruptive. Of the drugs that fail in Phase 3 ~60% are efficacy failures, caused by a mismatch between the selected target and its effects on disease processes in the trial population recruited. Choosing the right target based on an understanding of how this target biology is influenced in your potential patient population is both hard and also pivotal to successful drug R&D.

At PrecisionLife, we seek to understand the druggability and efficacy potential of targets for a disease in the context of high-resolution stratification of patient subgroups. This enables us to prioritize novel drug targets together with the patient stratification analytics and biomarkers that evaluate their likely efficacy in a chosen patient population.

Beyond Cancer and Rare Disease

Cancer and rare diseases are often driven by single mutations, frequently in protein coding regions, which may result in a change in the structure and function of a key protein. These signals are relatively easy to detect using Genome Wide Association Studies (GWAS).

In contrast, most common chronic diseases are more complex and have a much broader range of causes and influences than the single variant/gene-centric approach of GWAS can detect (Tam V, 2019, Skol, 2016, Walsh, 2020). Chronic diseases are highly heterogenous, both in their causative factors and in patients’ symptoms and treatment responses, resulting from tens or even hundreds of genetic variants and other external factors contributing to the underlying disease processes.

When we can identify and quantify the combinatorial effects of all of the disease-causing factors properly, we can accurately stratify diseases into multiple patient subgroups (Ahlqvist E, 2018). Patients in a given subgroup are more likely to have the same disease causes and to benefit from the same drugs. This is fundamental to our strategy for finding novel disease targets.

Example Case Study in COVID-19

Our project datasets can include genomic, expression, metabolic or clinical data. For example, in our recent Severe COVID-19 Risk study, we built the dataset from UK Biobank, which at the time contained 725 severe COVID-19 positive patients (who were hospitalized or died from the disease). Controls were mild/asymptomatic (non-hospitalized) COVID-19 positive patients, gender matched in a ratio of 2:1 against cases. The genotype data for the cohort contained 542,245 SNPs.

We used our PrecisionLife® platform to stratify patients at high-resolution based upon combinations of SNPs (disease signatures) that are associated with severe disease response. This is built on our patented mathematical framework, which is able to traverse very large problem spaces (10300) in a robust, reproducible and computationally efficient manner.

Generation of Candidate Targets

Our combinatorial analysis enables us to find signal in patient data sets that traditional methods miss. For example, GWAS analysis of the dataset found a single LincRNA coding SNP which was reported as significant in severe COVID-19 patients.

plink_gwas_manhattan_plot_unadjusted_pvalues_HS-COV-3-6-800x300

Figure 1: Manhattan plot generated using PLINK of genome-wide p-values of association for the severe COVID-19 UK Biobank cohort (October 2020). The horizontal red and blue lines represent the genome-wide significance threshold at p<5e-08 and p<1e-05 respectively.

Compare this result to PrecisionLife platform which found 2,535 combinations of SNPs (disease signatures that are highly associated with the development of severe COVID-19) in the same dataset. Of the significant SNPs generated, 99% were found in combinations with two or more other SNP genotypes and would therefore not have been identified using standard GWAS techniques (Figure 2). 100% of severe cases were represented by the disease signatures found in this study.

Combinatorial Disease Signatures 2,535
RF Scored SNPs 168
RF Scored Genes 86
Penetrance (%) 100%

Table 1. Summary of the results generated by the PrecisionLife platform for the severe COVID-19 UK Biobank Study

PrecisionLife Severe COVID-19 Results

  • 12 genes involved in host immune response to virus infection
  • 5 genes associated with regulation of inflammatory cytokines
  • 5 genes for lipid storage, signalling and droplet production
  • 12 genes associated with cardiovascular and endothelial cell functions
  • 9 genes associated with Wnt/β-catenin signalling
  • 6 genes associated with neurodegenerative diseases (AD, ALS, PD etc)
  • Druggable targets associated with ARDS, sepsis and other life-threatening complications
  • Further analyses using major US health system’s full COVID-19 patient dataset (>300,000 patients)

Our preprint can be found here

Severe-Covid--1374x1400
Figure 2: Disease architecture of severe COVID-19 patient population generated by the PrecisionLife platform. Each circle represents a disease-associated SNP genotype, edges represent co-associated in patients, and colours represent distinct patient sub-populations.

From SNPs to Targets

The genes that map to SNPs are prioritized to identify disease associated and clinically relevant targets. The PrecisionLife platform found 86 genes strongly associated with patients who developed severe COVID-19 from this dataset.

Clustering the SNPs by the patients in whom they co-occur allows us to generate a disease architecture (network) of severe COVID-19 patients (fig 2), which provides useful insights into patient stratification. We can use this to find genes and biological pathways that are associated with patient sub-populations and co-morbidities, enabling the development of disease biomarkers, drug repurposing and precision medicine strategies.

Efficient Prioritization of Tractable Targets

Targets generated by the PrecisionLife platform undergo a primary and secondary screen in order to prioritize the most druggable and tractable targets for in vitro validation.

Summary-of-Target-Generation-800x380

Figure 3: Summary of novel target identification and in-silico validation using the PrecisionLife platform.

The primary screen evaluates the target’s strength of disease association, whilst the secondary screen shortlists this group further to identify the targets with the highest tractability for drug development strategies (Figure 3). In our COVID-19 study the initial pool of 86 target were reduced to 10 targets with high disease association and potential druggability. These targets all have 5Rs data packages and patient stratification biomarkers by default.

Advantages of the Platform

The PrecisionLife platform is not based on neural nets or machine learning approaches. As a result, the system is not a black box and instead generates detailed, explainable disease models with mechanism of action hypotheses and interactive analysis of the factors influencing disease in patient subgroups. We typically work hypothesis free with highly reproducible analyses. It is also computationally tractable with most of our runs being executed on a 4 GPU server.

The system is data agnostic, so findings from a genotyped or sequence populations analysis can be cross referenced with analyses of epidemiological data and clinical patient populations, as we did recently using de-identified health records in the UnitedHealth Group COVID-19 Data Suite in partnership with UHG. (Preprint).

To explore how we could help you find novel targets in your data, please get in touch using the form below.

References

  1. Deloitte, Measuring the return from pharmaceutical innovation
  2. Tam V et al, Betium. (2012). An inefits and limitations of genome-wide association studies, Nat Rev Genet: 2019
  3. Skol, Andrew D et al. “The genetics of breast cancer risk in the post-genome era: thoughts on study design to move past BRCA and towards clinical relevance.” Breast cancer research : BCR vol. 18,1 99. 3 Oct. 2016, doi:10.1186/s13058-016-0759-4
  4. Roddy Walsh, Rafik Tadros, Connie R Bezzina, When genetic burden reaches threshold, European Heart Journal, Volume 41, Issue 39, 14 October 2020, Pages 3849–3855, https://doi.org/10.1093/eurheartj/ehaa269
  5. Ahlqvist E et al, Novel subgroups of adult-onset diabetes and their association with outcomes: a data-driven cluster analysis of six variables, Lancet Diabetes Endocrinol 6(5): pp361-9, 2018
  6. Boyle, E. A. (2017). An Expanded View of Complex Traits: From Polygenic to Omnigenic. Cell, 1177-1186.
  7. ENCODE Project Consorntegrated encyclopedia of DNA elements in the human genome. Nature, 57-74
  8. Lee S, Abecasis GR, Boehnke M, Lin X. Rare-variant association analysis: study designs and statistical tests. Am J Hum Genet. 2014;95(1):5-23. doi:10.1016/j.ajhg.2014.06.009

Contact us

Ask us a question or contact us to discuss potential collaborations and partnership opportunities by sending us a message here and we'll get back to you as soon as we can.

Form header

Sign Up

Subscribe to our blog