Skip to content

Why poker players win at combinatorial analysis…

Share this:

PrecisionLife brings new insight into chronic disease biology through its unique patented combinatorial analytics platform.

Chronic disease is complex

Chronic disease biology is characterized by complexity. Chronic diseases have a much broader range of causes and influences than the single SNP/gene-centric approach of GWAS can detect (Tam V, 2019). For most chronic diseases the genetic associations of single mutations appear to be unexpectedly low when tested across a whole patient population, even those in genes known to be important in a disease (Skol, 2016Walsh, 2020).

Chronic diseases are highly heterogenous, both in their causative factors and the way in which patients experience the disease – their symptoms, severities, and treatment responses. There are often multiple, possibly tens or even hundreds, of genetic variants and other external factors contributing to the underlying disease processes. When we can identify and quantify the effect of those properly, we can accurately stratify diseases into multiple patient subgroups (Ahlqvist E, 2018), whose patients are more likely to have the same disease causes and to benefit from the same drugs. This is fundamental to extending the promise of precision medicine outside of the narrow confines of oncology and rare disease.

It is not simply that in many diseases there is more than one variant affecting a disease process in a classical pairwise epistatic view. Chronic diseases have a complex and nuanced interconnectedness of regulatory networks (Boyle, 2017) influenced by multiple genetic control regions (ENCODE Project Consortium, 2012). Cellular biology is full of genetic networks, expression control systems and interlocking feedback loops, where changes in one component may inhibit or amplify the effects of many others in a complex interacting metabolic dance.

Chronic disease biology needs a news maths…

The combination of more than one mutation brings in the potential for feedback loops between the genes (and their pre-cursors and protein target products) the effect that a specific combination has on disease is inherently non-linear and unpredictable.

Existing genetic analysis tools largely assume that the contributions of multiple SNPs can simply be added together, with some sophisticated weighting schemes and statistical methods applied. This approach, however sophisticated, ignores much of the underlying biology.

…Or better poker players

This is analogous to trying to guess the winning hand in a game of poker by adding together the face value of the cards in each hand. Imagine the following scoring scheme for face value of cards (the suit is unimportant):

playing-cards-1-800x327

Figure 1 Face value scores for cards in a deck (Jack/Knave = 11, Queen= 12, King = 13 and Ace = 15)

Imagine dealing hands of five cards to four players. When we add up the face values of their individual hands, one has a total value of 60, another has a value of 46, one 30 and the last one 18 (See table below). If you were now asked to guess the winner from this limited information, knowing that higher cards win over lower cards in otherwise equivalent hands, you would probably choose Player 1 as the winning hand.

hand-1

Player 1

Face Value = 60

hand-2

Player 2

Face Value = 30

hand-3

Player 3

Face Value = 46

hand-4

Player 4

Face Value = 18

In fact, if you chose either of the hands with the top two face value scores, you would be wrong. Player 2, in spite of scoring just 30, holds a straight flush, the second highest hand in poker, and is the winner. In this case the combination of cards is much more important than their values – a straight flush running from 2-6 of clubs (scoring just 20 on face value) would beat four Aces (scoring 60+ on face value).

For outcomes that depend on combinations of features, like multiple common variants, scoring based on adding single features together doesn’t work reliably. A combination of otherwise unremarkable features can come together to have an unpredictably large impact. As a strategy with no other information available it might (like all linear additive models) enable some prediction of some winning hands, but it will always be beaten by a predictor that knows whether a hand is a full house or a straight flush.

The same is true in genetics, where we have essentially based most of our analyses and models on guesses derived from the face value of mutations (their odds-ratios for disease association when evaluated individually in isolation). In reality, non-linear additive effects are hugely pervasive in complex traits and diseases. In these disease, multiple common variants that individually have low odds ratios in a GWAS analysis can exert a large effect in combination.

For precision medicine to become a reality in chronic diseases, we need new combinatorial analytics tools that can identify combinations of features that together are associated with disease risk or patient response, and which can measure the actual degree of association observed in the patient population.

At PrecisionLife, our patented mathematical framework is allowing us to unpick complex disease biology, using high resolution patient stratification to explain the underlying disease biology.  We are identifying novel targets, patient stratification biomarkers that have application in clinical trials population inclusion, target efficacy as it relates to patient populations and disease risk. We will post more blogs in the near future. 

In the meantime, you can also visit our Disease Summaries to see the summary results of our analyses.

References

  1. Tam V et al, Benefits and limitations of genome-wide association studies, Nat Rev Genet: 2019
  2. Roddy Walsh, Rafik Tadros, Connie R Bezzina, When genetic burden reaches threshold, European Heart Journal, Volume 41, Issue 39, 14 October 2020, Pages 3849–3855, https://doi.org/10.1093/eurheartj/ehaa269
  3. Ahlqvist E et al, Novel subgroups of adult-onset diabetes and their association with outcomes: a data-driven cluster analysis of six variables, Lancet Diabetes Endocrinol 6(5): pp361-9, 2018
  4. Boyle, E. A. (2017). An Expanded View of Complex Traits: From Polygenic to Omnigenic. Cell, 1177-1186.
  5. ENCODE Project Consortium. (2012). An integrated encyclopedia of DNA elements in the human genome. Nature, 57-74
  6. Lee S, Abecasis GR, Boehnke M, Lin X. Rare-variant association analysis: study designs and statistical tests. Am J Hum Genet. 2014;95(1):5-23. doi:10.1016/j.ajhg.2014.06.009

Contact us

Ask us a question or contact us to discuss potential collaborations and partnership opportunities by sending us a message here and we'll get back to you as soon as we can.

Sign Up

Subscribe to our blog