
Our team focuses on deciphering the dynamics of cancer growth, progression and treatment resistance using mathematical and computational approaches applied to cancer multi-omic data, with the objective of predicting the future course of the disease.
We tackle cancer as a complex system, using rational tissue sampling and integrative genomics as the basis for data generation. We combine the data we generate in the lab with mathematical models of tumor evolution and machine learning methods, with the aim of formulating clinically-driven hypotheses and test predictions that will impact the way we treat cancer.
Our work draws upon ideas from the field of theoretical population genetics that we apply to cancer. For decades, population geneticists have been developing mathematical tools to make sense of complex genetic data. Our lab combines evolutionary theory, as well as mechanistic simulations and machine learning methods, with multi-omics profiling of patient samples and experimental systems to study tumour evolution in a quantitative manner.
Our research focuses on:
As a result of our studies, we also made significant contributions to the debate on neutral evolution vs selection in cancer (go to page).
We tackle cancer as a complex system, using rational tissue sampling and integrative genomics as the basis for data generation. We combine the data we generate in the lab with mathematical models of tumor evolution and machine learning methods, with the aim of formulating clinically-driven hypotheses and test predictions that will impact the way we treat cancer.
Our work draws upon ideas from the field of theoretical population genetics that we apply to cancer. For decades, population geneticists have been developing mathematical tools to make sense of complex genetic data. Our lab combines evolutionary theory, as well as mechanistic simulations and machine learning methods, with multi-omics profiling of patient samples and experimental systems to study tumour evolution in a quantitative manner.
Our research focuses on:
- Measuring cancer evolution in patients: we use (epi)genomic heterogeneity to quantify tumour evolution in humans.
- Predicting cancer evolution: we use time-course and spatial genomic profiling to forecast the course of the disease.
- Designing evolutionary-informed treatments: we develop model systems for experimental evolution to design novel treatment strategies.
As a result of our studies, we also made significant contributions to the debate on neutral evolution vs selection in cancer (go to page).
Measuring cancer evolution in patients

Quantify cellular dynamics from singe cell multi-omics with biologically interpretable machine learning
Single cell technologies have revolutionised biomedical research, and new machine learning methods allow reducing the dimensionality of sparse and noisy data to tractable latent spaces. However, the results of those methods remain hard to interpret due to the inherent non-linearity in their transformations, and the lack of interpretability in terms of biological mechanisms. We developed two orthogonal methods to make sense to single cell data in light of evolution and dynamical systems theory. First, we leveraged on Archetypes Analysis to measure the evolutionary trade-offs between cellular programmes with MIDAA (Milite 2024, biorxiv). Second, in collaboration with the Sanguinetti Lab at SISSA (Triest) we used Neural Ordinary Differential Equations to reconstruct gene regulatory networks that drive the temporal dynamics of single cell data with a new tool called NeuroVelo (Kouadri Boudjelthia, 2023, biorxiv).
Single cell technologies have revolutionised biomedical research, and new machine learning methods allow reducing the dimensionality of sparse and noisy data to tractable latent spaces. However, the results of those methods remain hard to interpret due to the inherent non-linearity in their transformations, and the lack of interpretability in terms of biological mechanisms. We developed two orthogonal methods to make sense to single cell data in light of evolution and dynamical systems theory. First, we leveraged on Archetypes Analysis to measure the evolutionary trade-offs between cellular programmes with MIDAA (Milite 2024, biorxiv). Second, in collaboration with the Sanguinetti Lab at SISSA (Triest) we used Neural Ordinary Differential Equations to reconstruct gene regulatory networks that drive the temporal dynamics of single cell data with a new tool called NeuroVelo (Kouadri Boudjelthia, 2023, biorxiv).

Measuring the co-evolution of cancer and the patient's immune system
Together with the Graham lab (ICR), we developed a series of computational methods to quantify how cancer neoantigens, mutations that stimulate the response of the immune system, evolve during tumourigenesis and how evolutionary forces such as immune-mediated negative selection change the repertoire of tumour immunogenic variants (Lakatos 2020, NatGen; Zapata 2023, NatGen; Lakatos 2024, biorxiv). We found that these dynamics impact the way patients respond to immunotherapy and provide a framework to design better therapeutic strategies in the future.
Together with the Graham lab (ICR), we developed a series of computational methods to quantify how cancer neoantigens, mutations that stimulate the response of the immune system, evolve during tumourigenesis and how evolutionary forces such as immune-mediated negative selection change the repertoire of tumour immunogenic variants (Lakatos 2020, NatGen; Zapata 2023, NatGen; Lakatos 2024, biorxiv). We found that these dynamics impact the way patients respond to immunotherapy and provide a framework to design better therapeutic strategies in the future.

Darwinian genetic and epigenetic evolution vs non-Darwinian cell plasticity in colorectal cancer
In the EPICC study (Evolutionary Predictions in Colorectal Cancer), joint with Trevor Graham's lab, we concomitantly measured multiple layers of cellular information from individual colon cancer crypts. We uncovered novel epigenetic drivers and tumourigenic processes that occurred without genetic alterations but instead through re-wiring of the chromatin (Heide, Househam et al. 2022, Nature). We also found that a significant proportion of transcriptional variation within cancers results from cell phenotypic plasticity, with no direct genetic control. We orthogonally confirmed that many colorectal cancer, after the onset of the malignancy driven by selection for genetic and epigenetic drivers in a Darwinian fashion, subsequently evolve neutrally at the genomic level, while fuelling intra-tumour phenotypic heterogeneity through non-Darwinian mechanisms.
In the EPICC study (Evolutionary Predictions in Colorectal Cancer), joint with Trevor Graham's lab, we concomitantly measured multiple layers of cellular information from individual colon cancer crypts. We uncovered novel epigenetic drivers and tumourigenic processes that occurred without genetic alterations but instead through re-wiring of the chromatin (Heide, Househam et al. 2022, Nature). We also found that a significant proportion of transcriptional variation within cancers results from cell phenotypic plasticity, with no direct genetic control. We orthogonally confirmed that many colorectal cancer, after the onset of the malignancy driven by selection for genetic and epigenetic drivers in a Darwinian fashion, subsequently evolve neutrally at the genomic level, while fuelling intra-tumour phenotypic heterogeneity through non-Darwinian mechanisms.

Combining evolutionary theory with machine learning to measure clonal evolution with MOBSTER
Subclonal reconstruction methods based on machine learning aim to separate those subpopulations in a sample and infer their evolutionary history. However, current approaches are entirely data driven and agnostic to evolutionary theory. We demonstrate that systematic errors occur in the analysis if evolution is not accounted for, and this is exacerbated with multi-sampling of the same tumor. We present a novel approach for model-based tumor subclonal reconstruction, called MOBSTER, which combines machine learning with theoretical population genetics. Using public whole-genome sequencing data from 2,606 samples from different cohorts, new data and synthetic validation, we show that this method is more robust and accurate than current techniques in single-sample, multiregion and longitudinal data (Caravagna et al. 2020, Nature Genetics)
Subclonal reconstruction methods based on machine learning aim to separate those subpopulations in a sample and infer their evolutionary history. However, current approaches are entirely data driven and agnostic to evolutionary theory. We demonstrate that systematic errors occur in the analysis if evolution is not accounted for, and this is exacerbated with multi-sampling of the same tumor. We present a novel approach for model-based tumor subclonal reconstruction, called MOBSTER, which combines machine learning with theoretical population genetics. Using public whole-genome sequencing data from 2,606 samples from different cohorts, new data and synthetic validation, we show that this method is more robust and accurate than current techniques in single-sample, multiregion and longitudinal data (Caravagna et al. 2020, Nature Genetics)

Measuring neutral evolution and positive selection from patient genomic data
Inspired by the Big Bang study, we used genetic data of more than 900 tumours from 14 different types to understand the patterns of tumour expansion in large cohorts. We found that many cancer mutational patterns, despite appearing complex, could be explained by a simple mathematical model of neutral tumour evolution (Williams, Werner et al. 2016, Nature Genetics). This finding suggests that the apparent chaos of cancer genomes could be simplified and understood more effectively using mathematical modelling. In a follow-up study, we then quantified also non-neutral evolutionary dynamics in those cancers where subclonal selection was detectable (Williams et al. 2018, Nature Genetics). We used a combination of computational modelling and theoretical population genetics, applied to patient high-depth tumour DNA sequencing to simultaneously determined the detectable subclonal architecture of a tumour, and measure the selection coefficient, time of occurrence, and mutation rates of selected subclones. These evolutionary parameters can be used to stratify patients based on how their tumour evolves, and also to play models forward to forecast the future evolutionary trajectory of a human malignancy.
Inspired by the Big Bang study, we used genetic data of more than 900 tumours from 14 different types to understand the patterns of tumour expansion in large cohorts. We found that many cancer mutational patterns, despite appearing complex, could be explained by a simple mathematical model of neutral tumour evolution (Williams, Werner et al. 2016, Nature Genetics). This finding suggests that the apparent chaos of cancer genomes could be simplified and understood more effectively using mathematical modelling. In a follow-up study, we then quantified also non-neutral evolutionary dynamics in those cancers where subclonal selection was detectable (Williams et al. 2018, Nature Genetics). We used a combination of computational modelling and theoretical population genetics, applied to patient high-depth tumour DNA sequencing to simultaneously determined the detectable subclonal architecture of a tumour, and measure the selection coefficient, time of occurrence, and mutation rates of selected subclones. These evolutionary parameters can be used to stratify patients based on how their tumour evolves, and also to play models forward to forecast the future evolutionary trajectory of a human malignancy.

'Big Bang' tumour growth
We demonstrated that, after having accumulated a 'jackpot' set of driver alterations, many colorectal cancers grow as a single "Big Bang" expansion, populated by numerous intermixed subclones that are not subject to stringent selection (Sottoriva et al. 2015, Nature Genetics). This produces subclonal variegation, where some distant regions of the tumour are more related than nearby regions. We showed that this mixing behaviour is a sign of malignant potential that developed early during the growth of the cancer. Additional evidence of lack of subclonal selection in established colorectal cancers has been presented in subsequent collaborative work (Sun et al. 2017, Nature Genetics; Cross et al. 2018, Nature Ecology and Evolution).
We demonstrated that, after having accumulated a 'jackpot' set of driver alterations, many colorectal cancers grow as a single "Big Bang" expansion, populated by numerous intermixed subclones that are not subject to stringent selection (Sottoriva et al. 2015, Nature Genetics). This produces subclonal variegation, where some distant regions of the tumour are more related than nearby regions. We showed that this mixing behaviour is a sign of malignant potential that developed early during the growth of the cancer. Additional evidence of lack of subclonal selection in established colorectal cancers has been presented in subsequent collaborative work (Sun et al. 2017, Nature Genetics; Cross et al. 2018, Nature Ecology and Evolution).
Predicting cancer evolution

Predicting recurrence beyond 10 years in prostate cancer
Advanced localised prostate cancer is hard to prognosticate, with often half of the patients recurring within 10y from diagnosis. Since current prognostic biomarkers are suboptimal, we reasoned that cancer evolution metrics could provide a tool to predict who will recur and who will not. Within a large clinical trial, we found that evolutionary divergence at the genetic level but also cell morphology level (Gleason grade), was a strong and independent predictor of 10y recurrence. We also profiled the matched cancer of patients at relapse more than a decade after diagnosis, revealing striking evolution of the disease, with genomic rearrangements massively increased with respect to the primary (Fernandez-Mateos et al. 2024, Nature Cancer). This study lays the ground for evolutionary metrics to be clinical predictors of disease progression in prostate cancer.
Advanced localised prostate cancer is hard to prognosticate, with often half of the patients recurring within 10y from diagnosis. Since current prognostic biomarkers are suboptimal, we reasoned that cancer evolution metrics could provide a tool to predict who will recur and who will not. Within a large clinical trial, we found that evolutionary divergence at the genetic level but also cell morphology level (Gleason grade), was a strong and independent predictor of 10y recurrence. We also profiled the matched cancer of patients at relapse more than a decade after diagnosis, revealing striking evolution of the disease, with genomic rearrangements massively increased with respect to the primary (Fernandez-Mateos et al. 2024, Nature Cancer). This study lays the ground for evolutionary metrics to be clinical predictors of disease progression in prostate cancer.

Detecting repeated cancer evolution in large genomic datasets with machine learning
In cancer evolution, repeatability of genetic changes in different tumours can also indicate predictability. We were interested in identifying sequences of repeated cancer evolution in human malignancies to exploit their predictive potential. Tumour multi-region sequencing is now widely used to dissect a tumour's evolutionary history using phylogenetic approaches. However, the inherent stochasticity of the evolutionary process, as well as noise in the data, make the identification of repeated cancer evolution from tumour phylogenetic trees challenging. We have developed REVOLVER (Repeated EVOLution in cancER), a tool based on a type of machine learning approach called 'Transfer Learning', to analyse and compare sets of phylogenetic trees from multi-region sequencing studies, with the aim of identifying hidden patterns of repeated evolutionary trajectories in the data (Caravagna et al. 2018, Nature Methods). We analysed a total of 768 samples from 178 patients with lung, colon, breast and renal cancer, and found that some of these repeated evolutionary trajectories also correlated with prognosis. Identifying recurrent sequences of evolutionary steps allows not just to make sense of noisy and complex data in the light of tumour evolution, but also to potentially predict a cancer's next step.
In cancer evolution, repeatability of genetic changes in different tumours can also indicate predictability. We were interested in identifying sequences of repeated cancer evolution in human malignancies to exploit their predictive potential. Tumour multi-region sequencing is now widely used to dissect a tumour's evolutionary history using phylogenetic approaches. However, the inherent stochasticity of the evolutionary process, as well as noise in the data, make the identification of repeated cancer evolution from tumour phylogenetic trees challenging. We have developed REVOLVER (Repeated EVOLution in cancER), a tool based on a type of machine learning approach called 'Transfer Learning', to analyse and compare sets of phylogenetic trees from multi-region sequencing studies, with the aim of identifying hidden patterns of repeated evolutionary trajectories in the data (Caravagna et al. 2018, Nature Methods). We analysed a total of 768 samples from 178 patients with lung, colon, breast and renal cancer, and found that some of these repeated evolutionary trajectories also correlated with prognosis. Identifying recurrent sequences of evolutionary steps allows not just to make sense of noisy and complex data in the light of tumour evolution, but also to potentially predict a cancer's next step.

Mathematical modelling of treatment resistance applied to clinical trials
Treatment resistance is the main problem in cancer therapy today. Early prediction of patient relapse would be extremely valuable to determine if a treatment is failing and to intervene early, before the tumour comes back as large and heterogeneous as it was at the start. The use of so-called 'liquid biopsies' to look for traces of circulating tumour DNA (ctDNA) in the plasma of cancer patients, holds immense promise. However, due to inter-patient variability, simple cohort-level measurements of presence/absence or quantity of ctDNA are unlikely to be reliable predictors. We combined frequent ctDNA profiling over time (every 4 weeks) of colorectal cancer patients undergoing EGFR targeted treatment within a phase II trial, with mathematical modelling of treatment resistance (Khan, Cunningham, Werner et al. 2018, Cancer Discovery). This allowed quantitative forecasting of the waiting time to progression in a significant proportion of patients. These results show that it is possible to make quantitative patient-specific predictions of when the tumour will relapse, thus creating a 'window of opportunity' for early intervention.
Treatment resistance is the main problem in cancer therapy today. Early prediction of patient relapse would be extremely valuable to determine if a treatment is failing and to intervene early, before the tumour comes back as large and heterogeneous as it was at the start. The use of so-called 'liquid biopsies' to look for traces of circulating tumour DNA (ctDNA) in the plasma of cancer patients, holds immense promise. However, due to inter-patient variability, simple cohort-level measurements of presence/absence or quantity of ctDNA are unlikely to be reliable predictors. We combined frequent ctDNA profiling over time (every 4 weeks) of colorectal cancer patients undergoing EGFR targeted treatment within a phase II trial, with mathematical modelling of treatment resistance (Khan, Cunningham, Werner et al. 2018, Cancer Discovery). This allowed quantitative forecasting of the waiting time to progression in a significant proportion of patients. These results show that it is possible to make quantitative patient-specific predictions of when the tumour will relapse, thus creating a 'window of opportunity' for early intervention.
Designing evolutionary-informed treatments

Epigenetic-driven drug resistance in patient-derived organoids
Drug resistance is an unsolved problem in oncology. Genetic mutations, driven by Darwinian clonal selection under the pressure of drugs, only explain a minority of this phenomenon. We wanted to understand to what extent cancer drug resistance is heritable and what were the causes of it. We set up a new experimental system based on patient-derived organoids to measure the predictability and causes of drug resistance ex vivo. We found that in the context of targeted therapies, drug resistance evolution was highly repeatable and predictable, and involved heritable epigenetic changes. Moreover, different subclones responded differently to distinct drugs, supporting the use of evolutionary therapies to control drug resistance (Oliveira, Milite, Fernandez-Mateos et al. 2023, biorxiv).
Drug resistance is an unsolved problem in oncology. Genetic mutations, driven by Darwinian clonal selection under the pressure of drugs, only explain a minority of this phenomenon. We wanted to understand to what extent cancer drug resistance is heritable and what were the causes of it. We set up a new experimental system based on patient-derived organoids to measure the predictability and causes of drug resistance ex vivo. We found that in the context of targeted therapies, drug resistance evolution was highly repeatable and predictable, and involved heritable epigenetic changes. Moreover, different subclones responded differently to distinct drugs, supporting the use of evolutionary therapies to control drug resistance (Oliveira, Milite, Fernandez-Mateos et al. 2023, biorxiv).

.Evolutionary steering and collateral drug sensitivity in large populations without re-plating
Drug resistance mediated by clonal evolution is arguably the biggest problem in cancer therapy today. However, evolving resistance to one drug may come at a cost of decreased growth rate or increased sensitivity to another drug due to evolutionary trade-offs. This weakness can be exploited in the clinic using an approach called ‘evolutionary steering’ that aims at controlling the tumour cell population to induce collateral drug sensitivity and delay resistance. However, recapitulating cancer evolutionary dynamics experimentally remains challenging. In (Acar, Nichol et al. 2020, Nature Communications) we present a novel approach for evolutionary steering based on a combination of single-cell barcoding, very large populations of 10^8–10^9 cells grown without re-plating, longitudinal non-destructive monitoring of cancer clones, and mathematical modelling of tumour evolution. We demonstrate evolutionary steering in non-small cell lung cancer, showing that it allows shifting the clonal composition of a tumour in our favour, leading to collateral drug sensitivity and proliferative fitness costs. A blog piece in Nature Cancer Research Community explains the story and vision behind this project: https://cancercommunity.nature.com/users/390015-andrea-sottoriva/posts/66407-exploiting-evolutionary-steering-to-induce-collateral-drug-sensitivity-in-cancer.
Drug resistance mediated by clonal evolution is arguably the biggest problem in cancer therapy today. However, evolving resistance to one drug may come at a cost of decreased growth rate or increased sensitivity to another drug due to evolutionary trade-offs. This weakness can be exploited in the clinic using an approach called ‘evolutionary steering’ that aims at controlling the tumour cell population to induce collateral drug sensitivity and delay resistance. However, recapitulating cancer evolutionary dynamics experimentally remains challenging. In (Acar, Nichol et al. 2020, Nature Communications) we present a novel approach for evolutionary steering based on a combination of single-cell barcoding, very large populations of 10^8–10^9 cells grown without re-plating, longitudinal non-destructive monitoring of cancer clones, and mathematical modelling of tumour evolution. We demonstrate evolutionary steering in non-small cell lung cancer, showing that it allows shifting the clonal composition of a tumour in our favour, leading to collateral drug sensitivity and proliferative fitness costs. A blog piece in Nature Cancer Research Community explains the story and vision behind this project: https://cancercommunity.nature.com/users/390015-andrea-sottoriva/posts/66407-exploiting-evolutionary-steering-to-induce-collateral-drug-sensitivity-in-cancer.