Undertaking the grand challenge of decoding the causality of the non-coding cancer genome, the MacMillan CSNCG is organized into four complementary themes

  • The non-coding genome determines the gene expression states that define cell types. and the cell type of origin in cancer has been shown to have a strong influence on cancer progression and metastasis. In this theme, we will develop novel, scalable measurement technologies and multimodal perturbation systems to comprehensively identify non-coding genome structure and causal relationships with single cell, spatial, and temporal resolution. By deploying these technologies in close collaboration with the other themes, we can systematically investigate:

    1) What non-coding gene regulatory networks or elements determine cell type and cell state identity?

    2) How do genetic and epigenetic drift, together with altered genomic DNA organization and the cellular microenvironment, alter cell states over time?

    3) Which specific cellular states and regulatory mechanisms drive oncogenesis, metastasis, and therapeutic resistance, and how?

  • Building on the demonstrated expertise in quantitative sciences, artificial intelligence, and machine learning assembled by the MacMillan CSNCG, we will develop novel mathematical frameworks to establish causal inference from high dimensional multi-modal cellular state information. We will:

    1) incorporate genomic, epigenomic, transcriptomic, and proteomic expression states,

    2) develop fitness landscape models of non-coding elements, and

    3) develop predictive modeling of resistance states incorporating genome sequence variation and structure.

  • The role of inherited variation in modifying epigenetic regulation of cell states and their influence on epigenetic rewiring in cancer is not well understood. Large scale genomic instability has the potential to rewire the epigenome and 3D genome structure by changing the physical ordering of underlying DNA sequences. Large-scale ‘omics initiatives have generated broad descriptive datasets and provided a basic framework for investigating normal tissues, but a systematic effort to study cancer genomes has not yet been undertaken. In this theme, we will interrogate the following questions:

    1) How much do inherited epigenetic states and genetic structural variation of the non-coding genome alter gene expression?

    2) How does this affect the fitness landscape of cancer?

  • The relationship between DNA sequence and epigenetic regulation of fitness states is poorly understood, obscuring the causal explanations underlying metastasis and drug resistance. In this theme, we will map cell-cell interactions by systematically interrogating the diverse interactions between cell-type specific enhancers and promoters, chromatin marks and altered DNA methylation states to elucidate the role of the non-coding genome in modifying cancer fitness and investigate the following questions:

    1) How is the non-coding genome involved as a cell-type specific mediator in drug resistance and/or cancer ontogeny?

    2) Are there common causal explanations for the observed drug-resistant phenotypes?

    3) How does the non-coding genome drive the racial and ethnic cancer fitness disparities?