Project 11. The interplay between RNA splicing and R-loop formation.
Alternative splicing is crucial for generating transcriptome and proteome diversity during development and for fostering tissue identity. As a consequence, dysregulation of splicing often occurs in diseases including various types of cancer. Recurrent mutations in core splicing factors, including U2AF1, U2AF2, SF1, SRSF2 and SF3B1, are hallmarks of myelodysplastic syndromes (MDS) and successive leukaemia. These proteins are part of the mRNA ribonucleoprotein (mRNP) complex that assembles co-transcriptionally at the 3’ splice site during the early steps of splicing. The 3’ splice site is defined by cis-regulatory elements, including the branch point, the polypyrimidine tract (Py-tract) and the AG marking the splice site itself. U2AF2 and U2AF1 recognise the Py-tract and the splice site AG, respectively, while SF1 binds the branch point. Together, they recruit SF3B1 as part of the U2 snRNP (small ribonucleoprotein) to the branch point to initiate splicing. During our detailed study of this complex, we recently identified FUBP1 as a new core factor of 3’ splice site definition. FUBP1 binds upstream of the branch point and can stabilise RNA binding of U2AF2. Our data indicate that FUBP1 is particularly important for the splicing of long introns. Alike the other 3’ splice site factors, FUBP1 hosts mutations that have been connected to various types of cancer and MDS.
Intriguingly, the same mutations that trigger widespread splicing alterations in MDS were recently implicated in the formation of R-loops, which are DNA/RNA hybrids with emerging roles in gene expression regulation. For example, work in cell lines showed that high risk alleles in U2AF1, SRSF2 and SF1 lead to elevated R-loop levels, replication stress and DNA damage response. Furthermore, it has recently been shown that inhibition of SF3B1 can lead to the accumulation of R-loops. While a connection between splicing and R-loops has been observed, it is not well understood why inhibition or mutation of splicing factors leads to increased R-loop formation and whether and how this contributes to their pathologic roles in cancer. In this proposal, we will investigate the mechanistic link between pre-mRNA splicing and R-loop formation (Figure 1).
To achieve this, we will address the following aims:
- Aim 1: We will characterize the immediate effects of splicing on R-loop formation.
- Aim 2: We will explore the mechanistic link between splicing and R-loop formation.
- Aim 3: We will dissect the impact of splicing factor mutations.
To learn about the link between splicing alterations and R-loop formation, it will be important to separate direct from indirect effects. To specifically study immediate and direct effects (Aim 1), we will set up fast and efficient depletion of splicing factors with the auxin-inducible degron (AID) system. To this end, we will use CRISPR/Cas9 genome editing to endogenously tag the splicing factors U2AF1/2, SF1, SF3B1 and FUBP1 with the AID in HCT116 cells. This will enable rapid depletion of these factors in usually less than one hour. To study the resulting changes in R-loop formation in a genome-wide manner, we will set up DRIP-seq experiments. These experiments will provide us with genome-wide data on splicing-driven changes in cellular R-loops. Changes in R-loops upon splicing inhibition could directly arise from changes in the kinetics of splicing and/or transcription (Aim 2). To monitor transcription and splicing kinetics in our cell line system, we will perform TT-seq experiments to characterize the nascent transcriptome in control and splicing factor-depleted cells. Bioinformatic comparison of changes in R-loops and the nascent transcriptome will tell us for instance whether R-loops occur preferentially at sites where the splicing of introns is delayed or where transcription slows down significantly. Knowing at which sites changes in the pre-mRNA transcriptome directly impact R-loop formation will guide us towards a mechanistic understanding of the underlying processes. In addition, we want to understand how the specific mutations that recurrently appear in cancers impact these processes (Aim 3). To this end, we will complement splicing factor depletion in our cell lines with ectopic expression of mutated protein variants to learn about their contribution to R-loop formation. Extending beyond our cell line models, bioinformatic integration of the obtained datasets with publicly available data will allow us to learn how splicing and R-loop formation are linked in physiology and might inform us how this becomes relevant in disease. Our group is specialised in functional genomics approaches to RNA biology, and we will support the other groups in the RTG with our expertise to study RNA (e.g., iCLIP, miCLIP and TT-seq).