Department of Molecular Virology, Immunology, and Medical Genetics, The Ohio State University,
Columbus, OH, USA
Corresponding author: Nicola Zanesi, PhD. 1094 Biomedical Research Tower,
460 W 12th Ave., Columbus, OH 43210, USA
Ph: +1-614-292-3318; Fax: +1-614-292-4097
Keywords: CFSs, common fragile sites, aphidicolin, genomic instability, FRA3B, FRA16D, CFS tumor suppressor genes
Specific alterations in the genome that modify the expression of genetic elements involved in the regulation of cell growth and maintenance of genomic integrity are responsible for driving tumorigenesis. These changes are not random, even though each tumor has a particular set of genome alterations. Typically, overexpression of oncogenes and inactivation of tumor suppressor genes occur often and are being extensively studied. Moreover, in malignant cells there is a group of genomic loci that is frequently unstable and contributes actively to tumorigenesis, the common fragile sites (CFSs) (Casper et al., 2012). These regions are non-random sites on chromosomes that under conditions of DNA replication stress, such as mild inhibition of DNA polymerase activity, form gaps and breaks (Glover et al., 1984). As signified by the "common" in their name, CFSs occur at specific chromosome bands of all humans and are a normal component of the chromosomal structure (Durkin et al., 2008). These loci are conserved in other mammals, including, but not limited to, primates and rodents (Fungtammasan et al., 2012). Endogenous and exogenous factors, such as hypoxia, chemotherapeutics and other pharmaceuticals, exposure to radiations, pesticides, cigarette smoke, caffeine and alcohol, may trigger activation of replication fork stress and DNA breaks at CFSs in vivo (Dillon et al., 2010). On the other hand, in vitro, a subset of CFSs (aCFSs) may be induced by aphidicolin, an inhibitor of DNA synthesis that, by affecting DNA polymerases alpha, delta and epsilon, has been shown to activate most fragile sites (Mrasek et al., 2010), inducing gaps that are microscopically visible in metaphase chromosomes.
At the molecular level, the phenomenon of common fragility on chromosomes is still not completely understood (Brueckner et al., 2012). The ATR DNA damage checkpoint pathway has been suggested to have an important role in maintaining the stability of CFSs since a deficiency of proteins associated with this pathway, like ATR, BRCA1, and CHK1, results in increased breakages of CFSs (Casper et al., 2002; Durkin et al., 2008). Moreover, CFSs fragility has been associated with late DNA replication (Debatisse et al., 2006) and histone hypoacetylation (Jiang et al., 2009). It has also been hypothesized that, following DNA replication stress CFSs instability derives from prolonged single-stranded regions of unreplicated DNA accumulating at stalled replication forks that escaped the ATR replication checkpoint (Brueckner et al., 2012). In fact, some aCFSs with delayed late replication due to aphidicolin treatment can enter G2 with only 50% of some aCFSs regions completely replicated (Pelliccia et al., 2008). DNA breakage within aCFSs is thought to derive from failing to complete replication prior to the end of telophase and chromosome segregation (Chan et al., 2009). It has been recently shown that the activity of topoisomerase I is necessary for CFSs fragility due to the requirement for polymerase - helicase uncoupling (Arlt and Glover, 2010). It has been suggested that impaired replication of such regions may be due to the formation of stable secondary structures in their DNA sequences (Burrow et al., 2010; Zlotorynski et al., 2003).
Many of the CFS genomic loci have not yet been molecularly defined. Thus far, the relatively well characterized CFSs are the following: FRA1E, FRA2C, FRA2G, FRA3B, FRA7G, FRA9G, FRA13A, FRA16D, and FRAXB (Brueckner et al., 2012), summarized in Table 1, that are all AT-dinucleotide-rich sites spanning between 300 kb and 1 Mb (Schwartz et al., 2006). Unlike rare fragile sites, in which fragility is attributable to either CGG repeat expansions or AT-rich minisatellites (Sutherland, 2003), in CFSs no such long repeat motifs have been found. However, the nine CFSs defined at molecular level seem to be characterized by segments of discontinuous AT-rich sequences potentially forming secondary structures able to affect replication fork progression and thus leading to chromosomal breakage (Dillon et al., 2010; Zlotorynski et al., 2003). Accordingly, it has been reported that specific DNA sequences, such as [A/T]n and [AT/TA]n repeats, and/or the formation of non-B DNA secondary structures within aCFSs can inhibit replicative DNA polymerases (Shah et al., 2010) and the progression of replication forks (Zhang and Freudenreich, 2007). Recently, scarcity of replication origins, inefficient origin initiation, and failure to activate latent origins have all been proposed to play a role in delayed replication at specific aCFSs (Letessier et al., 2011; Ozeri-Galai et al., 2011).
The Debatisse Laboratory reported the most important new findings in the recent years, showing that CFSs differ in different tissue types and are caused by the paucity of replication origins within the regions - i.e. both FRA3B and FRA16D have replication origins flanking the fragile locus and must replicate the DNA from flanking sites to meet in the middle late S or in G2, in lymphocytes, but the placement of replication origins is different in fibroblasts and these loci are much less fragile in fibroblasts, while different loci are more fragile in fibroblasts. Obviously, this may apply to other tissue types too and shows that the position of fragile regions in specific tissues is due to an epigenetic mechanism that determines the placement of replication origins. This suggests that all the hypothesizing about the effect that specific sequences have in fragile regions may become questionable (Debatisse et al., 2012; Huebner, 2011; Letessier et al., 2011).
Table 1. Association of the best characterized common fragile sites with their chromosome regions and genes affected by their activity. Modified from Saxena, 2012.
While rare fragile sites are generally associated with a single DNA element, several sequence motifs spread along an aCFS locus may determine its fragility (Durkin et al., 2008; Ragland et al., 2008) thus making the characterization of aCFSs a computational challenge (Fungtammasan et al., 2012). However, previous analyses of single aCFSs showed that these loci are enriched in Alu repeats (Tsantoulis et al., 2008), gene-coding regions (Helmrich et al., 2006), histone hypoacetylation (Jiang et al., 2009), high DNA flexibility sequences, and highly AT-rich sequences (Mishmar et al., 1998). Nevertheless, these sequence characteristics seemed not to be associated with the propensity for DNA gaps, breaks, deletions and other genomic rearrangements at CFSs; for example, LINE1 elements are common in the fragile site FRA3B but quite rare in FRA16D, while Alu repeats are dominant in the latter (Ried et al., 2000).
The organization of human chromosomes was traditionally investigated by a variety of banding methods (Comings, 1978). Yunis and Soreng observed that several types of fragile sites are more frequent in R bands that have a relatively high gene and CpG island density and correspond to early replicating genomic regions (Yunis and Soreng, 1984).
Among different CFSs the level of fragility is variable and the most fragile and bestcharacterized CFS in the entire human genome is FRA3B at chromosome band 3p14.2 (Mrasek et al., 2010). The second and third most active CFSs are FRA16D and FRAXB respectively at 16q23.2 and Xp22.3. Generally, in somatic cells CFSs are stable but in many cancers they display frequent chromosomal aberrations. Lung, kidney, breast, and digestive tract malignancies are mainly where heterozygous and homozygous deletions are identified as the most common genomic rearrangements in CFSs (Arlt et al., 2006). All CFSs investigated at molecular level up to now contain protein-coding genes, most of which extend over hundreds of kilobases of DNA (Smith et al., 2007). The FHIT and WWOX genes encompassing FRA3B and FRA16D, respectively, are both > 1 Mb in length and have been shown to exhibit tumor suppressor activity in vivo and in vitro (Drusco et al., 2011; Lewandowska et al., 2009; Saldivar et al., 2010). There are many reports of deletions within CFSs harboring these genes (McAvoy et al., 2007). Actually, the fact that very large genes present in mammalian genomes are preferentially affected by deletions in tumor cells suggests that these genes are all CFSs in the cell type in which they are expressed (Debatisse et al., 2012). Mitotic sister chromatid exchanges are often described at CFSs (Durkin et al., 2008), which suggests that CFS breaks may possibly drive loss of heterozygosity (LOH) in cancer cells when the repair occurs by homologous recombination.
During neoplastic progression, damage at CFS regions seems to be among the earliest occurrences, mainly due to DNA replication stress (Halazonetis et al., 2008) as suggested by the presence of these genomic alterations in pre-neoplastic lesions (Lai et al., 2010). Oncogene amplification and preferred integration sites for some oncogenic viruses are also triggered by CFS activity (Brueckner et al., 2012).
Germline genomic alterations in CFSs seem also to lead to other human illnesses of nonmalignant origin. In support of this possibility is the recent sequencing of breakpoint junctions in the CFS genes PARK2 at FRA6E and DMD at FRAXC in many patients affected, respectively, by juvenile Parkinsonism and muscular dystrophies (Mitsui et al., 2010). Somatic breakpoints in cancer cell lines and germline breakpoints within PARK2 and DMD shared some features that suggested involvement of common mechanisms in the generation of CFS rearrangements.
DNA replication and gene transcription are basic biological processes essential for cell division and growth. Large protein complexes moving at high speed along the chromosomes, and for long distances, make such processes possible. The RNA polymerase II (Pol II) enzyme, in mammalian cells, transcribes 18-72 nucleotides of DNA per second into RNA (Darzacq et al., 2007). One of the longest human loci, the 2.2 Mb dystrophin gene, is transcribed over a period of 16 hours (Tennyson et al., 1995) and similar figures are reported for other long genes. As for typical fast-cycling mammalian cells the cell cycle time is approximately 10 hours, it is expected that these long-term transcription cycles interfere with replication in cell cycle S phase. Unlike in bacteria, transcription and replication in higher eukaryotes are coordinated events that take place within domains spatially and temporally separated (Wei et al., 1998). Usually transcription occurs in G1 phase and sometimes in S phase. When this happens, transcription is thought to be spatially separated from replication sites (Vieira et al., 2004). Gene expression induction in mammalian cells caused recombination processes within the transcription unit, thus suggesting that collisions between replication and transcription complexes provoke instability at the genomic level (Gottipati et al., 2008). Recently, Helmrich et al. demonstrated that the time required to transcribe human genes larger than 800 kb spans more than one complete cell cycle, while their transcription speed is equivalent to that of smaller genes. CFS instability depends on the expression of the underlying long genes and may be suppressed by RNase H1 enzyme when intervenes on R-loops, which are RNA:DNA hybrids between nascent transcripts and the DNA template strand, while the nontemplate strand remains as single-stranded DNA (Helmrich et al., 2011).
The wealth of genome-wide profiling studies now available offers unique opportunities to study causes of genome instability in depth. Current evidence suggests that aCFSs are caused by a series of genomic factors (Dillon et al., 2010). Consequently, building a statistical model that takes into consideration multiple factors simultaneously is thought to be more biologically reliable on the contribution to fragility by the diverse genomic features. Moreover, studies usually do not incorporate in their models the different breakage frequencies of aCFSs.
To better understand the relationship between aCFSs and their genomic contexts, Fungtammasan et al. built statistical models to explain the fragility of well-characterized aCFSs by considering their genomic neighborhoods and comparing them with non-fragile regions (NFRs) (Fungtammasan et al., 2012). The authors focused on aphidicolin-induced CFSs because they are well-characterized genomewide (Mrasek et al., 2010), are the most numerous CFSs, and fragile sites induced by other agents might have different breakage mechanisms and characteristics. Multiple logistic regression was used to predict the probability of a given region to be either an aCFS or an NFR and multiple linear regression for the prediction of expected breakage frequency. Eventually these models were validated using mouse fragile sites (Fungtammasan et al., 2012). Results showed that local genomic features are effective predictors both of regions harboring aCFSs, explaining circa 77% of the deviance in logistic regression models, and of aCFS breakage frequencies, explaining approximately 45% of the variance in standard regression models. In models with the highest explanatory power, aCFSs are mainly located in G-negative chromosomal bands and far from centromeres, are enriched in Alu repeats, and have high DNA flexibility. In addition, aCFSs have high fragility when co-located with evolutionarily conserved chromosomal breakpoints (Fungtammasan et al., 2012).
In order to investigate the mechanisms of CFS-induced breaks, Casper et al. asked whether the flexibility peaks that have been identified within human CFS FRA3B are hotspots of instability (Casper et al., 2012). These authors, to analyze the consequences of CFS breaks, also investigated whether repair of fragile site breaks drives LOH events due to mitotic homologous recombination. To gather detailed data on exact break locations within CFSs, a yeast artificial chromosome (YAC) containing the human locus FRA3B was used. Data suggested that break sites are not randomly distributed, but rather clustered at the centromere-distal end of the FRA3B sequence insert. They also took advantage of a naturally occurring yeast fragile site known as FS2 (fragile site 2) to study mitotic homologous recombination. Similar to human CFSs, recurrent breaks at FS2 occur where replication is impaired because of stressful conditions (Lemoine et al., 2005). Results demonstrated that LOH is, in fact, a consequence of mitotic recombination between homologous chromatids with reciprocal crossovers at FS2 induced by inhibition of yeast DNA polymerase (Casper et al., 2012). Since not many CFSs have been molecularly characterized, despite the growing interest in understanding the precise nature of CFS instability, Brueckner et al. took into consideration the FRA2H CFS and after having fine-mapped the location with six-color fluorescence in situ hybridization, demonstrated that it is one of the most active CFSs in the human genome (Brueckner et al., 2012). FRA2H encompasses approximately 530 kb of a gene-poor region containing a novel large inter-genic non coding RNA gene (AC097500.2). Using custom-designed array comparative genomic hybridization, gross and submicroscopic chromosomal rearrangements were detected, involving FRA2H in a panel of 54 neuroblastoma, colon, and breast cancer cell lines. Genomic alterations often affected different classes of long terminal repeats (LTRs) and long interspersed nuclear elements (LINEs). Sequence analysis of breakpoint junctions revealed that DNA damage repair at FRA2H mostly appeared to occur via non-homologous end-joining events mediated by short micro-homologies (Brueckner et al., 2012).
Deletions at FRA3B CFS occur in pre-neoplasias and may be the most frequent and earliest alterations. FRA3B overlaps the FHIT gene, and its fragility frequently results in deletions of FHIT exons and loss of FHIT expression in precancerous and cancer cells (Sozzi et al., 1998). Examination of cells that have lost FHIT revealed that the protein has some functional roles in response to DNA damage (Saldivar et al., 2010). In particular, kidney epithelial cells established from Fhit-/- mice exhibited >2-fold increased chromosome breaks at fragile sites vs. corresponding Fhit+/+ cells (Turner et al., 2002), and the frequency of mutations following replicative and oxidative stress in Fhit-deficient cells was 2 to 5-fold greater than in Fhit-expressing cells (Ishii et al., 2008; Ottey et al., 2004). Despite these findings and strong evidence that Fhit acts as a tumor suppressor (Joannes et al., 2010; Pekarsky et al., 1998; Siprashvili et al., 1997) it has been proposed that deletions within the FHIT locus are secondary alterations rather than cancer-driving mutations (Bignell et al., 2010). In a new study, Kay Huebner and colleagues (Saldivar et al., 2012) examined further the role of Fhit loss in DNA damage process. Specifically, it has been shown that Fhit loss causes replication stress-induced DNA double-strand breaks in normal, transformed, and cancer-derived cell lines. In Fhit-deficient cells, a defect was observed in replication fork progression that stemmed mainly from fork stalling and collapse. The possible mechanism for the role of Fhit in replication fork progression is by regulation of thymidine kinase 1 expression and thymidine triphosphate pool levels. Interestingly, restoration of nucleotide balance rescued DNA replication defects and suppressed DNA breakage in Fhit-deficient cells. Loss of Fhit did not activate the DNA damage response nor cause cell cycle arrest, allowing continued cell proliferation and ongoing chromosome instability. Such a result was consistent with in vivo studies, where Fhit knockout mouse tissues showed no evidence of cell cycle arrest or senescence yet exhibited numerous somatic DNA copy number aberrations at replication-sensitive loci. Moreover, cells established from Fhit KO tissues showed rapid immortalization together with DNA deletions and amplifications. Of note, the murine gene Mdm2, an oncogene involved in cell transformation, was also amplified with 4-fold increase in Mdm2 mRNA expression, suggesting that genome instability induced by FHIT depletion facilitates the transformation process. In conclusion, this study proposes that Fhit depletion in precancerous lesions is the first step in the initiation of genomic instability and links alterations at CFSs to the very origin of this important phenomenon (Saldivar et al., 2012).
To conclude this short panoramic on CFSs and genomic instability, we would like to draw the reader's attention to the most recent findings about Polζ polymerase. Polζ, which consists of the catalytic subunit Rev3 and the accessory subunit Rev7, is a trans-lesion DNA synthesis (TLS) polymerase capable of bypassing certain DNA adducts efficiently (Gibbs et al., 1998). Besides its role in TLS, Rev3 is also essential for mouse embryonic development (Bemark et al., 2000), whereas no other TLS polymerases studied to date are required for this fundamental function. Rev3 has been also implicated in homologous recombination repair (Sharma et al., 2012). Because of its extremely large size (>350 kDa), little progress has been made in understanding the essential function of Rev3. Bhat et al. found that the cellular level of Rev3 is elevated in mitotic cells, and the protein is associated with chromatin. Experimental depletion of Rev3 results in elevated CFS expression and chromosomal instability, indicating that Rev3 is required for the late replication of these sites. Rev3 activity is independent of Rev7, as the depletion of cellular Rev7 does not cause CFS expression. Moreover, constitutive depletion of Rev3 in cultured human cells resulted in accumulated genomic instability and eventual arrest of cell division, suggesting that Rev3 is required not only for embryonic development but also for cell viability (Bhat et al., 2013). Interestingly, comparison of yeast and mammalian Rev3 proteins reveals a large exon that is unique to the mammalian gene that will surely be subjected to future investigations for its role in the maintenance of mitotic genomic stability.
We thank Dr. Kay Huebner for critical reading of the manuscript and Prasanthi Kumchala for technical assistance. This work was supported by NIH grant U01CA152758 (to CMC).