CTSL (cathepsin L1)

2009-01-01   Brigitte Goulet , Alain Nepveu 

Cancer research Laboratory Program, London Regional Cancer Program at London Health Science Centre, the University of Western Ontario, London, Canada (BG); Department of Biochemistry, Goodman Cancer Centre, McGill University, Montreal, Canada (AN)

Identity

HGNC
LOCATION
9q21.33
LOCUSID
ALIAS
CATL,CTSL1,MEP
FUSION GENES

DNA/RNA

Atlas Image
Shown are the eight exons and seven intron of the human Cathepsin L1 gene. The black boxes correspond to protein coding.

Description

The human Cathepsin L1 gene comprises eight exons and seven introns, and span 5411 bases. The first AUG translation initiation site is located within exon 2. Three spliced variants of hCATL-A have been identified; hCATL-AI, hCATL-AII, hCATL-AIII (Rescheleit et al., 1996; Bakhshi et al., 2001; Aurora et al., 2002; Jean et al., 2002). These spliced forms lack 27 nucleotide (nt), 90 nt and 144 nt from the 3 end of exon 1 respectively and lead to mRNA species that differ in the 5 untranslated regions. However, they are translated into identical proteins. The shorter 5UTR lack secondary loops and are translated more effectively than hCATL-A (Aurora et al., 2002).

Transcription

One major transcription initiation site is situated at -290 from the starting AUG on the human cDNA sequence (Joseph et al., 1988; Chatham et al., 1993; Bakhshi et al., 2001; Aurora et al., 2002; Jean et al., 2002). This mRNA of 1.7 Kb corresponds to hCATL-A. Another mRNA, hCATL-B, is transcribed from another TATA-less promoter localized within the first intron of hCATL-A and encodes the same cathepsin L protein (Joseph et al., 1988; Bakhshi et al., 2001; Seth, gene 2003). The hCATL-B therefore differs from hCATL-A in the 5 untranslated region (Bakhshi et al., 2001; Jean et al., 2002). The transcription factors NF-Y, SP1 and SP3 have been shown to be responsible for more than 85% of Cathepsin L expression in melanoma cells (Jean et al., 2002). In melanoma and in lymphoma cells, Cathepsin L expression is also regulated by CpG methylation, and gene amplification has been observed in one melanoma cell line (Jean et al., 2006). In tissue culture models, phorbol esters, certain oncogenes such as ras, v-src, SV-40 Large T and raf, cytokines such as IL-1, IL-6 and TNF-alpha, and hypoxia have all been shown to induce cathepsin L expression (Troen et al., 1991; Lemaire et al., 1994; Kakegawa et al., 1995; Lemaire et al., 1997; Heinrich et al., 2000; Gerber et al., 2001; Jean et al., 2008).

Pseudogene

Using the program TBLASTN three pseudogenes, closely related to cathepsin L were identified on chromosome 10 (Rossi et al., 2004).

Proteins

Atlas Image
Shown is a schematic representation of the various human cathepsin L isoforms. The full-length protein is composed of a signal peptide (pre), a pro-domain and the mature polypeptide. PF stands for proform, SC for single chain, HC for heavy chain and S correspond to the various short isoforms that are devoid of signal peptide and whose translation is initiated from internal, in-frame, AUGs located within the prodomain coding sequence.

Description

CSTL1 codes for a protein of 333 amino-acids. Like all lysosomal enzymes, cathepsin L is translated as an inactive pre-pro-enzyme. The pre-region, located at the amino terminus of the protein, is a 17 amino-acid signal peptide (or signal sequence). Human cathepsin L is N-glycosylated at Asn 204. This glycosylation event is not required for proper folding of the protein or for its enzymatic activity or stability, but is important for lysosomal targeting (Smith et al., 1989; Kane, 1993). In certain cases, especially in transformed cells, secretion of procathepsin L (MEP) is observed (Gottesman, 1978). The crystal structure of procathepsin L has been resolved (Coulombe et al., 1996). Structurally, like most enzymes from the papain family, the mature cathepsin L consists of two globular regions, the R-domain (right) and L- domain (left) (Turk et al., 1997). These two domains are organized to form an open V-shaped active site cleft. The propiece of cathepsin L can also be separated into two regions. The amino-terminal part, which consists of the first 60 amino acids, is important for proper folding and glycosylation of procathepsin L (Chapman et al., 1997). Expression of the carboxy-terminal part of the prodomain is responsible for the inhibitory role of the propiece, by preventing the entry of the substrate into the active cleft. In order to do so, when the amino-terminus binds to the prodomain binding loop, the carboxy-terminus of the proregion bends over the groove of the active site in the opposite direction that the substrate would have been (Coulombe et al., 1996). The removal of the propiece occurs via an intra and/or inter molecular mechanism as the zymogen reaches the acidic environment of late endosomes or lysosomes (Nishimura et al., 1989; Salminen et al., 1990; Nomura et al., 1997; Ishidoh, et al., 2002). In order for the enzyme to become active, the proregion must be removed (Mason et al., 1992; Ishidoh et al., 1995). Studies have shown that other lysosomal enzymes, such as cathepsin D can also process procathepsin L and be involved in the initial steps of activation (Nishimura et al., 1989). In some cell types, the mature single chain form of cathepsin L is further processed into a two chains, heavy and light, by cleavage of the carboxy-terminus (Mason et al., 1985; Gal et al., 1986; Erickson, 1989). The optimal activity of mature cathepsin L requires a slightly acidic pH and a reducing environment which permits the active cysteine to be oxidized. The maximal activity of cathepsin L, using small synthetic peptides as substrates is at pH 5.5. The enzyme is most stable between pH 4.5 and 5.5. Like all cysteine proteases, the active site of cathepsin L is composed of a reactive cysteine (Cys 25), and a histidine (His 159). In the active form, both residues are charged, forming a thiolate-imidazolium ion pair (McGrath, 1999). Cathepsin L prefers a hydrophobic residue (mainly L/I) in the P2 position (cleavage occurs between residues P1 and P1) (Chapman et al., 1997). In addition, a shorter isoform of Cathepsin L has been detected. Translation at downstream, in-frame, AUGs is responsible for generating a protein that is devoid of a signal peptide which cannot be routed to the endoplasmic reticulum (Goulet et al., 2004; Goulet et al., 2007).

Localisation

Cathepsin L is ubiquitously expressed. It is generally localized to the endosomes/lysosomes or secreted. Recently, various groups have reported the presence of Cathepsin L in the nucleus and the cytoplasm in various cell types (Goulet et al., 2004; Bulynko et al., 2006; Varanou et al., 2006; Sever et al., 2007; Duncan et al., 2008). Moreover, in neuroendocrine chromaffin cell types, Cathepsin L is detected in regulatory secretory vesicules (Yasothornsrikul et al., 2003).

Function

Cathepsin L is a lysosomal enzyme originally thought to be involved in terminal protein degradation only. However, knockout mice showed that terminal protein degradation was not the work of a single cathepsin, as none of these mice had defects in protein degradation. The various phenotypes of these mice rather suggested that this protease has other specific biological roles. One knockout was generated, and two mice with natural mutations within the cathepsin L gene were also identified, Furless and Nackt. The furless mice possess a G-to-A mutation, which substitute an arginine for a glycine at position 149 of the cathepsin L protein sequence, resulting in an inactive enzyme (Roth et al., 2000). The nackt mice display a deletion in the cathepsin L gene, preventing the generation of any functional protein (Benavides et al., 2002). These three animal phenotypes have revealed that Cathepsin L plays a role in various physiological events in different tissues.
For example, Cathepsin L is important for epidermal homeostasis, regulation of the hair cycle, control of keratinocyte proliferation, MHC class-II mediated antigen presentation and selection of CD4+ T cells in cortical epithelial cells of the thymus (Nakagawa et al., 1998; Roth et al., 2000; Benavides et al., 2002; Reinheckel et al., 2005). Cathepsin L expression in thymocytes has been shown to be essential for natural killer cell development (Honey et al., 2002). Knockout mice models have demonstrated that Cathepsin L directly participates in atherosclerosis and in neovascularisation induced by endothelial progenitor cells (Maehr et al., 2005; Urbich et al., 2005; Kitamoto et al., 2007). These knockout mice have been observed to develop heart disease similar to human dilated cardiomyopathy (Stypmann et al., 2002). In general, pups lacking cathepsin L also have a slightly higher mortality upon weaning than their littermates (Reinheckel et al., 2001). Cathepsin L is responsible for the processing of viral proteins (Chandran et al., 2005; Pager et al., 2005; Kaletsky et al., 2007; Bosch et al., 2008) and the generation of neuropeptides and thyroid hormone (Funkelstein et al., 2008; Funkelstein et al., 2008). Cathepsin L activity is responsible for adipogenesis and glucose tolerance by degrading matrix fibronectin and processing the insulin receptor and IGF-1R beta (Yang et al., 2007). It is involved in intestinal epithelial cell polarization and differentiation (Boudreau et al., 2007) and in proteinuric kidney disease (Sever et al., 2007). Nuclear Cathepsin L was shown to proteolytically process a transcription factor during the G1/S progression of the cell cycle (Goulet et al., 2004) and histone H3 during mouse embryonic stem cell differentiation (Duncan et al., 2008). The landscape modifications of the histone on the Y chromosome and pericentromeric heterochromatine are stabilized by Cathepsin L (Bulynko et al., 2006). The role of Cathepsin L in cancer has been studied extensively. Secreted Cathepsin L degrades basal membrane and extracellular matrix therefore could increase the development of metastases. Intracellular cathepsin L activity can lead to activation of oncogenes or inactivation of tumor suppressors (Goulet et al., 2007). A recent paper also indicate that Cathepsin L plays a role in drug resistance (Zheng et al., 2008).

Homology

Human Cathepsin L1 belongs to the papain superfamily. Cathepsin L2 (formerly called Cathepsin V) originated from ancestral Cathepsin L as they share 77% amino-acids identity. Moreover, both are similar to mouse cathepsin L (72% and 75% respectively) and other mammals (Itoh et al., 1999).

Mutations

Note

Not determined.

Implicated in

Entity name
Various cancers
Oncogenesis
Cathepsin L was initially identified as the major excreted protein (MEP) secreted from transformed fibroblastic cells (Gottesman, 1978; Troen et al., 1987; Troen et al., 1988). Oncogenic signals such as Ras, Raf, v-Src, fos, SV40 Large T as well as tumor promoters like phorbol ester can induce MEP expression and secretion (Joseph et al., 1987; Taniguchi et al., 1990; Troen et al., 1991; Lemaire et al., 1994; Heinrich et al., 2000). Moreover, cathepsin L secretion correlated with the metastatic potential of transformed cell lines (Denhardt et al., 1987; Chambers et al., 1992). Increased cathepsin L activity and secretion has been observed in many human cancers (Watanabe et al., 1987; Sheahan et al., 1989; Chauhan et al., 1991; Heidtmann et al., 1993; Nishida et al., 1995; Plebani et al., 1995; Park et al., 1996; Shuja et al., 1996; Sivaparvathi et al., 1996; Leto et al., 1997; Kim et al., 1998; Dohchin et al., 2000). Various reports also suggested that cathepsin L levels could be used as a potential indicator of tumor aggressiveness and metastasis (Thomssen et al., 1995; Park et al., 1996). Increased nuclear Cathepsin L expression and activity was recently found in various cancer cells, suggesting a different mechanism of cellular transformation (Goulet et al., 2007).
Therefore, although the association of cathepsin L and cancer is well established, its specific roles have not yet been fully elucidated.
Entity name
Breast cancer
Prognosis
In breast cancer, numerous studies have indicated that secreted cathepsin L could be a strong and independent prognostic factor, with a strength similar to lymph node status and grading (Castiglioni et al., 1994; Thomssen et al., 1995; Duffy, 1996; Foekens et al., 1998; Thomssen et al., 1998; Harbeck et al., 2000; Harbeck et al., 2001; Levicar et al., 2002). Cathepsin L expression could also predict response to adjuvant chemotherapy (Jagodic et al., 2005).
Entity name
Gastric carcinoma
Prognosis
Cathepsin L expression correlates with an early event in gastric carcinogenesis and with depth of invasion in early stage of gastric carcinoma. Higher expression is associated with worst prognosis (Plebani et al., 1995; Farinati et al., 1996; Dohchin et al., 2000).
Entity name
Skin cancer
Prognosis
Higher concentration of Cathepsin L in early primary melanomas correlates with poor prognosis and indicate possible early metastasis spread (Stabuc et al., 2006). In malignant cells of squamous cell carcinoma, Cathepsin L is mainly overexpressed at the periphery of the tumor. Cathepsin L is also overexpressed in various inflammatory skin diseases such as psoriasis and atopic eczema (Bylaite et al., 2006).
Entity name
Ovarian cancer
Prognosis
Cathepsin L expression is increased in ovarian cancer sample as well as in the serum of patients with ovarian cancer. Serum levels of Cathepsin L could be used in early detection of ovarian cancers (Nishida et al., 1995).
Entity name
Bladder cancer
Prognosis
Urinary Cathepsin L is an independent predictor of bladder urothelial cell cancer and invasiveness (Svatek et al., 2008).
Entity name
Brain cancer (neuroblastoma)
Prognosis
Cathepsin L has no prognostic value in glioma, but its expression is increased in tumor cells (Strojnik et al., 2005). In invasive benign meningioma and pituitary adenomas, Cathepesin L levels are also higher (Strojnik et al., 2001; Strojnik et al., 2005).
Entity name
Pancreatic adenocarcinoma
Prognosis
Cathepsin L is a strong independent prognostic marker in resectable cancers (Niedergethmann et al., 2004).

Bibliography

Pubmed IDLast YearTitleAuthors

Other Information

Locus ID:

NCBI: 1514
MIM: 116880
HGNC: 2537
Ensembl: ENSG00000135047

Variants:

dbSNP: 1514
ClinVar: 1514
TCGA: ENSG00000135047
COSMIC: CTSL

RNA/Proteins

Gene IDTranscript IDUniprot
ENSG00000135047ENST00000340342P07711
ENSG00000135047ENST00000340342A0A024R276
ENSG00000135047ENST00000342020Q5T8F0
ENSG00000135047ENST00000343150P07711
ENSG00000135047ENST00000343150A0A024R276

Expression (GTEx)

0
50
100
150
200
250
300
350
400
450

Pathways

PathwaySourceExternal ID
Autophagy - animalKEGGko04140
ApoptosisKEGGko04210
Antigen processing and presentationKEGGko04612
Autophagy - animalKEGGhsa04140
ApoptosisKEGGhsa04210
Antigen processing and presentationKEGGhsa04612
LysosomeKEGGko04142
LysosomeKEGGhsa04142
PhagosomeKEGGko04145
PhagosomeKEGGhsa04145
Rheumatoid arthritisKEGGko05323
Rheumatoid arthritisKEGGhsa05323
Proteoglycans in cancerKEGGhsa05205
Proteoglycans in cancerKEGGko05205
Immune SystemREACTOMER-HSA-168256
Adaptive Immune SystemREACTOMER-HSA-1280218
Class I MHC mediated antigen processing & presentationREACTOMER-HSA-983169
Antigen processing-Cross presentationREACTOMER-HSA-1236975
Endosomal/Vacuolar pathwayREACTOMER-HSA-1236977
MHC class II antigen presentationREACTOMER-HSA-2132295
Innate Immune SystemREACTOMER-HSA-168249
Toll-Like Receptors CascadesREACTOMER-HSA-168898
Trafficking and processing of endosomal TLRREACTOMER-HSA-1679131
Extracellular matrix organizationREACTOMER-HSA-1474244
Collagen formationREACTOMER-HSA-1474290
Assembly of collagen fibrils and other multimeric structuresREACTOMER-HSA-2022090
Degradation of the extracellular matrixREACTOMER-HSA-1474228
Collagen degradationREACTOMER-HSA-1442490
Fluid shear stress and atherosclerosisKEGGko05418
Fluid shear stress and atherosclerosisKEGGhsa05418

Protein levels (Protein atlas)

Not detected
Low
Medium
High

References

Pubmed IDYearTitleCitations
179283562007Proteolysis of the Ebola virus glycoproteins enhances virus binding and infectivity.97
199241012010Proteinuria: an enzymatic disease of the podocyte?91
220319332012Cathepsin cleavage potentiates the Ebola virus glycoprotein to undergo a subsequent fusion-relevant conformational change.69
235366512013TMPRSS2 activates the human coronavirus 229E for cathepsin-independent host cell entry and is expressed in viral target cells in the respiratory epithelium.66
156658312005Cathepsin L is required for endothelial progenitor cell-induced neovascularization.63
215555182011The BTB and CNC homology 1 (BACH1) target genes are involved in the oxidative stress response and in control of the cell cycle.60
184507562008Cathepsin L is responsible for processing and activation of proheparanase through multiple cleavages of a linker segment.59
159826602006Cathepsin L expression and regulation in human abdominal aortic aneurysm, atherosclerosis, and vascular cells.58
269533432016Glycopeptide Antibiotics Potently Inhibit Cathepsin L in the Late Endosome/Lysosome and Block the Entry of Ebola Virus, Middle East Respiratory Syndrome Coronavirus (MERS-CoV), and Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV).56
199587822010Cathepsin L, target in cancer treatment?47

Citation

Brigitte Goulet ; Alain Nepveu

CTSL (cathepsin L1)

Atlas Genet Cytogenet Oncol Haematol. 2009-01-01

Online version: http://atlasgeneticsoncology.org/gene/40208/ctsl