(1) Department of Chemistry and Biochemistry, Kennesaw State University, Kennesaw, Georgia, USA
(2) Department of Pediatrics, The University of Texas M.D. Anderson Cancer Center, Houston, Texas, USA
* Corresponding author Email:
Although most naturally occurring DNA and RNA adopt the now quite familiar double-helix structure, certain sequences can, under appropriate conditions, adopt a three-stranded, triple-helical structure. Both intramolecular and intermolecular triplexes have been described. Evidence for the existence of triplex structures
More studies will be needed before the full potential of triplex nucleic acids and their role in cancer can be realised.
Although the right-handed double-helical structure of B-form DNA is now quite famous, having been determined in 1953 through the work of Watson and Crick among others[1,2,3], it was shortly thereafter noted that certain homopolymer DNA sequences preferentially adopted a three-stranded, triple-helical structure. In this structure, the third strand lies within the major groove of the double helix, interacting with purine bases on one strand of the duplex through either Hoogsteen or reverse-Hoogsteen base pairing. A schematic representation of an intermolecular triplex is shown in Figure 1A.
Triplex nucleic acids. (A) Schematic representation of an intermolecular triplex. The third strand (black) resides in the major groove of the duplex nucleic acid, hydrogen bonding to the purine-rich duplex strand. Note that the third strand may be part of a larger nucleic acid (dotted line extensions). (B) Chemical structures of T*A:T and C[+]*G:C base triplets and strand orientations present in pyrimidine motif triplexes. (C) Chemical structures of A*A:T, G*G:C, and T*A:T base triplets and strand orientations present in purine motif triplexes. An asterisk (*) indicates Hoogsteen hydrogen bonding between complementary bases; a colon (:) indicates Watson–Crick hydrogen bonding between complementary bases. Arrowhead indicates 3’ end of nucleic acid strand.
Two triplex motifs have been described[6,7]. In the purine motif, a purine-rich third strand binds with an antiparallel orientation relative to the complementary purine-rich acceptor strand of the duplex and the primary base triplets are G*G:C and A*A:T or T*A:T, where a colon indicates the conventional Watson–Crick base pairing present in the duplex and an asterisk reverse-Hoogsteen base pairing between bases in the third strand (G, A, or T) and purine acceptors in the target duplex. In the pyrimidine motif, a pyrimidine-rich third strand binds with a parallel orientation relative to the complementary purine-rich acceptor strand of the duplex and the primary base triplets are T*A:T and C[+]*G:C, where an asterisk indicates Hoogsteen base pairing between bases in the third strand, either thymine or protonated cytosine, and purine acceptors in the target duplex. Typical triplex DNAs range from ~10 to many tens of contiguous base triplets in length. Graphical representations of the different triplex motifs and the hydrogen bonding between nucleic acid bases are shown in Figure 1B,C.
Although most triplex structures described to date are composed exclusively of DNA, both pure RNA and mixed RNA:DNA triplex species have also been identified[8,9,10,11]. However, it should be noted that not all possible combinations of nucleic acids form stable triplexes under physiological conditions. For example, a purine motif triplex composed entirely of DNA is stable but a comparable triplex containing a single RNA strand substitution is not. Presumably this inability to form triplexes stems from structural issues regarding steric repulsion, the width of the major groove and base pitch, these factors limiting third strand accessibility to purine acceptors in the duplex. Sequence considerations are also critically important in determining the relative stability of these species, especially under physiological conditions. Thus, pyrimidine motif triplex-forming nucleic acids tend to be deficient in cytosine, which is not highly protonated at physiological pH given its p
Purine motif triplex formation.
Pyrimidine motif triplex formation.
Triplexes are often thought of as a separate nucleic acid molecule interacting with a target duplex. One example would be a T-rich RNA interacting with a complementary homopurine–homopyrimidine DNA duplex through intermolecular pyrimidine motif triplex formation. However, examples have been found where a homopurine–homopyrimidine DNA duplex dissociates partially into single strands (e.g. as a result of superhelical torsional strain) and one strand then forms a triplex with a proximal complementary homopurine–homopyrimidine DNA duplex present in the same molecule. These are referred to as intramolecular triplexes (Figure 2A). Both pyrimidine (H-DNA) and purine (H’-DNA) motif intramolecular triplexes have been described[14,15]. In addition, four different intramolecular triplex isomers can exist for any particular homopurine–homopyrimidine sequence, depending on which motif is involved and which half-element strand serves as the third strand (Figure 2B). Most intramolecular triplexes investigated to date have been composed entirely of DNA. However, recent studies with certain long non-coding RNAs lacking 3’ poly(A) tails (e.g. MALAT1, MENβ) have found that their intrinsic high stability results from their ability to form intramolecular pyrimidine motif RNA triplexes[16,17].
Intramolecular triplexes. (A) Schematic representation of an intramolecular triplex. The third strand (grey) resides in the major groove of the duplex nucleic acid and hydrogen bonds to the purine-rich duplex strand while its complement (white) remains primarily single-stranded. (B) Schematic representations of the four possible forms of intramolecular DNA triplexes. For H-DNA isomers H-y3 (1͇) and H-y5 (2͇), the third strand is pyrimidine-rich and originates from either the 3’ or 5’ end of an oligopyrimidine duplex strand (light grey), respectively. For H’-DNA isomers H-r3 (3͇) and H-r5 (4͇) ), the third strand is purine-rich and originates from either the 3’ or 5’ end of an oligopurine duplex strand (black), respectively.
Although most studies with triplexes have been performed
One line of evidence for the
Triplex-binding proteins are present in many organisms. Electrophoretic mobility shift assays were assembled containing either 0.1 nM radiolabelled purine motif triplex (A) or duplex probe (B), 2 µg poly(dI-dC) carrier DNA, and various amounts of whole cell or nuclear extracts from different organisms, as indicated. (Lane 1) control reaction, no protein. (2) 6.7 µg
Triplex structures may form
The authors have referenced some of their own studies in this article. These referenced studies have been conducted in accordance with the Declaration of Helsinki (1964) and the protocols of these studies have been approved by the relevant ethics committees related to the institution in which they were performed. All human subjects in these studies gave their informed consent for their participation and use of derived materials.
Although the existence of persistent triple-helical structures
If triplexes exist and promote cancer, then the loss of proteins that prevent triplex formation and/or dissociate triplex structures should promote cancer. This is true for the DNA helicases BLM and WRN and for the bifunctional RNA and DNA helicase FANCJ. BLM is part of the BRCA1-associated genome surveillance complex and is normally involved in DNA replication and repair. Defects in BLM are the cause of Bloom’s syndrome, characterised by proportionate prenatal and postnatal growth deficiency, sun-sensitive telangiectatic hypopigmented and hyperpigmented skin, predisposition to general malignancy and chromosomal instability. WRN has both helicase and exonuclease activities and is involved in resolving inappropriate structures during recombination, DNA replication and repair. Defects in WRN are the cause of Werner syndrome, characterised by the premature onset of multiple age-related disorders, including atherosclerosis, non-insulin-dependent diabetes mellitus, ocular cataracts, osteoporosis and the appearance of rare cancers (e.g. osteosarcomas and chondrosarcomas). FANCJ (also known as BRIP1 or BACH1) is also associated with BRCA1 and is normally involved in DNA double-strand break repair by homologous recombination. Defects in FANCJ are the cause of Fanconi anaemia complementation group J and result in anaemia, leukopaenia and thrombopaenia. Loss of FANCJ has been described in several cancers but especially associated with breast carcinomas. However, it should be noted that while all of these helicases can act upon triplex structures, they can also act upon other non-B structures (e.g. G-quadruplex, cruciform, slipped-strands) as well as normal duplex nucleic acids. Thus, it cannot be said for certain which structure’s resolution is most responsible for the effects of these helicases on cancer.
With like reasoning, if triplexes exist and promote cancer, then proteins that facilitate triplex formation and/or stabilise triplex structures may promote cancer. This has been more difficult to establish for triplex-binding proteins. For example, while the human Orc4 protein plays an essential role in the initiation of replication and has been found to preferentially bind pyrimidine motif triplex DNA, overexpression of this protein has not been strongly implicated in any cancer[24,43]. Certain high mobility group proteins (e.g. HMGB1) have been reported to promote the formation of purine motif triplexes. However, they are well known to interact with other nucleic acid structures, preferentially single-stranded DNA, and have paradoxical oncogenic and tumour suppressive roles in several cancers[44,45]. Finally, while RPA preferentially directs XPA to DNA damage proximal to triplexes and facilitates repair, overexpression of RPA adversely impacts homologous recombination and elevated genomic instability, suggesting that the aforementioned repair does not reduce the incidence of certain cancers.
The best evidence for a relationship between triplex-binding proteins and cancer can be found in the recent work of Nelson et al. Examining extracts from 63 human colorectal tumour and adjacent normal tissues using an electrophoretic mobility shift assay (EMSA), they found significantly higher levels of one triplex species (H3, originally described by Musso et al.) in tumour extracts than in corresponding normal tissue extracts. The ratio of H3 observed in tumour versus normal tissue (T/N) significantly correlated with lymph node disease (N-stage), metastasis and a reduction in overall survival following 65 months observation. However, similar correlations were not observed for other triplex species observed in these extracts. Using affinity chromatography, nano-scale high-performance liquid chromatography and electrospray ionisation tandem mass spectrometry, they were able to identify three proteins specifically bound to their purine motif triplex DNA probe: 100-kDa polypyrimidine-tract binding-associated splicing factor PSF, 60-kDa nuclear RNA-binding protein P54nrb, and 65-kDa U2 small nuclear RNA auxiliary factor 2 isoform b. PSF and P54nrb form heterodimers and function as RNA polymerase II-associated splicing factors. U2AF65 is also a known RNA polymerase II-associated splicing factor, involved in the recognition of degenerate pyrimidine tracts downstream of the branch point during spliceosome assembly. Involvement of U2AF65 in the H3 species was confirmed using anti-U2AF65 MC3 antibody and a super-shift EMSA experiment. The roles of PSF or P54nrb in any other EMSA species were not confirmed. Of the 63 patient samples 51 were then investigated by western blotting. This confirmed U265AF correlation with H3 levels and showed increased expression in advanced clinical stages (UICC Stage III and IV, Dukes C and D). Similar correlations could not be made with either PSF or P54nrb. Curiously, western blotting indicated a strong correlation between H3 levels and the DNA helicase WRN, suggesting coordinate regulation of triplex-stabilising and triplex-destabilising activities.
Are triplexes and/or their interacting proteins suitable markers for cancer prognosis, diagnosis or targeting? This remains an open question at this time. Multiple lines of evidence suggesting the existence of triplex nucleic acids
MW Van Dyke is supported by the National Institute of General Medical Sciences (1R15GM104833-01) and was supported by a Faculty Research and Creative Activities Award from his previous institution (Western Carolina University, Cullowhee NC, USA).
All authors contributed to the conception, design, and preparation of the manuscript, as well as read and approved the final manuscript.
All authors abide by the Association for Medical Ethics (AME) ethical rules of disclosure.
Purine motif triplex formation.
|Triplex → Strand↓||D*D:D||D*D:R||D*R:D||D*R:R||R*D:D||R*D:R||R*R:D||R*R:R|
Pyrimidine motif triplex formation.
(D) DNA, (R) RNA, (*) Hoogsteen or reverse-Hoogsteen hydrogen bonding, (:) Watson-Crick hydrogen bonding, (3) third strand of triplex, (Pu) purine-rich strand of duplex, (Py) pyrimidine-rich strand of duplex. Relative triplex stabilities range from none (–), marginal (~), weak (+), moderate (++) to strong (+++). Multiple stabilities indicate values observed for different triplex sequences. First value is dominant among three independent investigations