The most common type of repetitive DNA are interspersed repeats or moderately repeated sequences. These are present as a single copy at very many different loci and can move or jump to new locations.Interspersed repeats account for almost half of human DNA. These do not occur in tandem arrays. Individual copies of the same, or nearly the same sequence, ~100 bp to ~10 kb long, are found at tens of thousands to more than 1 million different positions dispersed all over the genome. This dispersion is the result of repeated insertions of transposons into new sites during the evolution. The interspersed elements are either transposons themselves or are derived from other genomic sequences acted on by transposon enzymes.
|DNA Transposons||2 - 3 kbp||~300,000||3%|
|LTR Retrotransposons||6 - 11 kbp||~440,000||8%|
|LINEs||6 - 8 kbp||~860,000||21%|
|SINEs||100 - 300 bp||~1,600,000||13%|
|Processed Pseudogenes||Variable||1 - ~100||~0.4%|
|Unclassified Spacer DNA||~25%|
Transposons either move by: direct excision and reinsertion of one DNA element; or insertion of a reverse-transcribed RNA product. DNA transposons can jump during S phase from a daughter strand into unreplicated DNA, thus increasing its copy number. A DNA transposon integrates with a staggered cleavage of the target DNA followed by ligation of the target 5’ ends to the transposon and filling the gaps.
Retrotransposons are transcribed normally, then reverse-transcribed to form a DNA copy that is inserted into a new site. LTR retrotransposons are retroviruses that have lost the ability to exit and reinfect a cell. The upstream LTR acts as a promoter and the downstream LTR contains a poly-A site to produce transcripts from the integrated element. These both encode the proteins needed for transposition and serve as a template for making the DNA copy. Between them is a coding region that encodes proteins for transposition and also acts as a copy template.
The retroviral genomic RNA is copied into DNA via priming, extension, jumping and repriming steps:
- A tRNA binds to the primer binding site and copies the 5' end.
- Another tRNA at the 3' end copies the body of the virus.
- This 3' end copy is replicated.
- This 3' end copy replica reprimes at the 5' end, forming a virus with both LTR’s.
The double-stranded DNA of an LTR retrotransposon integrates by the same mechanism as a DNA transposon, with a staggered cleavage of the target DNA followed by ligation of the target 5’ ends to the transposon and filling the gaps.
Non-LTR Retrotransposons (LINEs)
Non-LTR retrotransposons also encode a reverse transcriptase, but use a different mechanism for insertion. The two proteins encoded by LINEs are: ORF1, an RNA binding protein; and ORF2, a reverse transcriptase and endonuclease. ORF2 protein makes a nick in an A/T rich region of the target DNA to allow priming on the Poly-A tail of the LINE RNA. Reverse Transcriptase uses the 3’ end of the nicked target to extend into the LINE RNA, making a DNA copy. RT reaches the end of the LINE RNA and continues into the target at the staggered cleavage. Insertion is completed by cellular enzymes that copy the second strand, degrade the RNA and ligate the fragments together. Most LINE elements are truncated due to incomplete copying of element during insertion. This makes them inactive for transposition, but they can still be mutagenic upon insertion and can still induce aberrant recombination events. LINEs can result in exon shuffling.
Short Interspersed Elements (SINE’s) do not encode proteins but transpose by the same mechanism as LINE’s, presumably using the LINE proteins.
SINEs carry within them a promoter for RNA Pol III allowing new RNA copies to be made. The most common SINE is the Alu Element named for an Alu restriction site it contains. Alu elements originally derived from the 7SL RNA, a cytoplasmic small RNA involved in protein secretion.Single Alu Elements can evolve into new exons. Alu elements can result in exon shuffling.