Student Reader header
Biology Political Science History Chemistry Physics Workbook Twitter
Genetics & Genomics    →   Genetics Basics    →    ©

Transposable elements were first discovered by Barbara McClintock in kernels of corn, where certain mutations caused loss and reinstatement of purple pigment (due to gain and loss of an insertion element that activated pigment genes). The human genome contains ~300,000 DNA transposons, which have extensively accelerated evolution due to the modularity of exons and regulatory regions. Transposition is one avenue for exon shuffling to occur, whereby an exon and two flanking transposons are all excised and reinserted elsewhere as a single element (potentially adding a new exon to a gene). There are conservative (cut + paste) and replicative (copy + paste) mechanisms for transposition. However, transposition is potentially mutagenic and over-transposition is very deleterious; thus, it remains a rare event that occurs in about 1:105 or 1:107 cells per generation. Transposable elements occur in both eukaryotes (transposons; retrotransposons) and prokaryotes (insertion elements).

Next Steps Transposons and retrotransposons are discussed in the article about eukaryotic chromosomes.

A gene is the nucleic acid sequence needed to synthesize a particular gene product. A gene includes more than just the coding region that encodes an RNA transcript; there are also control regions controlling synthesis, processing and translation of the RNA transcript. In prokaryotes, the entire coding region encodes a continuous polypeptide sequence. In eukaryotes, coding regions contain exons (50-250 nucleotides) that encode polypeptide sequences and introns (500-50,000 nucletides, removed during RNA processing) that do not. Higher eukaryotes not only have introns within genes, but large intergenic regions. For example, a ~80 kb region in Saccharomyces cerevisiae (baker’s yeast) contains 40 genes; the ~80 kb region encompassing the human β-globin cluster contains only 5 genes. This extra DNA comes from multiple repeats described here. Exons often encode modular units that are included or excluded via RNA processing. Exons are usually highly conserved while introns are barely conserved. For example, SUR2 exons are 90% identical between mice and humans while SUR2 introns are less than 10% identical between mice and humans. A lack of inter-species sequence conservation indicates a lack of function.

diagram showing exons and introns within a gene

Monocistronic Most eukaryotic genes are monocistronic, meaning their mRNAs encode a single protein. Often, a eukaryotic primary transcript forms a single mRNA that encodes a single protein. Most eukaryotic mRNAs have a 5′ cap structure that directs ribosome binding, with translation beginning only at the closest AUG codon.
Polycistronic Prokaryotic genes are mostly polycistronic, with one mRNA encoding multiple proteins involved in a biological process. Along the mRNA, there is a ribosome binding site near each coding region’s start site. Translation can initiate at any of these sites, allowing production of different proteins from one mRNA.

A transcription unit is a region of DNA that is transcribed under the control of a particular promoter. While a gene and a transcription unit (like the LAC operon) are distinguishable in prokaryotes, the two terms are used interchangeable in eukaryotes. There are simple and complex eukaryotic transcription units. A simple transcription unit RNA transcript is processed to yield a single mRNA encoding a single protein. Complex transcription units, which are more common, encode an RNA transcript that is processed to form different monocistronic mRNAs each encoding a different protein. A single transcript can undergo different mRNA pathways via:

Alternative Splicing mRNAs have the same 5′ and 3′ exons but different internal exons.
Alternative Poly(A) Sites mRNAs have the same 5′ exons but different 3′ exons.
Alternative Promoters mRNAs have different 5′ exons but share 3′ exons.
Next Steps Study about the eukaryotic chromosome.
°’s of Freedom Probability
.9 .5 .1 .05 .01
1 0.02 0.46 2.71 3.84 6.64
2 0.21 1.39 4.61 5.99 9.21
3 0.58 2.37 6.25 7.82 11.35
4 1.06 3.36 7.78 9.49 13.28
5 1.61 4.35 9.24 11.07 15.09

Click Χ2 for information on how to use the Χ2 table (Chi-Squared Table).

A mutation is any change in the nucleotide sequence or arrangement of DNA; mutations are one of only four evolutionary agents. There are three categories of mutations: genome mutations, which arise from chromosome missegregation and change how many intact chromosomes are in a cell, leading to aneuploidy; chromosome mutations, which arise from chromosome rearrangement and restructure an individual chromosome; and gene mutations, which are base pair mutations affecting an individual gene. An inherited mutation is a mutation passed on from ancestors; a de novo mutation is a new and non-inherited mutation. Disease mutations interfere with protein synthesis at one of the steps in the list to the bottom right.

  1. Transcription
  2. Translation
  3. 2° & 3° Interaction
  4. Protein Processing
  5. 4° Interaction
  6. Localization
  7. Cofactor Interaction
  8. Actual Activity

Somatic mutations are not passed on to progeny and only affect some cells of the body; a tumor is a common example of a somatic mutation. Germline mutation affect germ cells (cells which differentiate into eggs or sperm) and are passed onto progeny. Many germline mutations occur either in the fertilized egg (the zygote), leading to both germ and somatic cells containing the mutation. Dynamic mutations worsen during gametogenesis, leading to a worsened phenotype in each generation. Examples of dynamic mutations include Huntington Disease and Fragile X Syndrome, which involve repeat sequences extending during gametogenesis.

Some mutations are more likely to be maternal and other mutations are more likely to be paternal. Primary oocytes develop fetally and ovulate years and decades later; the older an oocyte is, the more likely it is to undergo nondisjunction. This nondisjunction leads to trisomy (which can be viable) and monosomy (which is almost never viable). For this reason, aneuploidy is maternal in at least 80% of all cases and occurs more often with maternal age. On the other hand, spermatogenesis occurs throughout a man’s entire life. With mutations accumulating during each round of replication, point mutations are almost always paternal and increase with paternal age.

Point Mutations

Point mutations are substitutions of one base pair for another. A transition is when one pyrimidine is swapped for another — such as C for T, or T for C — or one purine is swapped for another — such as A for G, or G for A. A transversion is when a pyrimidine and a purine swap. Transitions are more frequent because when cytosines methylate to form 5-methylcytosine, they can spontaneously deaminate to thymidine.

Missense Mutation A missense mutation is the changing of a single base-pair. Within a coding region, this usually leads to a change of a single amino acid in the protein product. However, if this mutation is in the 5′ or 3′ untranslated region then a missense can lead to underexpression of the protein product.
Nonsense Mutation A nonsense mutation (aka chain termination mutation) is the replacement of a codon encoding an amino acid with a stop codon. This stop codon ends transcription of that gene prematurely, with an incomplete and unstable mRNA formed. Most of these mRNAs just fall apart; however, the few that are translated result in truncated proteins that quickly disintegrate.
RNA Processing Mutation An RNA processing mutation interferes with splicing of mRNA. If the point mutation alters a splice donor or splice acceptor site, then RNA splicing is interfered with or abolished at that location. If the point mutation creates a new splice donor or splice acceptor site, then this new site competes with normal splicing and the processed mRNA might still contain introns.
Deletions, Insertions, Inversions & Translocations

There are deletions (removal of DNA), insertions (addition of DNA), inversions (reversal of the orientation of a DNA segment), translocations (moving of a DNA segment) and combinations thereof. Some deletions and insertions are small changes which are detectable only via PCR or genome sequencing; these usually shift the reading frame (a frameshift mutation) and lead to truncation of the mRNA by an early stop codon in the new reading frame. Some larger mutations can be detected via Southern blotting. For a mutation to be detectable by chromosome banding, it must involve at least 2 million base pairs.

Deletions, insertions, inversions and translocations often arise via faulty recombination. For example, unequal crossing over is crossing over without proper exchange of genetic information, leading to insertions in one chromosomes and deletions in another. Another form of faulty recombination is when mispaired chromosomes or sister chromatids exchange genetic information.

Mutation Nomenclature
Mutation Example Overview
Missense Phe15Tyr A missense mutation is described by the wild-type amino acid, its residue and the resulting mutant amino acid. The example shows a mutation where a phenylalanine is converted to tyrosene at residue 15 of a gene.
Nonsense Ser25X A nonsense mutation is described by the wild-type amino acid, its residue and then an X to represent the mutant stop codon. The example shows a mutation where serine is replaced with a stop codon at residue 25 of a gene.
Nucleotide Change g.3000G>A
c.1000g>a
If the full genomic sequence is known, a nucleotide change is denoted by a prefix (g for genomic and c for cDNA), followed by the number of that nucleotide, the original nucleotide, a ‘>’ symbol, and finally the mutant nucleotide. Mutations identified in genomic DNA are denoted with capitalized nucleotides; mutations identified in non-genomic DNA are denoted with lower-case nucleotides. The first example shows a genomic mutation at position 3,000 where a G transitions to an A; and the second example shows the same mutation at position 1,000 on cDNA.
g.IVS25+2G>A
g.IVS25-2T>A
g.IVS25-2A>T
If the full genomic sequence is not known, then nucleotides are counted either: up from the 5′ splice donor site, with +1 being the invariant G of the GT at the 5′ splice donor site; or down from the 3′ splice acceptor site, with -1 being the invariant G of the AG at the 3′ splice acceptor site. The first example shows a transversion at the T of the 5′ splice donor site; the second example shows a transversion at the A of the 3′ splice acceptor site.
Deletions c.1000_10003delGCAT Small deletions begin with a prefix (g or c for genomic or cDNA), followed by the locations of the deleted nucleotides, then a del for deletion, and finally the nucleotides deleted. The example shows a four-nucleotide deletion of a G,C, A and T at respective locations 1000, 1001, 1002 and 1003.
Insertions c.1000_10001insATGC Small insertions begin with a prefix (g or c for genomic or cDNA), followed by the nucleotides flanking the insertion, then an ins for insertion, and finally the nucleotides inserted. The example shows an insertion of A,T,G and C between nucleotides 1000 and 1001.
Nomenclature table derived from Nussbaum, McInnes & Willard: Genetics in Medicine, 7th ed. Philadelphia, Saunders, 2007. (pg 181)
Effect on Protein Function
Effect Example Overview
Loss of Function α-thalassemia
β-thalassemia
Turner Syndrome
Retinoblastoma
Loss of function mutations leads to a lower gene dosage. Examples of loss of function diseases are the α-thalassemias (where the entire α-globin gene is deleted), β-thalassemias (premature stop codon or coding missense), Turner Syndrome (a monosomy where a chromosome is lost) and retinoblastoma (where a somatic mutation lead to loss of function of tumor-suppressor genes). Many loss-of-functional diseases are less severe in heterozygotes; oftentimes, a single functional allele is enough for a healthy or mild-diseases phenotype.
Gain of Function Trisomy 21
Charcot-Marie-Tooth 1A
Hemoglobin Kempsey
Achondroplasia
Gain of function mutations lead to increased activity of a certain protein in tissues which normally express it (as opposed to heterocrhonic or ectopic expression). This increased activity can be due to a higher gene dosage, as in Down Syndrome (a third copy of part or all of Chromosome 21) or Charcot-Marie Tooth Type 1A (duplication of a particular gene) or even progression of certain cancers. Alternatively, one function of a protein might be detrimentally hyperactive. Examples include: hemoglobin Kempsey, a mutant hemoglobin with such high oxygen affinity that it does not release oxygen to tissues; and achondroplasia, where an over-strong signal (from exceptional binding of a growth factor receptor) leads to growth retardation.
Novel Property Sickle Cell Novel property mutations give encoded proteins new properties. For example, the hemoglobin chains of sickle cells aggregate into long fibers which deform the cell and impair its function. Some novel property mutations are also loss of function mutations; mutants with novel glycosylation sites are rendered inactive by glycosylation.
Heterochronic &
Ectopic Expression
Some Cancers Certain mutations are simply due to gene expression at the wrong time (heterochronic) or in the wrong tissue (ectopic). For example, constitutive expression of proliferation genes (known as oncogenes) can lead to tumor formation.
Term/Abbreviation Overview
Pedigree A graphical family tree using standardized symbols.
Germ Cell Constituting the germline, germ cells differentiate into sperm or eggs and pass genetic information to progeny.
Somatic Cell Any cell that is not a germ cell; genetic information within a somatic cell will not pass on to progeny.
Gene a section of DNA (xsm) that codes for a product.
Locus The location of a gene on a chromosome.
Allele A version of a gene. For example, there is a hair color gene. An allele of that gene might encode brown hair.
Wild-Type The most common form(s) of an allele in the overall population.
Mutation Any change in the nucleotide sequence or arrangement of DNA.
Mutant Arising from mutation, a mutant (aka variant) allele is any other than the wild-type allele(s).
Polymorphism If there is more than one common allele for a gene — such as with hair color — then that gene is polymorphic.
Rare Variant If an allele is present in less than 1% of the population, it is a rare variant (as opposed to a polymorphic allele).
Haplotype The different alleles possible at a given locus.
Centromere A centromere (abbreviated cen is the center of a chromosome, where sister chromatids meet.
Genotype An individual’s set of alleles.
Phenotype The phenotype (aka trait) is the manifestation of a genotype, ranging from retardation to hair color. Genetic variation (allelic heterogeneity, locus heterogeneity and gene modifiers) muddles the genotype-phenotype correlation.
Qualitative Trait A trait that is either present or not, such as trisomy 21.
Quantitative Trait A trait that is measured, such as height, body mass index or intelligence.
Single-Gene Trait A trait that is mostly determined by the alleles at a single locus.
Polygenic Trait Aka multigenic, any trait (or disease) that is controlled by several genes.
Allelic Heterogeneity Allelic heterogeneity is different mutations in the same gene resulting in the same disease, even if of different severity. An example is cystic fibrosis. ther diseases, like sickle cell disease, show little or no allelic heterogeneity (since a single mutation causes it).
Locus Heterogeneity Different genetic diseases causing the same phenotype. An example is retinitis pigmentosa.
Phenotypic Heterogneity Different mutations in the same gene can cause very different diseases. A deletion in the RET gene causes Hirchsprung Disease, characterized by severe constipation; other mutations in the RET gene result in thyroid and adrenal cancer; another set of mutations causes both Hirschsprung disease and cancer.
Inbreeding Coefficient (F)
Homozygous Having the same alleles on each homologous chromosomes.
Heterozygous Having different alleles on each homologous chromosome. A compound heterozygote carries no normal alleles (only mutant alleles) for a particular gene.
Hemizygous A gene with only one copy (and thus only one allele) normally present; for example, a Y gene is hemizygous).
Dominant Allele In a heterozygote, only the dominant allele is expressed.
Recessive In a heterozygote, the recessive allele is not expressed.
Multifactorial Aka complex, any disease caused by complex genetic (gene-gene or polygenic) and environmental (gene-environment) interactions, and not following a Mendellian pattern.
Autosomal Not pertaining to a sex chromosome (an autosome being any chromosome not a sex chromosome).
Penetrance The likelihood that a disease genotype leads to disease phenotype.
Expressivity Either constant or variable, expressivity describes disease severity among individuals with the same genotype.
Concordance When two family members share a disease, the two individuals are concordant for the disease.
Discordance When two family members do not share a disease, the two individuals are disconcordant for the disease.
Genocopy A concordant relative has a genocopy (aka phenocopy) if they express the disease for different genetic reasons.
FISH Fluorescent in situ hybridization (aka ish).
p Short arm of a chromosome
q Long arm of a chromosome
mar Marker chromosome
r Ring chromosome
i Isochromosome
der Derivative chromosome
dic Dicentric chromosome
cen Centromere
ter Telomere, or at the end
del Deletion
dup Duplication
fra Fragile site
ins Insertion
inv Inversion
t Translocation
rcp Reciprocal translocation
rob Robertsonian translocation
+ Gain or addition
- Loss or omission
: Breakage
:: Breakage and joining
/ Mosaic
arr cgh Array competitive genome hybridization
Abbreviations derived from Nussbaum, McInnes & Willard: Genetics in Medicine, 7th ed. Philadelphia, Saunders, 2007. (pg 66)

The most important message of the Hardy–Weinberg equilibrium is that allele frequencies remain the same from generation to generation unless some agent acts to change them. With that in mind, the Hardy–Weinberg equilibrium allows scientists to determine whether evolutionary agents are operating and their identity (as evidenced by the pattern of deviation from the equilibrium). The equilibrium also shows the distribution of genotypes that would be expected for a population at genetic equilibrium.

No real-life population is ever at Hardy-Weinberg equilibrium, but fortunately aberrations are usually rare enough that it can be assumed the population is at Hardy-Weinberg equilibrium. The five requirements for Hard-Weinberg are:

  1. Population size is very large.
  2. There is no migration between populations.
  3. There is no mutation.
  4. Mating is random.
  5. Natural selection does not affect the alleles under consideration.

If the conditions of the Hardy–Weinberg equilibrium are met, then the frequencies of alleles at a locus remain constant from generation to generation, and after one the genotype frequencies will not change after one generation of random mating. If p and q represent the frequencies of the dominant and recessive alleles at a locus, then p2 and q2 are the frequencies of the homozygous genotypes and 2pq (or pq+qp) is the the frequency of the heterozygous genotype. These can be related by the Hardy-Weinberg equation:

p2 + 2pq + q2 = 1

Tag Cloud