Wednesday, December 15, 2010

DNA

DNA is the main information carrier molecule in a cell. DNA may be single or double stranded. A single stranded DNA molecule, also called a polynucleotide, is a chain of small molecules, called nucleotides . There are four different nucleotides grouped into two types, purines: adenosine and guanine and pyrimidines: cytosine and thymine. They are usually referred to as bases (in fact bases are the only distinguishing element between different nucleotides, see figure below) and denoted by their initial letters, A,C ,G and T (not to be confused with amino acids!).
(picture taken from  On-Line Biology Book )
Different nucleotides can be linked together in any order to form a polynucleotide, for instance, like this
     A-G-T-C-C-A-A-G-C-T-T

Polynucleotides can be of any length and can have any sequence. The two ends of this molecule are chemically different, i.e., the sequence has a directionality, like this
     A->G->T->C->C->A->A->G->C->T->T->

The end of the polynucleotide are marked either 5' and 3' (this has chemical reasons in the numbering of the –OH groups of the sugar ring); by convention DNA is usually written with 5' left and 3' right, with the coding strand at top. Two such strands are termed complementary , if one can be obtained from the other by mutually exchanging A with T and C with G, and changing the direction of the molecule to the opposite. For instance,
     <-T<-C<-A<-G<-G<-T<-T<-C<-G<-A<-A
is complementary to the polynucleotide given above.
Specific pairs of nucleotides can form weak bonds between them. A binds to T, C binds to G (to be more precise, two hydrogen bonds can be formed between each A-T pair, and three hydrogen bonds between each C-G pair). Although such interactions are individually weak, when two longer complementary polynucleotide chains meet, they tend to stick together, like this

      5' C-G-A-T-T-G-C-A-A-C-G-A-T-G-C 3'
         | | | | | | | | | | | | | | | 
      3' G-C-T-A-A-C-G-T-T-G-C-T-A-C-G 5'

Vertical lines between two strands represent the forces between them (to be more accurate we could draw triple lines between each C and G and double lines between A and T) as shown below. The A-T and G-C pairs are called base-pairs (bp). The length of a DNA molecule is usually measured in base-pairs or nucleotides (nt), which in this context is the same thing. 

 
(picture taken from  On-Line Biology Book )

Two complementary polynucleotide chains form a stable structure, which resembles a helix and is known as a the DNA double helix. About 10 bp in this structure takes a full turn, which is about 3.4 nm long.

(picture taken from  On-Line Biology Book )

This structure was first figured out in 1953 in Cambridge by Watson and Crick (with the help of others), and the birthplace of this structure is often thought to be the Eagle pub on Bene't street. Later they got the Nobel Prize for this discovery, for more see the book by Watson – The Double Helix.

Watson and Crick at their DNA model molecule

It is remarkable that two complementary DNA polypeptides form a stable double helix almost regardless of the sequence of the nucleotides. This makes the DNA molecule a perfect medium for information storage. Note that as the strands are complementary, each one of them fully determining the other, therefore for the information purposes it is enough to give only one strand of the genome molecules. Thus, for many information related purposes, the molecule used on the example above, can be represented as CGATTCAACGATGC. The maximal amount of information that can be encoded in such a molecule is therefore 2 bits times the length of the sequence. Noting that the distance between nucleotide pairs in a DNA is about 0.34 nm, we can calculate that the linear information storage density in DNA is about 6x10 8 bits/cm, which is approximately 75 GB or 12.5 CD-Roms per cm.

Complementarity of two strands in the DNA is exploited for copying (multiplying) DNA molecules in a process known as the DNA replication , in which one double stranded DNA is replicated into two identical ones. (The DNA double helix unwinds and forks during the process, and a new complimentary strand is synthesised by specific molecular machinery on each branch of the fork. After the process is finished there are two DNA molecules identical to the original one.)   In a cell this happens during the cell division (see Section 1) and a copy identical to the original goes to each of the new cells.

Note that mismatched components between polynucleotide strands are possible, if the total sum of weak forces between the complementary nucleotides are strong enough. So the molecules like

 
C-G-A-T-T-G-C-C-A-C-G-A-T-G-C
| | | ~ | | | ~ | | | ~ | | |
G-C-T-T-A-C-G-T-T-G-C-A-A-C-G
are chemically possible, though they may be rare in a living cell. More bonds, i.e., more complementary pairs, makes the molecule more stable. If there are not enough bonds, the two stranded molecular structure may become weak and the strands may come apart. The number of links needed to keep the double-helix together depends on the temperature (so-called melting temperature) and other environmental factors. DNA which is no longer in the helical form is said to be denatured.