Design a class to represent DNA strings. A DNA
string consists of a list of the characters { A, C, G, T }
of arbitrary length. At this point,
just implement a bare bones class with constructors, destructor
and other useful methods. Strip out of the class methods which have
no useful purpose in this object. You will still need to
keep the insertion and deletion methods though. Since DNA
strings can be very long, pay close attention to efficiency of computation
here. There are 64 possible groupings of
three letters at a time from this set of four letters.
The genetic code assigns to each such triplet, an amino acid.
There is redundancy too as more than one triplet can be assigned to a
given amino acid. There are 20 amino acids, so the DNA string
can be read from any given starting point and each triplet encountered
can be mapped to its corresponding amino acid. We can continue this
read until we are told to stop. The substring of the DNA string
between the start and stop of the read may be called a gene and
the string of amino acids associated with this string is called
a protein.
Write such a read method for our DNA class. It needs to
be told where to start and where to stop. It needs also to
assign the correct protein to our gene.
We can do more with this later, but for now this will do.
Note that insertions and deletions correspond to genetic damage
and/ or mutations.
Use the following table to construct the triplet code: