Reputation: 1111
I'm currently learning about multidimensional arrays and was given the task of analyzing strands of RNA sequences (given from a .txt file). Here is an example of a strand:
AUGCUUAUUAACUGAAAACAUAUGGGUAGUCGAUGA
Given this string, I am to figure out what protein this RNA strand would create. In order to do so, I am to break down each strand into codons (groups of 3). So for this exampple, I need to look at AUG CUU AUU AAC UGA, etc. Each of these codons represents an amino acid. So AUG is methionine (represented by 'M'), CUU is leucine (represented by 'L') and so on and so forth. My output should therefore be a new string of amino acids (M-L-I...)
What would be the best way to approach this problem? From my understanding, I'm to create a 3-D array, let's say
int aminoAcid[4][4][4]
Since there are 4 possible choice for each base (A,U,G,C). I'm not entirely sure where to go from here though since certain combinations will give the same amino acid.
EDIT: Am I going in the right direction if a were to first convert the string into number representations (A=0, U=1, G=2, C=3). From there I can work better with a 3d array right?
Upvotes: 0
Views: 250
Reputation: 3891
You can use the 3d array to connect amino acids to different sequences. You should learn about enum
and figure out how you can use enum
with your array indices so that you can do something like
aminoAcid['A']['U']['G'] = 24
where 24 is also corresponding to methionine, meaning you can use another enum
there. Use enums
whenever you have a limited known group of items you want to represent with numbers.
It sounds like this is just the beginning of a larger project, so you should follow good practices from the start, thinking about how you can build components that represent your problem.
Upvotes: 1