Reputation: 23
I'm working on a fun problem regarding finding a more efficient way to store the genome of the human malaria parasite, and I thought it would be useful to get some our your insights!
So here's the background info: suppose we're only using 2 bits to store all 4 nucleotides of the genome (A, C, T, G), but because the genome is still SUPER long, we know it takes up a ton of space. However, we know that 80% of the genome is either A or T - how can we use this knowledge to our advantage to store the genome in a more efficient way?
Right now I'm playing around with a couple ideas:
Anyone else have any good ideas for making this data storage as efficient as possible? I'd love to hear 'em and discuss!
Upvotes: 0
Views: 32