Reputation: 2891
I am new to programming and F# is my first .NET language.
I am attempting this problem on Rosalind.info. Basically, given a DNA string, I am supposed to return four integers counting the respective number of times that the symbols 'A', 'C', 'G', and 'T' occur in the string.
Here is the code I have written so far:
open System.IO
open System
type DNANucleobases = {A: int; C: int; G: int; T: int}
let initialLetterCount = {A = 0; C = 0; G = 0; T = 0}
let countEachNucleobase (accumulator: DNANucleobases)(dnaString: string) =
let dnaCharArray = dnaString.ToCharArray()
dnaCharArray
|> Array.map (fun eachLetter -> match eachLetter with
| 'A' -> {accumulator with A = accumulator.A + 1}
| 'C' -> {accumulator with C = accumulator.C + 1}
| 'G' -> {accumulator with G = accumulator.G + 1}
| 'T' -> {accumulator with T = accumulator.T + 1}
| _ -> accumulator)
let readDataset (filePath: string) =
let datasetArray = File.ReadAllLines filePath
String.Join("", datasetArray)
let dataset = readDataset @"C:\Users\Unnamed\Desktop\Documents\Throwaway Documents\rosalind_dna.txt"
Seq.fold countEachNucleobase initialLetterCount dataset
However, I received the following error message:
CountingDNANucleotides.fsx(23,10): error FS0001: Type mismatch. Expecting a DNANucleobases -> string -> DNANucleobases but given a DNANucleobases -> string -> DNANucleobases [] The type 'DNANucleobases' does not match the type 'DNANucleobases []'
What went wrong? What changes should I make to correct my mistake?
Upvotes: 1
Views: 141
Reputation: 6223
countEachNucleobase
returns an array of the accumulator type instead of just the accumulator it got as its first parameter. Therefore, Seq.fold
can't find a valid solution for its 'State
parameter: it's just the record on the input, but an array on the output. The function used for folding must have the accumulator type as both its first input and its output.
In place of Array.map
in the question's code, you could already use Array.fold
:
let countEachNucleobase (accumulator: DNANucleobases) (dnaString: string) =
let dnaCharArray = dnaString.ToCharArray()
dnaCharArray
|> Array.fold (fun (accumulator : DNANucleobases) eachLetter ->
match eachLetter with
| 'A' -> {accumulator with A = accumulator.A + 1}
| 'C' -> {accumulator with C = accumulator.C + 1}
| 'G' -> {accumulator with G = accumulator.G + 1}
| 'T' -> {accumulator with T = accumulator.T + 1}
| _ -> accumulator) accumulator
And then, the call in the last line becomes:
countEachNucleobase initialLetterCount dataset
Shorter version
let readChar accumulator = function
| 'A' -> {accumulator with A = accumulator.A + 1}
| 'C' -> {accumulator with C = accumulator.C + 1}
| 'G' -> {accumulator with G = accumulator.G + 1}
| 'T' -> {accumulator with T = accumulator.T + 1}
| _ -> accumulator
let countEachNucleobase acc input = Seq.fold readChar acc input
Since strings are char sequences, input
will take strings as well as char arrays or other char sequences.
Upvotes: 3