alex_z
alex_z

Reputation: 436

How to extract only words from a text file in F#?

I`m trying to extract only words from a text file that is very simple:

Please note that you still have an unclaimed iPhone 7. 

We have repeatedly written to you regarding your delivery details. We do not understand why you have not yet confirmed your shipping information so we can send it to your home address. 

Your special price for the brand new  iPhone 7 phone is only £3 with shipping. 

We hope that you'll confirm your information this time. 

I have been using this function, but seems that it throws an Exception("No Overloads match for method Split"):

let wordSplit (text:string) = 
  text.Split([|' ','\n','\t',',','.','/','\\','|',':',';'|])
  |> Array.toList

Upvotes: 1

Views: 156

Answers (1)

rmunn
rmunn

Reputation: 36708

In F#, items in arrays or lists are separated by the ; (semicolon) character, not the , (comma). Your code is creating an array that contains one 10-item tuple. You should write the following if you want an array of ten items:

let wordSplit (text:string) = 
  text.Split([|' ';'\n';'\t';',';'.';'/';'\\';'|';':';';'|])
  |> Array.toList

If you also want to not get empty strings back as part of the split operation, then you want the version of String.Split that takes a StringSplitOptions parameter:

let wordSplit (text:string) = 
  text.Split([|' ';'\n';'\t';',';'.';'/';'\\';'|';':';';'|], StringSplitOptions.RemoveEmptyEntries)
  |> Array.toList

Note that StringSplitOptions is in the System namespace, so if you don't have an open System line at the top of your file, you'll need to add one.

Upvotes: 5

Related Questions