Reputation: 114
I'm trying to build an application that reads lines of csv text from the network and inserts it into sqlite db. I need to extract all strings that appear between commas, including empty strings. For e.g a line of text that I need to parse looks like:
"1/17/09 1:23,\"Soap, Shampoo and cleaner\",,1200,Amex,Steven O' Campbell,,Kuwait,1/16/09 14:26,1/18/09 9:08,29.2891667,,48.05"
My code snippet is below , I figured I need to use regex since I'm trying to split the line of string at "," character but the comma may also appear as part of the string.
package main
import (
"fmt"
"regexp"
"strings"
)
func main() {
re := regexp.MustCompile(`^|[^,"']+|"([^"]*)"|'([^']*)`)
txt := "1/17/09 1:23,\"Soap, Shampoo and cleaner\",,1200,Amex,Steven O' Campbell,,Kuwait,1/16/09 14:26,1/18/09 9:08,29.2891667,,48.05"
arr := re.FindAllString(txt, -1)
arr2 := strings.Split(txt, ",")
fmt.Println("Array lengths: ", len(arr), len(arr2))
}
The correct length of the split array in this case should be 13.
Upvotes: 0
Views: 432
Reputation: 3067
Like Marc and Flimzy said, regex isn't the right tool here. And since you're not specifying that we should use regex as the tool to extract data from your string, here's a snippet on how you'd extract those from your string and fit the result you're looking for:
import (
"bytes"
"encoding/csv"
"fmt"
)
func main() {
var testdata = `1/17/09 1:23,"Soap, Shampoo and cleaner",,1200,Amex,Steven O' Campbell,,Kuwait,1/16/09 14:26,1/18/09 9:08,29.2891667,,48.05`
var reader = csv.NewReader(bytes.NewBufferString(testdata))
var content, err = reader.Read()
if err != nil {
panic(err)
}
fmt.Println(len(content)) // 13
}
Upvotes: 1