Manik Taneja
Manik Taneja

Reputation: 114

Regex to match empty string or pattern

I'm trying to build an application that reads lines of csv text from the network and inserts it into sqlite db. I need to extract all strings that appear between commas, including empty strings. For e.g a line of text that I need to parse looks like:

"1/17/09 1:23,\"Soap, Shampoo and cleaner\",,1200,Amex,Steven O' Campbell,,Kuwait,1/16/09 14:26,1/18/09 9:08,29.2891667,,48.05"

My code snippet is below , I figured I need to use regex since I'm trying to split the line of string at "," character but the comma may also appear as part of the string.

package main

import (
    "fmt"
    "regexp"
    "strings"
)

func main() {
    re := regexp.MustCompile(`^|[^,"']+|"([^"]*)"|'([^']*)`)
    txt := "1/17/09 1:23,\"Soap, Shampoo and cleaner\",,1200,Amex,Steven O' Campbell,,Kuwait,1/16/09 14:26,1/18/09 9:08,29.2891667,,48.05"

    arr := re.FindAllString(txt, -1) 
    arr2 := strings.Split(txt, ",")     
    fmt.Println("Array lengths: ", len(arr), len(arr2)) 
  
}

The correct length of the split array in this case should be 13.

Upvotes: 0

Views: 432

Answers (1)

Gustavo Kawamoto
Gustavo Kawamoto

Reputation: 3067

Like Marc and Flimzy said, regex isn't the right tool here. And since you're not specifying that we should use regex as the tool to extract data from your string, here's a snippet on how you'd extract those from your string and fit the result you're looking for:

import (
    "bytes"
    "encoding/csv"
    "fmt"
)

func main() {
    var testdata = `1/17/09 1:23,"Soap, Shampoo and cleaner",,1200,Amex,Steven O' Campbell,,Kuwait,1/16/09 14:26,1/18/09 9:08,29.2891667,,48.05`
    var reader = csv.NewReader(bytes.NewBufferString(testdata))
    var content, err = reader.Read()
    if err != nil {
        panic(err)
    }
    fmt.Println(len(content)) // 13
}

Upvotes: 1

Related Questions