ediblecode
ediblecode

Reputation: 11971

Adapting csv reader to read unicode characters

I'm having a problem with characters in a csv file coming through as the black diamond with a ? in the middle.

I have written the code to parse the csv, but I don't get why the string isn't reading the unicode characters properly. It's probably something to do with my implementation:

StreamReader readFile = new StreamReader(path)

try {
  while ((line = readFile.ReadLine()) != null) {
    string[] row = { "", "", "" };
    int currentItem = 0;
    bool inQuotes = false;
    if (skippedFirst && currentItem != 3) {
      for (int i = 0; i < line.Length; i++) {
        if (!inQuotes) {
          if (line[i] == '\"')
            inQuotes = true;
          else {
            if (line[i] == ',')
              currentItem++;
            else
              row[currentItem] += line[i];
          }
        } else {
          if (line[i] == '\"')
            inQuotes = false;
          else
            row[currentItem] += line[i];
        }
      }
      parsedFile.Add(row);
    }
    skippedFirst = true;
  }

Upvotes: 3

Views: 8039

Answers (1)

mfussenegger
mfussenegger

Reputation: 3971

Specify the Encoding when opening the File.

using (var sr = new StreamReader(@"c:\Temp\csvfile.csv", Encoding.UTF8)) {
}

You might also want to look into Filehelpers for CSV parsing:

https://www.filehelpers.net/quickstart/

Upvotes: 11

Related Questions