Reputation: 11
I am importing a csv file, while reading it, there are special character like '�' appearing in a string which is read, how to avoid the these unicode characters.
I am using TextFieldParser for parsing the data, but while reading, space between two string in a sentence is replaced with the character '�'. Tried to do a contains search of a string and replace the character, but special character might be something different later.
Encoding DefaultEncoding = Encoding.UTF8;
public IList<string[]> ReadCsvData()
{
using (var reader = ReadBase64File())
{
return CsvParser.ReadCsvData(reader);
}
}
TextReader ReadBase64File()
{
var bytes = Convert.FromBase64String(base64File);
return new StreamReader(new MemoryStream(bytes), DefaultEncoding, true);
}
public static IList<string[]> ReadCsvData(TextReader reader)
{
IList<string[]> csvData = new List<string[]>();
using (Microsoft.VisualBasic.FileIO.TextFieldParser parser = new Microsoft.VisualBasic.FileIO.TextFieldParser(reader))
{
parser.SetDelimiters(",");
parser.TrimWhiteSpace = true;
try
{
while (!parser.EndOfData)
{
csvData.Add(parser.ReadFields());
}
}
catch (Microsoft.VisualBasic.FileIO.MalformedLineException ex)
{
throw new FormatException($"Invalid format found when importing the CSV data (line {parser.ErrorLineNumber}).", ex);
}
}
return csvData;
}
Upvotes: 1
Views: 2420
Reputation: 131
Just use Encoding.GetEncoding(1252)
as the second parameter of the TextFieldParser
constructor, i.e. replace:
new Microsoft.VisualBasic.FileIO.TextFieldParser(reader)
with:
new Microsoft.VisualBasic.FileIO.TextFieldParser(reader, Encoding.GetEncoding(1252))
Upvotes: 0