Reputation: 24116
I am using this http://joshclose.github.io/CsvHelper/ awesome library in my .NET C# project for my CSV parsing requirements.
If I have a CSV file that looks like this:
SupplierSku,MappedSageSku
EG1234,EGCD1234
EG4567,EG-XZ567
I usually create a DTO class like this:
public class SkuMapping
{
public string SupplierSku { get; set; }
public string MappedSageSku { get; set; }
}
and parse the csv file like this:
// Open & parse selected csv file
var csvReader = new CsvReader(File.OpenText(selectSkuMapping.Text));
var skuMappings = csvReader.GetRecords<SkuMapping>();
// Do something with each row
foreach (SkuMapping skuMapping in skuMappings)
{
// ...
}
This works great for pre-defined / structured CSV files.
I now have the requirement to parse arbitrary CSV files, where they may contains various column delimiter
& string enclosure
and exact number of columns on csv is not known, but the column index containing the data I need is known.
PartNumb,InStock,PrGroup
"A-X-1230",Y,103
"B-DD-1231",Y,103
Where; column delimiter
= ,
and string enclosure
= "
Data I need: Column index 0
(PartNumb) and Column index 1
(InStock)
SupplierSku,CatIds,StockStatus,Active
%ADA-BB-124%|4,5,1|%AV%|1
%XAS-E4-S11%|97,41,65|%OS%|0
Where; column delimiter
= |
and string enclosure
= %
Data I need: Column index 0
(SupplierSku) and Column index 2
(StockStatus)
So given the above, what is the best way to use the CsvHelper
library to parse arbitrary csv files (where column delimiter
, string enclosure
and column indexes
are known)? I also need the option to skip first row on csv (sometimes csv contains header row and sometimes they dont).
Upvotes: 2
Views: 11932
Reputation: 8588
There was a package upgrade and the new syntax is now:
using (var reader = new StreamReader("./myfile.csv", Encoding.UTF8))
using (var csv = new CsvReader(reader, new CsvConfiguration(CultureInfo.InvariantCulture, delimiter: ";", encoding: Encoding.UTF8)))
{
//do something with csv
}
As pointed out in the comments and seemingly already migrated in my codebase new syntax is
using (var reader = new StreamReader("./myfile.csv", Encoding.UTF8))
using (var csv = new CsvReader(reader, new CsvConfiguration(CultureInfo.InvariantCulture) { Delimiter = ";", Encoding = Encoding.UTF8 }))
{
//do something with csv
}
Upvotes: 5
Reputation: 536
CsvReader csv = new CsvReader(new StreamReader(stream), true, ';')
Configuration.Delimiter it is not working any more, now you have to pass the delimiter as a param in the CsvReader initialization
Upvotes: 0
Reputation: 30464
The answer depends on what you want:
If you don't know which delimiters are allowed, you are in trouble: Is 'A'a delimiter? Is 'B' a delimiter?. Let's assume there is a set of characters that you think are suitable as delimiters for the actual CSV stream you are going to parse.
You can do this using String.Replace(char, char) for each accepted delimiter, or use a regular expression
If you only need certain columns, create a map where you tell the CsvHelper which column must be mapped to which destination.
Example: if you need to map column "MyColumn" to Property YourProperty create a map:
private sealed class MyCsvConverterMap : CsvClassMap<MyDestinationType>
{
public MyCsvConverterMap()
{
Map(item => item.YourProperty).Name("MyColumn");
// map all properties in your destination to a column
}
}
using (TextReader txtReader = new StringReader(...))
{
CsvReader csvReader = new CsvReader(txtReader);
csvReader.Configuration.Delimiter = ";";
csvReader.Configuration.HasHeaderRecord = true;
csvReader.Configuration.RegisterClassMap(new MyCsvConverterMap());
while (csvReader.Read())
{
MyDestinationType convertedRecord = csvReader.GetRecord<MyDestinationType>();
...
Addition
It is also possible to map by column number instead of mapping by column name. See an explanation of all kinds of maps: CsvHelper Getting started
Upvotes: 3
Reputation: 696
With a bit more effort you can also create a type-safe mapping with the CsvClassMap
and inheriting from the DefaultTypeConverter
class to create a converter for the CatIds (comma separated).
Here's an example that works for your example #2:
[TestClass]
public class CsvHelperTest
{
[TestMethod]
public void Test()
{
var textToParse = "SupplierSku,CatIds,StockStatus,Active" + Environment.NewLine;
textToParse += "%ADA-BB-124%|4,5,1|%AV%|1" + Environment.NewLine;
textToParse += "%XAS-E4-S11%|97,41,65|%OS%|0";
using (var stringReader = new StringReader(textToParse))
{
using (var reader = new CsvReader(stringReader))
{
reader.Configuration.Quote = '%';
reader.Configuration.Delimiter = "|";
reader.Configuration.HasHeaderRecord = true; // If there is no header, set to false.
reader.Configuration.RegisterClassMap<StockMap>();
foreach(var stock in reader.GetRecords<Stock>())
{
// normally do something with data, now just test
Assert.IsNotNull(stock.SupplierSku);
Assert.IsTrue(stock.SupplierSku.IndexOf('%') == -1, "Quotes should be stripped");
Assert.IsNotNull(stock.CatIds);
Assert.AreEqual(3, stock.CatIds.Length, "Expected 3 CatIds");
}
}
}
}
public class StockMap : CsvClassMap<Stock>
{
public StockMap()
{
Map(stock => stock.SupplierSku).Index(0);
Map(stock => stock.CatIds).Index(1).TypeConverter<CatIdsConverter>();
Map(stock => stock.StockStatus).Index(2);
Map(stock => stock.Active).Index(3); // 1 is true, 0 is false
}
}
public class Stock
{
public string SupplierSku { get; set; }
public int[] CatIds { get; set; }
public StockStatus StockStatus { get; set; }
public bool Active { get; set; }
}
public enum StockStatus
{
AV, OS
}
public class CatIdsConverter : DefaultTypeConverter
{
public override bool CanConvertFrom(Type type)
{
return type == typeof(string);
}
public override object ConvertFromString(TypeConverterOptions options, string text)
{
if (string.IsNullOrEmpty(text))
return null;
var catIds = text.Split(',').Select(c=> Convert.ToInt32(c)).ToArray();
return catIds;
}
}
}
For example #1 just configure Quote = '"', Delimiter = ",", Add another class & CsvClassMap<> implementation and configure that within another CsvReader.
Upvotes: 1
Reputation: 2141
This seems to work, using CsvHelper:
var textToParse = @"SupplierSku,CatIds,StockStatus,Active
%ADA-BB-124%|4,5,1|%AV%|1
%XAS-E4-S11%|97,41,65|%OS%|0";
string supplierSku;
string stockStatus;
using (var stringReader = new StringReader(textToParse))
{
using (var reader = new CsvReader(stringReader))
{
reader.Configuration.Delimiter = ",";
reader.Configuration.HasHeaderRecord = true; // If there is no header, set to false.
while (reader.Read())
{
supplierSku = reader.GetField("SupplierSku"); // Or reader.GetField(0)
stockStatus = reader.GetField("StockStatus"); // Or reader.GetField(2)
Console.WriteLine($"SKU: {supplierSku}; Status: {stockStatus}");
}
}
}
However, it doesn't automatically trim/remove the quote characters - you can easily do that yourself using Trim()
or Substring()
. There is a little more manual effort involved, but it is still easier than doing it manually.
Upvotes: 4