Reputation: 7226
I am using FileHelpers 3.3.1 to import CSV data and populate DataTables
in my c# app. It works well, and here is how I'm calling it:
DataTable dt = CsvEngine.CsvToDataTable(fullPath, ',');
The problem is that some column values have padding, as in spaces to the left and/or right side of the values, and those spaces are not being trimmed. My CSV files are large and performance of my importer app is important, so I really want to avoid looping through the datatable after the fact and trimming every column value of every row.
Is there a way to invoke a "trim all column values automatically" during the call to CsvToDataTable()
?
I know there is a FieldTrim attribute that does this very thing, but I cannot bind rigid classes to my CSV files because I have many different CSV files and they all have different column names and data types. So that's not a practical option for me. It seems like there would be a built-in way to trim
using one of the generic CSV parsers like CsvToDataTable()
.
What is my best option?
Upvotes: 1
Views: 632
Reputation: 11326
The FileHelpers CsvEngine
class is quite limited. It is a sealed
class so you cannot easily inherit or override from it.
If you don't mind a hacky solution, the following works
// Set the internal TrimChars via reflection
public static class FileBaseExtensions
{
public static void SetTrimCharsViaReflection(this FieldBase field, Char [] value)
{
var prop = typeof(FieldBase).GetProperty("TrimChars", BindingFlags.NonPublic | BindingFlags.Instance);
prop.SetValue(field, value);
}
}
CsvOptions options = new CsvOptions("Records", ',', filename);
var engine = new CsvEngine(options);
foreach (var field in engine.Options.Fields)
{
field.SetTrimCharsViaReflection(new char[] { ' ', '\t' });
field.TrimMode = TrimMode.Both;
}
var dataTable = engine.ReadFileAsDT(filename);
But you would be better off using a standard FileHelperEngine
and creating your own version of CsvClassBuilder
(source code here) to create the mapping class. You would have to change the AddFields
method as follows:
public override DelimitedFieldBuilder AddField(string fieldName, string fieldType)
{
base.AddField(fieldName, fieldType);
if (base.mFields.Count > 1)
{
base.LastField.FieldOptional = true;
base.LastField.FieldQuoted = true;
base.LastField.QuoteMode = QuoteMode.OptionalForBoth;
base.LastField.QuoteMultiline = MultilineMode.AllowForBoth;
// <New>
base.LastField.TrimMode = TrimMode.Both;
base.LastField.TrimChars = " \t"; // trim spaces and tabs
// </New>
}
return base.LastField;
}
If necessary you can lift the code for CsvToDataTable
from the source code for CsvEngine
which is here.
Upvotes: 1