HerrimanCoder
HerrimanCoder

Reputation: 7226

How to trim all column values with CsvEngine.CsvToDataTable()?

I am using FileHelpers 3.3.1 to import CSV data and populate DataTables in my c# app. It works well, and here is how I'm calling it:

DataTable dt = CsvEngine.CsvToDataTable(fullPath, ',');

The problem is that some column values have padding, as in spaces to the left and/or right side of the values, and those spaces are not being trimmed. My CSV files are large and performance of my importer app is important, so I really want to avoid looping through the datatable after the fact and trimming every column value of every row.

Is there a way to invoke a "trim all column values automatically" during the call to CsvToDataTable()?

I know there is a FieldTrim attribute that does this very thing, but I cannot bind rigid classes to my CSV files because I have many different CSV files and they all have different column names and data types. So that's not a practical option for me. It seems like there would be a built-in way to trim using one of the generic CSV parsers like CsvToDataTable().

What is my best option?

Upvotes: 1

Views: 632

Answers (1)

shamp00
shamp00

Reputation: 11326

The FileHelpers CsvEngine class is quite limited. It is a sealed class so you cannot easily inherit or override from it.

If you don't mind a hacky solution, the following works

// Set the internal TrimChars via reflection
public static class FileBaseExtensions
{
    public static void SetTrimCharsViaReflection(this FieldBase field, Char [] value)
    {
        var prop = typeof(FieldBase).GetProperty("TrimChars", BindingFlags.NonPublic | BindingFlags.Instance);
        prop.SetValue(field, value);
    }
}

CsvOptions options = new CsvOptions("Records", ',', filename);
var engine = new CsvEngine(options);            
foreach (var field in engine.Options.Fields)
{
    field.SetTrimCharsViaReflection(new char[] { ' ', '\t' });
    field.TrimMode = TrimMode.Both;
}
var dataTable = engine.ReadFileAsDT(filename);

But you would be better off using a standard FileHelperEngine and creating your own version of CsvClassBuilder (source code here) to create the mapping class. You would have to change the AddFields method as follows:

public override DelimitedFieldBuilder AddField(string fieldName, string fieldType)
{
    base.AddField(fieldName, fieldType);
    if (base.mFields.Count > 1)
    {
        base.LastField.FieldOptional = true;
        base.LastField.FieldQuoted = true;
        base.LastField.QuoteMode = QuoteMode.OptionalForBoth;
        base.LastField.QuoteMultiline = MultilineMode.AllowForBoth;

        // <New>
        base.LastField.TrimMode = TrimMode.Both;
        base.LastField.TrimChars = " \t"; // trim spaces and tabs
        // </New>
    }
    return base.LastField;
} 

If necessary you can lift the code for CsvToDataTable from the source code for CsvEngine which is here.

Upvotes: 1

Related Questions