Reputation: 31
I have an issue occurring due to a very specific situation when trying to import a csv file into an SQL Server table using a SSIS package. The data in the csv file can contain double quotes and commas. As such, data with commas are double quoted and double quotes are escaped with an additional double quote. I've used the text qualifier to successfully process the initial surrounding quotes. However, there are particular instances where data is formatted like """Anne"", Annabelle" which I cannot get to process. The extra double quotes in the data appear to cause the comma to terminate the field. I have tried to use the Derived Column Transformation to replace those double quotes with something else that might not cause the issue, but to no avail. Has anyone else encountered this issue and found a solution or workaround?
Upvotes: 3
Views: 1742
Reputation: 5208
If you're OK with losing the quotes from those fields, a simple Script Task to process your file before you import it would work (the following creates a new file with "_Processed" added to the filename):
public void Main()
{
System.IO.StreamReader reader = null;
System.IO.StreamWriter writer = null;
try
{
string filepath = Dts.Variables["User::Filepath"].Value.ToString();
reader = new System.IO.StreamReader(filepath);
string fileText = reader.ReadToEnd();
string newFilepath =
System.IO.Path.Combine(
System.IO.Path.GetDirectoryName(filepath),
System.IO.Path.GetFileNameWithoutExtension(filepath) + "_Processed" + System.IO.Path.GetExtension(filepath)
);
if (System.IO.File.Exists(newFilepath))
{
System.IO.File.Delete(newFilepath);
}
writer = new System.IO.StreamWriter(newFilepath);
writer.Write(fileText.Replace("\"\"", ""));
Dts.TaskResult = (int)ScriptResults.Success;
}
catch (Exception ex)
{
Dts.Events.FireError(0, "Script Task", ex.Message, string.Empty, 0);
}
finally
{
if (reader != null)
{
writer.Close();
writer.Dispose();
}
if (writer != null)
{
writer.Close();
writer.Dispose();
}
}
}
If you want to keep the quotes, I would change:
writer.Write(fileText.Replace("\"\"", ""));
To something like:
writer.Write(fileText.Replace("\"\"", "[double quote removed]"));
You could then put the actual double quotes back in in a Derived Column Transformation.
And for all this you can just use a standard Flat File Connection, with comma as a separator and "
as a text qualifier.
Upvotes: 2