Reputation: 1887
I have got a situations where I receive csv file from third party and one of the column contains the line feeds in it. Thus causing issues with the uploading of csv file to database.
I am using SSIS package to parse csv file. Now I want to correct the csv file before it get's used by SSIS.
I'll prefer a powershell script for that.
this question handling a CSV with line feed characters in a column in powershell is very related to what I am looking with one difference..
My data is in this format (Notice [LF]
in the end instead of [CR][LF]
Column1,Column2,Column3,Column4
Text1,"Text2[LF]","text3[LF]",text4[LF]
Edit
Data Example
Column1, Column2, Column 3[LF]
1, "text text", text[LF]
2, "text[LF]
Some more text [LF]
Some more text", text[LF]
3, "text again", text[LF]
Here's the actual file https://www.dropbox.com/s/wsxfyehlnls7m53/test.csv
Can this actually be corrected, or it's just impossible?
Upvotes: 0
Views: 4714
Reputation: 1674
This is a little ugly, but it works for me and does what you need.
First, get the file contents and the column headers.
$text = Import-CSV $file
$columns = Get-Content $file -TotalCount 1
$columns = $columns.Split(",").Trim()
Next loop through each field and replace the line feed with the
.
For ($r=0; $r -lt $text.Count; $r++) {
For ($c=0; $c -lt $columns.Count; $c++) {
$text[$r].($columns[$c]) = $text[$r].($columns[$c]).Replace("`n","<br/>")
}
}
Then Export the CSV
$text | Export-Csv $file -NoTypeInformation
Upvotes: 0
Reputation: 68273
Try this:
(Get-content $file -Raw) -replace '\n(?=")','<br/>' |
set-content $file
That should replace any newline that is immediately followed by a double quote.
Or, you can do it this way:
(Get-content $file -Raw) -replace '\n"','<br/>"' |
set-content $file
Upvotes: 2