jjamesjohnson
jjamesjohnson

Reputation: 669

handling a CSV with line feed characters in a column in powershell

Currently, I have a system which creates a delimited file like the one below in which I've mocked up the extra line feeds which are within the columns sporadically.

Column1,Column2,Column3,Column4

Text1,Text2[LF],text3[LF],text4[CR][LF]

Text1,Text2[LF][LF],text3,text4[CR][LF]

Text1,Text2,text3[LF][LF],text4[CR][LF]

Text1,Text2,text3[LF],text4[LF][LF][CR][LF]

I've been able to remove the line feeds causing me concern by using Notepad++ using the following REGEX to ignore the valid carriage return/Line feed combinations:

(?<![\r])[\n]

I am unable however to find a solution using powershell, because I think when I get-content for the csv file the line feeds within the text fields are ignored and the value is stored as a separate object in the variable assigned to the get-content action. My question is how can I apply the regex to the csv file using replace if the cmdlet ignores the line feeds when loading the data?

I've also tried the following method below to load the content of my csv which doesn't work either as it just results in one long string, which would be similar to using -join(get-content).

[STRING]$test = [io.file]::ReadAllLines('C:\CONV\DataOutput.csv')
$test.replace("(?<![\r])[\n]","")
$test | out-file .\DataOutput_2.csv

Upvotes: 3

Views: 6611

Answers (1)

mousio
mousio

Reputation: 10357

Nearly there, may I suggest just 3 changes:

  • use ReadAllText(…) instead of ReadAllLines(…)
  • use -replace … instead of .Replace(…), only then will the first argument be treated as a regex
  • do something with the replacement result (e.g. assign it back to $test)

Sample code:

[STRING]$test = [io.file]::ReadAllText('C:\CONV\DataOutput.csv')
$test = $test -replace '(?<![\r])[\n]',''
$test | out-file .\DataOutput_2.csv

Upvotes: 3

Related Questions