vanandsh
vanandsh

Reputation: 92

Replace commas that are not within double inverted commas("") from csv files using powershell?

I have a huge csv file(around 100GB). My problem is that I need to replace commas(,) in the file with semi-colon(;) except for the ones within double-inverted commas("").

I tried several methods but none seem to be working. Also this modification needs to be on Windows, hence sed and awk are out of option.

Example:
Input : "A,B,C",D,E,"FG","H,J",K
Output : "A,B,C";D;R;"FG";H,J;K

Once this is done, I need to remove the ".

I am able to remove the " from the file, but semi-colon replacement is failing everytime.

Please let me know if this is achievable through Powershell.

Upvotes: 0

Views: 1119

Answers (1)

mjolinor
mjolinor

Reputation: 68341

This should take care of both the delimiter replacement and removing the double quotes:

 Get-Content ./File.csv -ReadCount 1000 |
 foreach { $_ -replace ',(?=(?:[^"]|"[^"]*")*$)',';' -replace '"' } |
 Add-Content ./NewFile.csv 

and handle a large file efficiently without needing third party utilities.

Upvotes: 1

Related Questions