Peter Webster
Peter Webster

Reputation: 103

Removing line breaks from CSV exported from Google Sheets

I have some data in the format:

-e, 's/,Chalk/,Cheese/g'

-e, 's/,Black/,White/g'

-e, 's/,Leave/,Remain/g'

in a file data.csv.

Using Gitbash, I use the file command to discover that this is ASCII text with CRLF terminators. If I also use the command cat -v , I see in Gitbash that each line ends ^M .

I want to remove those terminators, to leave a single line.

I've tried the following:

sed -e 's/'\r\n'//g' < data.csv > output.csv

taking care to put the \r\n in single quotes in order that the backslash is treated literally, but it does not work. No error, just no effect.

I'm using Gitbash for Windows.

Upvotes: 2

Views: 1869

Answers (1)

gnucchi
gnucchi

Reputation: 189

Quotes within quotes cancel each other out, so you actually undo the quotes around the sed command for the newline characters. You could escape the quotes like 's|'\''\r\n'\''||g', but that would just include them in the string, which would not match anything in your case.

But that is not the only problem; sed by default only processes strings between newlines.

If you have the GNU version of sed, RAM to spare if the file is huge, and are sure the file does not contain data with null characters, try adding the -z argument, like:

sed -z -e 's|\r\n||g' < data.csv > output.csv

Though I guess you probably also want to replace it with a comma:

sed -z -e 's|\r\n|,|g' < data.csv > output.csv

For non-GNU versions of sed, you may have an easier time using tr instead, like:

tr '\r\n' ',' data.csv > output.csv

Upvotes: 2

Related Questions