Sam T
Sam T

Reputation: 1053

Remove \r (CR) from CSV

On OSX I need to remove line-ending CR (\r) characters (represented as ^M in the output from cat -v) from my CSV file:

$ cat -v myitems.csv

output:

strPicture,strEmail^M
image1xl.jpg,[email protected]^M

I have tried lots of options with sed and perl but nothing works.

Any ideas?

Upvotes: 3

Views: 12235

Answers (3)

mklement0
mklement0

Reputation: 439377

Solutions with stock utilities:

Note: Except where noted (the sed -i incompatibility), the following solutions work on both OSX (macOS) and Linux.

Use sed as follows, which replaces \r\n with \n:

sed $'s/\r$//' myitems.csv

To update the input file in place, use

sed -i '' $'s/\r$//' myitems.csv

-i '' specifies updating in place, with '' indicating that no backup should be made of the input file; if you specify a extension, e.g., -i'.bak', the original input file will be saved with that extension as a backup.
Caveats:
* With GNU sed (Linux), to not create a backup file, you'd have to use just -i, without the separate '' argument, which is an unfortunate syntactic incompatibility between GNU Sed and the BSD Sed used on OSX (macOS) - see this answer of mine for the full story.
* -i creates a new file with a temporary name and then replaces the original file; the most notably consequence is that if the original file was a symlink, it is replaced with a regular file; for a detailed discussion, see the lower half of this answer.

Note: The above uses an ANSI C-quoted string ($'...') to create the \r character in the sed command, because BSD sed (the one used on OS X), doesn't natively recognize such escape sequences (note that the GNU sed used on Linux distros would).
ANSI C-quoted strings are supported in Bash, Ksh, and Zsh.

If you don't want to rely on such strings, use:

sed 's/'"$(printf '\r')"'$//'

Here, the \r is created via printf and spliced into the sed command with a command substitution ($(...)).


Using perl:

perl -pe 's/\r\n/\n/' myitems.csv | cat -v

To update the input file in place, use

perl -i -ple 's/\r\n/\n/' myitems.csv  # -i'.bak' creates backup with suffix '.bak' first

The same caveat as above for sed with regard to in-place updating applies.


Using awk:

awk '{ sub("\r$", ""); print }' myitems.csv  # shorter: awk 'sub("\r$", "")+1'

BSD awk offers no in-place updating option, so you'll have to capture the output in a different file; to use a temporary file and have it replace the original afterward, use the following idiom:

awk '{ sub("\r$", ""); print }' myitems.csv > tmpfile && mv tmpfile myitems.csv

GNU awk v4.1 or higher offers -i inplace for in-place updating, to which the same caveat as above for sed applies.


Edge case for all variants above: If the very last char. in the input file happens to be a lone \r without a following \n, it will also be replaced with a \n.


For the sake of completeness: here are additional, possibly suboptimal solutions:

None of them offer in-place updating, but you can employ the > tmpfile && mv tmpfile myitems.csv idiom introduced above


Using tr: a very simple solution that simply removes all \r instances; thus, it can only be used if \r instance only occur as part of \r\n sequences; typically, however, that is the case:

tr -d '\r' < myitems.csv

Using pure bash code: note that this will be slow; like the tr solution, this can only be used if \r instance only occur as part of \r\n sequences.

while IFS=$'\r' read -r line; do
  printf '%s\n' "$line"
done < myitems.csv

$IFS is the internal field separator, and setting it to \r causes read to read everything before \r, if present, into variable $line (if there's no \r, the line is read as is). -r prevents read from interpreting \ instances in the input.

Edge case: If the input doesn't end with \n, the last line will not print - you could fix that by using read -r line || [[ -n $line ]].

Upvotes: 5

egh3
egh3

Reputation: 1

Try the unix2dos command.

Example: unix2dos infile outfile

http://en.wikipedia.org/wiki/Unix2dos

The wikipedia page has some examples using perl and sed too.

perl -i -p -e 's/\n/\r\n/' file
sed -i -e 's/$/\r/' file

Upvotes: -1

BMW
BMW

Reputation: 45293

try this, it will fix your issue.

dos2unix myitems.csv myitems.csv

Upvotes: 2

Related Questions