PYL
PYL

Reputation: 109

Can you explain sed 's/$/\r/g'?

I tried command sed 's/$/\r/g' linux.txt > linux2win.txt to convert the text file from Linux to Windows.

And it works! All \n are converted to \r\n

For example, hello, world \n is converted to hello, world \r\n

What confuses me is that what exactly $ refers to? \n? or an empty char before \n? I don't even know what I replaced.

Upvotes: 1

Views: 5515

Answers (3)

Ed Morton
Ed Morton

Reputation: 203655

The answers/comments so far stating that $ matches the end of line are misleading. $ in a regexp matches end of string, that is all. The reason it appears to match end of line in sed is that by default sed reads 1 line at a time so in that context (but not in others) each string it's operating on does end at the end of the line.

So $ matches end-of-string and if your string ends at the end of a line then $ matches at the end of the line but if your string contains multiple lines (e.g. in sed you can create a multi-line string stored in a buffer) then $ does not match at the end of any given line, it simply and consistently matches at the end of the string.

Similarly ^ matches start-of-string, btw, not start-of-line as you may hear people claim.

wrt your comment:

my original line is hello, world \n$ and $ is invisible , and $ is replaced by \r, now my line is hello, world\n\r$ .`

No, that is not what is happening. Your original line is:

hello, world\n

and sed reads one \n-separated line at a time so what is read into seds buffer is the string:

hello, world

Now $ is a regexp metacharacter that matches the end-of-string so given the above string $ will match after the d (and ^ would match before h) so when you do

s/$/\r/

It changes the above string to:

hello world\r

and then when sed prints it out it adds back the newline (because a string with no terminating newline is not a text line per POSIX) and to outputs:

hello world\r\n

Note that $ is never part of the string, it's just a metacharacter that when used in a regexp matches the end of the string so you can test for characters appearing just at the end of a string or do other operations (like the above) after the end of the string.

Upvotes: 3

John Bollinger
John Bollinger

Reputation: 180286

The premise of your question is flawed. The sed command you present converts Linux-style line terminators (newline alone) to Windows-style (carriage-return / newline), not the other way around.

It works like this:

  • the $ is a regex metacharacter that matches the zero-width end of the line (i.e. just prior to the line terminator, if any).
  • the substitution string is a carriage return character (expressed as \r); it replaces the zero-width character sequence matched by the regex, in effect inserting the carriage return immediately before the newline

The trailing g in the sed command specifies that all matches in each line should be replaced; it is superfluous because the cannot be more than one match per line.

Note also that this can be slightly quirky: if the input file does not end with a newline, then the output will end with just \r, because the end of the file is then the end of the last line.

Upvotes: 0

Maroun
Maroun

Reputation: 95968

$ matches the end of line, so the command:

sed 's/$/\r/g'

simply adds \r to the end of line, which is not what you say. If the input is "hello, world \r\n", the output would be "hello, world \r\n".

Upvotes: 0

Related Questions