Naitree
Naitree

Reputation: 1148

using `sub()` function in awk cause repetitive replacement behavior

Suppose I have this /etc/crontab file example:

0 0 1 * * ntpdate -s pool.ntp.org && hwclock -w

What I want to achieve is to replace this line with another ntpdate cronjob like the following

0 0 0 * * ntpdate -s pool.ntp.org && hwclock -w

And if the original ntpdate line doesn't exist, then the second line just get appended at the end of the crontab file.

Therefore, I tried it with awk:

awk -v cronjob='0 0 0 * * ntpdate -s pool.ntp.org && hwclock -w' '/ntpdate/ { sub(/^.*$/,cronjob,$0);found=1; }; { print $0 }; END {if(!found) print cronjob}' /etc/crontab

which leads to the following (certainly wrong) repetitive replacement:

0 0 0 * * ntpdate -s pool.ntp.org 0 0 1 * * ntpdate -s pool.ntp.org && hwclock -w0 0 1 * * ntpdate -s pool.ntp.org && hwclock -w hwclock -w

What is wrong with my awk script? I must have misunderstood something, but I cannot figure out where.

Any help is appreciated. Thank you.

Upvotes: 2

Views: 68

Answers (1)

Jonathan Leffler
Jonathan Leffler

Reputation: 753970

Succinctly, & is a metacharacter in replacement strings; it means 'whatever you matched'. That's why you're getting the repetition.

The next issue is "how to avoid it". The answer seems to be two pairs of backslashes:

awk -v cronjob='0 0 0 * * ntpdate -s pool.ntp.org \\&\\& hwclock -w' \
    '/ntpdate/ { sub(/^.*$/,cronjob,$0); found=1; } { print $0 }
     END {if(!found) print cronjob}'

I was expecting a single backslash to be sufficient, but with my testing (using both BSD awk and GNU awk on Mac OS X 10.10.4), it seemed I needed a double backslash. My expectation ties in with Naitree's experience — but I'm not sure why I needed the the extra backslashes and he didn't. In the interim, take whichever of the options works for you: try single backslashes, and if it works, great, and if not, try double backslashes instead.

When I tried this on an Ubuntu 14.04 LTS VM, then I found that awk was indeed mawk 1.3.3 Nov 1996, and the single backslash was sufficient. Ouch! My suspicion is that BSD awk and GNU awk hew closer to the POSIX standard in this than mawk does, if only because they're a decade or so newer (for me, awk --version for BSD awk yields awk version 20070501, while gawk --version yields GNU Awk 3.1.7 and an end copyright date of 2009).

With the single \&, gawk reports:

gawk -v cronjob='0 0 0 * * ntpdate -s pool.ntp.org \&\& hwclock -w' \
     '/ntpdate/ { sub(/^.*$/,cronjob,$0);found=1; } { print $0 }
      END {if(!found) print cronjob}' /dev/null
gawk: warning: escape sequence `\&' treated as plain `&'
0 0 0 * * ntpdate -s pool.ntp.org && hwclock -w

Note the warning. It appeared on both Ubuntu and Mac OS X. That's the 'addition' mode; /dev/null doesn't contain a match for the line. If you save that in file x1 and then edit x1 with the same command line apart from the file name, I got the original repeated behaviour:

0 0 0 * * ntpdate -s pool.ntp.org 0 0 0 * * ntpdate -s pool.ntp.org && hwclock -w0 0 0 * * ntpdate -s pool.ntp.org && hwclock -w hwclock -w

Alternatively

As pii_ke suggests in a comment, a simpler technique is probably:

awk -v cronjob='0 0 0 * * ntpdate -s pool.ntp.org && hwclock -w' \
    '/ntpdate/ { next } { print $0 } END {print cronjob}' 

This deletes the original line and simply adds the new one at the end of the output. This worked sanely with all three variants of awk.

YMMV — Your mileage may vary; you were warned. These horribly subtle differences are the sort of thing that can drive you insane if you don't simply evade the problem.

Upvotes: 3

Related Questions