Reputation: 51717
I'm trying to use SED to extract text from a log file. I can do a search-and-replace without too much trouble:
sed 's/foo/bar/' mylog.txt
However, I want to make the search case-insensitive. From what I've googled, it looks like appending i
to the end of the command should work:
sed 's/foo/bar/i' mylog.txt
However, this gives me an error message:
sed: 1: "s/foo/bar/i": bad flag in substitute command: 'i'
What's going wrong here, and how do I fix it?
Upvotes: 127
Views: 104279
Reputation: 437090
Update: Starting with macOS Big Sur (11.0), sed
now does support the I
flag for case-insensitive matching, so the command in the question should now work (BSD sed
doesn't report its version, but you can go by the date at the bottom of the man
page, which should be March 27, 2017
or more recent); a simple example:
# BSD sed on macOS Big Sur and above (and GNU sed, the default on Linux)
$ sed 's/ö/@/I' <<<'FÖO'
F@O # `I` matched the uppercase Ö correctly against its lowercase counterpart
Note: I
(uppercase) is the documented form of the flag, but i
works as well.
Similarly, starting with macOS Big Sur (11.0) awk
now is locale-aware (awk --version
should report 20200816
or more recent):
# BSD awk on macOS Big Sur and above (and GNU awk, the default on Linux)
$ awk 'tolower($0)' <<<'FÖO'
föo # non-ASCII character Ö was properly lowercased
The following applies to macOS up to Catalina (10.15):
To be clear: On macOS, sed
- which is the BSD implementation - does NOT support case-insensitive matching - hard to believe, but true. The formerly accepted answer, which itself shows a GNU sed
command, gained that status because of the perl
-based solution mentioned in the comments.
To make that Perl solution work with foreign characters as well, via UTF-8, use something like:
perl -C -Mutf8 -pe 's/öœ/oo/i' <<< "FÖŒ" # -> "Foo"
-C
turns on UTF-8 support for streams and files, assuming the current locale is UTF-8-based.-Mutf8
tells Perl to interpret the source code as UTF-8 (in this case, the string passed to -pe
) - this is the shorter equivalent of the more verbose -e 'use utf8;'.
Thanks, Mark Reed(Note that using awk
is not an option either, as awk
on macOS (i.e., BWK awk and BSD awk) appears to be completely unaware of locales altogether - its tolower()
and toupper()
functions ignore foreign characters (and sub()
/ gsub()
don't have case-insensitivity flags to begin with).)
A note on the relationship of sed
and awk
to the POSIX standard:
BSD sed
and awk
limit their functionality mostly to what the POSIX sed
and
POSIX awk
specs mandate, whereas their GNU counterparts implement many more extensions.
Upvotes: 95
Reputation: 526
I had a similar need, and came up with this:
this command to simply find all the files:
grep -i -l -r foo ./*
this one to exclude this_shell.sh (in case you put the command in a script called this_shell.sh), tee the output to the console to see what happened, and then use sed on each file name found to replace the text foo with bar:
grep -i -l -r --exclude "this_shell.sh" foo ./* | tee /dev/fd/2 | while read -r x; do sed -b -i 's/foo/bar/gi' "$x"; done
I chose this method, as I didn't like having all the timestamps changed for files not modified. feeding the grep result allows only the files with target text to be looked at (thus likely may improve performance / speed as well)
be sure to backup your files & test before using. May not work in some environments for files with embedded spaces. (?)
Upvotes: 0
Reputation: 1139
The Mac version of sed
seems a bit limited. One way to work around this is to use a linux container (via Docker) which has a useable version of sed
:
cat your_file.txt | docker run -i busybox /bin/sed -r 's/[0-9]{4}/****/Ig'
Upvotes: 3
Reputation: 51
Use following to replace all occurrences:
sed 's/foo/bar/gI' mylog.txt
Upvotes: 5
Reputation: 52102
The sed FAQ addresses the closely related case-insensitive search. It points out that a) many versions of sed support a flag for it and b) it's awkward to do in sed, you should rather use awk or Perl.
But to do it in POSIX sed, they suggest three options (adapted for substitution here):
Convert to uppercase and store original line in hold space; this won't work for substitutions, though, as the original content will be restored before printing, so it's only good for insert or adding lines based on a case-insensitive match.
Maybe the possibilities are limited to FOO
, Foo
and foo
. These can be covered by
s/FOO/bar/;s/[Ff]oo/bar/
To search for all possible matches, one can use bracket expressions for each character:
s/[Ff][Oo][Oo]/bar/
Upvotes: 7
Reputation: 81
If you are doing pattern matching first, e.g.,
/pattern/s/xx/yy/g
then you want to put the I
after the pattern:
/pattern/Is/xx/yy/g
Example:
echo Fred | sed '/fred/Is//willma/g'
returns willma
; without the I
, it returns the string untouched (Fred
).
Upvotes: 8
Reputation: 2831
Editor's note: This solution doesn't work on macOS (out of the box), because it only applies to GNU sed
, whereas macOS comes with BSD sed
.
Capitalize the 'I'.
sed 's/foo/bar/I' file
Upvotes: 92
Reputation: 273
Not a direct answer, but in some contexts its okay to pipe the whole thing through tr A-Z a-z
to lowercase the entire stream.
Sure, you lose the uppercase letters, but that loss may be offset by simplifying other parts of the pipeline. Numbers and date/time are unaffected too, and the output stream is going to compress better as well. Email addresses are not case-sensitive, so that doesn't matter.
One downside is case-sensitive identifiers might become awkward. Sendmail logs would be less use this way.
Upvotes: 0
Reputation:
Another work-around for sed
on Mac OS X is to install gsed
from MacPorts or HomeBrew and then create the alias sed='gsed'
.
Upvotes: 30