Reputation: 2777
How do I grep a UTF-8 text file for lines containing any character outside ASCII, except a select few characters, eg. [æÆøØåÅ]
?
So the following three lines:
ABC
ÆØÅ
ABC-ÆØÅ 😃
Should yield:
ABC-ÆØÅ 😃
Because the smiley is outside ASCII and does not belong to the extra ignored characters.
Upvotes: 1
Views: 404
Reputation: 2777
GNU grep seems to support UTF-8. The following solves the problem on OS X.
brew install homebrew/dupes/grep
ggrep -P '[^\x00-\x7FæÆøØåÅ]' *.txt
Upvotes: 0
Reputation: 33618
grep
doesn't support UTF-8. Try Perl:
perl -CSD -Mutf8 -ne 'print if /[^\x00-\x7FæÆøØåÅ]/' [FILE...]
-CSD
enables UTF-8 IO. -Mutf8
enables UTF-8 in source code.
Upvotes: 1