Reputation: 22988
Is this a bug or am I doing something wrong (when trying to match Russian swear words in a multiplayer game chat log) on CentOS 6.5 with the stock perl 5.10.1?
# echo блядь | perl -ne 'print if /\bбля/'
# echo блядь | perl -ne 'print if /бля/'
блядь
# echo $LANG
en_US.UTF-8
Why doesn't the first command print anything?
Upvotes: 3
Views: 131
Reputation: 241908
You have to tell Perl that the source code contains UTF-8 (use utf8
), and that the input (-CI
) and output (-CO
) are UTF-8 encoded:
echo 'помёт' | perl -CIO -ne 'use utf8; print if /\bпомё/'
Upvotes: 4