Marta Koprivnik
Marta Koprivnik

Reputation: 405

Why tr gives me double char, when I use special characters?

I have next problem;

$ echo ača | tr 'č' 'c'
$ acca

Why it gives mi double "c" ? How to solve that? I want aca, not acca.

Upvotes: 1

Views: 253

Answers (1)

Andreas Louv
Andreas Louv

Reputation: 47119

č is two bytes long in unicode:

charinfo č
U+010D LATIN SMALL LETTER C HACEK [Ll]

tr will see it as two characters of one byte each. Then it will extend the second argument until all characters have been replaced, therefore two c's.

You could use sed (might just be GNU):

echo ača | sed 'y/č/c/'

Or Perl:

echo ača | perl -pe 'use open qw(:std :utf8);use utf8;y/č/c/'

Consider this which might make you understand what's happening:

% echo abc | tr 'abc' 'de'
dee

Upvotes: 4

Related Questions