Reputation: 59
Here is the data I want to capitalize:
molly w. bolt 334-78-5443
walter q. bugg 984-49-0032
noah p. way 887-12-0921
kerry t. bricks 431-09-1239
ping h. yu 109-32-9845
Here is the script I have written so far to capitalize the first letter of name including initial
h
s/\(.\).*/\1/
y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/
G
s/\(.\)\n\(.\)\(.*\)/\1\3/
/ [a-z]/{
h
s/\([A-Z][a-z]* \)\([a-z]\).*/\2/
y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/
G
s/\(.\)\n\([A-Z][a-z]* \)\(.\)\(.*\)/\2\1\4/
}
/ [a-z]/{
h
s/\([A-Z][a-z]* \)\([a-z]\).*/\2/
y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/
G
s/\(.\)\n\([A-Z][a-z]* \)\(.\)\(.*\)/\2\1\4/
}
It gives me:
MOLLY W. BOLT 334-78-544Molly 3. bolt 334-78-5443
WALTER Q. BUGG 984-49-003Walter 2. bugg 984-49-0032
NOAH P. WAY 887-12-092Noah 1. way 887-12-0921
KERRY T. BRICKS 431-09-123Kerry 9. bricks 431-09-1239
PING H. YU 109-32-984Ping 5. yu 109-32-9845
I want to only have:
Molly W. Bolt 334-78-544
Walter Q. Bugg 984-49-003
Noah P. Way 887-12-092
Kerry T. Bricks 431-09-123
Ping H. Yu 109-32-984
What would I change?
Upvotes: 3
Views: 3047
Reputation: 10039
sed 's/^/ /;s/ [aA]/ A/g;s/ [bB]/ B/g;s/ [cC]/ C/g;s/ [dD]/ D/g;s/ [eE]/ E/g;s/ [fF]/ F/g;s/ [gG]/ G/g;s/ [hH]/ H/g;s/ [iI]/ I/g;s/ [jJ]/ J/g;s/ [kK]/ K/g;s/ [lL]/ L/g;s/ [mM]/ M/g;s/ [nN]/ N/g;s/ [oO]/ O/g;s/ [pP]/ P/g;s/ [qQ]/ Q/g;s/ [rR]/ R/g;s/ [sS]/ S/g;s/ [tT]/ T/g;s/ [uU]/ U/g;s/ [vV]/ V/g;s/ [wW]/ W/g;s/ [xX]/ X/g;s/ [yY]/ Y/g;s/ [zZ]/ Z/g;s/^.//' YourFile
Posix (no GNU sed) version
Works on your sample but not if something like {andrea,georges ...
assuming word are at the start of line OR after a space char here.
Upvotes: 1
Reputation: 63932
(GNU) Sed what should works with utf8
too:
sed -E 's/[[:alpha:]]+/\u&/g'
#or
sed -E 's/\S+/\u&/g'
Or perl
perl -pe 's/(\w+)/\u$1/g'
\w+
s///
it $1
with uppercase 1st character \u
g
or the simpler
perl -pe 's/\S+/\u$&/g'
the
perl -CSDA -pe 's/\S+/\u$&/g'
will work with utf8
encoded files too..., e.g. from the
павел андреевич чехов 234
γεοργε πατσασογλοθ 123
čajka šumivá 345
will print
Павел Андреевич Чехов 234
Γεοργε Πατσασογλοθ 123
Čajka Šumivá 345
for inline file edit use the next:
perl -i.bak -CSDA -pe 's/\S+/\u$&/g' some filenames ....
will create the .bak
(backup) file.
If you have bash 4.2+ and need convert only in the variables, you can use:
for name in павел андреевич чехов γεοργε πατσασογλοθ čajka šumivá
do
echo "${name^}" #capitalize the $name
done
prints
Павел
Андреевич
Чехов
Γεοργε
Πατσασογλοθ
Čajka
Šumivá
Also, a solution for sed, what doesn;t knows the \u
https://stackoverflow.com/a/11804643/632407
Upvotes: 4
Reputation: 14955
Quite simple with python also:
$ python -c 'with open("myfile") as f:print f.read().title()'
https://docs.python.org/2/library/stdtypes.html
Upvotes: 2
Reputation: 65791
How about this (GNU sed):
$ sed 's/\b[a-z]/\u&/g' myfile
Molly W. Bolt 334-78-5443
Walter Q. Bugg 984-49-0032
Noah P. Way 887-12-0921
Kerry T. Bricks 431-09-1239
Ping H. Yu 109-32-9845
Upvotes: 6