GregB
GregB

Reputation: 5725

Capitalize strings in sed or awk

I have three types of strings that I'd like to capitalize in a bash script. I figured sed/awk would be my best bet, but I'm not sure. What's the best way given the following requirements?

  1. single word
    e.g. taco -> Taco

  2. multiple words separated by hyphens
    e.g. my-fish-tacos -> My-Fish-Tacos

  3. multiple words separated by underscores
    e.g. my_fish_tacos -> My_Fish_Tacos

Upvotes: 22

Views: 14495

Answers (7)

Jody Frankowski
Jody Frankowski

Reputation: 7

For zsh users:

str="test   spaceS-dasheS_underscoreS"
echo "${(C)str}"
Test   Spaces-Dashes_Underscores

Upvotes: 0

alinsoar
alinsoar

Reputation: 15793

Here is a solution that does not use the \u, that is not common to all seds.

Save this file into capitalize.sed, then run sed -i -f capitalize.sed FILE

s:^:.:
h
y/qwertyuiopasdfghjklzxcvbnm/QWERTYUIOPASDFGHJKLZXCVBNM/ 
G 
s:$:\n:
:r
/^.\n.\n/{s:::;p;d}
/^[^[:alpha:]][[:alpha:]]/ {
    s:.\(.\)\(.*\):x\2\1: 
    s:\n\(..\):\nx: 
    tr
}

/^[[:alpha:]][[:alpha:]]/ {
    s:\n.\(.\)\(.*\)$:\nx\2\1:
    s:..:x:
    tr
}
/^[^\n]/ {
    s:^.\(.\)\(.*\)$:.\2\1:
    s:\n..:\n.:
    tr
}

Upvotes: 3

Neale Pickett
Neale Pickett

Reputation: 306

alinsoar's mind-blowing solution doesn't work at all in Plan9 sed, or correctly in busybox sed. But you should still try to figure out how it's supposed to do its thing: you will learn a lot about sed.

Here's a not-as-clever but easier to understand version which works in at least Plan9, busybox, and GNU sed (and probably BSD and MacOS). Plan9 sed needs backslashes removed in the match part of the s command.

#! /bin/sed -f

y/PYFGCRLAOEUIDHTNSQJKXBMWVZ/pyfgcrlaoeuidhtnsqjkxbmwvz/

s/\(^\|[^A-Za-z]\)a/\1A/g
s/\(^\|[^A-Za-z]\)b/\1B/g
s/\(^\|[^A-Za-z]\)c/\1C/g
s/\(^\|[^A-Za-z]\)d/\1D/g
s/\(^\|[^A-Za-z]\)e/\1E/g
s/\(^\|[^A-Za-z]\)f/\1F/g
s/\(^\|[^A-Za-z]\)g/\1G/g
s/\(^\|[^A-Za-z]\)h/\1H/g
s/\(^\|[^A-Za-z]\)i/\1I/g
s/\(^\|[^A-Za-z]\)j/\1J/g
s/\(^\|[^A-Za-z]\)k/\1K/g
s/\(^\|[^A-Za-z]\)l/\1L/g
s/\(^\|[^A-Za-z]\)m/\1M/g
s/\(^\|[^A-Za-z]\)n/\1N/g
s/\(^\|[^A-Za-z]\)o/\1O/g
s/\(^\|[^A-Za-z]\)p/\1P/g
s/\(^\|[^A-Za-z]\)q/\1Q/g
s/\(^\|[^A-Za-z]\)r/\1R/g
s/\(^\|[^A-Za-z]\)s/\1S/g
s/\(^\|[^A-Za-z]\)t/\1T/g
s/\(^\|[^A-Za-z]\)u/\1U/g
s/\(^\|[^A-Za-z]\)v/\1V/g
s/\(^\|[^A-Za-z]\)w/\1W/g
s/\(^\|[^A-Za-z]\)x/\1X/g
s/\(^\|[^A-Za-z]\)y/\1Y/g
s/\(^\|[^A-Za-z]\)z/\1Z/g

Upvotes: 1

potong
potong

Reputation: 58430

This might work for you (GNU sed):

echo "aaa bbb ccc aaa-bbb-ccc aaa_bbb_ccc aaa-bbb_ccc"  | sed 's/\<.\|_./\U&/g'
Aaa Bbb Ccc Aaa-Bbb-Ccc Aaa_Bbb_Ccc Aaa-Bbb_Ccc

Upvotes: 0

Dennis Williamson
Dennis Williamson

Reputation: 360143

There's no need to use capture groups (although & is a one in a way):

echo "taco my-fish-tacos my_fish_tacos" | sed 's/[^ _-]*/\u&/g'

The output:

Taco My-Fish-Tacos My_Fish_Tacos

The escaped lower case "u" capitalizes the next character in the matched sub-string.

Upvotes: 33

Sergii Stotskyi
Sergii Stotskyi

Reputation: 5390

Using awk:

echo 'test' | awk '{
     for ( i=1; i <= NF; i++) {
         sub(".", substr(toupper($i), 1,1) , $i);
         print $i;
         # or
         # print substr(toupper($i), 1,1) substr($i, 2);
     }
}'

Upvotes: 8

Andrew Clark
Andrew Clark

Reputation: 208495

Try the following:

sed 's/\([a-z]\)\([a-z]*\)/\U\1\L\2/g'

It works for me using GNU sed, but I don't think BSD sed supports \U and \L.

Upvotes: 6

Related Questions