Reputation: 144

How to replace text with respect to its capitalization?

Here is the text in text.txt:

"word1 Word2 word3"

Now, I would like to have this output:

"nword1 Nword2 nword3"

What I have done so far:

sed -e  s/word1/nword1/gI text.txt
sed -e  s/word2/nword2/gI text.txt
sed -e  s/word3/nword3/gI text.txt

The thing is I do not know which word have a capital letter. So I have to input "sed -e s/word3/nword3/gI text.txt" in lower case.

So basically, I would like to replace words with respect to capitalization of the original text. How can I do this in bash script?

Upvotes: 2

Answers (7)

codeforester

Reputation: 42999

You can use awk for this:

awk '{for(i=1; i<NR; i++) { if ($i ~ /^[[:lower:]]/) {$i = "n"$i} else {$i = "N"$i}}}i' file

For your test case, it outputs:

nword1 NWord2 nword3

It will work irrespective of how many words you have on each line.

Upvotes: 0

James Brown

Reputation: 37404

In awk:

$ awk -v f="n" '
{
    for(i=1;i<=NF;i++) 
        sub(/^./, ((c=substr($i,1,1))~/[[:upper:]]/?toupper(f):f) tolower(c),$i)
} 1' file

Of course you can pipe from the echo to the script as well. Explained:

awk -v f="n" char to prepend is brought in a variable
for(i=1;i<=NF;i++) iterate thru every word in record
sub(/^./, ( replace first char of word with
(c=substr($i,1,1))~/[[:upper:]]/?toupper(f):f) tolower(c), first char of word is stored to c var and if it is a capital, produce capitalized char from f and lowered char from c
$i) on every word

Edit As commened but untested.

Upvotes: 0

RomanPerekhrest

Reputation: 92854

AWK solution:

awk '{for(i=1;i<=NF;i++){printf "%s%s"FS,($i~/\<[[:lower:]]/)?"n":"N",tolower($i);}}' text.txt

The output:

nword1 Nword2 nword3

Explanation:

for(i=1;i<=NF;i++) - iterating through all fields/columns(i.e. words)

$i~/\<[[:lower:]]/ - checks if a field/word starts with lowercase letter.
\< is gawk regex operator which matches the empty string at the beginning of a word. For example, /\<away/ matches ‘away’ but not ‘stowaway’.

tolower($i) - converts a word into lowercase

Upvotes: 0

Ed Morton

Reputation: 203502

With what you've shown us for sample input, all you need is:

$ awk '{for (i=1;i<=NF;i++) $i=($i ~ /^[[:upper:]]/ ? "N" : "n") tolower($i)} 1' file
nword1 Nword2 nword3

If that's NOT all you need then edit your question to show sample input that better represents your real data.

Upvotes: 0

grail

Reputation: 930

Or we could use simple bash:

replace=n

while read -r -a words
do
    out=()
    for word in "${words[@]}"
    do
        first=${word:0:1}

        if [[ "${first,}" == "${word:0:1}" ]]
        then
            word="$replace$word"
        else
            word="${replace^}${word,,}"
        fi
        out+=("$word")
    done
    echo "${out[*]}"
done<input_file

Upvotes: 1

clt60

Reputation: 63902

Perl

perl -CSDA -plE 'BEGIN{$f=shift@ARGV;$t=lc(shift@ARGV)}s/(?i)\b($f)\b/$1=~m!^\p{Upper}!?ucfirst $t:$t/xge;' word nword

The solution, not only prepending N to the word, but can convert any given word to another one, preserving the original word capitalization.

more readable

perl -CSDA -plE '
   BEGIN{ $f = shift @ARGV; $t = lc(shift @ARGV) }
   s/ (?i) \b($f)\b/ $1 =~ m!^\p{Upper}! ? ucfirst $t : $t /xge;
' word nword

But recommending to you create a bash function let say call it casesubs

casesubs() {
    #usage: casesubs fromword toword
    perl -CSDA -plE 'BEGIN{$f=shift@ARGV;$t=lc(shift@ARGV)}s/(?i)\b($f)\b/$1=~m!^\p{Upper}!?ucfirst $t:$t/xge;' "$1" "$2"
}

and you can now easily use it as the following examples:

(
    text='abcword word Word word wordlen';
    echo "$text"
    casesubs word nword <<<"$text"

) | column -t #pretty printing

abcword  word   Word   word   wordlen  #orig
abcword  nword  Nword  nword  wordlen  #changed

the solution works with any utf8 encoded Unicode, e.g. not only [a-z].

(
    text='überJägermeister ÜBERJÄGERMEISTER'
    echo "$text"
    casesubs überJägermeister unterPIÑACOLÁDA <<<"$text"
) | column -t

output

überJägermeister  ÜBERJÄGERMEISTER
unterpiñacoláda   Unterpiñacoláda

and ofcourse with files too, e.g. having a file capfile.txt with a content

Ut debitis eveniet molestiae iusto quis ut. Est nemo dolores
error ipsum aut überJägermeister ÜBERJÄGERMEISTER. Numquam
itaque molestias ut iusto. Quia ut nobis expedita.

can use

casesubs überJägermeister unterPIÑACOLÁDA < capfile.txt

and get

Ut debitis eveniet molestiae iusto quis ut. Est nemo dolores
error ipsum aut unterpiñacoláda Unterpiñacoláda. Numquam
itaque molestias ut iusto. Quia ut nobis expedita.

Upvotes: 1

stephanmg

Reputation: 766

i'll outline informal what you could do IMHO:

Read text file #1 into a variable, say textfile1

In a for loop:
1. Read text file #2 line by line, split at space into two variables pattern_to_match and replacement
2. Find pattern_to_match in textfile1 (with case insensitive search) and store this in a variable, say match
3. Find out if first character of match is lower or upper case and remember this in variable upperCase
4. Capitalize the variable replacement if upperCase is true
5. Replace in textfile1 pattern_to_match by replacement

This can all be done in Bash/Sh.

Upvotes: -1

How to replace text with respect to its capitalization?

Answers (7)

Related Questions