Reputation: 144
Here is the text in text.txt:
"word1 Word2 word3"
Now, I would like to have this output:
"nword1 Nword2 nword3"
What I have done so far:
sed -e s/word1/nword1/gI text.txt
sed -e s/word2/nword2/gI text.txt
sed -e s/word3/nword3/gI text.txt
The thing is I do not know which word have a capital letter. So I have to input "sed -e s/word3/nword3/gI text.txt" in lower case.
So basically, I would like to replace words with respect to capitalization of the original text. How can I do this in bash script?
Upvotes: 2
Views: 303
Reputation: 42999
You can use awk
for this:
awk '{for(i=1; i<NR; i++) { if ($i ~ /^[[:lower:]]/) {$i = "n"$i} else {$i = "N"$i}}}i' file
For your test case, it outputs:
nword1 NWord2 nword3
It will work irrespective of how many words you have on each line.
Upvotes: 0
Reputation: 37404
In awk:
$ awk -v f="n" '
{
for(i=1;i<=NF;i++)
sub(/^./, ((c=substr($i,1,1))~/[[:upper:]]/?toupper(f):f) tolower(c),$i)
} 1' file
Of course you can pipe from the echo
to the script as well. Explained:
awk -v f="n"
char to prepend is brought in a variablefor(i=1;i<=NF;i++)
iterate thru every word in recordsub(/^./, (
replace first char of word with(c=substr($i,1,1))~/[[:upper:]]/?toupper(f):f) tolower(c),
first char of word is stored to c
var and if it is a capital, produce capitalized char from f
and lowered char from c
$i)
on every wordEdit As commened but untested.
Upvotes: 0
Reputation: 92854
AWK
solution:
awk '{for(i=1;i<=NF;i++){printf "%s%s"FS,($i~/\<[[:lower:]]/)?"n":"N",tolower($i);}}' text.txt
The output:
nword1 Nword2 nword3
Explanation:
for(i=1;i<=NF;i++)
- iterating through all fields/columns(i.e. words)
$i~/\<[[:lower:]]/
- checks if a field/word starts with lowercase letter.
\<
is gawk
regex operator which matches the empty string at the beginning of a word. For example, /\<away/
matches ‘away
’ but not ‘stowaway
’.
tolower($i)
- converts a word into lowercase
Upvotes: 0
Reputation: 203502
With what you've shown us for sample input, all you need is:
$ awk '{for (i=1;i<=NF;i++) $i=($i ~ /^[[:upper:]]/ ? "N" : "n") tolower($i)} 1' file
nword1 Nword2 nword3
If that's NOT all you need then edit your question to show sample input that better represents your real data.
Upvotes: 0
Reputation: 930
Or we could use simple bash:
replace=n
while read -r -a words
do
out=()
for word in "${words[@]}"
do
first=${word:0:1}
if [[ "${first,}" == "${word:0:1}" ]]
then
word="$replace$word"
else
word="${replace^}${word,,}"
fi
out+=("$word")
done
echo "${out[*]}"
done<input_file
Upvotes: 1
Reputation: 63902
Perl
perl -CSDA -plE 'BEGIN{$f=shift@ARGV;$t=lc(shift@ARGV)}s/(?i)\b($f)\b/$1=~m!^\p{Upper}!?ucfirst $t:$t/xge;' word nword
The solution, not only prepending N
to the word
, but can convert any given word to another one, preserving the original word capitalization.
more readable
perl -CSDA -plE '
BEGIN{ $f = shift @ARGV; $t = lc(shift @ARGV) }
s/ (?i) \b($f)\b/ $1 =~ m!^\p{Upper}! ? ucfirst $t : $t /xge;
' word nword
But recommending to you create a bash function
let say call it casesubs
casesubs() {
#usage: casesubs fromword toword
perl -CSDA -plE 'BEGIN{$f=shift@ARGV;$t=lc(shift@ARGV)}s/(?i)\b($f)\b/$1=~m!^\p{Upper}!?ucfirst $t:$t/xge;' "$1" "$2"
}
and you can now easily use it as the following examples:
(
text='abcword word Word word wordlen';
echo "$text"
casesubs word nword <<<"$text"
) | column -t #pretty printing
abcword word Word word wordlen #orig
abcword nword Nword nword wordlen #changed
the solution works with any utf8 encoded Unicode, e.g. not only [a-z]
.
(
text='überJägermeister ÜBERJÄGERMEISTER'
echo "$text"
casesubs überJägermeister unterPIÑACOLÁDA <<<"$text"
) | column -t
output
überJägermeister ÜBERJÄGERMEISTER
unterpiñacoláda Unterpiñacoláda
and ofcourse with files too, e.g. having a file capfile.txt
with a content
Ut debitis eveniet molestiae iusto quis ut. Est nemo dolores
error ipsum aut überJägermeister ÜBERJÄGERMEISTER. Numquam
itaque molestias ut iusto. Quia ut nobis expedita.
can use
casesubs überJägermeister unterPIÑACOLÁDA < capfile.txt
and get
Ut debitis eveniet molestiae iusto quis ut. Est nemo dolores
error ipsum aut unterpiñacoláda Unterpiñacoláda. Numquam
itaque molestias ut iusto. Quia ut nobis expedita.
Upvotes: 1
Reputation: 766
i'll outline informal what you could do IMHO:
Read text file #1 into a variable, say textfile1
In a for loop:
This can all be done in Bash/Sh.
Upvotes: -1