Reputation: 5291
I have a cpp code for it. It basically takes a dictionary file formatted as:
blue 1
cat 2
chased 3
dog 4
. 5
....
and takes a text file:
blue cat chased dog .
yellow carrot ate brown fish .
and it converts it into:
1 2 3 4 5
88 90 121 11 133 5
......
Is there a simple one line solution for it in Bash?
Upvotes: 0
Views: 160
Reputation: 241988
Create a sed script from the input file:
sed 's/^/s=/;s/ /=/;s/$/=/' file
And run it on the input:
sed 's/^/s=/;s/ /=/;s/$/=/' file | sed -f- input
This might not work if a word is part of another word, e.g. cat
and category
.
Perl solution: read the first file into a hash table, then read the second file and replace each word with the corresponding value from the hash table.
perl -lane 'if (! $second) { $h{ $F[0] } = $F[1] }
else { s/(\S+)/$h{$1}/g; print }
$second = 1 if eof;' file input
Upvotes: 0
Reputation: 37424
In awk implementing @karakfa envisioned missing dictionary item:
$ awk 'NR==FNR {
a[$1]=$2; # store dict to a hash
if($2>m) # m is the max number in dict
m=$2;
next
} {
for(i=1;i<=NF;i++) # iterate thru all words in record
if($i in a) # if a dict match is found
$i=a[$i]; # replace it
else { # if not
a[$i]=++m; # grow m and make new dictionary entry
# print a[$i], m > "new_items" # to store them to a file
$i=m # ... and use it
}
} 1' dict text
Upvotes: 0
Reputation: 104032
For silliness, here is pure Bash (you should use awk for this IMHO):
declare -A dict
while read k v; do
dict[$k]=$v
done < /tmp/f1.txt
while IFS= read -r line || [[ -n $line ]]; do
la=($line)
for word in ${la[@]}; do
[[ ${dict[$word]} ]] && printf "%s " ${dict[$word]}; done
echo
done < /tmp/f2.txt
Upvotes: 0
Reputation: 67507
awk
to the rescue!
$ awk 'NR==FNR {dict[$1]=$2; next}
{for(i=1;i<=NF;i++) $i=dict[$i]}1' dict file
perhaps add logic for handling missing items in dictionary
Upvotes: 3
Reputation: 43039
@choroba's sed
solution didn't work for me. I am not sure if there is a one line solution for this. I would do this in Bash:
#!/bin/bash
# read the word values from the first file into an associative array
declare -A map
while IFS=' ' read -r word value; do
map[$word]=$value
done < 1.txt
# traverse the second file and print out numbers corresponding to each word
# if there is no mapped number, print nothing
while read -r line; do
read -ra words <<< "$line"
for word in ${words[@]}; do
num="${map[$word]}"
[[ $num ]] && printf "%s " "${map[$word]}"
done
printf "\n"
done < 2.txt
Gives the following output for the files in your question:
1 2 3 4 5
5
Upvotes: 0