Reputation: 61
I have 2 files, the English file (source file) and the Italian file (target file). Both of them have the same number of lines. I run awk 'NF<3'
to remove all the strings in my Italian file having more than 2 words, but at the same time I'd like to remove the specific source strings erased from the Italian file in the English file (I thought I could work on the line number). Naturally, I have to perform a sed
command on the line number of the source string (cause the strings in both file are different), but I do not know how to do that at the same time while I'm using awk
to remove those strings from the Italian file, because when I launched the command, I lose the equivalent line numbers in the files.
Example
EN
1 Santa Claus
2 Pigs don't fly
3 The son of the father
4 Elf
IT
1 Babbo Natale
2 I maiali non volano
3 Il figlio del padre
4 Elfo
I run awk on IT file
OUTPUT FILE
IT
1 Babbo Natale
4 Elfo
the lines removed with awk
in the IT file need to be also removed from the EN file (i can't use again awk on the eng file, cause the word count on the eng file is different with the IT file, it's only a line number work)
THE OUTPUT EN FILE MUST BE
1 Santa Claus
2 Elf
Any suggestions? If it's not clear, please ask...
Upvotes: 1
Views: 775
Reputation: 58578
This might work for you (GNU sed):
sed -rn 's/\S+//3;T;=' fileIT | sed 's/.*/&d/' | sed -f - fileEN
This uses the IT file to create a sed file that is run against the EN file. The first sed invocation ouputs a line number of any line in the IT file that has three or more words on a line. The second sed invocation turns the line number into a sed command to delete that line number. The third sed invocation deletes those line numbers from the EN file.
Upvotes: 0
Reputation: 14975
Having as source files:
$ cat it.dat
Babbo Natale
I maiali non volano
Il figlio del padre
Elfo
$ cat en.dat
Santa Claus
Pigs don't fly
The son of the father
Elf
This awk
:
awk 'NR==FNR{if(NF>3){a[NR]}else{a[NR]=1;print > "filtered_it.dat"}}
NR!=FNR && a[FNR]{print > "filtered_en.dat"}' it.dat en.dat
Results
$ cat filtered_id.dat
Babbo Natale
Elfo
$ cat filtered_en.dat
Santa Claus
Elf
Upvotes: 4