Reputation: 23
I have file with two columns. First column is string, second is positive number. in If first field (string) doesn't have double in file (so, first field is unique for the file), I want to copy that unique line to (let's say) result.txt. If first field does have duplicate in file, then I want to subtract second field (number) in those duplicated lines. By the way, file will have one duplicate max, no more than that. I want to save that also in result.txt. So, output file will have all lines with unique values of first field and lines in which first field is duplicated name and second is subtracted value from those duplicates. Files are not sorted. Here is example:
INPUT FILE:
hello 7
something 8
hey 9
hello 8
something 12
nathanforyou 23
OUTPUT FILE that I need (result.txt):
hello 1
something 4
hey 9
nathanforyou 23
I can't have negative numbers in ending file, so I have to subtract smaller number from bigger. What have I tried so far? All kinds of sort (I figure out how to find non-duplicate lines and put them in separate file, but choked on duplicate substraction), arrays in awk (I saved all lines in array, and do "for" clause... problem is that I don't know how to get second field from array element that is line) etc. By the way, problem is more complicated than I described (I have four fields, first two are the same and so on), but at the end - it comes to this.
Upvotes: 2
Views: 60
Reputation: 203393
$ cat tst.awk
{ val[$1,++cnt[$1]] = $2 }
END {
for (name in cnt) {
if ( cnt[name] == 1 ) {
print name, val[name,1]
}
else {
val1 = val[name,1]
val2 = val[name,2]
print name, (val1 > val2 ? val1 - val2 : val2 - val1)
}
}
}
$ awk -f tst.awk file
hey 9
hello 1
nathanforyou 23
something 4
Upvotes: 1