Jaeyoung Park
Jaeyoung Park

Reputation: 339

Subtracting values on two files in Linux

I have two files which contain numbers and texts. Texts in two files are identical. I want to create a new file that has average of numbers from two files.

FileA.txt (more than 10000 lines and more than 1000 text and numbers)

textA
textB(10,2,2)
textC(2)
textD
.
.

FileB.txt (Texts are identical to FileA.txt)

textA
textB(0,0,4)
textC(4)
textD
.
.

FileNew.txt (Have averages from FileA and FileB.txt)

textA
textB(5,1,3)
textC(3)
textD
.
.

One request is that I don't want to change any text. Only numbers are needed to be changed.

I think AWK or diff work this job.

Best,

Jaeyoung

Upvotes: 1

Views: 462

Answers (2)

repzero
repzero

Reputation: 8402

Perhaps you can attempt this like:

paste 'FileA' 'FileB'|awk '{if($0!~/\([0-9]+,[0-9]+,[0-9]+\)/){print $1;next}{split($1,f1,/[(),]/);split($2,f2,/[(),]/)};print f1[1] "(",int((f1[2]+f2[2])/2) "," int((f1[3]+f2[3])/2) "," int((f1[4]+f2[4])/2) ")"}'

to break this down in a readable style

create a file with the name awkscript and append these lines

#!/usr/bin/awk
{
if($0!~/\([0-9]+,[0-9]+,[0-9]+\)/){
    print $1;next}
{split($1,f1,/[(),]/);split($2,f2,/[(),]/)};
print f1[1] "(",int((f1[2]+f2[2])/2) "," int((f1[3]+f2[3])/2) "," int((f1[4]+f2[4])/2) ")"
}

now call your script like

paste 'FileA' 'FileB'|awk -f 'awkscript'

(paste comes in handy here)

results

textA
textB( 5,1,3)
textC(2)
textD
.
.

Upvotes: 3

Kaz
Kaz

Reputation: 58667

TXR solution:

@(next :list @(weave (get-lines (open-file [*args* 0]))
                     (get-lines (open-file [*args* 1]))))
@(repeat)
@  (cases)
@text(@cnum0)
@text(@cnum1)
@    (do (let* ((num0 [mapcar toint (split-str cnum0 ",")])
                (num1 [mapcar toint (split-str cnum1 ",")])
                (avg (mapcar (op trunc (+ @1 @2) 2) num0 num1)))
           (put-line `@text(@{avg ","})`)))
@  (or)
@text
@text
@    (do (put-line text))
@  (end)
@(end)


$ txr avg.txr FileA.txt FileB.txt
textA
textB(5,1,3)
textC(3)
textD
.
.

This script processes the files as if they were one file with lines interleaved from the two files. The corresponding texts have to match exactly, or the program stops with a failed termination status. Lines which don't match the text(whatever) syntax are assumed to be text which has to match exactly.

It's assumed that the numbers are integers, and truncating division is used in the averaging.

This purely Lisp solution simply chops lines into numeric and non-numeric pieces, averaging the corresponding numeric pieces. Floating-point math is used. The t argument in tok-str tells it to keep the in-between pieces which don't match the tokenizing regex.

(each ((line0 (get-lines (open-file [*args* 0])))
       (line1 (get-lines (open-file [*args* 1]))))
  (let* ((nregex #/(\d+|\d+\.\d+|\.\d+)([Ee][+\-]?\d+)?/)
         (chop0 (tok-str line0 nregex t))
         (chop1 (tok-str line1 nregex t))
         (out (mapcar (lambda (tok0 tok1)
                        (let ((n0 (tofloat tok0))
                              (n1 (tofloat tok1)))
                          (if (and n0 n1)
                            (/ (+ n0 n1) 2)
                            tok0)))
                      chop0 chop1)))
    (put-line `@{out ""}`)))

$ txr avg.tl  FileA.txt FileB.txt
textA
textB(5.0,1.0,3.0)
textC(3.0)
textD
.
.

Upvotes: 0

Related Questions