Reputation: 9

sum columns depend on another column in linux

I have 2 files and I want to sum first columns depend on same seconds. If there is no a time that mean it is zero, if time is duplicate it means sum all same time but how, help me please.

First file:

 16 /home/appuser<Apr 4, 2016 11:24:46 PM EEST
  2 /home/appuser<Apr 4, 2016 11:24:47 PM EEST
  3 /home/appuser<Apr 4, 2016 11:24:48 PM EEST
  1 /home/appuser<Apr 4, 2016 11:24:50 PM EEST
  3 /home/appuser<Apr 4, 2016 11:24:51 PM EEST
  7 /home/appuser<Apr 4, 2016 11:24:52 PM EEST
  9 /home/appuser<Apr 4, 2016 11:24:54 PM EEST
  8 /home/appuser<Apr 4, 2016 11:24:54 PM EEST
  5 /home/appuser<Apr 4, 2016 11:24:55 PM EEST

Second file:

  6 /home/appuser<Apr 4, 2016 11:24:46 PM EEST
  4 /home/appuser<Apr 4, 2016 11:24:49 PM EEST
  7 /home/appuser<Apr 4, 2016 11:24:50 PM EEST
  5 /home/appuser<Apr 4, 2016 11:24:50 PM EEST
 10 /home/appuser<Apr 4, 2016 11:24:52 PM EEST
  6 /home/appuser<Apr 4, 2016 11:24:52 PM EEST
 10 /home/appuser<Apr 4, 2016 11:24:55 PM EEST
  5 /home/appuser<Apr 4, 2016 11:24:57 PM EEST

output:

 22 /home/appuser<Apr 4, 2016 11:24:46 PM EEST
  2 /home/appuser<Apr 4, 2016 11:24:47 PM EEST
  3 /home/appuser<Apr 4, 2016 11:24:48 PM EEST
  4 /home/appuser<Apr 4, 2016 11:24:49 PM EEST
 13 /home/appuser<Apr 4, 2016 11:24:50 PM EEST
  3 /home/appuser<Apr 4, 2016 11:24:51 PM EEST
 23 /home/appuser<Apr 4, 2016 11:24:52 PM EEST
  0 /home/appuser<Apr 4, 2016 11:24:53 PM EEST
 17 /home/appuser<Apr 4, 2016 11:24:54 PM EEST
 15 /home/appuser<Apr 4, 2016 11:24:55 PM EEST
  0 /home/appuser<Apr 4, 2016 11:24:56 PM EEST
  5 /home/appuser<Apr 4, 2016 11:24:57 PM EEST

Upvotes: 0

Answers (3)

anubhava

Reputation: 785146

This gets pretty tricky due to requirement of inserting 0 and missing date.

Here is an awk with sort that you can use:

awk -F '<| /' '{
   cmd="date -d \"" $3 "\" +%s"
   cmd | getline ts
   close(cmd)

   if (p>0 && (ts-p)>1) {
      for(i=p+1; i<ts; i++) {
         sums[i]=0
         cmd="TZ=EET date -d @" i " \"+%b%e, %Y %r %Z\""
         cmd | getline tsi
         close(cmd)
         data[i]= "/" c2 "<" tsi
      }
   }

   sums[ts]+=$1
   data[ts]="/" $2 "<" $3
   p = ts
   c2 = $2
}
END {
   for (i in sums)
      printf "%4d%s%s\n", sums[i], OFS, data[i]
}' <(sort -t'<' -k2 file1 file2)

Output:

  22 /home/appuser<Apr 4, 2016 11:24:46 PM EEST
   2 /home/appuser<Apr 4, 2016 11:24:47 PM EEST
   3 /home/appuser<Apr 4, 2016 11:24:48 PM EEST
   4 /home/appuser<Apr 4, 2016 11:24:49 PM EEST
  13 /home/appuser<Apr 4, 2016 11:24:50 PM EEST
   3 /home/appuser<Apr 4, 2016 11:24:51 PM EEST
  23 /home/appuser<Apr 4, 2016 11:24:52 PM EEST
   0 /home/appuser<Apr 4, 2016 11:24:53 PM EEST
  17 /home/appuser<Apr 4, 2016 11:24:54 PM EEST
  15 /home/appuser<Apr 4, 2016 11:24:55 PM EEST
   0 /home/appuser<Apr 4, 2016 11:24:56 PM EEST
   5 /home/appuser<Apr 4, 2016 11:24:57 PM EEST

Upvotes: 1

user4401178

Reputation:

In your original question you put output that includes times that have sum of 0, I'm not sure where that's from -- presuming that's additional data that you don't have to worry about the following will sum up column one based on matching column twos. This can be expanded to as many files as you need, just add them to the file list in the input cat --> <(cat f1.txt f2.txt f3.txt ... fn.txt)

unset myarr && declare -A myarr 
while read a; do  
 col1=$(cut -d' ' -f1 <<< "${a}") 
 col2=$(cut -d' ' -f3- <<< "${a}") 
 let myarr["${col2}"]+="${col1}"
done < <(awk '{var=$1; $1=""; print var,$0}' <(cat f1.txt f2.txt)) 
for a in "${!myarr[@]}"; do echo "${myarr["$a"]} ${a}"; done

Upvotes: 0

Eswar Vignesh

Reputation: 77

try using the below code.hope it helps

$ awk '{a[$5]+=$1; sub(/[0-9]+/,"",$1); line[$5]=$0}
    END{for(k in a) printf "%2d %s\n",a[k],line[k]}' first second


13  /home/appuser<Apr 4, 2016 11:24:50 PM EEST
 3  /home/appuser<Apr 4, 2016 11:24:51 PM EEST
23  /home/appuser<Apr 4, 2016 11:24:52 PM EEST
17  /home/appuser<Apr 4, 2016 11:24:54 PM EEST
15  /home/appuser<Apr 4, 2016 11:24:55 PM EEST
22  /home/appuser<Apr 4, 2016 11:24:46 PM EEST
 2  /home/appuser<Apr 4, 2016 11:24:47 PM EEST
 5  /home/appuser<Apr 4, 2016 11:24:57 PM EEST
 3  /home/appuser<Apr 4, 2016 11:24:48 PM EEST
 4  /home/appuser<Apr 4, 2016 11:24:49 PM EEST

Upvotes: 0

sum columns depend on another column in linux

Answers (3)

Related Questions