Reputation: 422

Permutation columns without repetition

Can anybody give me some piece of code or algorithm or something else to solve the following problem? I have several files, each with a different number of columns, like:

$> cat file-1   
1 2
$> cat file-2
1 2 3
$> cat file-3
1 2 3 4

I would like to subtract the column absolute values and divide by the sum of all in a row for each different columns only once (combination without repeated column pairs):

in file-1 case I need to get:

0.3333                    # because |1-2/(1+2)|

in file-2 case I need to get:

0.1666 0.1666 0.3333      # because |1-2/(1+2+3)| and |2-3/(1+2+3)| and |1-3/(1+2+3)|

in file-3 case I need to get:

0.1 0.2 0.3 0.1 0.2 0.1   # because |1-2/(1+2+3+4)| and |1-3/(1+2+3+4)| and |1-4/(1+2+3+4)| and |2-3/(1+2+3+4)| and |2-4/(1+2+3+4)| and |3-4/(1+2+3+4)|

Upvotes: 4

Answers (3)

jaypal singh

Reputation: 77115

This should work though I am guessing you have made a minor mistake in your input data. Based on your third pattern the following data should be -

Instead of:

in file-2 case I need to get:

0.1666 0.1666 0.3333      # because |1-2/(1+2+3)| and |2-3/(1+2+3)| and |1-3/(1+2+3)|

It should be:

in file-2 case I need to get:

0.1666 0.3333 0.1666     # because |1-2/(1+2+3)| and |1-3/(1+2+3)| and |2-3/(1+2+3)|

Here is the awk one liner:

awk '
NF{
    a=0;
    for(i=1;i<=NF;i++)
    a+=$i;
    for(j=1;j<=NF;j++)
    {
        for(k=j;k<NF;k++)
        printf("%s ",-($j-$(k+1))/a)
        }
    print "";
    next;
    }1' file

Short version:

awk '
NF{for (i=1;i<=NF;i++) a+=$i; 
for (j=1;j<=NF;j++){for (k=j;k<NF;k++) printf("%2.4f ",-($j-$(k+1))/a)}
print "";a=0;next;}1' file

Input File:

[jaypal:~/Temp] cat file
1 2

1 2 3

1 2 3 4

Test:

[jaypal:~/Temp] awk '
NF{
    a=0;
    for(i=1;i<=NF;i++)
    a+=$i;
    for(j=1;j<=NF;j++)
    {
        for(k=j;k<NF;k++)
        printf("%s ",-($j-$(k+1))/a)
        }
    print "";
    next;
    }1' file
0.333333 

0.166667 0.333333 0.166667 

0.1 0.2 0.3 0.1 0.2 0.1

Test from shorter version:

[jaypal:~/Temp] awk '
NF{for (i=1;i<=NF;i++) a+=$i; 
for (j=1;j<=NF;j++){for (k=j;k<NF;k++) printf("%2.4f ",-($j-$(k+1))/a)}
print "";a=0;next;}1' file 
0.3333 

0.1667 0.3333 0.1667 

0.1000 0.2000 0.3000 0.1000 0.2000 0.1000

Upvotes: 3

Steve

Reputation: 54477

@Jaypal just beat me too it! Here's what I had:

awk '{for (x=1;x<=NF;x++) sum += $x; for (i=1;i<=NF;i++) for (j=2;j<=NF;j++) if (i < j) printf ("%.1f ",-($i-$j)/sum)} END {print ""}' file.txt

Output:

0.1 0.2 0.3 0.1 0.2 0.1

prints to one decimal place.

@Jaypal, Is there a quick way to printf an absolute value? Perhaps like: abs(value) ?

EDIT:

@Jaypal, yes I've tried searching too and couldn't find something simple :-( It seems if ($i < 0) $i = -$i is the way to go. I guess you could use sed to remove any minus signs:

awk '{for (x=1;x<=NF;x++) sum += $x; for (i=1;i<=NF;i++) for (j=2;j<=NF;j++) if (i < j) printf ("%.1f ", ($i-$j)/sum)} {print ""}' file.txt | sed "s%-%%g"

Cheers!

Upvotes: 1

shadyabhi

Reputation: 17234

As it looks like a homework, I will act accordingly.

To find the total numbers present in the file, you can use

cat filename | wc -w

Find the first_number by:

cat filename | cut -d " " -f 1

To find the sum in a file:

cat filename | tr " " "+" | bc

Now, that you have the total_nos, use something like:

for i in {seq 1 1 $total_nos}
do
    #Find the numerator by first_number - $i
    #Use the sum you got from above to get the desired value.
done

Upvotes: 0