How to check successive values in a CSV file by awk?

Question

In a two-column CSV, I want to check the successive numbers (the values are integers):

1,1
4,1
5,1
6,1
7,1
10,1
11,1
4,2
5,2
6,2
10,2
5,3
5,7
5,8
5,9

I tried

awk -F, '
{
    if($1==a+1) { n+=1 } 
    else { n = 1 } 
{a=$1}
} {print $1"," n}
' file.txt

to produce

1,1,1
4,1,1
5,1,2
6,1,3
7,1,4
10,1,1
11,1,2
4,2,1
5,2,2
6,2,3
10,2,1
5,3,1
5,7,1
5,8,1
5,9,1

where the third column counts the successive rows with respect to the first column.

Is my approach is correct (particularly performance-wise because it is a huge file)?
What is the best way to calculate the fourth column based on the second column (counting the successive values in the second column)?

This is the expected output including the fourth column.

1,1,1,1
4,1,1,1
5,1,2,1
6,1,3,1
7,1,4,1
10,1,1,1
11,1,2,1
4,2,1,1
5,2,2,1
6,2,3,1
10,2,1,1
5,3,1,1 
5,7,1,1
5,8,1,2
5,9,1,3

My idea is to sort the produced 3-column CSV (sort -t, -k1,1n -k2,2n) and do the same thing with the second column, but I am not sure.

EDIT: I made a mistake in the first dataset. When we count the first column, the second column is constant, and vice versa.

anubhava · Accepted Answer

This awk should work for you:

awk 'BEGIN {
   FS=OFS=","
}
{
    $3 = ($1 == p1+1 ? $3+1 : 1)
    $4 = ($1 == p1 && $2 == p2+1 ? $4+1 : 1)
}
{
   p1 = $1
   p2 = $2
} 1' file.csv

1,1,1,1
4,1,1,1
5,1,2,1
6,1,3,1
7,1,4,1
10,1,1,1
11,1,2,1
4,2,1,1
5,2,2,1
6,2,3,1
10,2,1,1
5,3,1,1
5,7,1,1
5,8,1,2
5,9,1,3

Answers (2)