Reputation: 724

Duplicate Lines 2 times and transpose from row to column

I will like to duplicate each line 2 times and print values of column 5 and 6 separated.( transpose values of column 5 and 6 from column to row ) for each line

I mean value on column 5 (first line) value in column 6 ( second line)

Input File

08,1218864123180000,3201338573,VV,22,27
08,1218864264864000,3243738789,VV,15,23
08,1218864278580000,3244738513,VV,3,13
08,1218864310380000,3243938789,VV,15,23
08,1218864324180000,3244538513,VV,3,13
08,1218864334380000,3200538561,VV,22,27

Desired Output

08,1218864123180000,3201338573,VV,22
08,1218864123180000,3201338573,VV,27
08,1218864264864000,3243738789,VV,15
08,1218864264864000,3243738789,VV,23
08,1218864278580000,3244738513,VV,3
08,1218864278580000,3244738513,VV,13
08,1218864310380000,3243938789,VV,15
08,1218864310380000,3243938789,VV,23
08,1218864324180000,3244538513,VV,3
08,1218864324180000,3244538513,VV,13
08,1218864334380000,3200538561,VV,22
08,1218864334380000,3200538561,VV,27

I use this code to duplicate the lines 2 times, but i cant'n figer out the condition with values of column 5 and 6

awk '{print;print}' file

Thanks in advance

Upvotes: 0

Answers (4)

potong

Reputation: 58578

This might work for you (GNU awk):

awk '{print gensub(/((.*,).*),/,"\\1\n\\2",1)}' file

Replace the last comma by a newline and the previous fields less the penultimate.

Upvotes: 1

Wintermute

Reputation: 44073

In this simple case where the last field has to be removed and placed on the last line, you can do

awk -F , -v OFS=, '{ x = $6; NF = 5; print; $5 = x; print }'

Here -F , and -v OFS=, will set the input and output field separators to a comma, respectively, and the code does

{
  x = $6    # remember sixth field
  NF = 5    # Set field number to 5, so the last one won't be printed
  print     # print those first five fields
  $5 = x    # replace value of fifth field with remembered value of sixth
  print     # print modified line
}

This approach can be extended to handle fields in the middle with a function like the one in the accepted answer of this question.

EDIT: As Ed notes in the comments, writing to NF is not explicitly defined to trigger a rebuild of $0 (the whole-line record that print prints) in the POSIX standard. The above code works with GNU awk and mawk, but with BSD awk (as found on *BSD and probably Mac OS X) it fails to do anything.

So to be standards-compliant, we have to be a little more explicit and force awk to rebuild $0 from the modified field state. This can be done by assigning to any of the field variables $1...$NF, and it's common to use $1=$1 when this problem pops up in other contexts (for example: when only the field separator needs to be changed but not any of the data):

awk -F , -v OFS=, '{ x = $6; NF = 5; $1 = $1; print; $5 = x; print }'

I've tested this with GNU awk, mawk and BSD awk (which are all the awks I can lay my hands on), and I believe this to be covered by the awk bit in POSIX where it says "setting any other field causes the re-evaluation of $0" right at the top. Mind you, the spec could be more explicit on this point, and I'd be interested to test if more exotic awks behave the same way.

Upvotes: 2

Ed Morton

Reputation: 204731

To repeatedly print the start of a line for each of the last N fields where N is 2 in this case:

$ awk -v n=2 '
    BEGIN { FS=OFS="," }
    {
        base = $0
        sub("("FS"[^"FS"]+){"n"}$","",base)
        for (i=NF-n+1; i<=NF; i++) {
            print base, $i
        }
    }
' file
08,1218864123180000,3201338573,VV,22
08,1218864123180000,3201338573,VV,27
08,1218864264864000,3243738789,VV,15
08,1218864264864000,3243738789,VV,23
08,1218864278580000,3244738513,VV,3
08,1218864278580000,3244738513,VV,13
08,1218864310380000,3243938789,VV,15
08,1218864310380000,3243938789,VV,23
08,1218864324180000,3244538513,VV,3
08,1218864324180000,3244538513,VV,13
08,1218864334380000,3200538561,VV,22
08,1218864334380000,3200538561,VV,27

Upvotes: 2

RavinderSingh13

Reputation: 133780

Could you please try following(considering that your Input_file always is same as shown and you need to print every time 1st four fields and then rest of the fields(one by one printing along with 1st four)).

awk 'BEGIN{FS=OFS=","}{for(i=5;i<=NF;i++){print $1,$2,$3,$4,$i}}'  Input_file

Upvotes: 1

Duplicate Lines 2 times and transpose from row to column

Answers (4)

Related Questions