JavaRed
JavaRed

Reputation: 708

Reformat a long file using awk/sed

I have a very long file. The contents of the file are like:

myserver1
kernel_version
os

myserver2
kernel_version
os

myserver3
kernel_version
os
...

There are more than 10.000 entries and 3 entries for each host. Hostname, kernel_version and OS version.

I would like to have an output like:

myserver1, kernel_version, os
myserver2, kernel_version, os
myserver3, kernel_version, os
...

instead. So what is the best awk/sed command to provide this output?

Upvotes: 2

Views: 62

Answers (4)

bR3nD4n
bR3nD4n

Reputation: 121

While AWK/SED could help you perform this task, a better way would be to use Python, assuming that the *NIX system you are working on has it installed to process this data.

You could use the following in python to process this quite easily:

import csv

output_file = csv.writer(open("/path/to/output/file","w"))

column_num = 3 # number of columns in your end-state data
with open("</path/to/your/input/file>","r") as input:
  row = []
  iteration_counter = 0
  for line in input:
    iteration_counter += 1
    stripped = line.strip() # to remove the newlines (\n)
    if iteration_counter <= column_num:
      row.append(stripped)
    else:
      iteration_counter = 0 # reset the counter to 0
      output_writer.writerow(row) # output the list as a csv row
      row = [] # clear the row list to nothing
      iteration_counter += 1
      row.append(stripped)

Upvotes: 0

Benjamin W.
Benjamin W.

Reputation: 52441

With sed:

$ sed '/^$/d;N;N;s/\n/, /g' infile                                  
myserver1, kernel_version, os                              
myserver2, kernel_version, os                              
myserver3, kernel_version, os

This works as follows:

/^$/d       # Delete line if empty (skips rest of commands)
N           # Append second line to pattern space
N           # Append third line to pattern space
s/\n/, /g   # Replace newlines by comma and a blank

If you want the criterion for the line to be skipped not be "empty line" but its line number (4, 8, 12...), you can replace the first command (this is a GNU extension):

sed '4~4d;N;N;s/\n/, /g' infile

Upvotes: 3

Bertrand Martel
Bertrand Martel

Reputation: 45433

You can use :

awk 'BEGIN{RS="";OFS=", "} {print $1,$2,$3}' data.txt

defining record separator as empty line with output field separator (OFS) as ", "

You can also use :

awk 'BEGIN{RS="";OFS=", "} {$1=$1; print $0}' data.txt

$1=$1 forces the record to be reconstituted, see this

Upvotes: 1

MauricioRobayo
MauricioRobayo

Reputation: 2356

You can also use paste:

paste -d ',,\0' - - - - <file

Upvotes: 2

Related Questions