Carl Gustav
Carl Gustav

Reputation: 9

How do I extract the second element from rows in a list, and store the elements in a comma separated line in a new file using BASH?

I have rows and columns preceded by an introduction that I do not need. There are no headers for the columns. The data in my current file looks something like this (IP Addresses are fake):

This is a totally extraneous introduction and does not have anything to do with the data. It is here as a facsimile of what the output file looks like.

df    bank.com 10.10.10.1
sdfdg store.com 10.10.10.2
s     church.com 10.10.10.3

I need to skip past the introduction, paste in the extracted data to look like this below (derived from field two above), and put it all into a new .txt file. The strings do not need quotes:

bank.com,store.com,church.com

Any advice on how to do this in Bash?

I tried using this following technique, but it only grabbed the first line of the introduction, and did not go through each row.

Turning multi-line string into single comma-separated

Upvotes: 0

Views: 2433

Answers (2)

markp-fuso
markp-fuso

Reputation: 34084

Assumptions:

  • always skip the first 2 lines of the input file
  • the second field contains no white space
  • there are no blank lines

Sample data file:

$ cat input.dat
This is a totally extraneous introduction and does not have anything to do with the data. It is here as a facsimile of what the output file looks like.

df    bank.com 10.10.10.1
sdfdg store.com 10.10.10.2
s     church.com 10.10.10.

One awk solution:

$ awk 'FNR>2 {printf "%s%s", pfx, $2; pfx=","} END {printf "\n"}' input.dat
bank.com,store.com,church.com

Explanation:

  • FNR>2 - for record (row) numbers greater than 2 ...
  • printf "%s%s", pfx, $2 - print our prefix (initially blank) plus field #2; because there is no \n in the format the cursor is left on the current line
  • pfx="," - set prefix to a comma (,) for the rest of the file
  • END {printf "\n"} - add a \n to the end of the line

Upvotes: 1

Eric Miller
Eric Miller

Reputation: 56

This should work:

tail -n +3 filename | awk '{print $2}' | sed -z 's/\n/,/g;s/,$/\n/' > newfile.txt

The tail -n +3 will skip the first 2 lines of the file (so change this to however many lines the intro is, plus one).

The next part prints out only the second column, the one you're interested in.

The third replaces the newlines with commas.

The last part places the output into a new file.

Upvotes: 0

Related Questions