Suresh Rajasekaran
Suresh Rajasekaran

Reputation: 19

Filter text file using awk or sed or cut?

I am trying to fix this issue.

$ cat test.txt  
server1
ec2-xx.xx.xx.xxus-west-2.compute.amazonaws.com
Search
ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
Web
ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
Web
ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
server2
ec2-xx.xx-xx-xx.us-west-2.compute.amazonaws.com
loaddb
ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
ec2dd
ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
server2
ec2-xx.xx-xx-xx.us-west-2.compute.amazonaws.com
loaddb
ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com

I need output like this:

$ cat test.txt
server1:ec2-xx.xx.xx.xxus-west-2.compute.amazonaws.com  
Search:ec2-xx.xx.xx.xxus-west-2.compute.amazonaws.com  
Web:ec2-xx.xx.xx.xxus-west-2.compute.amazonaws.com  
Web:ec2-xx.xx.xx.xxus-west-2.compute.amazonaws.com  
server2:ec2-xx.xx.xx.xxus-west-2.compute.amazonaws.com  
loaddb:ec2-xx.xx.xx.xxus-west-2.compute.amazonaws.com  
ec2dd:ec2-xx.xx.xx.xxus-west-2.compute.amazonaws.com  
server2:ec2-xx.xx.xx.xxus-west-2.compute.amazonaws.com  
loaddb:ec2-xx.xx.xx.xxus-west-2.compute.amazonaws.com

Upvotes: 1

Views: 624

Answers (3)

John1024
John1024

Reputation: 113844

Using sed:

$ sed 'N;s/\n/:/' test.txt
server1:ec2-xx.xx.xx.xxus-west-2.compute.amazonaws.com
Search:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
Web:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
Web:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
server2:ec2-xx.xx-xx-xx.us-west-2.compute.amazonaws.com
loaddb:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
ec2dd:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
server2:ec2-xx.xx-xx-xx.us-west-2.compute.amazonaws.com
loaddb:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com 

This approach uses two sed commands:

  • N reads in a second line from the file and appends it to the pattern space. This way, the pattern space always has two consecutive lines in it.

  • s/\n/:/ removes the newline from between the two lines in the pattern space and replaces it with a colon.

Using awk:

$ awk 'NR%2==1{name=$1;next} {print name ":" $0;}' test.txt
server1:ec2-xx.xx.xx.xxus-west-2.compute.amazonaws.com
Search:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
Web:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
Web:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
server2:ec2-xx.xx-xx-xx.us-west-2.compute.amazonaws.com
loaddb:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
ec2dd:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
server2:ec2-xx.xx-xx-xx.us-west-2.compute.amazonaws.com
loaddb:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com 

Notes:

  • NR%2==1{name=$1;next}

    This reads in the first and all subsequent odd-numbered lines and assigns the first field to the variable name. It then skips the rest of the commands and jumps to start over with the next line.

  • print name ":" $0

    On even numbered lines, this prints the name, a colon, and the current line.

Using pure shell

while read name; read line
do
        printf "%s:%s\n" "$name" "$line"
done <test.txt

Here, one line is read from text.txt into the variable name and the next into the variable line. These two are then printed with a colon between them.

Upvotes: 4

Avinash Raj
Avinash Raj

Reputation: 174706

You could simply use the paste command,

paste -d: - - < file

Through Perl,

perl -pe 's/\n/:/g if $.%2==1' file

$. in Perl is similar to NR in awk. So it takes only the odd lines and replaces the newline character with : only on that particular lines.

Upvotes: 2

Jotne
Jotne

Reputation: 41456

This awk may do, but be careful with getline if you do not fully understand it.

awk '{a=$1;getline;print a":"$1}' file
server1:ec2-xx.xx.xx.xxus-west-2.compute.amazonaws.com
Search:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
Web:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
Web:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
server2:ec2-xx.xx-xx-xx.us-west-2.compute.amazonaws.com
loaddb:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
ec2dd:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
server2:ec2-xx.xx-xx-xx.us-west-2.compute.amazonaws.com
loaddb:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com

This is a better way to do it:

awk 'ORS=NR%2?":":RS' file
server1:ec2-xx.xx.xx.xxus-west-2.compute.amazonaws.com
Search:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
Web:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
Web:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
server2:ec2-xx.xx-xx-xx.us-west-2.compute.amazonaws.com
loaddb:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
ec2dd:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com
server2:ec2-xx.xx-xx-xx.us-west-2.compute.amazonaws.com
loaddb:ec2-xx-xx-xx-xx.us-west-2.compute.amazonaws.com

Upvotes: 2

Related Questions