user2352938
user2352938

Reputation: 13

How to use awk/sed to reformat this text

everyone. I have a text file like this:

1
question1
answer1
2
question2
answer2
3
question3
answer3
<etc>

I've read (and unsuccessfully tried) many different ways using awk. I'm not a programmer, so awk is difficult to understand.

There are so many of you who are experts in awk, so I'm hoping that you will show me the correct command to do this.

Thank you for your help!

I would like to eliminate the numbered lines (1st, 4th, 7th, etc.) then put a comma between the questions and answers so the resulting text file looks like:

question1, answer1
question2, answer2
question3, answer3
<etc>

Upvotes: 0

Views: 222

Answers (4)

potong
potong

Reputation: 58381

This might work for you (GNU sed):

sed -n 'n;N;s/\n/, /p' file

Sed by default will print every line that it processes, however you can switch this off by using the -n option and now sed will only print when we wish it.

The n command normally prints the current line and replaces it with the next, but as we have asked it to only print on demand it effectively loses the current line.

The N command appends the next line to the current line. As sed normally strips off any newlines before processing a line it first appends a newline \n to the current line then appends the next.

The s/\n/, /p command substitutes this newline with a , followed by a space. The p flag at the end of the substitution command prints whatever is in the current line if the substitution was successful. As we have constructed the current line to have a newline in it (N) we know this will always happen.

To summarise, the commands: deletes the first line, joins the second and third with a newline and then replaces that newline with a comma followed by space and prints the result. Repeat.

A few of alternatives:

sed 'N;s/.*\n//;N;s/\n/, /' file

sed 'N;N;s/.*\n\(.*\)\n/\1, /' file

sed -En 'n;N;G;s/(.)(.*)\1$/, \2/' file

The last solution is similar to the first but never refers to a newline directly.

Upvotes: 1

Cyrus
Cyrus

Reputation: 88583

With GNU sed. Delete every third line starting from first line (1~3d), append the next line of input into sed's the pattern space (N) and replace the now contained newline in sed's pattern space (s/\n/, /).

sed '1~3d; N; s/\n/, /' file

Output:

question1, answer1
question2, answer2
question3, answer3

Upvotes: 2

Ed Morton
Ed Morton

Reputation: 203324

$ awk '{n=NR%3} n!=1{printf "%s%s", $0, (n?", ":ORS)}' file
question1, answer1
question2, answer2
question3, answer3

Upvotes: 3

mevets
mevets

Reputation: 10451

This seems to work:

awk '{
    if ($0 ~ /^[0-9]+$/) {
       /* eliminates lines of all numbers */
    } else if (x == "") {
        /* save until next line is available */
        x = $1;
    } else {
        /* print both */
        print x "," $1;
        /* reset flag */
        x = "";
    }
}'

It is not the most elegant awk, as it is more procedural than the pattern -> action that it is designed for....

Upvotes: 0

Related Questions