Caterina
Caterina

Reputation: 987

how to keep newline(s) when selecting a given column with awk

Suppose I have a file like this (disclaimer: this is not fixed I can have more than 7 rows, and more than 4 columns)

R H A 23
S E A 45
T E A 34
U   A 35
Y T A 35
O E A 353
J G B 23

I want the output to select second column if third column is A but keeping newline or whitespace character.

output should be:

HEE TE

I tried this:

awk '{if ($3=="A") print $2}' file | awk 'BEGIN{ORS = ""}{print $1}'

But this gives:

HEETE%

Which has a weird % and is missing the space.

Upvotes: 2

Views: 116

Answers (1)

anubhava
anubhava

Reputation: 785631

You may use this gnu-awk solution using FIELDWIDTHS:

awk 'BEGIN{ FIELDWIDTHS = "1 1 1 1 1 1 *" } $5 == "A" {s = s $3}
END {print s}' file

HEE TE

awk splits each record using width values provided in this variable FIELDWIDTHS.

1 1 1 1 1 1 * means each of first 6 columns will have single character length and remaining text will be filled in 7th column. Since you have a space after each value so $2,$4,$6 will be filled with a single space and $1,$3,$5 will be filled with the provided values in input.

$5 == "A" {s = s $3}: Here we are checking if $5 is A and if that condition is true then we keep appending value of $3 in a variable s. In the END block we just print variable s.

Without using fixed width parsing, awk will treat A in 4th row as $2.


Or else if we let spaces part of column value then use:

awk '
   BEGIN{ FIELDWIDTHS = "2 2 2 *" }
   $3 == "A " {s = s substr($2,1,1)}
   END {print s}
' file

Upvotes: 3

Related Questions