Reputation: 4004

bash command to print column at specific range of line numbers

I'm trying to get the values in column X at lines 5 to 5 + Y. I'm guessing there's a quick way to do this with awk. How is this done?

Upvotes: 1

Answers (3)

Robin A. Meade

Reputation: 2474

Use awk to print column 2 of lines 5 to 10:

awk 'NR==5,NR==10 {print $2}' <file                           # white space delim. columns
awk 'NR==5,NR==10 {print $2}; NR==10 {exit}' <file            # optimized
awk -F: 'NR==5,NR==10 {print $2}; NR==10 {exit}' </etc/passwd # colon delimited columns

The optimization is that it exits after the last line of the desired range has been printed.

A range pattern was used:

A range pattern is made of two patterns separated by a comma, in the form ‘begpat, endpat’. It is used to match ranges of consecutive input records.
https://www.gnu.org/software/gawk/manual/html_node/Ranges.html

A pattern can be either a regexp pattern or an expression pattern. Above uses expression patterns to do comparisons with NR.

I assumed white space delimited columns, but provided an example of specifying a different delimiter with the -F option.

Upvotes: 0

tripleee

Reputation: 189377

If by "column" you mean you have a file with, say, comma-delimited fields and you want to extract a particular field, the accepted answer does that nicely. To recap,

awk -F , 'NR==5 { print $6 }' file

to extract the sixth field from line number 5 in a comma-separated file. If your delimiter is not comma, pass something else as the argument to the -F option. (With GNU Awk you can pass a regex to -F to specify fairly complex column delimiters, but if you need that, go find a more specific question about that particular scenario.)

If by "column" you mean a fixed character position within a line, the substr function does that.

awk 'NR == 5 { print substr($0, 6) }' file

prints the sixth column and everything after it. If you want to restrict to a fixed width,

awk 'NR == 5 { print substr($0, 6, 7) }' file

prints seven characters starting at offset 6 (Awk indexing starts at 1, so offset 1 is the first character on the line) on line 5. If you don't know exactly how many characters to extract, but you want a number, Awk conveniently allows you to extract the number from the start of a string:

awk 'NR == 5 { print 0 + substr($0, 6, 7) }' file

will extract the same 7 characters but then coerce the result to a number, effectively trimming any non-numeric suffix, and print that.

In the most general case, you might want to perform further splitting on the value you have extracted.

awk 'NR == 5 { split(substr($0, 6), a, /:/); print a[1] }' file

will split the extracted substring on the regex /:/ (in this trivial case, the regex simply matches a literal colon character) into the array a. We then print the first element of a, meaning we ditch everything starting from the first colon in the substring which starts at index 6 and extends through to the end of the line on line number 5.

(To spare you from having to look it up, $0 is the entire current input line. Awk processes a file line by line, running the body of the script on each line in turn. If you need to expose shell variables to Awk, awk -v awkvariable="$shellvariable" does that.)

Upvotes: 1

Steve

Reputation: 54392

I think this will work for you, untested:

awk 'NR >= 5 && NR <= 5 + Y { print $X }' file.txt

Obviously, substitute X and Y for some real values.

EDIT:

If X and Y are shell variables:

awk -v column="$X" -v range="$Y" 'NR >= 5 && NR <= 5 + range { print $column }' file.txt

Upvotes: 2

bash command to print column at specific range of line numbers

Answers (3)

Related Questions