pixel
pixel

Reputation: 10587

awk how to print from column 4 to the end of line when last column contains a sentence

If I have file fileA.txt containing lines like this:

Toyota 2017 Corola Good condition
Honda 2013 Civic Flat back right tire
Jeep 2017 Wrangler Roof is leaking and oil needs change

I would like to print only column 2 and 3 but column 3 contains sentence with spaces. I would like my output to look like

2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

But I cannot make the last column print full sentence. I get only last word like:

awk '{print $2,$NF}' fileA.txt 

This outputs

2017 condition
2013 tire
2017 change 

I know why this is happening. It is because awk treats spaces as column separators and NF returns simply only last column for each line which is last word.

How do I get desired output using awk?

Upvotes: 1

Views: 684

Answers (3)

Ed Morton
Ed Morton

Reputation: 203985

$ cut -d' ' -f2- file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

$ sed 's/[^ ]* //' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

$ awk '{sub(/[^ ]* /,"")} 1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

$ awk 'match($0," "){$0=substr($0,RSTART+1)} 1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

Any of the above will work no matter which values are in $1 or $2 or anywhere else in the input.

If you wanted to separate each line into 4 fields you could do this with GNU awk for the 3rd arg to match() and \S:

$ awk -v OFS='\t' '
    match($0,/(\S+) (\S+) (\S+) (.*)/,a) {
        print a[1], a[2], a[3], a[4]
    }
' file
Toyota  2017    Corola  Good condition
Honda   2013    Civic   Flat back right tire
Jeep    2017    Wrangler        Roof is leaking and oil needs change

and then you can, of course, print whichever fields you like.

Upvotes: 2

David C. Rankin
David C. Rankin

Reputation: 84579

Only downside to assigning an empty-string to field-1 is that it will cause a recalculation of the fields removing any additional whitespace from the original record. You can preserve the original record format (which may/may not be required) by using either sub() or match and then substr() with RSTART (filled by `match() containing the position of the start of the 2nd field).

For example:

$ awk '{sub ($1 FS, "", $0)}1' file

(substitute the empty-string for the field-1 and a field-separator)

or

$ awk '{match ($0,$2); print substr($0,RSTART)}' file

(use match() to obtain the start of field-2 and print with substr())

Example Use/Output

With your sample data in file you would receive the following:

$ awk '{sub ($1 FS, "", $0)}1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

and

$ awk '{match ($0,$2); print substr($0,RSTART)}' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

See GNU Awk User's Guide - String Functions for detailed usage of sub(), match() and substr(). (and note the difference in parameter order for match() compared with sub(), gsub(), etc..) Let me know if you have questions.

Upvotes: 2

Daweo
Daweo

Reputation: 36620

I would harness GNU AWK following way, let file.txt content be

Toyota 2017 Corola Good condition
Honda 2013 Civic Flat back right tire
Jeep 2017 Wrangler Roof is leaking and oil needs change

then

awk '{$1="";print substr($0,2)}' file.txt

gives output

2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

Explanation: I clear 1st field ($1) by setting its' value to empty string then print rest starting at 2nd character using substr function in order to remove leading space.

(tested in GNU Awk 5.1.0)

Upvotes: 2

Related Questions