Reputation: 10587
If I have file fileA.txt containing lines like this:
Toyota 2017 Corola Good condition
Honda 2013 Civic Flat back right tire
Jeep 2017 Wrangler Roof is leaking and oil needs change
I would like to print only column 2 and 3 but column 3 contains sentence with spaces. I would like my output to look like
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
But I cannot make the last column print full sentence. I get only last word like:
awk '{print $2,$NF}' fileA.txt
This outputs
2017 condition
2013 tire
2017 change
I know why this is happening. It is because awk treats spaces as column separators and NF returns simply only last column for each line which is last word.
How do I get desired output using awk?
Upvotes: 1
Views: 684
Reputation: 203985
$ cut -d' ' -f2- file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
$ sed 's/[^ ]* //' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
$ awk '{sub(/[^ ]* /,"")} 1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
$ awk 'match($0," "){$0=substr($0,RSTART+1)} 1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
Any of the above will work no matter which values are in $1 or $2 or anywhere else in the input.
If you wanted to separate each line into 4 fields you could do this with GNU awk for the 3rd arg to match()
and \S
:
$ awk -v OFS='\t' '
match($0,/(\S+) (\S+) (\S+) (.*)/,a) {
print a[1], a[2], a[3], a[4]
}
' file
Toyota 2017 Corola Good condition
Honda 2013 Civic Flat back right tire
Jeep 2017 Wrangler Roof is leaking and oil needs change
and then you can, of course, print whichever fields you like.
Upvotes: 2
Reputation: 84579
Only downside to assigning an empty-string to field-1 is that it will cause a recalculation of the fields removing any additional whitespace from the original record. You can preserve the original record format (which may/may not be required) by using either sub()
or match
and then substr()
with RSTART
(filled by `match() containing the position of the start of the 2nd field).
For example:
$ awk '{sub ($1 FS, "", $0)}1' file
(substitute the empty-string for the field-1 and a field-separator)
or
$ awk '{match ($0,$2); print substr($0,RSTART)}' file
(use match()
to obtain the start of field-2 and print with substr()
)
Example Use/Output
With your sample data in file
you would receive the following:
$ awk '{sub ($1 FS, "", $0)}1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
and
$ awk '{match ($0,$2); print substr($0,RSTART)}' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
See GNU Awk User's Guide - String Functions for detailed usage of sub()
, match()
and substr()
. (and note the difference in parameter order for match()
compared with sub()
, gsub()
, etc..) Let me know if you have questions.
Upvotes: 2
Reputation: 36620
I would harness GNU AWK
following way, let file.txt
content be
Toyota 2017 Corola Good condition
Honda 2013 Civic Flat back right tire
Jeep 2017 Wrangler Roof is leaking and oil needs change
then
awk '{$1="";print substr($0,2)}' file.txt
gives output
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
Explanation: I clear 1st field ($1
) by setting its' value to empty string then print
rest starting at 2nd character using substr
function in order to remove leading space.
(tested in GNU Awk 5.1.0)
Upvotes: 2