trufflewaffle
trufflewaffle

Reputation: 179

How to print certain fields in a column if one of the fields is less than a certain value?

I have a .txt file in that contains data about 100 colleges in the format

{COLLEGE NAME} {CITY, STATE} {RANK} {TUITION} {IN STATE TUITION} {ENROLLMENT}

For example here are two lines

YeshivaUniversity "New York, NY" 66 "$40,670 "  "2,744" 
FordhamUniversity "New York, NY" 60 "$47,317 "  "8,855"

There are 98 more lines and the output should return all the colleges with tuition less than $30000?

Assuming that the field separator is space, how could I print the {COLLEGE NAME} {CITY, STATE} {TUITION} of colleges with {TUITION} less than $30,000? Is it possible to do with awk or sort?

I have tried some combinations of awk and the operators <=, but I get an error every time. For example

$ awk -F" " '{print $1, $2, $4<=30000}' data1a.txt

gives me a syntax error.

Upvotes: 1

Views: 56

Answers (1)

James Brown
James Brown

Reputation: 37404

Using GNU awk, since it's got FPAT:

$ gawk '
BEGIN {
    FPAT="([^ ]*)|(\"[^\"]+\")"
}
{
    tuition=$4                    # separate 4th column for cleaning
    gsub(/[^0-9]/,"",tuition)     # clean non-digits off
    if(tuition<30000)             # compare
        print                     # and output
}'

Output for sample data:

(Next time, please post such sample that it has positive and negative cases.)

Also, it was mentioned in the comments: Delimited by single space and you have a space in name of University. That wasn't the case anymore when I saw your question but that could be tackled by counting the fields from the end, ie. $4 would be $(NF-1).

Upvotes: 2

Related Questions