Rajeev
Rajeev

Reputation: 46909

linux search a pattern and print its count

In the following i am trying to grep all the pattern of geoIs.My question is how can i list different values of of geoIs along with it count

ex:

GeoIs:"Paramount","sumthing else"
GeoIs:"undefined","sumthing else"
GeoIs:"undefined","sumthing else"
GeoIs:"178","sumthing else"
GeoIs:"178","sumthing else"
and many more
...
...

Result expected:

GeoIs:"Paramount" 1
GeoIs:"undefined" 2
GeoIs:"178" 2

command

zcat file.gz | grep -P '"geoIs":".*?.undefined*?"' | sort -u -T.|wc -l

EDIT1:

GEOIS is found int he following string

  012-10-02 09:32:45{"e":{"ec":100001,"st":1349170352455,"bd":"Mozilla%2F5.0%20(Windows%20NT%206.1)%20AppleWebKit%2F537.4%20(KHTMf01f02008592~rt%2366.657~rv%2366.228~as%2317~st%231349170293955~cat%231349170352431~sp%23as~c%2334~pat%231349128562942","smplCookie":"undefined","geoIPAddress":"122.107.154.58","geoCountry":"australia","geoCity":"Vermont","geoRegion":"Victoria","geoPostalCode":"undefined","geoLatitude":"undefined","geoLongitude":"undefined","geoMetro":"0","geoArea":"0","geoIs"}}

Upvotes: 0

Views: 117

Answers (2)

choroba
choroba

Reputation: 241848

To return a frequency table, use

sort | uniq -c | sort -n

For the sample data you provided, I'd use

zcat file.gz | cut -f1 -d, | sort | uniq -c | sort -n

zcat file.gz | grep -o '"searchstring":"[^"]*"'| sort | uniq -c | sort -n

Upvotes: 3

Kent
Kent

Reputation: 195049

an awk alternative:

awk -F, '{a[$1]++;}END{for(x in a)if(x)print x,a[x]}' file


kent$  echo 'GeoIsp:"Paramount","sumthing else"
GeoIsp:"undefined","sumthing else"
GeoIsp:"undefined","sumthing else"
GeoIsp:"178","sumthing else"
GeoIsp:"178","sumthing else"
'|awk -F, '{a[$1]++;}END{for(x in a)if(x)print x,a[x]}'
GeoIsp:"Paramount" 1
GeoIsp:"undefined" 2
GeoIsp:"178" 2

Upvotes: 1

Related Questions