Reputation: 1
I have the huge logfile which contain more then 100M strings. it contains 19 columns:
time | date | host | user | domain | category | source | port | URL | etc
example:
time date host user domain category source port URL etc
2:10:21 18.11.2014 192.168.56.101 %username1% %domainname% "many words" stackoverflow.com "80" http://stackoverflow.com/
2:10:22 18.11.2014 192.168.56.101 %username2% %domainname% "done" stackoverflow.com "80" http://stackoverflow.com/
2:10:23 18.11.2014 192.168.56.101 %username3% %domainname% "denied site" stackoverflow.com "80" http://stackoverflow.com/
2:10:24 18.11.2014 192.168.56.101 %username4% %domainname% "suspicious" stackoverflow.com "80" http://stackoverflow.com/
2:10:25 18.11.2014 192.168.56.101 %username5% %domainname% "uncategorized" stackoverflow.com "80" http://stackoverflow.com/
2:10:26 18.11.2014 192.168.56.101 %username6% %domainname% "denied site" stackoverflow.com "80" http://stackoverflow.com/
2:10:27 18.11.2014 192.168.56.101 %username7% %domainname% "many words" stackoverflow.com "80" http://stackoverflow.com/
when I try find string in column sometimes it looks badly:
user@stand-01:~/folder$cat file |awk '{FS=" ";print$6}'
category
"many
"done"
"denied
"suspicious"
"uncategorized"
"denied
"many
so when I try 7-th column it has data from another column:
user@stand-01:~/folder$cat file |awk '{FS=" ";print$7}'
source
words"
stackoverflow.com
site"
stackoverflow.com
stackoverflow.com
site"
words"
How can I use space delimiter and avoid separating text in quotes?
Upvotes: 0
Views: 93
Reputation: 26667
Something like this may work
$ awk '$6 ~ /^"[^"]+"$/{print $6;next} $6 ~ /^"/{print $6, $7}' input
"many words"
"done"
"denied site"
"suspicious"
"uncategorized"
"denied site"
"many words"
Upvotes: 0
Reputation: 41456
Here is one awk
awk -F\" 'NR>1{print $2}' file
many words
done
denied site
suspicious
uncategorized
denied site
many words
Or
awk -F\" 'NR>1{print FS$2FS}' file
"many words"
"done"
"denied site"
"suspicious"
"uncategorized"
"denied site"
"many words"
Upvotes: 1