Bhawan
Bhawan

Reputation: 2491

matching a column with awk using regex

I have a CSV file in which record number, element symbol and element name is written. I want to extract those lines in which second column (i.e element symbol) does not contain the letters a,e,i,o,u,A,E,I,O,U. So I wrote a script to achieve the same but it is giving me all the lines.

This is my script:

awk -F',' '$2~/[^aeiouAEIOU]/' sample.txt

The sample.txt file:

102,No,Nobelium
103,Lr,Lawrencium
104,Rf,Rutherfordium
105,Db,Dubnium
106,Sg,Seaborgium
107,Bh,Bohrium
108,Hs,Hassium
109,Mt,Meitnerium
110,Ds,Darmstadtium
111,Rg,Roentgenium
112,Cn,Copernicium
113,Nh,Nihonium
114,Fl,Flerovium
115,Mc,Moscovium
116,Lv,Livermorium
117,Ts,Tennessine
118,Og,Oganesson

Upvotes: 0

Views: 69

Answers (1)

Sundeep
Sundeep

Reputation: 23667

Try

$ awk -F',' '$2!~/[aeiouAEIOU]/' sample.txt
103,Lr,Lawrencium
104,Rf,Rutherfordium
105,Db,Dubnium
106,Sg,Seaborgium
107,Bh,Bohrium
108,Hs,Hassium
109,Mt,Meitnerium
110,Ds,Darmstadtium
111,Rg,Roentgenium
112,Cn,Copernicium
113,Nh,Nihonium
114,Fl,Flerovium
115,Mc,Moscovium
116,Lv,Livermorium
117,Ts,Tennessine
  • !~ to return false on a match
  • $2~/[^aeiouAEIOU]/ means return true if second field contains any non-vowel character.. so, No will match because N is non-vowel character
    • this can be corrected by whole string match: $2~/^[^aeiouAEIOU]+$/
  • tolower($2) !~ /[aeiou]/ can also be used instead of $2 !~ /[aeiouAEIOU]/

Upvotes: 1

Related Questions