user5588864
user5588864

Reputation:

How to grep lines containing more than 1 word in Unix

I have a file called companies.dat containing lines with several company names. How can I use the grep command to identify and display all the companies with more than 1 word in their name? I'm not searching for any specific word, just a pattern with more than 1 word per line.

Here is the file content:

id companyName placeId
1:British Airways:1
2:The New York Times:3
3:Toyota:3
4:BNP Paribas:2
5:EDF:2
6:Tesco:1
7:IBM:1
8:Google:3
9:Castlemaine:5

Upvotes: 2

Views: 2804

Answers (6)

dhamaraiselvi sekar
dhamaraiselvi sekar

Reputation: 1

It's much simpler with awk:

awk 'NF>=1' file

according to your requirement of word count, you can change the value.

Upvotes: 0

Dimitris Koulialis
Dimitris Koulialis

Reputation: 11

In this example most likely they want you to identify patterns.

If they want you to use grep then you first and foremost access your manual.

You will find that there is -E mode you can use.

In your file, all words are separated by a space like a pattern

My solution is:

grep -E ' ' /the literal path of the file/companies.dat

Upvotes: 1

kvantour
kvantour

Reputation: 26471

A robust way would be

awk -F: '($2 ~ /[^ ] [^ ]/)' file

It checks for a space sandwiched between two non-space characters in the second field.

Upvotes: 0

Jahid
Jahid

Reputation: 22428

This is a way:

grep -o '[a-ZA-Z]*[[:blank:]]\+[a-ZA-Z[:blank:]]*' companies.dat

Output:

id companyName placeId
British Airways
The New York Times
BNP Paribas

If you want to omit the first line (id companyName placeId), then:

tail -n +2 companies.dat |grep -o '[a-ZA-Z]*[[:blank:]]\+[a-ZA-Z[:blank:]]*'

Output:

British Airways
The New York Times
BNP Paribas

If you want all the other info too, then just omit the -o flag from the grep command:

tail -n +2 companies.dat |grep '[a-ZA-Z]*[[:blank:]]\+[a-ZA-Z[:blank:]]*'

Output:

1:British Airways:1
2:The New York Times:3
4:BNP Paribas:2

Upvotes: 0

Sobrique
Sobrique

Reputation: 53478

If you specifically have to use grep then check for spaces:

grep -E '\w\s+\w' 

Or perhaps:

grep '[A-Za-z] [A-Za-z]' 

This checks for a letter on either side of a space too, but personally I think it a bit less elegant.

Upvotes: 3

P.P
P.P

Reputation: 121357

It's much simpler with awk:

awk 'NF>1' file

Upvotes: 2

Related Questions