Reputation:
I have a file called companies.dat containing lines with several company names. How can I use the grep
command to identify and display all the companies with more than 1 word in their name? I'm not searching for any specific word, just a pattern with more than 1 word per line.
Here is the file content:
id companyName placeId
1:British Airways:1
2:The New York Times:3
3:Toyota:3
4:BNP Paribas:2
5:EDF:2
6:Tesco:1
7:IBM:1
8:Google:3
9:Castlemaine:5
Upvotes: 2
Views: 2804
Reputation: 1
It's much simpler with awk:
awk 'NF>=1' file
according to your requirement of word count, you can change the value.
Upvotes: 0
Reputation: 11
In this example most likely they want you to identify patterns.
If they want you to use grep then you first and foremost access your manual.
You will find that there is -E mode you can use.
In your file, all words are separated by a space like a pattern
My solution is:
grep -E ' ' /the literal path of the file/companies.dat
Upvotes: 1
Reputation: 26471
A robust way would be
awk -F: '($2 ~ /[^ ] [^ ]/)' file
It checks for a space sandwiched between two non-space characters in the second field.
Upvotes: 0
Reputation: 22428
This is a way:
grep -o '[a-ZA-Z]*[[:blank:]]\+[a-ZA-Z[:blank:]]*' companies.dat
Output:
id companyName placeId
British Airways
The New York Times
BNP Paribas
If you want to omit the first line (id companyName placeId
), then:
tail -n +2 companies.dat |grep -o '[a-ZA-Z]*[[:blank:]]\+[a-ZA-Z[:blank:]]*'
Output:
British Airways
The New York Times
BNP Paribas
If you want all the other info too, then just omit the -o
flag from the grep command:
tail -n +2 companies.dat |grep '[a-ZA-Z]*[[:blank:]]\+[a-ZA-Z[:blank:]]*'
Output:
1:British Airways:1
2:The New York Times:3
4:BNP Paribas:2
Upvotes: 0
Reputation: 53478
If you specifically have to use grep
then check for spaces:
grep -E '\w\s+\w'
Or perhaps:
grep '[A-Za-z] [A-Za-z]'
This checks for a letter on either side of a space too, but personally I think it a bit less elegant.
Upvotes: 3