MZimmerman6
MZimmerman6

Reputation: 8623

cut command in bash terminating on quotation marks

So I am trying to read in a file that has a bunch of lines with an email address and then a nickname in them. I am trying to extract this nickname, which is surrounded by parentheses, like below

[email protected] (Tom)

so my thought was just to use cut to get at the word Tom, but this is foiled when I end up with something like the following

[email protected] ("Bob")

Because Bob has quotes around it, the cut command fails as follows

cut: <file>: Illegal byte sequence

Does anyone know of a better way of doing this? or a way to solve this problem?

Upvotes: 2

Views: 7031

Answers (3)

kallos
kallos

Reputation: 81

Reset your locale to C (raw uninterpreted byte sequence) to avoid Illegal byte sequence errors.

locale charmap
LC_ALL=C cut ... | LC_ALL=C sort ...

Upvotes: 8

Vijay
Vijay

Reputation: 67319

perl -lne '$_=~/[^\(]*\(([^)]*)\)/g;print $1'

tested here

Upvotes: 0

Floris
Floris

Reputation: 46445

I think that

grep -o '(.*)' emailFile 

should do it. "Go through all lines in the file. Look for a sequence that starts with open parens, then any characters until close parens. Echo the bit that matches the string to stdout."

This preserves the quotes around the nickname... as well as the brackets. If you don't want those, you can strip them:

grep -o '(.*)' emailFile | sed 's/[(")]//g'

("replace any of the characters between square brackets with nothing, everywhere")

Upvotes: 1

Related Questions