Reputation: 8623
So I am trying to read in a file that has a bunch of lines with an email address and then a nickname in them. I am trying to extract this nickname, which is surrounded by parentheses, like below
[email protected] (Tom)
so my thought was just to use cut to get at the word Tom
, but this is foiled when I end up with something like the following
[email protected] ("Bob")
Because Bob has quotes around it, the cut command fails as follows
cut: <file>: Illegal byte sequence
Does anyone know of a better way of doing this? or a way to solve this problem?
Upvotes: 2
Views: 7031
Reputation: 81
Reset your locale
to C
(raw uninterpreted byte sequence) to avoid Illegal byte sequence
errors.
locale charmap
LC_ALL=C cut ... | LC_ALL=C sort ...
Upvotes: 8
Reputation: 46445
I think that
grep -o '(.*)' emailFile
should do it. "Go through all lines in the file. Look for a sequence that starts with open parens, then any characters until close parens. Echo the bit that matches the string to stdout."
This preserves the quotes around the nickname... as well as the brackets. If you don't want those, you can strip them:
grep -o '(.*)' emailFile | sed 's/[(")]//g'
("replace any of the characters between square brackets with nothing, everywhere")
Upvotes: 1