Reputation: 33
I have a file containing three kinds of lines:
[ ] APPLE
[ORANGE ] * ORANGE on XXXXXXXXXXXXXXX
[YELLOW ] + BANANA on XXXXXXXXXXXXXXX
What I want to do now is to extract the fruit name like below:
APPLE
ORANGE
BANANA
I tried to extract it with echo ${line:start:end}
before I realized both the length of the line might vary. So I guess I have to do it with pattern matching.
I'm new to bash, how should I extract the fruit name, with sed/awk or any other ways?
Thanks!
Upvotes: 2
Views: 1156
Reputation: 4514
This deals with the two-worded fruit names like "star fruit", but must assume that the trailing garbage (if any) starts with "on" (i.e. those "on XXXXXX"). It also assumes that the fruit name start after the first left-square parenthesis ("]"):
sed -e 's/^[^]]*][^A-Za-z]*//' -e 's/\bon\b.*$//' -e 's/\s*$//' your_file
Explanations:
-e 's/^[^]]*][^A-Za-z]*//'
:
Removes anything from the start until first "]", the first "]", and any non-alphabets following the first "]".
-e 's/\bon\b.*$//'
:
Removes a whole word "on" til the end of a line, if it exists.
-e 's/\s*$//'
:
Removes any trailing spaces, after the above processing.
Upvotes: 1
Reputation: 786031
You can use this awk
with custom field separator to get your values:
awk -F '\\[[^]]+\\][ *+]+| *on *' '{print $2}' file
APPLE
ORANGE
BANANA
Upvotes: 0
Reputation: 2859
Use grep
with extended regex -E
and -o
flag to return only matching bits
grep -o -E 'SERVICE[_0-9A-Za-z]+' file
The +
will ensure that digits greater than 9 are still returned
edited to match the changes in question
Upvotes: 1
Reputation: 1726
Try with this sed
sed 's/^\[....\] . \([A-Za-z0-9]*\).*/\1/' file
Upvotes: 1