Reputation: 183
I am working at trying to parse out hashtags from a file. For instance:
Some text here #Foo Some other text here....
I would like the output to be:
#Foo
The text before and after the # can change and I'm trying to apply this to multiple lines of the file. Every line will have a # in it as I already grep'd the file for hashtags.
Basically I'm trying to create a list of the hashtags that are contained in a file. If there is also a way to remove duplicated tags from the resulting output that would be a bonus.
Upvotes: 2
Views: 4274
Reputation: 42117
With sed
:
sed -E 's/^[^#]*(#[^[:blank:]]*).*/\1/'
^[^#]*
matches the portion before first #
(#[^[:blank:]]*)
matches the #
followed by any number of non-space/tab characters, and put the match in captured group 1
.*
matches the rest
In the replacement, the captured group \1
is used
Example:
% sed -E 's/^[^#]*(#[^[:blank:]]*).*/\1/' <<<'Some text here #Foo Some other text here'
#Foo
Upvotes: 1