sed - Remove all of a line except matching pattern

Question

I am working at trying to parse out hashtags from a file. For instance:

Some text here #Foo Some other text here....

I would like the output to be:

#Foo

The text before and after the # can change and I'm trying to apply this to multiple lines of the file. Every line will have a # in it as I already grep'd the file for hashtags.

Basically I'm trying to create a list of the hashtags that are contained in a file. If there is also a way to remove duplicated tags from the resulting output that would be a bonus.

heemayl · Accepted Answer

With sed:

sed -E 's/^[^#]*(#[^[:blank:]]*).*/\1/'

^[^#]* matches the portion before first #
(#[^[:blank:]]*) matches the # followed by any number of non-space/tab characters, and put the match in captured group 1
.* matches the rest
In the replacement, the captured group \1 is used

Example:

% sed -E 's/^[^#]*(#[^[:blank:]]*).*/\1/' <<<'Some text here #Foo Some other text here'
#Foo

sed - Remove all of a line except matching pattern

Answers (2)

Related Questions