Reputation: 55

Remove the strings starting with # and everything after #

How can I remove the substrings starting with # and everything after #?

There are many of them on different lines; they all start with # and are at the end of the line, and the number at the end is always different. They are all 15 characters long; I want to delete everything from # through the end of the line, with sed or awk.

http://www.somesite/play/episodes/xyz/fred-episode-110#group=p02q32xl
http://www.somesite/play/episodes/abc/simon-episode-266#group=p03d924k
http://www.somesite/play/episodes/qwe/mum-episode-39#group=p03l1jpr
http://www.somesite/play/episodes/zxc/dad-episode-41#group=p03l1j9s
http://www.somesite/play/episodes/asd/bob-episode-57#group=p03l1j7g

Upvotes: 0

Answers (3)

Benjamin W.

Reputation: 52536

With cut – declare # as the field separator and print only the first field:
```
cut -d '#' -f 1 infile
```
With sed – replace everything from # on with the empty string:
```
sed 's/#.*//' infile
```
With awk – declare # as field separator and print the first field:
```
awk -F'#' '{ print $1 }' infile
```
With Bash, taking advantage of the fact that it's always the last 15 characters:
```
while IFS= read -r line; do
    echo "${line:0:-15}"
done < infile
```
Notice that this is a) very slow and b) requires Bash 4.2-alpha or newer to support the negative length value in the parameter expansion.
With Perl – splitting by #, taking the first field of the list and printing it with say to include a newline:
```
perl -nE 'say ((split /#/)[0])' infile
```
or, more concise and sed-ish (pointed out my mklement0):
```
perl -pe 's/#.*//' infile
```