bob
bob

Reputation: 55

Remove the strings starting with # and everything after #

How can I remove the substrings starting with # and everything after #?

There are many of them on different lines; they all start with # and are at the end of the line, and the number at the end is always different. They are all 15 characters long; I want to delete everything from # through the end of the line, with sed or awk.

http://www.somesite/play/episodes/xyz/fred-episode-110#group=p02q32xl
http://www.somesite/play/episodes/abc/simon-episode-266#group=p03d924k
http://www.somesite/play/episodes/qwe/mum-episode-39#group=p03l1jpr
http://www.somesite/play/episodes/zxc/dad-episode-41#group=p03l1j9s
http://www.somesite/play/episodes/asd/bob-episode-57#group=p03l1j7g

Upvotes: 0

Views: 69

Answers (3)

Benjamin W.
Benjamin W.

Reputation: 52536

  • With cut – declare # as the field separator and print only the first field:

    cut -d '#' -f 1 infile
    
  • With sed – replace everything from # on with the empty string:

    sed 's/#.*//' infile
    
  • With awk – declare # as field separator and print the first field:

    awk -F'#' '{ print $1 }' infile
    
  • With Bash, taking advantage of the fact that it's always the last 15 characters:

    while IFS= read -r line; do
        echo "${line:0:-15}"
    done < infile
    

    Notice that this is a) very slow and b) requires Bash 4.2-alpha or newer to support the negative length value in the parameter expansion.

  • With Perl – splitting by #, taking the first field of the list and printing it with say to include a newline:

    perl -nE 'say ((split /#/)[0])' infile
    

    or, more concise and sed-ish (pointed out my mklement0):

    perl -pe 's/#.*//' infile
    

Upvotes: 3

mklement0
mklement0

Reputation: 440677

To complement Benjamin W.'s helpful answer:

grep is another option:

  • If you do NOT want to include the #:

    grep -Eo '^[^#]+' file
    
  • If you DO want to include the #:

    grep -Eo '^[^#]+.' file
    

Upvotes: 1

Ferit
Ferit

Reputation: 9737

Using Python Regex(.*?)(#.*) and substituting with \1:

Upvotes: 0

Related Questions