Reputation: 1614
The following Regex works as expected in Notepad++:
^.*[^a-z\r\n].*$
However, when I try to use it with sed, it wont work.
sed -r 's/\(^.*[^a-z\r\n].*$\)//g' wordlist.txt
Upvotes: 0
Views: 1516
Reputation: 14038
Two things:
Sed is a stream editor. It processes one line of the input at a time. That means the search and replace commands, etc, can only see the current line. By contrast, Notepad++ has the whole file in memory and so its search expressions can span two or more lines.
Your command sed -r 's/\(^.*[^a-z\r\n].*$\)//g' wordlist.txt
includes \(
and \)
. These mean real (ie non-escaped) round brackets. So the command says find a line that starts with a (
and ends with a )
with some other characters between and replace it with nothing. Rewriting the command as sed -r 's/^.*[^a-z\r\n].*$//g' wordlist.txt
should have the desired effect. You could also remove the \r\n
to give sed -r 's/^.*[^a-z].*$//g' wordlist.txt
. But neither of these will be exactly the same as the Notepad++ command as they will leave empty lines. So you may find the command sed -r '/^.*[^a-z].*$/d' wordlist.txt
is closer to what you really want.
Upvotes: 1
Reputation: 35018
You could use:
sed -i '/[^a-z]/d' wordlist.txt
This will delete each line that has a non-alphabet character (no need to specify linefeeds)
EDIT:
You regex doesn't work because you are trying to match
( bracket
^ beginning of line
...
$ end of line
) bracket
As you won't have a bracket and then the beginning of the line, your regex simply doesn't match anything.
Note, also an expression of
s/\(^.*[^a-z\r\n].*$\)//g'
wouldn't delete a line but replace it with a blank line
EDIT2:
Note, in sed using the -r flag changes the behaviour of \(
and \)
without the -r
flag they are group indicators, but with the -r
flag they're just brackets...
Upvotes: 2