Reputation: 423
Using Sed because of large files, I'd like to match strings of form
'09/07/15 16:56:36,333000000','DD/MM/RR HH24:MI:SSXFF'
and replace it by
'09/07/15 16:56:36','DD/MM/RR HH24:MI:SS'
Checked by regex tester this regex seems to match
'\d{2}\/\d{2}\/\d{2}\s\d{2}:\d{2}:\d{2},\d{9}','DD\/MM\/RR HH24:MI:SSXFF'
but when I do
sed -ie "s#\(\x27\d{2}\/\d{2}\/\d{2}\s\d{2}:\d{2}:\d{2}\),\d{9}
\(\x27,\x27DD\/MM\/RR HH24:MI:SS\)XFF\x27#\1\2\x27#g" inputfile
it does not replace anything. What am I missing ?
Upvotes: 2
Views: 93
Reputation: 1679
NOTE: in the answer below I describe why your expression doesn't work in general. I would strongly suggest that you try to simplify your expression as much as possible first, or use @StevenPenny's excellent answer, because:
The problem is that sed
and http://regexr.com/ regex engines are somewhat different. See the "RegEx engine" section on the website:
While the core feature set of regular expressions is fairly consistent, different implementations (ex. Perl vs Java) may have different features or behaviours.
RegExr uses your browser's RegExp engine for matching, and its syntax highlighting and documentation reflect the JavaScript RegExp standard.
Whereas the latest versions of GNU sed
is mostly compatible with POSIX.2 Basic Regular Expressions (BREs). See the excerpt from the sed(1)
manpage for GNU sed
, version 4.2.2:
REGULAR EXPRESSIONS
POSIX.2 BREs should be supported, but they aren't completely because of performance problems. The \n sequence in a regular expression matches the newline character, and similarly for \a, \t, and other sequences.
The descriptions of POSIX regex languages (that is BRE — Basic Regular Expressions and ERE — Extended Regular Expressions) are in the regex(7)
manpage.
In particular, concerning your expression:
\d
, while in BRE you should write [[:digit:]]
; for white space, you're using \s
, whereas in BRE there's [[:space:]]
.{
, which in BRE should be \{
.Upvotes: 0
Reputation: 1
Why not just use something like this?
#!/usr/bin/sed -f
s/,[[:digit:]]*//
s/XFF//
Upvotes: 2