Reputation: 11
A log file has these pattern appearing more than once in a line. for example the file may look like
dsads utc-hour_of_year:2013-07-30T17 jdshkdsjhf utc-week_of_year:2013-W31 dskjdskf
utc-week_of_year:2013-W31 dskdsld fdsfd
dshdskhkds utc-month_of_year:2013-07 gfdkjlkdf
I want to replace all date specific info with "Y"
I tried : replaceAll("_year:.*\s", "_year:Y ");` but it removes everything that occurs after the first replacement,due to greedy match of .*
dsads utc-hour_of_year:Y
utc-week_of_year:Y
dshdskhkds utc-month_of_year:Y
but the expected result is:
dsads utc-hour_of_year:Y jdshkdsjhf utc-week_of_year:Y dskjdskf
utc-week_of_year:Y dskdsld fdsfd
dshdskhkds utc-month_of_year:Y gfdkjlkdf
Upvotes: 0
Views: 104
Reputation: 124215
I am not sure what you are really trying to do and this answer is only based on your example. In case you want to do something else leave comment below or edit your question with more specific information/example
It removes everything after _year:
because you are using .*\\s
which means
.*
zero or more of any characters (beside new line), \\s
and space after itso in sentence
utc-hour_of_year:2013-07-30T17 dsfsdgfsgf utc-week_of_year:2013-W31 dsfsdgfsdgf
it will match
utc-hour_of_year:2013-07-30T17 dsfsdgfsgf utc-week_of_year:2013-W31 dsfsdgfsdgf
// ^from here to here^
because by default *
quantifier is greedy. To make it reluctant you need to add ?
after *
so try maybe
"_year:.*?\\s"
or even better instead .*?
match only non-space characters using \\S
which is the same as negation of \\s
that can be written as [^\\s]
. Also if your data can be at the end of your input you shouldn't probably add \\s
at the end of your regex and space in your replacement, so try maybe one of this ways
.replaceAll("_year:\\S*", "_year:Y")
.replaceAll("_year:\\S*\\s", "_year:Y ")
Upvotes: 1
Reputation: 129477
Try using a reluctant quantifier: _year:.*?\s
.
.replaceAll("_year:.*?\\s", "_year:Y ")
System.out
.println("utc-hour_of_year:2013-07-30T17 dsfsdgfsgf utc-week_of_year:2013-W31 dsfsdgfsdgf"
.replaceAll("_year:.*?\\s", "_year:Y "));
utc-hour_of_year:Y dsfsdgfsgf utc-week_of_year:Y dsfsdgfsdgf
Upvotes: 1