Reputation: 123
Can any one please let me know the Regex used to find numbers which is of multiples of 4 from a given string. The string will consist both text and numbers.
Thanks in advance
Upvotes: 0
Views: 4988
Reputation: 3365
complaining about regex not being the right tool for the job doesn't really answer the question and I think is some what counter productive. Although it may be true that the asker is just unaware that there is a better way. However, maybe he is building a lexer for an entirely new language compiler which only takes certain divisors as tokens?
That may be unlikely and impractical, but my point is that passing judgement on an inferred motive doesn't do anybody any good... ANYWAY...
I think this is an interesting question if for no other reason than it presents an interesting challenge academically, and to answer your question there is a way to use regex to determine multiples.
Ultimately regex is just a pattern matcher right? so what types of patterns might be created by numbers in multiples of four? To answer this question I wrote a quick program to print out all multiples of four from 1 - 500 ( try it ;)
import java.io.FileWriter;
public class Four {
public static void main(String args[]){
StringBuilder myFour = new StringBuilder();
int i = 1;
int mult = 0;
while(mult < 500){
mult = i*4;
myFour.append(mult + "|");
i++;
}
try{
FileWriter writer = new FileWriter("out.txt");
writer.write(myFour.toString());
writer.close();
} catch(Exception e){e.printStackTrace();}
}
}
what I noticed is that the last digit of each number was alternating between 0 4 8 2 6. Now you might be tempted to use this immediately and just check all strings of digits to see if they end in one of these numbers, but that wouldn't work, since other integers also end with those digits that aren't themselves divisible by four such as 10, 14, 18, 22, 26, etc... and so the search continues. Next I looked at the last two digits and noticed a repeating pattern between 0 and 100
4|8|12|16|20|24|28|32|36|40|44|48|52|56|60|64|68|72|76|80|84|88|92|96|100|...|204|208|212|...
if you prefix the single digits with zeros you'll notice that this pattern repeats every increment of 100. So now I'm feeling pretty confident that I'm on to something. To test my theory further I pulled up Google and typed in 2147483648 % 4 (which is the next highest number past the maximum 32 bit signed int value which is divisible by 4) this was just the first arbitrary value that came to mind and has no other meaning that I'm aware of and as it turns out 2147483648 % 4 = 0 so I'm feeling really good right now. I suppose you could actually write out a mathematical proof and prove that this theory works, but I'm more into application. So I figure at this point all I have to do is write up this regex and then I can test it against the output of the program written above. So my next goal is to write the actual regex.
If you notice I conveniently made the program print out the OR regex operator so I can just cut and paste most of the regex and I'm halfway home. All I want are the last two digits so the first part of my regex looks something like this:
(00|04|08|12|16|20|24|28|32|36|40|44|48|52|56|60|64|68|72|76|80|84|88|92|96)
you'll notice I prefixed the zeros to the single digits and added 00 to the front. Again this is because I want to match the last TWO chars including the 00 from 100 (this will also return strings of 0 as a valid multiple of four as it should). so now I have my regex suffix wrtten. According to my theory any string of of digits suffixed by the aforementioned two digits is a multiple of four so I just need to write a rule for the prefix (any digit) and I'm done. This is very easy and is just [0-9]* So now my regex looks like this:
[0-9]*(00|04|08|12|16|20|24|28|32|36|40|44|48|52|56|60|64|68|72|76|80|84|88|92|96)
Now I'm almost done. What have I forgotten? Single digits!!! 0,4 and 8 will be rejected by the regex above since they are single digits and the above pattern only matches two digits preceded by 0 or more digits. so I have to tweak the regex a little and I end up with this:
(0|4|8)|([0-9]*(00|04|08|12|16|20|24|28|32|36|40|44|48|52|56|60|64|68|72|76|80|84|88|92|96))
and that's pretty much it. Technically you would also have to add word boundaries since you want to treat the entire string of digits as a word. you would add boundary tags like this:
\b(0|4|8)|([0-9]*(00|04|08|12|16|20|24|28|32|36|40|44|48|52|56|60|64|68|72|76|80|84|88|92|96))\b
but whether or not you do that depends on your application. If you were going to use this in a lexer you might be building with jflex for instance you might not want to include those since you could have other rules for similar lexemes.
So all in all that's how I would do it. That's probably not the shortest most concise regex and I'm sure there are better ways to do it, but if you're looking for something quick and dirty I don't think it gets any quicker or dirtier. Also, I thought it might help if I walked you through my thought process. The down side of being quick and dirty is that I could be entirely wrong and if so now you can see exactly where I was derailed and you can put the train back on the tracks yourself ;) Hope this helps....
Upvotes: 4
Reputation: 91518
Regex isn't the right tool to do the job, but if you really want, have a try with:
/[[0268][048]|[13579][26])(\D|$)/
Upvotes: 3
Reputation: 152284
With regex you can only extract whole numbers. However it is possible to extract only even numbers:
(\d*[02468])
Then you have to check them if they are dividable with 4
with modulo test:
if ( ( number != 0 ) && ( number % 4 == 0 ) ) {
// number is dividable with 4 and does not equal 0
}
Upvotes: 1