Reputation: 6487
I know regular expressions are very powerful, and to become an expert with them is not easy.
One of my colleagues once wrote a java class to parse formatted text files. Unfortunately it caused a StackOverFlowError in the first integration test. It seems difficault to find the bug, before another colleague from structural programming world came over and fixed it quickly by thowing away all regular expressions and instead using many nested conditional statements and many split and trim methods, and it works very well!
Well, why do we need regular expression in a programming language like Java? As far as I know, the only necessary usage of regular expression is the find/replace function in text editors.
Upvotes: 1
Views: 428
Reputation: 29490
If you find that a regular expression would get too complex and unmaintable, use code instead. Regular expressions can get very complex even for things that sound very simple at first. For example validation of dates in the format mm/dd/yy[yy] is as "simple" as:
^(((((((0?[13578])|(1[02]))[\.\-/]?((0?[1-9])|([12]\d)|(3[01])))|(((0?[469])|(11))[\.\-/]?((0?[1-9])|([12]\d)|(30)))|((0?2)[\.\-/]?((0?[1-9])|(1\d)|(2[0-8]))))[\.\-/]?(((19)|(20))?([\d][\d]))))|((0?2)[\.\-/]?(29)[\.\-/]?(((19)|(20))?(([02468][048])|([13579][26])))))$
Nobody can maintain that. Manually parsing the date will need more code but can be much more readable and maintainable.
Regular expressions are very powerful and useful for matching TEXT patterns, but are bad for validation with numeric parts like dates.
Upvotes: 2
Reputation: 1601
Anything that can be expressed as a regular expression can, by definition, be expressed as a chain of IFs. You use REGEX basically for two reasons:
If your expression gets too complex, the use the advice given by this answer. If it get truly nasty, think about learning how to use a parser generator like ANTLR or JavaCC. A simple grammar usually can replace a regex, and it is a lot easier to maintain.
Upvotes: 3
Reputation: 74760
Regular expressions can be easier to read, but they can also be too complicated. It depends on the format of data you want to match.
The Java RE implementation still has some quirks, with the effect that some quite simple expressions (like '((?:[^'\\]|\\.)*)'
) cause a stack overflow when matching longer strings. So make sure you test with real life data (and more extreme examples, too) - or use a regex engine with a different implementation (there are several ones, also as Java libraries).
Upvotes: 1
Reputation: 40176
Regular expression is very powerful in looking for patterns in the content. You can certainly avoid using regular expression and rely on the conditional statements, but you will soon notice that it takes many lines of code to accomplish the same task. Using too many nested conditional statements increases the cyclomatic complexity of your code, as a result, it becomes even more difficult to test because there are too many branches to test. Further, it also makes the code difficult to read and understand.
Granted, your colleague should have written testcases to test his regular expressions first.
There's no right or wrong answer here. If the task is simple, then there's no need to use regular expression. Otherwise, it is nice to sprinkle a little regular expressions here and there to make your code easy to read.
Upvotes: 0
Reputation: 39907
You can use regex cleverly by spliting those into smaller chunks, something like,
final String REGEX_SOMETHING = "something";
final String REGEX_WHATEVER = "whatever";
..
String REGEX_COMPLETE = REGEX_SOMETHING + REGEX_WHATEVER + ...
Upvotes: 1
Reputation: 16190
Regular expressions are a tool (like many others). You should use it when the work to be done could best be done with that tool. To know which tool to use, it helps ask a question like "When could I use regular expressions?". And of course it will become easier to decide which tool to use when you have many different tools in your toolbox and you know them fairly well.
Upvotes: 1
Reputation: 4385
As always, you should use the best tool for the job. I would define the "best tool" by the most simple, understandable, effective method that fulfills the requirements.
Often regexes will simplify code and make it more readable. But this is not always the case.
Also, I would not jump to conclusions that regexes caused the StackOverflowError.
Upvotes: 1
Reputation: 1387
So the multiple nested conditional statements with many split and trim methods are easier for you to debug than a single line or two with regular expressions?
My preference is regular expressions because once you learn them, they are far more maintainable and far easier to read than parsing huge nested if loops.
Upvotes: 2
Reputation: 29927
Like everything else: Use with care and KISS
I use regexes quite often, but I don't go over the top and write a 100 character regex, because I know that I (personally) won't understand it later... in fact I think my limit is about 30-40 characters, something larger than that makes me spend too much time scratching my head.
Upvotes: 4