Reputation: 7757
I have this web application which needs to do several heavy text processing tasks: removing certain characters, parsing XML files, among others. Some of them involve regular expressions.
The web application has some implementations in Java and others in PHP. Is it worth using Perl or other specific text processing language for such tasks, or is there really no difference with using PHP?
I even thought of using Sed, Awk maybe even some compiled C scripts for processing texts. There's a lot of text to be processed...
Upvotes: 3
Views: 2061
Reputation: 1003
I too suggest you use Perl, it's made for text crunching.
However, if you are going to parse/process XML, please don't try to roll your own solution, there are several high quality modules that do the job correctly. As a starter, I recommend you take a look at XML::Twig
Also, for regular expressions, there are dozens of already-made ones under the Regexp::Common distribution. Most probably you'll find what you need there and it will save you time.
Upvotes: 9
Reputation: 6802
Yes. Text processing is PERL's #1 strong point. Since you will integrate into your existing app, you'll need to execute an external program so think about how to run it securely and perhaps as a background process (to avoid start up delays in your real time web app.)
Upvotes: 2
Reputation: 473
Perl is THE language for text processsing. It was designed with this in mind.
Upvotes: 7
Reputation: 12816
Yes, Perl was designed with processing text in mind.
It has tons of useful text processing features, and it was the first language I used (long ago) that had regular expressions.
http://en.wikipedia.org/wiki/Perl
Upvotes: 3
Reputation: 134551
Text processing is exactly what Perl was created for. After all it's Practical Extraction and Report Language. On the other hand, for web application I'd prefer Python.
Upvotes: 3
Reputation: 30225
Yes, Perl is a good option. As a language, it's definitely more suitable for those kinds of tasks than Java or PHP. If you have the Perl knowledge, I would recommend it for this kind of task.
Upvotes: 17