sod
sod

Reputation: 3928

PHP dealing with huge string

I have to replace xmlns with ns in my incomming xml in order to fix SimpleXMLElements xpath() function. Most functions do not have a performance problem. But there allways seems to be an overhead as the string grows.

E.g. preg_replace on a 2 MB string takes 50ms to process, even if I limit the replaces to 1 and the replace is done at the very beginning.

If I substr the first few characters and just replace that part it is slightly faster. But not really that what I want.

Is there any PHP method that would perform better in my problem? And if there is no option, could a simple php extension help, that just does Replace => SimpleXMLElement in C?

Upvotes: 0

Views: 146

Answers (4)

rid
rid

Reputation: 63560

If you know exactly where the offending "x", "m" and "l" are, you can just use something like $xml[$x_pos] = ' '; $xml[$m_pos] = ' '; $xml[$l_pos] = ' ' to transform them into spaces. Or transform them into ns___ (where _ = space).

Upvotes: 2

gaRex
gaRex

Reputation: 4225

From the PHP source, as we can see, for example here: http://svn.php.net/repository/php/php-src/branches/PHP_5_2/ext/standard/string.c

I don`t see, any copies, but I'm not expert in C. From the other hand we can see there many convert to string calls, which at 1st sight could copy values. If they copy values, then we in trouble here.

Only if we in trouble Try to invent some str_replace wheel here with the help of string-by-char processing. For example we have string $somestring = "somevalue". In PHP we could work with it's chars by indexes as echo $somestring{0}, which will give us "s" or echo $somestring{2} which will give us "m". I'm not sure in this way, but it's possible, if official implimentations don't use references, as they should use.

Upvotes: 0

Lightness Races in Orbit
Lightness Races in Orbit

Reputation: 385295

50ms sounds pretty reasonable to me, for something like this. The requirement itself smells of something being wrong.

Is there any particular reason that you're using regular expressions? Why do people keep jumping to the overkill regex solution?

There is a bog-standard string replace function called str_replace that may do what you want in a fraction of the time (though whether this is right for you depends on how complex your search/replace is).

Upvotes: 0

cusimar9
cusimar9

Reputation: 5259

You're always going to get an overhead when trying to do this - you're dealing with a char array and trying to do replace multiple matching elements of the array (i.e. words).

50ms is not much of an overhead, unless (as I suspect) you're trying to do this in a loop?

Upvotes: 0

Related Questions