buggedcom
buggedcom

Reputation: 1532

Matching the first function in a list of function calls

Answered From reading what I can, it seems that this is nigh on impossible to do in a regex, so I'm using the regex to match the function pattern and then tokenising the results using brace positions. Not the best answer, but solved none the less.

I'm trying to match the first single function in a list of functions for example

$string = "user('firstname'), user('lastname')"

But I don't know how to successfully pattern match a sub function call that could contain any kind of arguments, ie a string such as "my string)", eg

$string = "user('my string)!'), user('lastname')"

So any pattern must not match any braces within itself, ie user('my string).

I'm not concerned about matching different types of arguments, rather just grabbing the first function as a whole. The current regex is as follows.

'/([a-z0-9\_]+)\((.*)\)/'

I would imagine some kind of negative look-ahead/behind assertion is required but I'm not yet up to that level in constructing patterns. Any help would be greatly appreciated.

Flavour of regex is PHP.

EDIT 1 The function list could also look like this.

user((5*5)+10), user(otherfunc())

In which case the pattern would have to match user((5*5)+10) then after post processing user(otherfunc()). I have a expression tokeniser that lexes arguments and expressions. It does great on everything but multiple buried functions.

Upvotes: 1

Views: 84

Answers (3)

NikiC
NikiC

Reputation: 101936

'~^[a-z0-9_]++\(([^\'"()]*+(?:(?:\'[^\'\\\\]*+(?:\\\\.[^\'\\\\]*+)\'|"[^"\\\\]*+(?:\\\\.[^"\\\\]*+)"|\((?1)\))[^\'"()]*+)*+)\)~'

Didn't test it.

Slightly more readable:

'~^
 [a-z0-9_]++
 \((
     [^\'"()]*+(?:(?:
           \'[^\'\\\\]*+(?:\\\\.[^\'\\\\]*+)\'
         |  "[^"\\\\]*+(?:\\\\.[^"\\\\]*+)"
         | \((?1)\)
     )[^\'"()]*+)*+
 )\)
 ~x'

Upvotes: 0

Gary Green
Gary Green

Reputation: 22395

Try:

(?:\s*([a-z\d_]+)\('[^']+'\)),?

This will also match any number of functions (with global match /g flag) so i.e.:

user('firstname'), user('lastname'),user3('la!(["())!gstname')

Edit: For what your trying to do, this is not suited to regex because you are dealing with nested structures, i.e. recursion. You are far better off looping through each character individually and parsing it the same way a real language does.

Upvotes: 1

tvkanters
tvkanters

Reputation: 3523

I think '/([a-z0-9\_]+)\(\'([^\']*)\'\)/' should do fine. At least, if the argument is always a string within single quotes. Is this what you need or does it have to be more advanced?

Upvotes: 0

Related Questions