Reputation: 226
I have a problem to extract function arguments in R.
x="theme(legend.position='bottom',
legend.margin=(t=0,r=0,b=0,l=0,unit='mm'),
legend.background=element_rect(fill='red',size=rel(1.5)),
panel.background=element_rect(fill='red'),
legend.position='bottom')"
What I want is:
[1]legend.position='bottom'
[2]legend.margin=(t=0,r=0,b=0,l=0,unit='mm')
[3]legend.background=element_rect(fill='red',size=rel(1.5))
[4]panel.background=element_rect(fill='red')
[5]legend.position='bottom'
I tried several regular expressions without success including followings:
strsplit(x,",(?![^()]*\\))",perl=TRUE)
Please help me!
Upvotes: 1
Views: 317
Reputation: 226
Thank you for all your advice. I have parsed the sentences and get the arguments as list. Here is my solution.
x<-"theme(legend.margin=margin(t=0,r=0,b=0,l=0,unit='mm'),
legend.background=element_rect(fill='red',size=rel(1.5)),
panel.background=element_rect(fill='red'),
legend.position='bottom')"
extractArgs=function(x){
result<-tryCatch(eval(parse(text=x)),error=function(e) return("error"))
if("character" %in% class(result)){
args=character(0)
} else {
if(length(names(result)>0)){
pos=unlist(str_locate_all(x,names(result)))
pos=c(sort(pos[seq(1,length(pos),by=2)]),nchar(x)+1)
args=c()
for(i in 1:(length(pos)-1)){
args=c(args,substring(x,pos[i],lead(pos)[i]-2))
}
} else{
args=character(0)
}
}
args
}
Upvotes: 0
Reputation: 522386
I think the best answer here might be to not attempt to use a regex to parse your function call. As the name implies, regular expressions require regular language. Your function call is not regular, because it has nested parentheses. I currently see a max nested depth of two, but who knows if that could get deeper at some point.
I would recommend writing a simple parser instead. You can use a stack here, to keep track of parentheses. And you would only split a parameter off if all parentheses were closed, implying that you are not in the middle of a parameter, excepting possibly the very first one.
Upvotes: 1
Reputation: 2463
Arf, I'm really sorry but i have to go work, i will continue later but for now i just let my way to solve it partially : theme\(([a-z.]*=['a-z]*)|([a-z._]*=[a-z0-9=,'_.()]*)*\,\)?
It misses only the last part..
Here the regex101 page : https://regex101.com/r/BZpcW0/2
See you later.
Upvotes: 0