Finding letters in strings in different orders

Question

I have 2 strings that represents trees. Every '{' means going to lower level in the tree (kids). Each string contains letters (or words) divided by '{' one or more. I would like to convert only same level letters - the second (or the first) string to be in the same order of the letters in the other one without changing the location in the string. Here is the example:

> str1<-"{a{b}{c{{d}{e}}}}" 
> str2<-"{a{c}{b{{e}{d}}}}"

I would like to change str2 to be "{a{b}{c{d{e}}}}". Since both 'b' and 'c' are in the same level (kids of 'a' in str1 and str2) they are only in different order in str1 and str2. The same is for 'd' and 'e'. I would like to change them in str2 to str2<-"{a{c}{b{{d}{e}}}}" the same order as in str1. The single letters are only examples. They might be longer than one letter. Is there a fast and short solution for it? I think the best approach to to order sort each of the strings for instance:

 > str1<-"{a{b}{c{{d}{e}}}}" 
 > str2<-"{a{b}{c{{d}{e}}}}"

That is good intermediate solution for me for this problem. The command sort (x) is for vectors. I would like to use the strings and to keep the positions of the '{'. It means that only for same level of the nodes (siblings) we can sort but not between higher levels. For instance in the following case:

> str1<-"{a{b}{c}}" 
> str2<-"{b{a}{c}}"

since 'a' is the root and 'b' is his kid and vice versa for str2 we can't sort this case. We can sort in the following case:

> str1<-"{a{b}{c}}" 
> str2<-"{a{c}{b}}"

Since in the upper example {a} is the root and {b} and {c} are equal level kids of {a}.

C8H10N4O2 · Accepted Answer

Here's a hint -- to be eligible for a swap, you aren't requiring that the "parent" be the same, just that the "generation" (level or depth of the tree) be the same. You can get this depth from the count of { minus count of }. I don't understand your brace convention, but as long as it's uniform then it should identify members of the same generation.

> str1<-"{a{b}{c{{d}{e}}}}" 

> require(stringr)
> str_match <- str_extract_all(str1,"\w+")[[1]]
> str_match
[1] "a" "b" "c" "d" "e"

> str_loc <- str_locate_all(str1,"\w+")[[1]]
> str_loc
     start end
[1,]     2   2
[2,]     4   4
[3,]     7   7
[4,]    10  10
[5,]    13  13

> prior_str <- str_sub(str1, end=str_loc[,'start'])
> prior_str
[1] "{a"            "{a{b"          "{a{b}{c"       "{a{b}{c{{d"    "{a{b}{c{{d}{e"

> str_depth <- str_count(prior_str,"[{]") - str_count(prior_str,"[}]")
> str_depth
[1] 1 2 2 4 4

Hopefully you know enough R to take it from there.

Finding letters in strings in different orders

Answers (1)

Related Questions