upendra
upendra

Reputation: 2189

Regular expression is not working as expected

Could some one help me regular expression for this. I am really struggling.

Basically i want to write a regular expression to separate the string into two sub strings.

For example in the example i want to separate the full string into "comp99810_c0_seq1" and "|m.8409".

test <- "comp99810_c0_seq1|m.8409" 
c1 <- sub("([A-Za-z1-9])(\\|)(m.\\d+)", "\\1", test) 
c2 <- sub("([A-Za-z1-9])(\\|)(m.\\d+)", "\\2\\3", test) 

I was able to get c1 to work but not c2. Can somebody help me....

Thanks Upendra

Upvotes: 0

Views: 82

Answers (2)

Sabuj Hassan
Sabuj Hassan

Reputation: 39365

Try to use similar split("|") function from the language you are currently it is using.

However, change the [A-Za-z1-9] into \\w+ and it will work for you.

Currently your regex meaning only one character. Whereas the \\w+ means 1 or more characters from a-zA-Z, 0-9, _

Upvotes: 2

Phoenix
Phoenix

Reputation: 632

If you don't want to split on "\|", the issue is the first group is missing a repeat character. i.e. ([A-Za-z1-9])+ or ([A-Za-z1-9])*. Because now it is only matching a single character in that set then trying to find the pipe.

Upvotes: 0

Related Questions