user2427306
user2427306

Reputation: 39

How to find a specific pattern using regular expression?

In this case,

(3 [97 98] 100 110 [116 117] 200)

I want to pick numeric words like this.

When numeric words is in [ ] then just words after [ and numeric words is not in [ ] then all of them.

3 97 100 110 116 200

How I can make this?

Upvotes: 2

Views: 97

Answers (2)

Sven Hohenstein
Sven Hohenstein

Reputation: 81683

You can use gsub:

s <- "(3 [97 98] 100 110 [116 117] 200)"

gsub("\\[(\\d+).*?\\]|[()]", "\\1", s)
# [1] "3 97 100 110 116 200"

How it works?

The regex used in gsub is

\\[(\\d+).*?\\]|[()]

It consists of two parts, connected by logical or (|).

The first part,

\\[(\\d+).*?\\]

matches everything between square brackets (including the brackets). The regex \\[ matches [, \\], matches ]. Furthermore, \\d+ means one or multiple digits. The .*? matches any number of any character. The ? ensure non-greedy matching, i.e., until the next ]. The parentheses denote a matching group. Here, the first matching group is the first string of digits after [.

The second part,

[()]

matches parentheses.

Every match is replaced by \\1, i.e., the first matching group. Hence, the string between square brackets is replaced by the first number inside these brackets. Parentheses are replaced by nothing (the empty string) because there's no matching group.

Upvotes: 5

aquemini
aquemini

Reputation: 960

This might be what you're looking for.

s <- "(3 [97 98] [116 117] 200)"
regmatches(s, gregexpr("[0-9]", s))

I don't understand your edit exactly, but you would just need to replace "[0-9]" with the updated regular expression.

Upvotes: 0

Related Questions