Reputation: 4452
I have a long text, it's part of them
C: state name of the Company in Russian: [03_SNYuLOOO IC "Story Group".]
). - [04_MNMestablishment of the Company: 107S64, Russian Federation, Moscow,
ul. Krasnobogatyrskaya, 2, is built.
2, floor 3. com. 11. Office B].
I need to findall all substrings like this:
[03_SNYuLOOO IC "Story Group".]
[04_MNMestablishment of the Company: 107S64, Russian Federation, Moscow,
ul. Krasnobogatyrskaya, 2, is built.
2, floor 3. com. 11. Office B]
I try to use
re.findall(r'^\[\d{2}_[\s\S]+\]$', text)
But it returns empty list. What do I wrong?
Upvotes: 1
Views: 109
Reputation: 627507
The ^
and $
anchors require the whole string to match the pattern and [\s\S]+
match any 1+ chars as many as possible, grabbing any [
and ]
on its way to the end of string, so the final ]
will match the rightmost ]
in the string.
You may use the following regex:
r'\[\d{2}_[^]]+]'
See the regex demo
Details
\[
- a literal [
\d{2}
- two digits_
- an underscore[^]]+
- one or more chars other than ]
]
- a literal ]
.See the Python demo:
import re
s='''C: state name of the Company in Russian: [03_SNYuLOOO IC "Story Group".]
). - [04_MNMestablishment of the Company: 107S64, Russian Federation, Moscow,
ul. Krasnobogatyrskaya, 2, is built.
2, floor 3. com. 11. Office B].'''
print(re.findall(r'\[\d{2}_[^]]+]', s))
# => ['[03_SNYuLOOO IC "Story Group".]', '[04_MNMestablishment of the Company: 107S64, Russian Federation, Moscow, \nul. Krasnobogatyrskaya, 2, is built.\n2, floor 3. com. 11. Office B]']
Upvotes: 2