timebandit
timebandit

Reputation: 838

Regular expressions matching across multiple line in Sublime Text

My data follow a repeated pattern in a text file. The same type data structure with unique values is printed out till the end of the file

{'AuthorSite': None,
 'FirstText': None,
 'Image': None,
 'SrcDate': None,
 'Title': None,
 'Url': None}
...
..
.

I am trying to match each block one at a time using a regular expression in sublime text. I have tried a variety of forms with no success. The latest one being:

\{(.|\s)\}

I wanted to hoover up everything between each pair of braces. Please advise. I will eventually implement this in python.

Upvotes: 1

Views: 1862

Answers (2)

vks
vks

Reputation: 67978

\{([^}]+)\}

You can try this demo:

http://regex101.com/r/hQ9xT1/32

import re
p = re.compile(ur'{([^}]+)}')
test_str = u"{'AuthorSite': None,\n 'FirstText': None,\n 'Image': None,\n 'SrcDate': None,\n 'Title': None,\n 'Url': None}"

re.findall(p, test_str)

Your regex \{(.|\s)\} didn't work because you had not quantified it. Use \{(?:.|\s)+\}.

Upvotes: 2

Gabriel
Gabriel

Reputation: 21

Assuming you want to retrieve the values, I would use the following regular expression

\{([^\}]+)\}

The key here is [^}] character class, which matches anything that isn't the literal } character. Whitespaces, border characters, letters, digits, etc.

Here is the Python code:

import re
hoover_exp = re.compile(r'\{([^\}]+)\}')
with(open('data.txt', 'r') as infile):
    text = infile.read()
matches = hoover_exp.findall(text)

matches will be a list of all the non-overlapping matches in text. e.g.

["'AuthorSite': None,\n 'FirstText': None,\n 'Image': None,\n 'SrcDate': None,\n 'Title': None,\n 'Url': None", "'AuthorSite': None,\n 'FirstText': None,\n 'Image': None,\n 'SrcDate': None,\n 'Title': None,\n 'Url': None"]

That being said, if you input text is nothing but these dicts, you might be better off using something like json to pull them directly into Python dicts.

Upvotes: 1

Related Questions