how to get a value from string using regular expression?

I want to extract value from below string using regular expression

"a:4:{i:0;s:24:\"hello \"tejo krishna\"!!!`\";i:1;s:11:\"hello \"xyz\"\";i:2;s:6:\"defeat\";i:3;s:7:\"pattern\";}"

above string I want to extract italic format text. any help appreciated.

Thanks,

Upvotes: 1

Views: 42

Answers (1)

deeenes
deeenes

Reputation: 4576

The exact constraints of the acceptable characters are not clear, also you don't tell about the language. But in Python, with your example, the regex below works. If you expect more types of characters in the input, just extend the classes:

import re

myre = re.compile(r'\\"([\sa-zA-z0-9]+\\?"?[\sa-zA-z0-9]+\\?"?[!`]*)\\"')
s = r'"a:4:{i:0;s:24:\"hello \"tejo krishna\"!!!`\";'\
    r'i:1;s:11:\"hello \"xyz\"\";i:2;s:6:\"defeat\";i:3;'\
    r's:7:\"pattern\";}"'
match = myre.findall(s)
# results
# ['hello \\"tejo krishna\\"!!!`', 'hello \\"xyz\\"', 
#  'defeat', 'pattern']

Note: in Python, the backslash (\) is an escape character, so need to be escaped in strings, thus the double backslashes in the output. In regex, backslash is also an escape character, thus the double backslashes in the regex. There because it is defined as raw string (note the r in front of the string r'...'), Python does not need us to escape, we escape for the regex engine. Otherwise you could use 4 backslashes in normal string: '\\\\"([\\sa-zA-z0-9]+\\\\?"?[\\sa-zA-z0-9]+\\\\?"?[!]*)\\"'`. You need to do this if in your programming language no raw string is available.

Upvotes: 1

Related Questions