Usi Usi
Usi Usi

Reputation: 2997

Python extract 3 words before and 3 words after a specific list of words with a regex

I need to use python to extract 3 words before and 3 words after a specific list of words

Nokia Lumia 930 Smartphone, Display 5 pollici, Fotocamera 20 MP, 2GB RAM, Processore Quad-Core 2,2GHz, Memoria 32GB, Windows Phone 8.1, Bianco [Germania]

At the moment I'm using this regex without success

((?:[\S,]+\s+){0,3})ram\s+((?:[\S,]+\s*){0,3})

https://regex101.com/r/yN6iI0/1

My list of words that I need is:

Upvotes: 3

Views: 1066

Answers (2)

vks
vks

Reputation: 67968

((?:[\S,]+\s+){0,3})ram,?\s+((?:[\S,]+\s*){0,3})

                       ^^

Just add a ,.See demo.

https://regex101.com/r/yN6iI0/4

You can use this finally,

((?:[\S,]+\s+){0,3})(?:ram|Display|Fotocamera|RAM|Processore|Memoria),?\s+((?:[\S,]+\s*){0,3})

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626936

You regex did not work because \s+ requires at least 1 whitespace, but between RAM and , there is none. Either use a * quantifier or just remove it and use ``

(?i)((?:\S+\s+){0,3})\bRAM\b\s*((?:\S+\s+){0,3})

See demo

I added \b (word boundary) to make sure we match RAM, not RAMBUS.

Mind the re.I modifier (or use an inline version (?i) at the beginning of the pattern).

Other patterns can be formed in a similar way, just replace RAM with the words from your list.

Upvotes: 1

Related Questions