HeelMega
HeelMega

Reputation: 508

Python Get text between keywords

I would like to get the text between certain keywords [en] and [ja]

So for the following example:

[en]
Text
- Example
- Example
- Example

[ja]
Text
 - 例
 - 例
 - 例

I need it to return only:

Text
- Example
- Example
- Example

I have tried using regex:

([en])(.|\n)+?([ja])

But it only grabs the first 2 characters of first line. What am I doing wrong here?

Upvotes: 1

Views: 83

Answers (2)

anubhava
anubhava

Reputation: 785128

You may use this regex for capturing text between [en] and [ja]:

\[en]\n((?:.*\n)*?)\n\[ja]

RegEx Demo

RegEx Details:

  • \[en]\n: Match [en] followed by a line break
  • ((?:.*\n)+?): Match anything followed by a line break. Repeat this group 1+ times (lazy matching) and capture matched text in group #1
  • \n\[ja]: Match line break followed by [ja]

Upvotes: 1

adumred
adumred

Reputation: 164

Captures all the text between [en] and [ja]

(?<=\[en\]\n)(?:(?:.*\n)+?)(?=\n\[ja\])

Regex working link

Upvotes: 1

Related Questions