user459
user459

Reputation: 111

How to extract relevant text between two lines using regex

Berat: 0.25kg
Rp 115.000
Jumlah:
Beli
Ke Dafta

Here I want to extract Rp 115.00 and note that weights 0.25 kg are variable

I am trying

\b.*\n\K.*(?=\n*\n)

but its giving me "Rp 115.00" and "Jumlah:" There are multiple entries of Rp for eg Rp 10 ,Rp 400 in the text but I only want to extract the one between "Berat" and "Jumlah" . And these numbers are also variable PS- I am looking for solutions with regex

Upvotes: 1

Views: 201

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626936

Assuming

I only want to extract Rp 115.000

You can use a gsub with the (?s).*(Rp\\s+\\d+\\.\\d+).* regex to extract it from the text:

gsub("(?s).*(Rp\\s+\\d+\\.\\d+).*", "\\1", s, perl=T)
##[1] "Rp 115.000"

See demo

The .* will match any symbols (even a newline due to (?s) modifier) and Rp\\s+\\d+\\.\\d+ will match the pattern Rp + whitespace + number + . + number.

Upvotes: 2

Related Questions