Reputation: 5310
Hi I'm trying to parse a sub string with php preg_match.
String input like :
25k8cp1gl6-Mein Herze im Blut, BWV 199: Recitative: Ich Wunden_SVD1329578_14691639_unified :CPN_trans:
Here I want to extract Mein Herze im Blut, BWV 199: Recitative: Ich Wunden
25k8cp1gl6-La Puesta Del Sol_SVD1133599_12537702_unified :CPN_trans:
Here I want to extract La Puesta Del Sol
La Puesta Del Sol_SVD1133599_12537702_unified :CPN_trans:
Here I want to extract La Puesta Del Sol
25k8cp1gl6-La Puesta Del Sol_MNA1133599_12537702_unified :CPN_trans:
Here I want to extract La Puesta Del Sol
25k8cp1gl6-La Puesta Del Sol_IMC1133599_12537702_unified :CPN_trans:
Here I want to extract La Puesta Del Sol
So basically I want to extract the string before _SVD
or _MNA
and _IMC
excluding the first part of the string 25k8cp1gl6-
Thanks in Advance
Upvotes: 0
Views: 118
Reputation: 20486
Here is an expression for ya:
(?<=25k8cp1gl6-).*?(?=_(?:SVD|MNA|IMC))
Explanation:
(?<=...)
is syntax for a lookahead, meaning we start by finding (but not including in our match) "25k8cp1gl6-". Then we lazily match our entire string with .*?
. Finally, (?=...)
is a lookahead syntax. We look for "_" followed by "SVD", "MNA", or "IMC" (separated with |
in the non-capturing group (?:...)
).
PHP:
$strings = array(
'25k8cp1gl6-Mein Herze im Blut, BWV 199: Recitative: Ich Wunden_SVD1329578_14691639_unified :CPN_trans:',
'25k8cp1gl6-La Puesta Del Sol_SVD1133599_12537702_unified :CPN_trans:',
'25k8cp1gl6-La Puesta Del Sol_MNA1133599_12537702_unified :CPN_trans:',
'25k8cp1gl6-La Puesta Del Sol_IMC1133599_12537702_unified :CPN_trans:',
);
foreach($strings as $string) {
if(preg_match('/(?<=25k8cp1gl6-).*?(?=_(?:SVD|MNA|IMC))/', $string, $matches)) {
$substring = reset($matches);
var_dump($substring);
}
}
Another option, which would use preg_replace()
, is demoed here:
^\w+-(.*?)_(?:SVD|MNA|IMC).*
Explanation:
This one matches the entire string, but captures the part we want to keep so that we can reference it in our replacement. Also note that I began with ^\w+-
instead of 25k8cp1gl6-
. This pretty much just looks for any number of "word characters" ([A-Za-z0-9_]
) followed by a hyphen at the beginning of the string. If it needs to be "25k8cp1gl6-", you can replace this; I just wanted to show another option.
PHP:
$strings = array(
'25k8cp1gl6-Mein Herze im Blut, BWV 199: Recitative: Ich Wunden_SVD1329578_14691639_unified :CPN_trans:',
'25k8cp1gl6-La Puesta Del Sol_SVD1133599_12537702_unified :CPN_trans:',
'25k8cp1gl6-La Puesta Del Sol_MNA1133599_12537702_unified :CPN_trans:',
'25k8cp1gl6-La Puesta Del Sol_IMC1133599_12537702_unified :CPN_trans:',
);
foreach($strings as $string) {
$substring = preg_replace('/^\w+-(.*?)_(?:SVD|MNA|IMC).*/', '$1', $string);
var_dump($substring);
}
Upvotes: 2