capsula
capsula

Reputation: 498

preg_match issue (not parsing as I expected)

I'm parsing a mail header, and I'm looking for the "boundary=..." parameter.

$content = '..Content-Type: multipart/alternative;
    boundary="----=_NextPart_000_10CD_01CD3CB2.7C22E7C0"
X-Mailer: Microsoft CDO for Windows 2000..'

I'm using the following, but none of them works:

    $boundary = preg_replace('#(.*)boundary="(.*)"(.*)#is',"$2",$content);

    $boundary = preg_replace('#boundary="(.*)"#i',"$2",$content);

The first line returns:

NextPart_000_10CD_01CD3CB2.7C22E7C0"
X-Mailer: Microsoft CDO for Windows 2000

While the second one:

Content-Type: multipart/alternative;
    ----=_NextPart_000_10CD_01CD3CB2.7C22E7C0
X-Mailer: Microsoft CDO for Windows 2000

I understand what the second lines does, and it do it correctly. But I don't get while the first line doesn't parse the second double quotes. Any idea?

Upvotes: 0

Views: 115

Answers (3)

damianb
damianb

Reputation: 1224

in response to your question in your self-answer:

.*, when used with the s modifier, also includes newlines iirc.

https://www.php.net/manual/en/reference.pcre.pattern.modifiers.php

s (PCRE_DOTALL)

If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded. This modifier is equivalent to Perl's /s modifier. A negative class such as [^a] always matches a newline character, independent of the setting of this modifier.

Upvotes: 0

flowfree
flowfree

Reputation: 16462

preg_match('/boundary="([^"]+)"/m', $content, $m);
echo $m[1]; // ----=_NextPart_000_10CD_01CD3CB2.7C22E7C0

Upvotes: 1

capsula
capsula

Reputation: 498

I could finally solve it using the negative matcher [^"]*

$boundary = preg_replace('#(.*)boundary="([^"]*)(.*)#is',"$2",$content);

But still if anyone knows the answer to my question, it would be appreciated. I don't fully understand the behavior of (.*) used with s modifier

Upvotes: 0

Related Questions