Reputation: 8836

Modification to regex to match time

I want to take the 2012-07-16T21:00:00 out of the

 <abbr title="2012-07-16T21:00:00" class="dtstart">Monday, July 16th, 2012</abbr>

but I am having some difficulties. This is what I did

preg_match('/<abbr title="(.*)" \/>/i', $file_string, $time);
$time_out = $time[1];

Upvotes: 0

Answers (4)

Madara's Ghost

Reputation: 174957

The best way would be to use an HTML parser, like PHP's DOM.

<?php

    $html = <<<HTML
<abbr title="2012-07-16T21:00:00" class="dtstart">Monday, July 16th, 2012</abbr>
HTML;

    $dom = new DOMDocument();
    $dom->loadHTML($html);
    $abbr  = $dom->getElementsByTagName("abbr")->item(0);
    $title = $abbr->getAttribute("title");

    echo $title;

That will work even if your data doesn't look exactly like that:

If there are other attributes before or after title.
If there are trailing spaces or other invisible characters.
Regardless of quote type (", ', or none).

So please, don't use RegEx, as it will eventuall cause you to lose your mind to cuthulu. The <center> cannot hold it is too late.

Upvotes: 0

Stegrex

Reputation: 4024

Try it this way instead of regex:

$dom = new DOMDocument;
$dom->loadXML($file_string);

$abbr = simplexml_import_dom($dom);

$time;
foreach ($abbr[0]->attributes() as $key => $value)
{
    if ($key == 'title')
    {
        $time = $value;
        break;
    }
}
echo $time;

Regex can be a pain for dealing with this sort of thing. Better to use a parser.

Upvotes: 0

Joseph Silber

Reputation: 219930

While I don't think using a regex for this is the best approach, it might be OK in some circumstances.

If you're using a regex, this is what you need:

preg_match('/<abbr title="([^"]*)"/i', $file_string, $time);

See it here in action: http://viper-7.com/qZu9tj

Upvotes: 0

Arcadien

Reputation: 2278

use

preg_match('/<abbr title="([^"]*)" \/>/i', $file_string, $time);

So your matcher will stop at first <<">> ([^"] means anything but ")

preg_match('/<abbr title="([0-9T:-]*)" \/>/i', $file_string, $time);

more precise, use group that contains only what you need to catch. (note the " is exluded)

Upvotes: 1

Modification to regex to match time

Answers (4)

Related Questions