EnexoOnoma
EnexoOnoma

Reputation: 8836

Modification to regex to match time

I want to take the 2012-07-16T21:00:00 out of the

 <abbr title="2012-07-16T21:00:00" class="dtstart">Monday, July 16th, 2012</abbr>

but I am having some difficulties. This is what I did

preg_match('/<abbr title="(.*)" \/>/i', $file_string, $time);
$time_out = $time[1];

Upvotes: 0

Views: 84

Answers (4)

Madara&#39;s Ghost
Madara&#39;s Ghost

Reputation: 174957

The best way would be to use an HTML parser, like PHP's DOM.

<?php

    $html = <<<HTML
<abbr title="2012-07-16T21:00:00" class="dtstart">Monday, July 16th, 2012</abbr>
HTML;

    $dom = new DOMDocument();
    $dom->loadHTML($html);
    $abbr  = $dom->getElementsByTagName("abbr")->item(0);
    $title = $abbr->getAttribute("title");

    echo $title;

That will work even if your data doesn't look exactly like that:

  • If there are other attributes before or after title.
  • If there are trailing spaces or other invisible characters.
  • Regardless of quote type (", ', or none).

So please, don't use RegEx, as it will eventuall cause you to lose your mind to cuthulu. The <center> cannot hold it is too late.

Upvotes: 0

Stegrex
Stegrex

Reputation: 4024

Try it this way instead of regex:

$dom = new DOMDocument;
$dom->loadXML($file_string);

$abbr = simplexml_import_dom($dom);

$time;
foreach ($abbr[0]->attributes() as $key => $value)
{
    if ($key == 'title')
    {
        $time = $value;
        break;
    }
}
echo $time;

Regex can be a pain for dealing with this sort of thing. Better to use a parser.

Upvotes: 0

Joseph Silber
Joseph Silber

Reputation: 219930

While I don't think using a regex for this is the best approach, it might be OK in some circumstances.

If you're using a regex, this is what you need:

preg_match('/<abbr title="([^"]*)"/i', $file_string, $time);

See it here in action: http://viper-7.com/qZu9tj

Upvotes: 0

Arcadien
Arcadien

Reputation: 2278

use

preg_match('/<abbr title="([^"]*)" \/>/i', $file_string, $time);

So your matcher will stop at first <<">> ([^"] means anything but ")

or

preg_match('/<abbr title="([0-9T:-]*)" \/>/i', $file_string, $time);

more precise, use group that contains only what you need to catch. (note the " is exluded)

Upvotes: 1

Related Questions