userlite
userlite

Reputation: 135

Are wildcards allowed in sitemap.xml file?

I have a website that has a directory that contains 100+ html files. I want crawlers to crawl all the html files that directory. I have already added following sentence to my robots.txt:

Allow /DirName/*.html$

Is there any way to include the files in the directory in sitemap.xml file so that all html files in the directory will get crawled? Something like this:

<url>
    <loc>MyWebsiteName/DirName/*.html</loc>
</url>

Upvotes: 3

Views: 4238

Answers (2)

eQ19
eQ19

Reputation: 10711

It is not allows the use of wildcards. if you run php in your server then you could list all files in the directory and generate sitemap.xml automatically using the DirectoryIterator .

// this is assume you have already a sitemap class.
$sitemap = new Sitemap;

// iterate the directory
foreach(new DirectoryIterator('/MyWebsiteName/DirName') as $directoryItem)
{
    // Filter the item
    if(!$directoryItem->isFile()) continue;

    // New basic sitemap.
    $url = new Sitemap_URL;

    // Set arguments.
    $url->set_loc(sprintf('/DirName/%1$s', $directoryItem->getBasename()))
        ->set_last_mod(1276800492)
        ->set_change_frequency('daily')
        ->set_priority(1);

    // Add it to sitemap.
    $sitemap->add($url);
}

// Render the output.
$response = $sitemap->render();

// Cache the output for 24 hours.
$cache->set('sitemap', $response, 86400);

// Output the sitemap.
echo $response;

Upvotes: 0

methode
methode

Reputation: 5438

The sitemap protocol neither restricts or allows the use of wildcards; to be honest this is the first time i hear this. Also, I'm pretty much sure that search engines can't make use of the wildcards in sitemaps.

Please take a look at Google's recommendation of sitemap generators. There are tons of tools you can create a sitemap with in a blink of an eye.

Upvotes: 1

Related Questions