Hoetmaaiers
Hoetmaaiers

Reputation: 3503

Docsearch on subpath '/docs' not scraping side navigation

A Docusaurus documentation website: https://slovakia-atmo-plan.marvintest.vito.be/docs/ is rendered in Docs only mode.

The Algolia Docsearch scraper is not scraping root level pages, instead it logs Ignored: from start url. This issue only seems to arise when the Docusaurus build is nested under {baseUrl}/docs.

Why is this being ignored? This is my docsearch config:

{
  "index_name": "atmoplan-documentation",
  "start_urls": ["https://slovakia-atmo-plan.marvintest.vito.be/docs"],
  "sitemap_urls": ["https://slovakia-atmo-plan.marvintest.vito.be/docs/sitemap.xml"],
  "sitemap_alternate_links": true,
  "stop_urls": ["/tests"],
  "selectors": {
    "lvl0": {
      "selector": "(//ul[contains(@class,'menu__list')]//a[contains(@class, 'menu__link menu__link--sublist menu__link--active')]/text() | //nav[contains(@class, 'navbar')]//a[contains(@class, 'navbar__link--active')]/text())[last()]",
      "type": "xpath",
      "global": true,
      "default_value": "Documentation"
    },
    "lvl1": "header h1",
    "lvl2": "article h2",
    "lvl3": "article h3",
    "lvl4": "article h4",
    "lvl5": "article h5, article td:first-child",
    "lvl6": "article h6",
    "text": "article p, article li, article td:last-child"
  },
  "strip_chars": " .,;:#",
  "custom_settings": {
    "separatorsToIndex": "_",
    "attributesForFaceting": ["language", "version", "type", "docusaurus_tag"],
    "attributesToRetrieve": ["hierarchy", "content", "anchor", "url", "url_without_anchor", "type"]
  },
  "conversation_id": ["833762294"],
  "nb_hits": 46250
}

Upvotes: 0

Views: 592

Answers (1)

D.Kastier
D.Kastier

Reputation: 3025

Inside your docusaurus.config.js you should set the url parameter with the actual website where you will be hosting your docs. Something like:

module.exports = {
    url: 'https://slovakia-atmo-plan.marvintest.vito.be/docs',
[…]
}

This will be used by your docusaurus to generate the sitemap.xml, used by algolia to locate your pages.


REFERENCE: https://docusaurus.io/docs/docusaurus.config.js/#url


DISCLAIMER

I noted something strange inside your sitemap.xml. For example the first link was https://www.vito.be/docs/markdown-page, but defined URL for Algolia is https://slovakia-atmo-plan.marvintest.vito.be/docs.

Upvotes: 1

Related Questions