Brian Porter
Brian Porter

Reputation: 529

How can I create blog entries for Gatsby with migrated HTML content

I am trying to migrate a blog, and can extract the posts in HTML format as well as title, keywords, data, meta description, etc.

How can I use them to create the blog posts in GatsbyJS? I can only find instructions for using Markdown. It is not really feasible to migrate several hundred of this by hand and converting them to markdown because of the complex formatting along with some inline CSS styles.

Is there some way of adding the HTML in a separate Javascript file so that it gets included (via the template?) and the meta data is in the markdown file?

Upvotes: 2

Views: 532

Answers (1)

Derek Nguyen
Derek Nguyen

Reputation: 11577

Edit: Here's an example repo


I think you can point gatsby-source-filesystem to your html folder & create a node for each of the file in there. Once you have that, you can query them in your template just like with other markdown nodes.

Say you have the htmls in a content folder:

root
 |--content
 |   `--htmls
 |       |--post1.html
 |       `--post2.html
 |  
 |--src
 |   `--templates
 |        `--blog.js
 |
 |--gatsby-config.js
 `--gatsby-node.js

Point gatsby-source-filesystem to your html folder:

// gatsby-config.js
{
  resolve: `gatsby-source-filesystem`,
  options: {
    path: `${__dirname}/content/htmls`,
    name: `html`,
  },
},

Then in gatsby-node.js, you can use loadNodeContent to read the raw html. From then on it's pretty straight forward, just follow this example on Gatsby's doc about creating node.

const { createContentDigest } = require("gatsby-core-utils");

exports.onCreateNode = async ({
  node, loadNodeContent, createNodeId, actions
}) => {

  // only care about html file
  if (node.internal.type !== 'File' || node.internal.mediaType !== 'text/html') return;
  
  const { createNode } = actions;

  // read the raw html content
  const nodeContent = await loadNodeContent(node);

  // set up the new node
  const htmlNodeContent = {
    id: createNodeId(node.relativePath), // required
    content: nodeContent,
    name: node.name, // take the file's name as identifier
    internal: {
      type: 'HTMLContent',
      contentDigest: createContentDigest(nodeContent), // required
    }
    ...otherNecessaryMetaDataProps
  }

  createNode(htmlNodeContent);
}

Once you create the nodes, you can query them with

{
  allHtmlContent {
    edges {
      node {
        name
        content
      }
    }
  }
}

and from then on pretty much treat them as other markdown nodes. It'll get more complex if you need to parse the content, like locating images file etc. in which case I think you'd need to look into something like rehype.

Upvotes: 6

Related Questions