Generating a Blog Sitemap with Node.js using the Express Framework

In this article I will talk (superficially) about what sitemaps are, and how to generate them with an existing Node.js package. Even though the practical examples are focused on blog applications, they may easily be extended to a more general website. The sitemap will be cached by 24 hours (the time may easily be changed).

I chose to use the sitemap package, which is not specific to Express, but is very easy to use, and has an interesting cache system. You may take a look at the express-sitemap package, specific to Express, but the only advantage that I found in it is that it can automatically add static pages of your app object to the sitemap. If you have lots of static pages, this may be useful for you. For a blog, I haven’t found any reason to use it instead of the more generalsitemap package.

If you know the basic about sitemaps, you may skip the first section of this article.

An Introduction to Sitemaps

Sitemap is a protocol made for helping search engines on pages indexing. It may be a single text file, on which each line is the URL of a page you want it to be indexed.

Naturally, if you don’t include some URL on your sitemap, this page may still be indexed. If you don’t want some page to be indexed, read about robots.txt. Likewise, if you add a page on the sitemap, it may not be displayed on the first page of any search engine.

Sitemaps are just tools to help the search engines to find every page of your site. Optionally, you may add a relative priority to every page of your site, if you think that some pages are more important than others. This priority is relative only to pages of your own site, and doesn’t have any effect to make your site be better ranked than other sites.

The most common sitemap format is the XML.

Here,the http://www.example.com/ was updated for the last time on 01-01-2015, is expected to be updated monthly, and has a 0.8 priority (in a scale of 0 to 1). The default priority is 0.5.

Installing and Configuring the Sitemap Package

To install the package, include sitemap to the dependencies of your package.json file, or use the npm.

Now create a routessitemap.js file and add it to your app object.

So, you can generate the sitemap

Visit yoursite.com/sitemap.xml, and there it is!

Including your Posts on the Sitemap

Let’s get the URLs of yours posts, so we can include the on the sitemap. You don’t need to addevery page of your site to the sitemap, only the ones which offers some new content. Pagination or search pages, usually, don’t need to be included.

For this reason, we will add to the sitemap only the URLs of the posts.

The sm.toXML() method checks if the XML needs to be generated again, or if the cache is still valid. If your sitemap have lots of pages, this will result on a great performance gain.

If you prefer to use some package specific to Express, you may use the express-sitemap with only a few modifications. The most important is that the express-sitemap doesn’t have any type of cache. To avoid searching every post from the database, and creating the XML every time the sitemap is requested, you can add a cron job to generate the XML at regular intervals, or save the last time a sitemap was generated in a variable, and check for the expiration by yourself.

Leave a Reply

Your email address will not be published. Required fields are marked *