Google isn't indexing my older Ghost posts!
Older posts not getting indexed? Here's why.
For several months, I've been noticing that Google will crawl my older posts, but not actually index them. They're in the sitemap, but the Google Bot just doesn't think they're worth indexing.
I considered two possibilities:
1) They were too 'deep' in the site, so that Google never got there, or didn't have enough domain reputation when it did. I added these buttons (below), to help get some of my tag content closer to the top of the page hierarchy.
That seemed to help with some posts, but my oldest content was still missing, and Google Search Console said there were zero links to it. That didn't make sense, because it was linked right there on the blog index page!
Then it clicked. The GoogleBot doesn't scroll. Ruby (at least the version I was using) uses infinite scroll, with no fallback pagination links. Those two things combined to mean that Google had no way to get to my older content (no link to /page/2/), and since it doesn't scroll, those posts never loaded. So, I made a quick tweak to serve Google some links it could click, without changing how the site looked for everyone else. Here's how:
1) Add the {{pagination}}
helper to the bottom of index.hbs
(and tag.hbs
and author.hbs
if desired) - I put it just before </main>
.
2) Add the script below at the end of these files. (Note, this works with the default pagination helper, but you may need to adjust it if you theme uses a custom file, found in /partials/pagination.hbs
.
<script>
// detect scrolling and hide the .pagination element
document.addEventListener('scroll', function (event) {
var pagination = document.querySelectorAll('.older-posts, .page-number, .newer-posts');
pagination.forEach(el => el.style.display='none')
}
);
</script>
The script above causes the pagination elements to be made invisible when the user starts scrolling. So it'll be invisible (display=none) for normal users who scroll and get infinite scroll adding posts, but visible for the GoogleBot, which never scrolls.
And now we wait to see if the GoogleBot can find my older content! I've used Search Console to ask for a reindex of the /blog page, and will be watching to see those posts show up as having links.
And yes, I filed a bug:
Update: further research
This theme (and likely several other infinite scrolling themes) assumes that using link rel/next syntax in the header will help search engines get to the second (and third and... ) page. Unfortunately, the Google Bot doesn't use this syntax. So if your theme infinite scrolls, you may need the fix above!