GroundWorkDigitalGet in touch
Home/Blog/Search visibility
Search visibility

What is a sitemap, and why yours might be broken

30 June 2026·12 min read

If you've never heard of a sitemap, you're not alone, most small business owners have never seen the word, let alone checked whether their own site has one. But a broken or missing sitemap is one of the most common reasons a website's pages never appear on Google, even when the content itself is good.

This guide covers everything you need to know, what a sitemap actually is, why it matters, the different types, how to check your own, and the most common ways sitemaps break in the wild. We have audited dozens of South African business websites and found broken or misconfigured sitemaps on a significant number of them, often without the business owner having any idea.

What is a sitemap

A sitemap is a file on your website that lists every important page, post, and piece of content you have, along with information about each one, when it was last updated, how it relates to other pages, and sometimes images or videos associated with it.

Think of your website as a city and Google's crawler as a visitor who has never been there before. Without a map, the visitor has to wander street by street, hoping to stumble onto every building. A sitemap is the map you hand over, a direct list of every address that exists, so nothing gets missed.

There are two main types worth knowing about.

XML sitemaps

This is the version search engines read. It's a machine-readable file, usually found at yoursite.co.za/sitemap.xml, written in a structured format. It is not designed for humans to look at, it exists purely so Google, Bing, and other search engines can efficiently discover every page on your site.

A basic XML sitemap entry includes a page address tag (written as <loc>) and a last-modified date tag (<lastmod>), wrapped inside a <url> block. For example, a page would be listed with its full web address inside the <loc> tag, and the date it was last updated inside the <lastmod> tag. Each of these blocks represents one page on your site, and a sitemap simply contains one block for every page you want search engines to find.

HTML sitemaps

This is a visible page on your website, usually linked from the footer, that lists your site's pages in a readable format for human visitors. It is less common today since most websites have clear navigation menus, but it still has value for very large sites or for users who land on your site and want a full overview of what's available.

For most small business websites, the XML sitemap is the one that actually matters for search visibility.

Why a sitemap matters

Google's official guidance is direct about this: a sitemap doesn't guarantee every page will be crawled and indexed, but in most cases your site benefits from having one. There are a few situations where a sitemap becomes especially important.

Your site is new. A brand new website has no history and few, if any, other sites linking to it. Google's crawler discovers new pages mainly by following links from pages it already knows about. If nothing links to your new site yet, a sitemap is often the only way Google finds it quickly.

Your site has orphaned pages. An orphaned page is one that exists on your website but has no internal links pointing to it from any other page. These pages can be completely invisible to Google's crawler unless they appear in your sitemap.

Your internal linking is weak. If your navigation menu doesn't link to every page, or if some service pages are buried several clicks deep, a sitemap acts as a safety net that ensures those pages are still discoverable.

Your site has a lot of pages. Larger sites with many service pages, blog posts, or product listings benefit significantly from a sitemap, since it's harder to ensure every page is linked from somewhere else on the site.

For most South African small business websites, service pages, a handful of blog posts, a contact page, a sitemap takes a few minutes to set up correctly and provides real insurance against pages silently going unindexed.

How to check if your sitemap exists and works

This takes less than two minutes. Type your website address followed by /sitemap.xml into your browser, for example yoursite.co.za/sitemap.xml.

If a sitemap exists, you'll see either a structured list of URLs, or a sitemap index file that points to several other sitemap files (common for WordPress sites using Yoast SEO, which often splits pages, posts, and categories into separate files).

If instead you see a 404 error or "page not found", your site does not have a sitemap at this location, either it doesn't exist, or it's been placed somewhere non-standard.

The five most common ways sitemaps break

We have come across all five of these issues repeatedly while auditing South African business websites. None of them are obvious unless you know to look.

1. The sitemap points to URLs that don't exist on the site

This is one of the more serious issues, because it actively misleads Google rather than simply being incomplete. We recently audited a Cape Town renovation business whose sitemap listed pages like /roof-repairs and /flooring, while the actual live pages on the site were /roof-install-and-repair and /epoxy-floor-coatings-cape-town. None of the sitemap entries matched a real page. Google was being directed to dead ends on every single URL in the sitemap, while the actual content, built up over months of blog posts and service pages, sat completely undiscovered.

This typically happens when a site is restructured or URLs are renamed, but the sitemap is never regenerated to match.

2. The sitemap is years out of date

We found a sitemap with a last-modified date from 2015 and a change frequency set to "never" on a business that was still actively operating. The sitemap pointed to an entirely different, defunct domain. This is common with older websites built on now-abandoned platforms, where the sitemap was generated once at launch and never touched again.

3. The sitemap only contains the homepage

We've seen multiple sites, a CRM platform, a construction company, a Wix-built contractor site, where the sitemap or the actual indexed pages on Google consisted of the homepage only, despite the business having anywhere from 6 to 20 additional pages. In most of these cases, the underlying issue was either an invalid sitemap, suspicious URL patterns (more on that below), or the site simply never having submitted its sitemap to Google Search Console at all.

4. The sitemap is technically invalid

Google's crawler needs to be able to parse your sitemap file without errors. We have seen Lighthouse audits flag an invalid robots.txt file or sitemap files that fail to download entirely. A sitemap with malformed XML, broken characters, or incorrect formatting can be silently ignored by Google, meaning none of the pages listed in it benefit from the sitemap at all.

5. URL structures that look auto-generated or duplicated

This one is subtle. We audited a Wix-built business site where service page URLs were structured with number suffixes, pages ending in "-1" or "-7", for example a kitchen page and a contact page both carrying trailing numbers. While the sitemap technically listed all the correct URLs, these number suffixes are a known signal that can cause Google to treat the pages as duplicate or low-quality auto-generated content rather than genuine pages, meaning despite a "correct" sitemap, only the homepage ended up indexed.

How to submit your sitemap to Google

Once you've confirmed your sitemap exists and is accurate, submitting it takes a few minutes inside Google Search Console.

  • Go to search.google.com/search-console and select your property
  • In the left-hand menu, click Indexing, then Sitemaps
  • Under "Add a new sitemap", enter the path to your sitemap, usually just sitemap.xml
  • Click Submit
  • You'll see a status update shortly after. A green "Success" message means Google was able to read and process your sitemap. A red "Couldn't fetch" or "Has errors" status means something is wrong, and it's worth investigating immediately rather than assuming it will resolve itself.

    How sitemaps are usually generated

    You rarely need to write a sitemap by hand. Most platforms generate one automatically:

    WordPress - plugins like Yoast SEO or Rank Math automatically generate and update your sitemap whenever you publish or edit content. This is usually reliable, but issues can still creep in after a site restructure, theme change, or plugin conflict.

    Wix - Wix automatically generates a sitemap, typically at the standard sitemap address, listing your published pages. As noted above, the URLs themselves can sometimes cause indexing issues even when the sitemap file is technically correct.

    Custom-built sites (like Next.js) - sitemaps are usually generated programmatically as part of the build process, pulling directly from the site's actual routes. When built correctly, this approach tends to be the most reliable, since the sitemap can never drift out of sync with the real pages, it's generated from the same source of truth.

    A quick checklist

    Run through these five checks on your own website today:

  • Visit your sitemap address directly, does it load without an error?
  • Click through 3-4 of the URLs listed, do they all lead to real, live pages?
  • Check the last-modified dates, are they recent, or years old?
  • Search site:yoursite.co.za on Google, does the number of indexed pages roughly match what's in your sitemap?
  • Confirm your sitemap is submitted in Google Search Console under Indexing → Sitemaps, with a "Success" status
  • If any of these checks fail, it's worth investigating before assuming your content simply "isn't ranking yet", sometimes the content was never given a fair chance to be found in the first place.

    The bottom line

    A sitemap is one of the simplest, most overlooked pieces of technical SEO. It costs nothing to set up correctly and takes only a few minutes to check. But when it's broken, pointing to dead URLs, years out of date, or missing entirely, it can quietly undermine months or years of content work, leaving genuinely good pages invisible to the people searching for exactly what you offer.

    If you've never checked your sitemap before, do it today. It's one of the fastest ways to find out whether Google can actually see your full website, or just the homepage.

    Want us to do this for your business?

    Get a free audit. We'll show you exactly where your site stands.

    Get a free audit
    ← Back to blog