How to Create an XML Sitemap (2026 Guide)

Technical SEO
TL;DR

An XML sitemap is a file (usually sitemap.xml) that lists the canonical URLs you want search engines to crawl and index. Create one in 3 steps: generate the file from your CMS or a crawler, host it at yourdomain.com/sitemap.xml, then submit it in Google Search Console and Bing Webmaster Tools.

What an XML sitemap is

An XML sitemap is a machine-readable file — almost always named sitemap.xml — that lists the canonical URLs on a site you want search engines to discover, crawl, and index. The keyword to anchor on is XML sitemap: it is not a ranking signal and it does not guarantee indexing. It is a discovery aid that tells Google and Bing *here are the pages that matter and when they last changed*.

Concretely, a sitemap is a list of <url> entries, each wrapping a <loc> (the absolute URL) and optionally a <lastmod> timestamp. Search engines read the file to find pages that internal links or crawl budget might otherwise miss — deep pages, newly published posts, or sections with weak internal linking.

A minimal valid sitemap looks like this:

xml
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/</loc>
    <lastmod>2026-06-10</lastmod>
  </url>
  <url>
    <loc>https://example.com/blog/how-to-create-an-xml-sitemap</loc>
    <lastmod>2026-06-10</lastmod>
  </url>
</urlset>

Two limits define the format. A single sitemap file can hold at most 50,000 URLs and must be under 50 MB uncompressed. Past either limit, you split into multiple sitemaps and tie them together with a sitemap *index* file. A sitemap is one piece of technical SEO — the layer that governs how machines crawl and parse a site rather than how humans read it.

What to include and what to exclude

An XML sitemap should list only the URLs you genuinely want indexed: one canonical, self-canonicalizing, 200 OK version of each page. The fastest way to confuse a crawler — and to invite the "Sitemap contains URLs which are blocked by robots.txt" warning — is to include pages you are simultaneously telling Google not to index.

Include in a sitemap:

  • Important content — published posts, products, category pages, key landing pages.
  • Pages with weak internal linking that crawlers might otherwise miss.

Exclude from a sitemap:

  • Non-canonical duplicates — only the canonical URL belongs here. If you are unsure which version is canonical, see what is a canonical tag.
  • robots.txt-blocked URLs, paginated infinite-scroll junk, faceted-search parameter URLs, and login or admin pages.

The governing rule: the sitemap is a statement of intent, not an inventory. Every URL in it should be one you would be happy to see ranking tomorrow. Mixing in noindex URLs, redirect chains, or canonicalized duplicates is the most common reason Search Console reports a sitemap full of warnings.

How to generate, submit, and verify a sitemap

Creating an XML sitemap is a four-step loop: generate the file, host it at a stable URL, submit it to search engines, then verify it processed cleanly. Most sites never reach the verify step, which is exactly where the silent failures hide.

Create, submit, and verify an XML sitemap
  1. Generate the fileProduce sitemap.xml from your CMS, framework, or a crawler, listing only canonical 200 OK URLs.
  2. Host at a stable URLServe it at https://yourdomain.com/sitemap.xml with the correct XML content type.
  3. Reference in robots.txtAdd a Sitemap: line so any crawler discovers the file automatically.
  4. Submit to search enginesEnter the sitemap path in Google Search Console and the full URL in Bing Webmaster Tools.
  5. Verify processingReturn a day later and confirm the status reads Success with the expected discovered-URL count.
  6. Monitor and resubmitRe-check after structural changes; resubmit only after migrations or large batches of new pages.

Generating the file depends on your stack. Most modern frameworks and CMS platforms produce one automatically: WordPress via Yoast or Rank Math, Next.js via a sitemap.ts route or next-sitemap, Shopify and Squarespace out of the box. For static or custom sites, a crawler like Screaming Frog can export a sitemap from a live crawl. Whatever the method, the output must be served at a stable, absolute URL — conventionally https://yourdomain.com/sitemap.xml.

Submitting means two things. First, reference the sitemap in robots.txt with a Sitemap: line so any crawler discovers it automatically:

Sitemap: https://example.com/sitemap.xml

Second, submit the URL explicitly in Google Search Console and Bing Webmaster Tools (covered in the next section). Verifying is the step people skip — open Search Console a day later, confirm the status reads *Success*, and check that the discovered URL count matches what you expected. To confirm the file is even reachable and well-formed before you submit, run the live URL through our free SEO + GEO audit at /, which checks the sitemap alongside 40+ other signals at /check.

How to submit a sitemap to Google and Bing

Submitting a sitemap to Google means pasting its URL into Google Search Console; submitting to Bing means doing the same in Bing Webmaster Tools. Both processes take under a minute and both require a verified property for the domain first.

Google Search Console. Open the property for your domain, click Sitemaps in the left navigation, enter the path (e.g. sitemap.xml) in the *Add a new sitemap* field, and click Submit. Google fetches the file, parses it, and within hours to a few days reports a status of *Success*, *Has errors*, or *Couldn't fetch*, along with the number of discovered URLs.

Bing Webmaster Tools. Open your verified site, go to Sitemaps, click Submit sitemap, paste the full absolute URL (https://example.com/sitemap.xml), and submit. Bing also imports sitemaps automatically if you connect a verified Search Console account, which is the fastest path if Google is already set up.

Submitting a sitemap: Google Search Console vs. Bing Webmaster Tools
StepGoogle Search ConsoleBing Webmaster Tools
Where to goSitemaps section in the left navSitemaps section, then Submit sitemap
What to enterThe path, e.g. sitemap.xmlThe full absolute URL
Auto-importNo native import from BingCan import directly from a verified GSC account
Status reportedSuccess / Has errors / Couldn't fetchSubmitted with last-read date and URL count

You do not need to resubmit a sitemap every time content changes. Once submitted, Google and Bing re-fetch it on their own schedule. Resubmit manually only after a major structural change — a domain migration, a new sitemap index, or a large batch of new pages you want crawled quickly. The Sitemap: line in robots.txt also keeps the file discoverable to crawlers that never touch your webmaster accounts.

Common sitemap errors and how to fix them

Most sitemap problems reduce to a handful of repeatable errors, and the two that waste the most time are the "Couldn't fetch" stale status and lastmod abuse. Both look like the sitemap is broken when the real cause is elsewhere.

"Couldn't fetch" rarely means the file is broken. It usually means Google could not reach it at the moment it tried.

"Couldn't fetch" (the stale-status trap). Search Console shows *Couldn't fetch* when Google's request for the file failed or timed out. The status is sticky — it can linger for days even after you fix the cause, which makes people re-edit a sitemap that was fine all along. Real culprits, in order of frequency: the sitemap URL returns a redirect or non-200 status; a slow or cold-start server timed out the fetch (common on serverless and free-tier hosting that sleeps); a robots.txt rule blocks the sitemap path; or the file is served with the wrong content type. Fix the root cause, confirm the file loads instantly in a browser, then resubmit — and ignore the stale status for a day or two while it clears.

lastmod abuse. The <lastmod> tag should reflect the genuine last meaningful change to a page. Setting every URL's lastmod to today's date on every build — a default in some generators — trains Google to distrust the field entirely. Google has stated it ignores lastmod when the values are obviously unreliable. Either populate lastmod accurately or omit it; a wrong timestamp is worse than none.

Other frequent errors:

  • Non-200 / redirect / noindex URLs in the sitemap — strip anything that is not a canonical, indexable 200.
  • Relative or wrong-protocol URLs — every <loc> must be a full absolute URL on the canonical host and protocol.
  • Over the 50,000-URL or 50 MB limit — split into multiple files and reference them from a sitemap index.

When the file looks correct but pages still are not indexed, the issue is often crawl access rather than the sitemap itself. A blocked AI or search crawler never reads your perfect file — confirm crawler access at /check/geo.aibots.blocked.

Run a free audit on your site

See how your site scores across 40+ SEO, JSON-LD, and GEO/AI-search checks — including everything covered in this guide. Free forever, no signup, no crawl cap.

Audit my site →

People also ask

What is an XML sitemap?

An XML sitemap is a machine-readable file, usually named sitemap.xml, that lists the canonical URLs a site wants search engines to crawl and index. Each entry contains the page URL in a loc tag and optionally a lastmod timestamp marking the last meaningful change. The file is a discovery aid, not a ranking signal, and a single sitemap can hold up to 50,000 URLs.

Do I need a sitemap for a small site?

A small site with clean internal linking does not strictly need an XML sitemap, because Google can crawl every page by following links. A sitemap still helps in three cases: when pages are deep or poorly linked, when content is brand new and not yet linked, or when you want clear indexing data in Search Console. Since most CMS platforms generate one automatically, the cost is near zero, so there is little reason to skip it.

How do I submit a sitemap to Google?

Submit a sitemap to Google by opening your verified property in Google Search Console, clicking Sitemaps in the left navigation, entering the sitemap path such as sitemap.xml, and clicking Submit. Google fetches and parses the file, then reports a status of Success, Has errors, or Couldn't fetch within hours to a few days. You should also add a Sitemap: line to robots.txt so crawlers discover the file without the manual submission.

Why does my sitemap say couldn't fetch?

A sitemap shows Couldn't fetch when Google's request for the file failed or timed out, not usually because the file is malformed. Common causes are a sitemap URL that returns a redirect or non-200 status, a slow or cold-start server that timed out, a robots.txt rule blocking the path, or the wrong content type. The status is sticky and can linger for days after you fix the root cause, so resolve the cause, confirm the file loads instantly in a browser, then wait for the status to clear.

How many URLs can an XML sitemap contain?

A single XML sitemap file can contain at most 50,000 URLs and must be under 50 MB uncompressed. Sites larger than that split their URLs across multiple sitemap files and tie them together with a sitemap index file, which itself can reference up to 50,000 sitemaps. Staying within these limits prevents truncation and processing errors in Search Console.

Frequently asked questions

Should every URL in my sitemap be indexable?

Yes. Every URL in an XML sitemap should be a canonical, self-canonicalizing page that returns 200 OK and is not blocked by robots.txt or marked noindex. Listing redirects, 404s, noindexed pages, or non-canonical duplicates sends mixed signals and triggers warnings in Search Console. The sitemap is a statement of which pages you want indexed, not a full inventory of every URL on the site.

Do I need to resubmit my sitemap after every change?

No. Once a sitemap is submitted, Google and Bing re-fetch it on their own schedule, so routine content edits need no action. Resubmit manually only after a major structural change such as a domain migration, a new sitemap index, or a large batch of new pages you want crawled quickly. Keeping a Sitemap: line in robots.txt also ensures crawlers keep finding the file on their own.

Does a lastmod date affect rankings?

A lastmod date does not affect rankings directly; it only hints when a page last changed so crawlers can prioritize re-crawling. Setting every URL's lastmod to the build date on every deploy makes the field unreliable, and Google ignores lastmod values it cannot trust. Populate lastmod with the genuine last meaningful edit, or omit it entirely rather than faking it.

Keep reading

People also search for