An XML sitemap is one of those website assets you only notice when something goes wrong: a launch stalls in search, a redesign leaves old URLs behind, or new pages are not discovered as quickly as expected. This guide gives you a reusable checklist for creating, submitting, validating, and troubleshooting XML sitemaps across WordPress, static sites, and custom builds. The goal is simple: help you make a clean sitemap, point search engines to it, and avoid the common setup errors that waste time during launches and maintenance.
Overview
An XML sitemap is a machine-readable file that lists important URLs on your site. It is not a ranking shortcut and it does not replace good internal linking, but it can make site discovery and maintenance easier, especially for larger sites, recently launched projects, websites with frequent updates, or sites that have gone through migrations and URL changes.
In practical terms, a good sitemap should do four things:
- List canonical, indexable URLs you actually want crawled.
- Exclude pages that should not appear in search, such as admin areas, duplicate archives, filtered URLs, or staging environments.
- Return a successful response and be reachable at a stable URL.
- Stay reasonably current as content is added, removed, or redirected.
For most sites, the process is straightforward:
- Create the sitemap automatically through your CMS, plugin, framework, or build process.
- Check that the file opens correctly and includes the right URLs.
- Reference it in
robots.txtwhere appropriate. - Submit the sitemap in Google Search Console and any other search engine webmaster tools you use.
- Monitor for indexing issues, crawl errors, and stale URLs after launches or structural changes.
If you are also reviewing crawl directives, pair this work with a robots.txt guide for beginners. A sitemap and robots rules should support each other, not conflict.
Before you start, keep one principle in mind: the sitemap should reflect your best public version of the site. If a URL redirects, is blocked, returns an error, or carries a noindex instruction, it usually does not belong in the sitemap.
Checklist by scenario
Use the checklist that matches your setup. The steps are similar, but the way the sitemap is generated depends on how the site is built.
Scenario 1: WordPress sitemap setup
WordPress can generate sitemaps natively, and many SEO plugins can also manage them. The exact interface varies, but the checklist stays consistent.
- Confirm whether WordPress core or a plugin is generating the sitemap. Do not assume both should run at once. Having multiple sitemap systems can create duplicate or conflicting files.
- Locate the sitemap URL. Common patterns include a sitemap index page or individual section sitemaps for posts, pages, categories, or media.
- Open the sitemap in a browser. Make sure it loads over the preferred protocol and host, such as HTTPS and your canonical domain.
- Review included content types. Posts and pages are usually appropriate. Tag archives, author archives, attachment pages, thin taxonomy pages, or search result URLs may not be.
- Check indexability settings. If a content type is set to noindex, it should generally be excluded from the sitemap as well.
- Inspect sample URLs. Click a few URLs from each sitemap section and confirm they return a live page, not a redirect or 404.
- Submit the sitemap in Search Console. Submit the main sitemap index if one exists, rather than every child sitemap individually.
- Recheck after permalink or plugin changes. Sitemap behavior can shift after theme updates, permalink resets, or SEO plugin replacements.
If your WordPress site is acting strangely after structural changes, also review related maintenance topics like a WordPress 404 fix guide or a staging site workflow before changing production settings.
Scenario 2: Static website or Jamstack project
Static sites often need a build-time sitemap generator or a manually maintained XML file. This is common with documentation sites, portfolio sites, and landing-page-heavy projects.
- Choose the generation method. Use your framework's plugin, a build script, or a static sitemap file checked into the project.
- Set the correct base URL. This is one of the most common mistakes. If the site domain, subdomain, or protocol is wrong, every entry in the sitemap can be wrong.
- Generate the sitemap during deployment. Ideally, sitemap creation should be part of your build or publish process, not a separate manual task that gets forgotten.
- Exclude utility routes. Skip preview pages, private dashboards, thank-you pages, internal test routes, and query-based variations.
- Deploy and test the sitemap path. Visit the final sitemap URL on the live site, not just the local development version.
- Confirm headers and accessibility. The file should be publicly reachable and not blocked by authentication or misconfigured redirects.
- Submit after launch or major rebuilds. This helps search engines discover structural changes more efficiently.
If you are deploying through a modern hosting platform, this pairs well with a broader deployment workflow such as deploying a static website.
Scenario 3: Custom CMS or manually maintained sitemap
Some teams generate sitemaps from a database query, an API, or a server-side route. Others maintain a small sitemap manually for simple brochure sites.
- Define the inclusion rules. Decide what counts as a public, indexable URL in your application.
- Use canonical URLs only. If multiple routes resolve to the same content, include only the preferred public version.
- Split large sitemaps when needed. Organize entries by section, language, or content type if the file becomes too large or unwieldy.
- Create a sitemap index if you use multiple files. This gives you one stable submission target.
- Automate updates where possible. A manually updated sitemap is easy to neglect after content changes or feature launches.
- Test against real page states. Confirm the sitemap excludes redirects, deleted records, draft states, and blocked content.
- Document ownership. Someone on the team should know where the sitemap logic lives and when it should be revised.
Scenario 4: New site launch or redesign
Launches and redesigns are when sitemap mistakes become expensive. Use this short preflight checklist:
- Make sure the sitemap is on the live domain, not a staging subdomain.
- Confirm robots rules do not block important sections.
- Check that HTTP redirects to HTTPS consistently, and the sitemap uses the final HTTPS URLs.
- Verify old URLs either still exist or redirect cleanly where appropriate.
- Remove test pages, placeholder content, and duplicate environments from the sitemap.
- Submit the sitemap after launch and monitor errors for the next few days.
If the redesign included protocol changes, mixed URL patterns, or asset issues, you may also need to review mixed content errors after enabling HTTPS.
Scenario 5: Site migration, host change, or structure change
Migrations often break sitemap assumptions because domains, paths, and redirects all change at once.
- Regenerate the sitemap after the move. Do not carry over an old file without checking the URLs.
- Verify canonical host choices. Pick one preferred domain format and reflect it consistently in the sitemap.
- Test migrated URLs. Sample old and new URLs to ensure redirects and final destinations behave as expected.
- Check section changes. If you moved content from subdomain to subdirectory or changed category structures, make sure the sitemap reflects the new architecture.
- Resubmit in Search Console. Treat a major migration as a fresh validation point.
Related reading: migrating a WordPress site and subdomain vs subdirectory for SEO and site structure.
What to double-check
Once the sitemap exists, spend a few minutes on quality control. This is where most avoidable indexing problems are caught.
1. URL quality
- The URLs should be canonical and publicly accessible.
- They should not redirect unnecessarily.
- They should not return 404, 410, 5xx, or soft error pages.
- They should match your preferred host and protocol.
2. Indexability alignment
- Do not include pages blocked by robots rules unless you have a very specific reason and understand the tradeoff.
- Do not include pages with noindex if the intent is to keep them out of search.
- Do not include duplicates generated by filters, parameters, session IDs, or internal search.
3. Scope and relevance
- Include the content that matters: core pages, articles, products, documentation, and landing pages meant for discovery.
- Exclude thin utility pages, admin areas, carts, account pages, and temporary promo URLs unless they truly need indexing.
- Review taxonomies and archive pages carefully; many sites publish too many low-value archive URLs.
4. Technical accessibility
- The sitemap URL should return successfully and be publicly reachable.
- Compression is fine if supported properly, but always test the live result.
- If you use multiple sitemap files, make sure the sitemap index references the correct child files.
5. Freshness
- Recently published pages should appear without a long delay if your site updates frequently.
- Deleted or redirected pages should not linger in the sitemap indefinitely.
- After major content pruning, regenerate and resubmit.
A useful habit is to test five to ten random URLs from the sitemap after any major change. This small sampling often reveals larger pattern problems quickly.
Common mistakes
Most sitemap issues are not caused by XML syntax. They come from mismatched site settings, overlooked migrations, or content governance gaps.
Including the wrong URLs
The most common error is treating the sitemap as a full inventory of everything on the site. It should be a curated list of URLs worth crawling and indexing. If your sitemap includes search pages, tag clutter, old campaign URLs, preview links, or duplicate parameter versions, it becomes less useful.
Submitting a sitemap while blocking the same content
This usually happens when robots rules, plugin settings, and CMS visibility settings are handled by different people. The sitemap says “please discover this,” while robots or noindex says “do not process this.” Review both together whenever indexing looks inconsistent.
Leaving staging or development URLs in place
After a rushed launch, it is not unusual to find staging domains, localhost references, or password-protected preview URLs in the sitemap logic. This is a high-priority fix because it can confuse both crawling and internal QA.
Forgetting protocol and hostname consistency
If your preferred site is https://www.example.com, the sitemap should not mix in http://example.com or other variants unless you have a deliberate architecture reason. Mixed host formats often point to wider canonicalization issues.
Using multiple sitemap generators at once
WordPress sites are especially prone to this. Core, theme functions, SEO plugins, cache plugins, or server tools may all touch sitemap behavior. Choose one clear source of truth.
Not updating the sitemap after a redesign
Teams often remember redirects but forget sitemap cleanup. A redesign can change templates, archive behavior, URL patterns, and content visibility. If the sitemap still reflects the old structure, it can keep surfacing retired URLs long after launch.
Expecting the sitemap to solve deeper site issues
A clean sitemap helps discovery, but it will not fix weak internal linking, poor content quality, duplicate pages, or server instability. If your site is slow or unreliable, address those issues too. For WordPress projects, performance work such as a reusable speed checklist can support broader crawl health.
When to revisit
Treat your sitemap as a maintenance item, not a one-time SEO task. Revisit it whenever the underlying inputs change.
At minimum, review your sitemap in these situations:
- Before a launch: confirm production URLs, HTTPS, canonical host, and live accessibility.
- After a redesign: validate new templates, content types, archives, and redirects.
- After a migration or host move: regenerate and resubmit the sitemap.
- When changing plugins, frameworks, or routing rules: verify that sitemap generation still works as expected.
- When pruning content: remove retired URLs and check redirect logic.
- During seasonal planning cycles: review whether temporary campaign pages, old categories, or outdated sections are still included.
- When indexing drops or discovery slows: sample sitemap URLs, compare them with actual indexable pages, and look for conflicts.
Use this quick repeatable action list whenever you revisit the sitemap:
- Open the live sitemap and confirm it loads.
- Check that it uses the preferred HTTPS domain.
- Sample URLs from key sections and verify status codes.
- Remove noindex, blocked, redirected, or broken URLs from the sitemap source.
- Submit or resubmit the sitemap in Search Console after major changes.
- Monitor coverage and crawl feedback over the following days or weeks.
If you keep a launch checklist or website maintenance checklist, add sitemap review as a standard step. It takes only a few minutes when things are healthy, and it can save hours when a launch, redesign, or migration introduces indexing problems.
The durable takeaway is simple: your XML sitemap should mirror the public site you want search engines to understand. Keep it clean, keep it current, and revisit it whenever your structure, tooling, or publishing workflow changes.