Skip to main content

How hreflang Tags Tell Google Which Language a Page Targets

A practical guide to hreflang tags for multilingual and multi-region sites: return-tag rules, x-default, avoiding duplicate content, and a worked example.

Published By Li Lei
#hreflang #international seo #multilingual #duplicate content #technical seo

How hreflang Tags Tell Google Which Language a Page Targets

If you run the same content in more than one language, you have probably watched Google show the wrong version in search results. A searcher in Mexico lands on your English page. A reader in France gets served Spanish. The translations are sitting right there, indexed and ready, but the search engine keeps picking the wrong door. That mismatch is exactly the problem hreflang was built to solve.

An hreflang annotation is a small signal you attach to each page that says, in effect, "this URL is the English-for-the-US version, and here are the other language and region versions." Google reads the full set, then matches each searcher to the variant that fits their language and country. Done right, every locale earns its own placement. Done wrong, the whole cluster collapses and Google falls back to guessing.

What hreflang actually communicates

The tag carries two facts: which language a page is written in, and optionally which region it targets. The value follows a fixed shape — an ISO 639-1 language code, optionally a dash, then an ISO 3166-1 Alpha-2 region. So en is English everywhere, en-US is English aimed at the United States, and en-GB is English aimed at the United Kingdom. You can also add a script subtag, like zh-Hant for Traditional Chinese.

The order matters: language first, region second. A bare region code is never valid. US on its own is rejected, and UK is a trap — the country code for the United Kingdom is GB, not UK. These are the errors that quietly break clusters in production, which is why running your codes through the hreflang tag generator before you ship is worth the thirty seconds. It validates each code against that shape and flags the ones that will not survive a crawl.

The return-tag rule that breaks most clusters

Here is the single most important rule, and the one that costs sites the most rankings: hreflang must be bidirectional. Every page in a language set has to list all versions, including a self-reference back to itself.

Concretely, each language version links to all the others with rel="alternate" hreflang="lang-REGION", every page references back, and an x-default entry covers users who do not match any listed locale. If your English page points to the French page but the French page does not point back, Google treats the annotation as unconfirmed and ignores it. You will see this in Search Console as "no return tags," and it is almost always literal — the references simply do not point both ways.

That means the same block of link tags appears on every localized URL. The English page lists English, French and itself. The French page lists English, French and itself. They are identical sets, repeated on each member of the cluster. There is no shortcut here; skipping the self-reference on one page is enough to drop that page from the set.

What x-default does

x-default names the page to serve when no other entry matches the visitor. Think of a language picker, or a global English home page. If you target en-US, fr-FR and ja-JP, but a visitor's browser is set to German, none of those three matches — and x-default catches them instead of leaving Google to pick at random.

It is optional, but I recommend it on nearly every multi-region setup. It is the clean fallback that keeps unmatched users from landing on a locale they cannot read. Point it at whatever page makes the best generic landing spot: a chooser, your most widely understood language, or a region-neutral home page.

A worked example: en, zh, and x-default

Say you have two versions of a product page plus a language picker as the fallback. The block below goes in the <head> of both localized pages, unchanged:

<link rel="alternate" hreflang="en-US" href="https://example.com/en/product" />
<link rel="alternate" hreflang="zh-CN" href="https://example.com/zh/product" />
<link rel="alternate" hreflang="x-default" href="https://example.com/product" />

Read it back against the rules. Three entries: English-for-US, Simplified-Chinese-for-China, and a default. The English page includes its own en-US line (the self-reference), and so does the Chinese page — because the block is identical on both. The x-default row points at a neutral /product URL that detects the visitor's language and redirects. Use absolute URLs with the protocol and host, never relative paths, or Google may resolve them against the wrong origin.

If you serve non-HTML files such as PDFs, you cannot add a <link> tag — there is no <head> to put it in. For those, the same relationships go into an HTTP Link: response header set by your server or CDN. And for very large sites, the cleaner path is to declare all alternates inside your XML sitemap with <xhtml:link> elements, so the entire language map lives in one file instead of being scattered across thousands of page heads.

How this prevents duplicate-content problems

Translations are not duplicate content in the spammy sense, but a search engine that cannot tell them apart may behave as if they were — indexing only one variant, or consolidating signals onto a single URL and dropping the rest. hreflang is what disambiguates them. It tells Google the pages are deliberate language alternates of one another, so each is indexed on its own terms and surfaced to the right audience.

This is also where placement discipline pays off. Pick one method per page — head tags, HTTP header, or sitemap — and do not mix them with conflicting values, or you send Google contradictory signals. Keep hreflang consistent with your canonical tags too: a page should be its own canonical, not point its canonical at a different-language version, or you tell Google to collapse exactly the pages you just told it to keep separate. While you are auditing the technical layer, the meta tag generator is handy for keeping the canonical and language meta clean alongside the alternate links.

The first time I shipped hreflang on a real site, I did everything right except the self-reference — I listed every other locale on each page but never the page itself. Search Console flagged "no return tags" within a week, and rankings for the new locales stayed flat for a month while I chased phantom server bugs. The fix was one extra line per page. I now treat the self-reference as the first thing I check, not the last, and I paste the generated block into every URL in the set rather than hand-editing each one.

A short checklist before you ship

Run through this before pushing localized pages live:

  • Every page in the set lists all versions, including itself.
  • Language codes are well-formed: language first, optional region second, no bare region codes.
  • One placement method per page, with absolute URLs.
  • An x-default entry covers visitors who match nothing.
  • Canonical tags agree with hreflang rather than fighting it.

Get those five right and Google will route each searcher to the version meant for them, your translations will stop competing with each other, and the rankings you built in one language will carry over to the rest.


Made by Toolora · Updated 2026-06-13