Valid lang attribute language codes
Last updated:
Related Guides
HTML lang attribute must use a valid language code
Set the page’s primary language with a valid code on the html lang attribute. If the code is missing or invalid, assistive technologies may fall back to the wrong language. This affects multilingual users and anyone relying on screen readers or Braille output.
Why It Matters
Screen readers pick pronunciation, voice, and accent from the page language. An incorrect or missing code produces garbled speech.
Braille translation and hyphenation depend on language tables. The wrong tag can yield incorrect contractions or character mapping.
Users with cognitive or language processing disabilities rely on consistent pronunciation and predictable reading. Directional languages (Arabic, Hebrew) also require correct direction for reading and caret movement.
Common Causes
- No lang attribute on the root html element.
- Nonstandard values (eng, en_US, english) instead of BCP 47 tags (en, en-US).
- Putting lang on body only, or conflicting values on multiple root-level elements.
- Client-side rendering that removes or replaces the html element without preserving lang.
- Not marking inline language changes for foreign words or quotes.
- Missing dir for right-to-left languages or using CSS alone to fake direction.
How to Fix
- Identify the page’s primary language.
- Use the language of the main content users read, not the user’s locale or geolocation.
- Set a valid BCP 47 tag on the html element.
- Use hyphens, not underscores. Example: en, en-US, es, fr-CA, ar.
- Recommendation: use conventional casing (language lowercase, region uppercase), though tags are case-insensitive.
- Only one lang attribute on html. Avoid conflicting values.
- Mark language changes within content.
- Wrap foreign words, phrases, or passages with an element that includes a lang attribute.
- Keep the scope tight to just the text that changes language.
- Set text direction when needed.
- For RTL languages (e.g., ar, he, fa), use
dir="rtl". Put it on html if the whole page is RTL, or on the specific element for isolated runs. - Use
dir="ltr"for LTR runs inside RTL contexts when needed.dir="auto"can help with mixed input.
- For RTL languages (e.g., ar, he, fa), use
- Validate your tags.
- Verify language tags against BCP 47/IANA subtag registry. Avoid made-up or deprecated values.
Standards alignment: WCAG 2.2 — 3.1.1 Language of Page, 3.1.2 Language of Parts.
How to Test
- Source/DevTools check: Inspect the root
<html>. Confirm lang is present, formatted with hyphens (e.g., en-GB), and matches the primary content. - Static analysis: Use automated tooling (e.g., linters or accessibility scanners) to flag missing or malformed lang. Manually confirm the value is a valid BCP 47 tag.
- Screen reader check (desktop): With NVDA/JAWS/VoiceOver, read the page and verify correct voice and pronunciation. Move through any inline language changes and confirm voice switches appropriately.
- Mobile screen reader check: With TalkBack/VoiceOver on a phone, ensure pronunciation and voice switching match the tagged languages.
- Directionality check (if applicable): For RTL pages or segments, confirm reading order, punctuation placement, and caret navigation are correct.
- Keyboard: Not applicable for this issue, but ensure testing navigation isn’t hindered by language/dir changes.
Good Example
!doctype html>
<html lang="en-GB">
<head>
<meta charset="utf-8">
<title>Store</title>
</head>
<body>
<p>Order summary</p>
<p>Offer: <span lang="es">envío gratuito</span> en todos los pedidos.</p>
<p lang="ar" dir="rtl">هذا مثال قصير.</p>
</body>
</html>Why it’s good:
- Valid BCP 47 tag on html (en-GB).
- Inline Spanish phrase marked with
lang="es". - Arabic text tagged with lang and
dir="rtl"for correct direction.
Bad Example
!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>Store</title>
</head>
<body lang="en_US">
<p>Offer: envío gratuito en todos los pedidos.</p>
<p lang="ar">نص عربي بدون اتجاه</p>
</body>
</html>What’s wrong:
- Missing lang on html.
- Invalid value with underscore (en_US). Correct is en-US on html.
- Spanish phrase not marked as a language change.
- Arabic lacks
dir="rtl", causing incorrect reading order.
Quick Checklist
- html element has a single, valid BCP 47 lang value (e.g., en, en-US, fr-CA).
- Use hyphens, not underscores; avoid nonstandard codes.
- Mark foreign words/phrases with a lang attribute on the smallest containing element.
- Apply
dir="rtl"for RTL languages (anddir="ltr"ordir="auto"when needed). - Confirm correct pronunciation and voice switching in screen readers.
- Validate tags with automated tools, then verify manually.
- Keep templates/SPA frameworks from stripping or overriding the root lang.
- Align with WCAG 2.2: 3.1.1 (page) and 3.1.2 (parts).