Comparing Page Language Declaration Setups in Screen Readers
Published on September 28, 2021 (↻ October 3, 2023), filed under Development (RSS feed for all categories).
One best practice in web development is to declare the document language via the lang
attribute, on the html
start tag. That is useful, as it aims to ensure that user agents can present, including read, each document correctly. It’s also controversial, because using the Content-Language
HTTP header is more efficient and because language detection software has become more and more effective—perhaps more effective than authors and editors are in marking up language.
To survey our situation with respect to language declaration, I set up a test page, in German so as not to be supported accidentally by all the screen readers developed in English-speaking countries. That test page knows five conditions: no language declared; language declared correctly through the lang
attribute; language declared correctly through the Content-Language
HTTP header; language declared correctly through both; and language declared incorrectly through both, with conflicting values.
This page I tested in four of the most popular screen readers; I ran the VoiceOver test on my own machine, and the NVDA, JAWS, and Narrator tests with Assistiv Labs. Big thanks to Assistiv Labs here, not only for their generally great product, but also their kind support after I struggled finding help with JAWS testing.
Here are the findings:
Test | NVDA | JAWS | VoiceOver | Narrator |
---|---|---|---|---|
No language declared | English ❌ * | English ❌ * | English ❌ * | English ❌ * |
lang attribute |
German âś… | German âś… | German âś… | German âś… |
Content-Language HTTP header |
German âś… | German âś… | German âś… | German âś… |
lang and HTTP header |
German âś… | German âś… | German âś… | German âś… |
Conflicting lang and HTTP header values |
English? * | Russian (following lang ) |
Russian (following lang ) |
Russian (following lang ), English * |
What does this mean?
First, I’m still cautious around the findings as I don’t regularly test with screen readers, and as all the software was recent. Maybe I missed something an experienced accessibility tester would know, and perhaps older tooling would produce different results.
But then, what do you think the results mean? An HTML minimalist, I’ve already been vocal sharing my take on the topic. Here I simply like to provide a few data points to validate. The topic of whether and how to declare page language is going to stay with us for longer, so we’ll probably see it covered again.
Many thanks to Thomas Steiner for reviewing this post.
Update (November 30, 2021)
My concerns about requiring lang
to be set on the html
start tag had first been based on an insufficient differentiation between (and missing reconciliation of) text-processing language and language(s) of the intended audience. While not meant to be the same, in reality, they end up being used the same way. Under that premise, the argument made should be more understandable.
A clarification, rather than an update. The W3C I18N Activity’s language Q&A and RFC 2616 differentiate between a document’s language and the language of its intended audience. Based on that differentiation, there’s an argument against using the Content-Language
HTTP header, and for @lang
in every document.
I don’t think this is useful. In practice, there doesn’t seem to be a difference between a document’s language and the language of its intended audience. When you write a document in English, you expect your audience to be people who speak English. Accordingly, there doesn’t seem to be any website actually working like this, either—instead, languages declared through Content-Language
and html@lang
usually match. (If both meant entirely different things, it wouldn’t make sense, either, to use the HTTP header as a fallback to determine a document’s language.)
Therefore, unless you share your web pages on DVDs, the argument is not a good one. It seems weak, even, as it impacts (but ignores) code economy and maintainability. When advocating against the use of a single Content-Language
header on the server-side, and instead asking to add @lang
in every document, the result is poor economy and maintainability.
No matter where you look (okay—where I look), the argument made for html@lang
is not strong. (If I don’t get something, tell me—just don’t quote the very resources I’m already considering.)
* As all screen reader installations were in English, this is likely to mean that the fallback is the language of the screen reader, rather than that the fallback is always going to be English.
About Me
I’m Jens (long: Jens Oliver Meiert), and I’m a frontend engineering leader and tech author/publisher. I’ve worked as a technical lead for companies like Google and as an engineering manager for companies like Miro, I’m a contributor to several web standards, and I write and review books for O’Reilly and Frontend Dogma.
I love trying things, not only in web development (and engineering management), but also in other areas like philosophy. Here on meiert.com I share some of my experiences and views. (Please be critical, interpret charitably, and give feedback.)
Read More
Maybe of interest to you, too:
- Next: Declaring Page Language—and Declaring Changes in Language
- Previous: Not Releasing Late on Fridays, a Matter of Courtesy
- More under Development
- More from 2021
- Most popular posts
Looking for a way to comment? Comments have been disabled, unfortunately.
Get a good look at web development? Try WebGlossary.info—and The Web Development Glossary 3K. With explanations and definitions for thousands of terms of web development, web design, and related fields, building on Wikipedia as well as MDN Web Docs. Available at Apple Books, Kobo, Google Play Books, and Leanpub.