HTML 2022: 20 Additional Observations From Analyzing the Web Almanac Data
Published on Oct 10, 2022 (updated Oct 18, 2024), filed under development, html (feed). (Share this on Mastodon or Bluesky?)
You saw the release of the HTTP Archiveâs 2022 Web Almanac? Yes, itâs liveâwith enough chapters to make me inform Frontend Dogma readers about a large number of articles coming up. (If youâre a developer and donât know the Web Almanac yetâyouâll probably like it!)
The Web Almanac is turning into an institution, one of those publications to look forward to each year, much like the State of JavaScript and the State of CSS.
This year, I had again the pleasure to analyze and document the data for HTML, in the Web Almanacâs Markup chapter. So while Iâd feel honored if you like to check out that chapter, let me honor you by sharing 20 things that I didnât get to call out in it.
20 More Observations from 30 Sheets of Data
The no-doctype regression: The chapter mentions it in passing, but the 2.5% â 2.7% regression of more pages not using a doctype (mobile; 2.7% â 3.0% on desktop) is another worrying indicator of a decaying craft. Itâs a small and perhaps temporary dent, but the trend is negative.
Conditional Comments zombies: The mobile data set revealed 2,885,132 âconditional comments.â All past loathing aside,âwhy are so many of these still around?
SVG use on the rise: In 2021, 46.4% of (unspecified?) pages used at least one SVG. In 2022, this is up 8.3% (to 54.7%).
Bring out the elements trash: Get yourself a nice cup of specialty coffee and scan the list of elements in use. (Pause.) Letâs please always validate our sitesâ HTML output. Doing so contributes to a higher-quality Web and a greater career (and a shorter Web Almanac elements list).
Long live
isindex
: One element can still be found in the Web Almanac data. And donât ask me why, I still love it. (It was deprecated with HTMLÂ 4.01.)Mind the âembedâ elements:
object
,embed
, andparam
are still alive.Pornhub uses custom elements: The HTML analysis included a sheet about âtop pages with custom elements.â Pornhub is one of them, though only using one nineteenth of what Mercari uses (2 vs. 38).
Someone uses 108 custom elementsâand 7 other (desktop) pages use more than 100 custom elements, too. I suppose these donât have to be 100+ unique elements, but didnât dig into that.
65.7% of pages contain a form: Rick called that out in the data, but I didnât get to review and discuss forms in the Markup chapter. The number seems big to me, though likely related to the data set still relying largely on homepages. How does the number look to you?
18.5% of all
input
s are of type âtextââcounting both thoseinput
elements explicitly settingtype=text
, and those that omit it because itâs the default.There are almost as many verbose instances of defining a submit button as there are concise ones: On 41% of pages we find
button
s with no type specified, on 32% we findbutton
s of type âsubmit.â Butbutton
s without a type are submit buttons, tooâi.e.,<button>
suffices.The median form contains 4
input
elements; the 10th and 25th percentile contain 2, the 75th percentile 7, and the 90th percentile 14input
s.Weâre using too many classes (4,300,024,711 on 7,940,685 pages). (Just as weâre using too many
div
s.)Weâre dealing with too much metadata cruft. 107 different metadata directives, each one added with the idea it was relevant, even important? Weâre adding metadata too easily. (Upgrade Your HTML IVâcoming out in Novemberâwill have a chapter about âmetadata madness.â)
Itâs great to see strong use of
data-*
.data-*
attributes allow to embed âcustom non-visible data,â and websites are making ample use of them. The reason blossoming use ofdata-*
is so much better than blooming use ofmeta
elements is thatdata-*
use is usually driven by site owner and developer needs (which they typically know), whilemeta
elements are typically dictated by third parties (who know their own needs, tooâbut which may or may not fit those of site owners and their developers).7.3% of âmobile pagesâ and 11.53% of âdesktop pagesâ set no viewport information: Unsurprising (this information is more useful on mobile) and fascinating (no regard or awareness for mobile on some sites, at all?) at the same time.
PNG is the most popular favicon format, and itâs becoming more popular: 2021, 35.3% of favicons were PNGs; 2022, itâs 37.7%. (On 10,035 pages in the mobile set, itâs spelled âpnj.â) But SVG is on the rise, too: 2021, 0.4%; 2022, 1%.
80% of links use
target=_blank
? (Did I read this right; why!)There are too many
javascript:
links: 25ish% of all links are of this type. (You probably thought, too, that these died in the early 2000âs.)mailto:
(0.3%) andtel:
(0.5%), more useful schemes, are far less popular.However, going by what you can find per page,
mailto:
(29.5%) andtel:
(26.6%) are more popular thanjavascript:
(22.2%). That is, on 3 of 10 pages you find amailto:
reference, on 1 of 4 atel:
one, and on 1 of 5 ajavascript:
one. Next?whatsapp:
andviber:
âon about 1 of 200 pages.
This is it! This is what I found combing through the data once more. Did I make a mistake? Did I miss something else thatâs worth highlighting (Iâm sure I did)? Have you shared your own highlights? Respond to the tweet for this post, and letâs start adding life to the #htmlalmanac tag. And yes!âif youâre into minimal, quality HTML, maybe youâll enjoy my HTML book series.
About Me
Iâm Jens (long: Jens Oliver Meiert), and Iâm a web developer, manager, and author. Iâve been working as a technical lead and engineering manager for companies youâve never heard of and companies you use every day, Iâm an occasional contributor to web standards (like HTML, CSS, WCAG), and I write and review books for OâReilly and Frontend Dogma.
I love trying things, not only in web development and engineering management, but also in other areas like philosophy. Here on meiert.com I share some of my experiences and views. (I value you being critical, interpreting charitably, and giving feedback.)