On Semantics in HTML
Published on OctĀ 26, 2011 (updated FebĀ 5, 2024), filed under development, html, semantics (feed). (Share this on Mastodon orĀ Bluesky?)
This and many other posts are also available as a pretty, well-behaved ebook: On Web Development.
As web developers we like to talk about āsemantic markup,ā a somehow inaccurate short form for āmarkup that is meaningful and used how itās supposed to be used.ā We also like discussions around what markup is appropriate when, and to ramble on markup that is āmeaningless.ā In many cases markup decisions and discussions donāt stop at HTML elements but also cover ID and class names.
Now when weāre talking about āsemantic markup,ā where is all that meaning actually coming from?
In essence, semantics in HTML is all about who and how many agree on the meaning, both of elements and ID and class names.
More thoughts.
In general you can say that initially, HTML and especially XML elements have no meaning.
In a way, however, we assign standards bodies like the W3C the authority (or accept their mandate) to tell us what HTML elements mean and what their purpose is. That is, only through these bodiesā directive do we agree on
p
elements representing paragraphs,ul
andol
elements representing lists, and alsodiv
elements carrying little semantic weight.It would well be possible to both not assign these elements any meaning (thatās what authors involuntarily did with using tables for layout), or to assign them a different meaning (why would
p
not be great for parentheses?).Similarly, we accept certain communitiesā interpretation of what markup may mean. Think microformats. Their markup constructs donāt have any meaning, either, per se, but with a lot of people agreeing on their purpose they do become meaningful.
Next in line is common sense. A class like āerrorā or an ID āauthorā has meaning because it defines a purpose that can be understood and also agreed on. With these names also being advisable for maintainability reasons we now know why functional ID and class names are most useful.
Then the terrain gets a bit more rough with generic names like āauxā or āalt[ernative]ā. Here weāre leaving the semantics trail as generic names are harder to grasp, yet their purpose is less to add meaning but to avoid pseudo-meaning and serve as helper constructs.
Last are obfuscated, random, or presentational names. They are meaningless and should be avoided. Presentational names especially as they impose the biggest threat to maintainability.
As this list goes from āmost meaningā to āleast meaning,ā you can see why blockquote
can rather be accepted to mean a quote than āvcardā for an hCard container than āloginā for a sign-in field than āauxā for a helper class than āredā for I donāt know. It also shows why you donāt need to have a class ālist-itemā on an li
element as it is already defined as a list item on a higher, namely the spec level.
As I love disclaimers, this may not be all there is to say about the topic but it was good enough for a write-down-everything-that-comes-to-mind-on-semantics-now post.
Update (August 5, 2014)
In the meantime, in 2012, I also wrote an article about semantics for Google. I believe it adds detail and value to the points made here.
Update (November 3, 2014)
Web Components and custom elements rank the same as IDs or class names in terms of meaning, and the same naming best practices apply.
About Me
Iām Jens (long: Jens Oliver Meiert), and Iām a web developer, manager, and author. Iāve been working as a technical lead and engineering manager for companies youāve never heard of and companies you use every day, Iām an occasional contributor to web standards (like HTML, CSS, WCAG), and I write and review books for OāReilly and Frontend Dogma.
I love trying things, not only in web development and engineering management, but also in other areas like philosophy. Here on meiert.com I share some of my experiences and views. (I value you being critical, interpreting charitably, and giving feedback.)