70% Repetition in Style Sheets: Data on How We Fail at CSS Optimization
Published on May 31, 2017 (⻠September 17, 2024), filed under Development (RSS feed for all categories).
If the optimization of CSS is of particular import to you, Iâve collected several concepts in a brief book: CSS Optimization Basics.
Teaser: Check on how many declarations you use in your style sheets, how many of those declarations are unique, and what that means.
Background. In 2008 Iâve argued that using declarations just once makes for a key method to DRY up our style sheets (this works because avoiding declaration repetition is usually more effective than avoiding selector repetitionâdeclarations are often longer). Iâve later raised the suspicion that the demand for variables would have been more informed and civilized had we nailed style sheet optimization. What I havenât done so far is gather data to show how big the problem really is. Thatâs what Iâve done in the last few weeks. (You can tell that the topic is dear to me.)
In a Google spreadsheet Iâve collected the Top 200 of content sites in the The Moz Top 500, and taken another 20 sites related to web development for comparison. (Iâve also added 3 of my sites out of curiosity.) Iâve then used the extremely useful CSS Stats to determine the total number of CSS declarations, as well as the number of unique declarations, to calculate ratios as well as averages: You get the idea as soon as you check out said spreadsheet.
Figure: A spreadsheet full of data.
A note on the sample: I removed those Moz top sites that were duplicates, one-pagers, redirectors, and placeholders (like international Google homepages, sedoparking.com, blogspot.com, youtu.be, goo.gl, bit.ly, &c. pp.), as well as those that derailed CSS Stats because of validation issues (of which there were so many, I stopped counting respective sitesâvalidate! itâs part of our profession). I topped off using the next popular sites available, and given that I went as far as to include website #321 in the Moz list, you can see that quite a few sites were invalid either way. Although Iâve taken samples to verify that CSS Stats counts correctly, CSS validation issues might still distort the numbers (it did with Technorati, which I then removed); I accept this risk for workload reasons and the fact that lower declaration counts favor, not penalize, respective sites (less repetition reported than actually present).
A side note: About 37% of sites in the sample are still on http, including many commercial sites. (Compiling the list I believe that interestingly, none of the handful of popular Chinese sites have been on https.) Enabling https has become so easy, thereâs not even an excuse for us with small sites not to switch, let alone for those who run some of the most frequented websites on the planet. Letâs encrypt.
Contents
Some Theory
Before we dissect the numbers, letâs first quickly establish what one can consider good CSS development practice:
-
As with all code, donât repeat yourself (DRY).
-
In CSS, an effective way to not repeat yourself is to use declarations just once. The resulting repetition of selectors is less of a problem because declarations are usually longer than selectorsâand yet variable selector length makes using declarations once a soft rule that still requires thinking. (We should also really think of selector groups here, as we only deal with selector repetition once all the selectors for a given rule are identical. That we repeat
h1
in another rule forh1, h2
is fine then, becauseh1
andh1, h2
are different selector groupsâand hence theyâre not a problem in terms of DRY CSS.) -
No repetition of declarations is theoretically attainable, but in practice, two things interfere: Not just in cases when selector order is mandated (another oft-forgotten subject, yet there are detailed proposals on how to standardize selector sorting), the cascade may not permit grouping of all selectors; and when strict separation of modules (CSS sections) is important, one may also tolerate some repetition. These two pieces deserve more elaboration, but relevant here is the note that strict separation of modules should not be used as a blanket refusal to curb declaration repetitionâas the data show, that would be foolish.
Now, what is a reasonable amount of repetition? From my experience, 10â20%; in other words, 80â90% of a style sheet should consist of unique declarations. Reality, however, looks vastly different, and here we get to the 70%.
Some Numbers
The most important observation first: Per our 220 websites sample, the average number of declarations is 6,121; the average number of unique declarations is 1,698; and the average ratio of unique to total declarations 0.28 (median 0.34), meaning that 72% of the average style sheetâs declarations content is repetition (I rounded down in the post title). That is, per what we just established as feasible, 3.5â7 times more repetition than we really want (and need) to see.
Additional numbers and some interpretation before we talk more about DRY in CSS.
-
33 of the 220 websites use more than 10,000 declarations. Here the point weâre trying to make, that most websites repeat too many declarations, is particularly important: When we look at the data from this angle, no website tested and, given the high profile of the websites here, perhaps barely a website at all, should need more than a total of 5,000 declarations. The exceptions from the sample (the ones with the most unique declarations) rather prove the rule, and so this is an important observation as it may be able to provide a great red flag for teams working on large-scale sites.
-
The website with the most declarations, Kickstarter, uses 33,938 declarations; the next biggest website in terms of total declarations, Engadget, follows with 28,909 declarations. 8 more websites also use more than 20,000 declarations.
-
Kickstarter is not the website with the most unique declarations, however, though they come in second with 6,932 unique declarations; first here is Booking.com with 7,006 unique declarations. These high unique declaration counts probably help both not to become the sites with the most repetition (with ratio 0.2 and thus 80% repetition of declarations for Kickstarter, and 0.26/74% for Booking.com).
-
The most repetition we instead find on Engadgetâs website, where only 2,382 of 28,909 declarations are unique (ratio 0.08/92%)âallow this to sink inâ, meaning that quite possibly, more than 20,000 declarations could have been skipped. (Numbers like these should give us much to think.)
-
TED (0.09/91%), the Sydney Morning Herald (0.11/89%), Bloomberg (0.13/87%), the New Yorker (0.14/86%), the Guardian, and the Telegraph (both 0.15/85%) like to repeat themselves very much, too.
-
Very little repetition we find on my own websites (coderesponsibly.org, meiert.com, uitest.com), however I excluded those from the sample; Google.com should because of the site being so specialized, and thus code-wise rather small, be removed, too (they come in at 75 total/63 unique/0.84 ratio); that makes Yahoo be the first website of moderate complexity with relatively little repetition (1,406/952/0.68), followed by Wikimedia (261/167/0.64), Chris Heilmann (269/170/0.63), and, also more complex, WordPress.org (1,636/996/0.61). I believe thatâentirely ignoring sites that I may have DRYed, which once included BloggerâYahoo and WordPress.org provide very compelling cases for that DRY CSS works.
-
The 20 web development sites sample does fare better than the 200 Moz sites sample: 7 out of 20 attain a medium DRY score (>0.5, or less than 50% repetition), compared to 11 out of 200. The web dev sites are, on average, significantly less complex than the ones in the 200 Moz sample, but one may suspect, and hope, that web developers (on their own sites), write better, and more DRY, CSS. Personally Iâm happy we see such difference for our own websites can be great showcases.
-
There are certainly more things to note; please review the spreadsheet and share your own observations.
Some Conclusion
The conclusion is the one you probably arrived at, too: In CSS, we repeat ourselves too much.
While itâs absolutely, practically possible to limit declaration repetition to 10â20%, reality averages 72% (median 66%).
That is bad news.
That is bad news, primarily, because this excess of repetition is the definition of bad code. CSS with this much repetition, this much bloat, is slow. CSS with this much repetition is also expensive: Itâs expensive to create, and itâs particularly expensive to maintain (not even variables will save those who, on average, repeat each declaration nine times, like Engadget and TED).
That is bad news, secondarily, because it underscores how little attention we pay to optimize style sheets for production. It seems to indicate that some of our sources fail us, that they talk the wrong talk, or walk the wrong walk (this is something the âanti-pattern here, anti-pattern thereâ folk may actually want to look at). It seems to have lured us to prioritize differently what we most need in the CSS specs (cf., again, variables and constants that never had to be in the spec).
That, then, suggests at least two things we can do now:
-
Individually and collectively, letâs focus more on how to optimize, notably DRY, our style sheets. That most definitely includes all final production code. Donât repeat yourself; look into using declarations just once, as long as you donât begin to repeat selectors excessively (the rule has limitations; see notes).
Now if thereâs one thing missing, something Iâve only scratched in the original post on DRY CSS, is that using declarations just once means a different way of working with style sheets. I havenât cared much about this here because this post is about what can make CSS DRY, not how the process is best done, but itâs something weâll all notice when taking the points made to heart. Iâm working on a follow-up post on how using declarations only once looks like in practice, offering a closer look at all the implications, and I encourage the community to explore the issue together.
One thing our analysis here prompts to look at, then, is how different exactly those style sheets are, and how much better they perform, that use 10â20% as opposed to 70% declaration repetition, because some of the improvements are lost with increased selector repetition. (The loss is often comparatively small, but not always, and so it deserves more scrutiny.)
-
Letâs give more stage time to those who work on the really big sites and manage to achieve good code quality. The work of research coders has always looked glamorous and will continue to draw attention, but we can learn little from it when it comes to high quality production code. (Here is one real caveat we observe around DevRel teams. They are rarely the ones who keep their businessesâ infrastructure running.) High quality production code on complex, large-scale sites is more useful to base decisions on than much that just came out of a laboratory.
Thereâs much more to elaborate on and to qualifyâthe âanswerâ is somewhere in the middleâbut I think that the basic ideas are clear. If you wish to look at other aspects of high quality web development, there are plenty more articles in the archives (regularly reviewed), and if you wish to read one of my brief books on web development, I believe that The Little Book of HTML/CSS Frameworks (updated) gives a great perspective on tailored web development and why if you want something excellent, your best bet, granted the expertise and resources, is to do it yourself; you, that includes your company and your team, for you, not the third parties, know best what you need.
Oh. Also, please sort declarations alphabetically.
Appendix
Example
I wish to not just point to my projectsâ style sheets for how DRY works there, but also pick a rather complex case from the sample to perform post-optimization on that one. I selected Yandex (some time in Aprilâthe numbers changed a little since).
Here are files and data showing what I did, which was pretty much roughly formatting the style sheets and then DRYing them up (but not reviewing and optimizing them beyond that). The results are interesting for they illustrate a limitation of exclusive focus on declaration avoidance:
File | Total declarations | Unique declarations | Ratio | Size (bytes) |
---|---|---|---|---|
Yandex Original | 5,443 | 1,921 | 0.35 | 201,410 |
Yandex Original, Formatted | 5,443 | 1,921 | 0.35 | 234,792 |
Yandex Semi-DRY, Formatted | ? | 1,914 | ? | ? |
Yandex DRY, Formatted | 2,083 | 1,914 | 0.92 | 237,033 |
Yandex DRY | 2,083 | 1,914 | 0.92 | 220,513 |
Feel free to poke around and verify (and ponder what else you can tell looking at these and your own development styles); meanwhile, here are some things to note:
-
I wanted to demonstrate both moderate (âSemi-DRYâ) and aggressive optimization (âDRYâ) but had to go all in for aggressive optimization (for the ones whoâve never seen a declaration-DRY style sheet, thatâs how they can turn outâwith additional optimization, the style sheet wouldnât nearly look as âscaryâ). The reason is actually trivial: I didnât have access to the raw, likely structured and commented production files which would have made it easier to go for gentler module-based optimization (as opposed to the file-based optimization done) to show moderate editing.
Thatâs not a problem in terms of achieving maximum effect, however aggressive optimization made the style sheet actually larger (because of lengthy selectors being repeated âtoo oftenââthatâs the main reason why using declarations just once is not a hard rule), and more likely to lead to side-effects. So when you (or the Yandex team) are working with the optimized files, make sure to take the results with a grain of salt. Moderate optimization would have been easier to digest and easier to just plug in. When doing heuristic and aggressive post-optimization, turbulences are to be expected, and here weâre looking at somewhat of a paradigm shift for writing CSS.
-
Yandex, likely reflecting a modular instead of holistic development approach, ended up presenting a sub problem of the declaration repetition problem extensively discussed here: great repetition of media queries. The production files I had pulled included 56 media queries; these could be brought down to 36, which only then allowed to DRY up their contents.
-
Why fewer unique declarations? Because some optimization I couldnât resist. If I see that thereâs both
border: none;
andborder: 0;
(the latter to be preferred), then I consolidate declarations, like we all should.
FAQ
A few questions came up when asking for feedback on the first version of this article; as these touched on important points Iâm feeling free to include them as a complement. I might add more Q&A depending on what additional feedback comes in.
Does this scale for larger sites, especially when styles are split across multiple files?
The larger the site the more repetition thereâll be, that looks like a fair statement to me. Yet this should be addressed by suggesting 10â20% repetition to be fine. On small sites, 0â5% may be attainable. As for whether the process actually works for large sites, yes, absolutely. The Yandex example above should serve as a proof of concept; as for separate files, one can at the very least aim to DRY respective sub style sheets.
What about mixins or other techniques which prefer repetition of declarations?
Mixins are a special issue because they are certainly convenient, but the very
automation (compilation) that makes them so convenient stands in the way of avoiding repetition. Itâs a problem I donât have an answer ready for.
Does it still make sense from a maintenance perspective to avoid repetition when declarations are only coincidentally the same?
Yes, because with the idea of tailoring, everything that matters is the present. So when theyâre the same, they should, ideally, not be duplicated. Also, in my mind thereâs only the idea of practical, not coincidental, in a sense that if declarations are the same, one should aim to consolidate them.
How does e.g. moving a width declaration away from the height declaration affect maintainability and readability?
That can get a bit messyâso both could take a hitâbut there are two things that soften the impact: 1) the freedom to limit avoiding repetition to sections (which is reflected by 10â20% âpermissibleâ overall repetition) and 2) the observation that this is more of a restriction around general declarations, whereas specific page elements would be so unique to actually maintain bundling of their declarations. Iâm inclined to add 3), that with this approach, one has a great opportunity to consolidate elements and class names, too, so that what at first seems to be inconvenient separation leads to welcome structural improvements elsewhere.
What about repetition because of vendor-specific extensions?
Vendor-specific extensions are relevant for style sheet managementâextensions that arenât needed anymore should regularly be removedâbut not for the DRY keeping of CSS, because respective declarations are, de facto, different. -webkit-transition
is not transition
.
Why is Atomic CSS (or x) not being mentioned?
Why should it? (Please comment or email.) Atomic CSS, for instance, has not been mentioned because itâs not the solution to the different issues here. Perhaps itâs not even a solution: Atomic CSS style sheets do indeed appear to be more declaration-DRY than others, but not to an extent where we should be convinced that thatâs all there is to say about DRY CSS, or about CSS optimization in general. This would stop the discussion right before it started, too. Furthermore, Atomic CSS violates some of the most fundamental principles for maintainability as well as for writing HTML; even though these principles reflect a traditional paradigm in web development, they represent a valid view at how to code websites that should likewise be addressed and handled properly.
⧠The three main points of this long article summarized, then: We repeat ourselves too much in CSS; using declarations just once is often one solid avenue to avoid repetition; together, we need to put more focus on style sheet optimization. That focus is, indeed, most important: There are some tough problems in here for which I make suggestions, but we, plural, will still need to wrap our heads around them.
Thanks Tony Ruscoe and Kevin Khaw for reviewing and helping to improve this article.
About Me
Iâm Jens (long: Jens Oliver Meiert), and Iâm a frontend engineering leader and tech author/publisher. Iâve worked as a technical lead for companies like Google and as an engineering manager for companies like Miro, Iâm somewhat close to W3C and WHATWG, and I write and review books for OâReilly and Frontend Dogma.
I love trying things, not only in web development (and engineering management), but also in other areas like philosophy. Here on meiert.com I share some of my views and experiences.
If youâd like to do me a favor, interpret charitably (I speak three languages, and they do collide), yet be critical and give feedback, so that I can make improvements. Thank you!
Comments (Closed)
-
On June 9, 2017, 19:16 CEST, Daniel J Dominguez said:
I feel this isnât a good way of handling CSS. This method can fix a few inconsistencies in the declarations, like using a slightly off color, but that can be fixed using variables. I think that CSS repetition is a necessity, simply to make it easier to maintain. I mean if you have a button, you donât necessarily want that linked to a content style simply because they have the same padding and margin. I feel that CSS should be seperated by intent and context of the element, not by its styles. By this and leveraging variables, you can have more maintainable code. You may argue that this would lead to bigger file sizes, and it does, in certain contexts. Letâs take yours. So I loaded up your files in three seperate tabs and compared them. Original, O-Formatted, and DRY. File sizes were as you said, but that isnât interesting. What is, is how they were sent. You see, GZIP is a very nice compression format that browsers can handle. Turns out you are serving GZIP, so that makes it easier to test. Original was served in 38.8KB, O-Formatted was served in 40.4KB, and DRY was served in 41.1KB. Interesting how even though O-Formatted was a bigger file size, it had a smaller download than DRY. That is because GZIP really likes repetition. It is able to find these sorts of patterns and cut the download to what is necessary to recreate it. So why make it harder for maintenance, if a computer can just make use of these repetitions and optimize there.
-
On June 11, 2017, 9:12 CEST, Vergil Penkov said:
Declaratin DRYness (letâs say via mixins) seems okay considering GZIP based on Harryâs article here. Any thoughts on that?
-
On June 12, 2017, 17:25 CEST, Ray Estrada said:
What about mixins or other techniques which prefer repetition of declarations?
To reduce declarations we could use @extend with placeholders rather than @include mixins. This achieves the goal of this article. However there are two real downsides:
1. Code inheritance.
When Sass compiles all selectors are put together in the original declaration location. This pulls the selector out of the existing inheritance structure into another location which sometimes leads to styles losing top inheritance. A possible solution to fix this is specificity of the selector, but then that also leads to a challenge of reusability of code.2. @extend placeholders cannot be used in media queries.
This is a large one. The nature of the way the selectors are strung together to reduce the redundancy of code means that you canât use media queries. A possible solution is to use media query changes within the placeholder itself rather than in the unique selector, however this is not always ideal or possible.I think moving forward it would be wise to think about how we can leverage placeholders & mixins and discover a pattern that works to get us closer to DRY code.
-
On June 14, 2017, 11:05 CEST, Ben Frain said:
Hello Jens, thanks for writing this. I probably need to re-read this a couple of times but after a first pass, my overriding question is, why?
I donât think anyone would argue that unneeded repetition is a ‘badâ thing but what is your ultimate goal(s)? Reduced file size? UA parse and paint speed? Maintainability?
Gzip will largely negate repetitive strings from a file size perspective and that seems to be proved by your Appendix example (am I right in reading that the Yandex DRY is a larger file than the Yandex original?).
I would love to see further thoughts from you on what the actual gains are? Developer ergonomics? Time to first paint improvements? Size?
Thanks again for the post. -
On June 14, 2017, 11:08 CEST, Ben Frain said:
Sorry, forgot to add in the previous post, canât many of the optimisations you hope for be automated by tools like cssnano (http://cssnano.co/)? I feel like much of problems you see could be greatly diminished by tools rather than authoring practices?
-
On June 14, 2017, 18:51 CEST, Lea Verou said:
DRY is not a goal in itself, the goal is maintainable stylesheets. DRY is just a tool to help us get there. Extreme, rigid adherence to DRY can actually hinder readability and maintainability, both in CSS code and programming code. Trying to deduplicate tiny amounts of code (like 1 declaration) can very easily hinder readability. I would be more interested in duplication of sets of > 3 declarations and how common that is. Also keep in mind that some of the duplication could be generated by preprocessors and not present in the source code.
Read More
Maybe of interest to you, too:
- Next: Regarding the Fermi Paradox
- Previous: The Great Web Maintainability Survey
- More under Development
- More from 2017
- Most popular posts
Looking for a way to comment? Comments have been disabled, unfortunately.
Get a good look at web development? Try WebGlossary.infoâand The Web Development Glossary 3K (2023). With explanations and definitions for thousands of terms of web development, web design, and related fields, building on Wikipedia as well as MDN Web Docs. Available at Apple Books, Kobo, Google Play Books, and Leanpub.