70% Repetition in Style Sheets: Data on How We Fail at CSS Optimization

Published on May 31, 2017 (↻ September 17, 2024), filed under (RSS feed for all categories).

If the optimization of CSS is of particular import to you, I’ve collected several concepts in a brief book: CSS Optimization Basics.

Teaser: Check on how many declarations you use in your style sheets, how many of those declarations are unique, and what that means.

Background. In 2008 I’ve argued that using declarations just once makes for a key method to DRY up our style sheets (this works because avoiding declaration repetition is usually more effective than avoiding selector repetition—declarations are often longer). I’ve later raised the suspicion that the demand for variables would have been more informed and civilized had we nailed style sheet optimization. What I haven’t done so far is gather data to show how big the problem really is. That’s what I’ve done in the last few weeks. (You can tell that the topic is dear to me.)

In a Google spreadsheet I’ve collected the Top 200 of content sites in the The Moz Top 500, and taken another 20 sites related to web development for comparison. (I’ve also added 3 of my sites out of curiosity.) I’ve then used the extremely useful CSS Stats to determine the total number of CSS declarations, as well as the number of unique declarations, to calculate ratios as well as averages: You get the idea as soon as you check out said spreadsheet.

Just a screenshot.

Figure: A spreadsheet full of data.

A note on the sample: I removed those Moz top sites that were duplicates, one-pagers, redirectors, and placeholders (like international Google homepages, sedoparking.com, blogspot.com, youtu.be, goo.gl, bit.ly, &c. pp.), as well as those that derailed CSS Stats because of validation issues (of which there were so many, I stopped counting respective sites—validate! it’s part of our profession). I topped off using the next popular sites available, and given that I went as far as to include website #321 in the Moz list, you can see that quite a few sites were invalid either way. Although I’ve taken samples to verify that CSS Stats counts correctly, CSS validation issues might still distort the numbers (it did with Technorati, which I then removed); I accept this risk for workload reasons and the fact that lower declaration counts favor, not penalize, respective sites (less repetition reported than actually present).

A side note: About 37% of sites in the sample are still on http, including many commercial sites. (Compiling the list I believe that interestingly, none of the handful of popular Chinese sites have been on https.) Enabling https has become so easy, there’s not even an excuse for us with small sites not to switch, let alone for those who run some of the most frequented websites on the planet. Let’s encrypt.

Contents

  1. Some Theory
  2. Some Numbers
  3. Some Conclusion
  4. Appendix
    1. Example
    2. FAQ

Some Theory

Before we dissect the numbers, let’s first quickly establish what one can consider good CSS development practice:

Now, what is a reasonable amount of repetition? From my experience, 10–20%; in other words, 80–90% of a style sheet should consist of unique declarations. Reality, however, looks vastly different, and here we get to the 70%.

Some Numbers

The most important observation first: Per our 220 websites sample, the average number of declarations is 6,121; the average number of unique declarations is 1,698; and the average ratio of unique to total declarations 0.28 (median 0.34), meaning that 72% of the average style sheet’s declarations content is repetition (I rounded down in the post title). That is, per what we just established as feasible, 3.5–7 times more repetition than we really want (and need) to see.

Additional numbers and some interpretation before we talk more about DRY in CSS.

Some Conclusion

The conclusion is the one you probably arrived at, too: In CSS, we repeat ourselves too much.

While it’s absolutely, practically possible to limit declaration repetition to 10–20%, reality averages 72% (median 66%).

That is bad news.

That is bad news, primarily, because this excess of repetition is the definition of bad code. CSS with this much repetition, this much bloat, is slow. CSS with this much repetition is also expensive: It’s expensive to create, and it’s particularly expensive to maintain (not even variables will save those who, on average, repeat each declaration nine times, like Engadget and TED).

That is bad news, secondarily, because it underscores how little attention we pay to optimize style sheets for production. It seems to indicate that some of our sources fail us, that they talk the wrong talk, or walk the wrong walk (this is something the “anti-pattern here, anti-pattern there” folk may actually want to look at). It seems to have lured us to prioritize differently what we most need in the CSS specs (cf., again, variables and constants that never had to be in the spec).

That, then, suggests at least two things we can do now:

  1. Individually and collectively, let’s focus more on how to optimize, notably DRY, our style sheets. That most definitely includes all final production code. Don’t repeat yourself; look into using declarations just once, as long as you don’t begin to repeat selectors excessively (the rule has limitations; see notes).

    Now if there’s one thing missing, something I‘ve only scratched in the original post on DRY CSS, is that using declarations just once means a different way of working with style sheets. I haven’t cared much about this here because this post is about what can make CSS DRY, not how the process is best done, but it’s something we’ll all notice when taking the points made to heart. I’m working on a follow-up post on how using declarations only once looks like in practice, offering a closer look at all the implications, and I encourage the community to explore the issue together.

    One thing our analysis here prompts to look at, then, is how different exactly those style sheets are, and how much better they perform, that use 10–20% as opposed to 70% declaration repetition, because some of the improvements are lost with increased selector repetition. (The loss is often comparatively small, but not always, and so it deserves more scrutiny.)

  2. Let‘s give more stage time to those who work on the really big sites and manage to achieve good code quality. The work of research coders has always looked glamorous and will continue to draw attention, but we can learn little from it when it comes to high quality production code. (Here is one real caveat we observe around DevRel teams. They are rarely the ones who keep their businesses’ infrastructure running.) High quality production code on complex, large-scale sites is more useful to base decisions on than much that just came out of a laboratory.

There’s much more to elaborate on and to qualify—the “answer” is somewhere in the middle—but I think that the basic ideas are clear. If you wish to look at other aspects of high quality web development, there are plenty more articles in the archives (regularly reviewed), and if you wish to read one of my brief books on web development, I believe that The Little Book of HTML/CSS Frameworks (updated) gives a great perspective on tailored web development and why if you want something excellent, your best bet, granted the expertise and resources, is to do it yourself; you, that includes your company and your team, for you, not the third parties, know best what you need.

Oh. Also, please sort declarations alphabetically.

Appendix

Example

I wish to not just point to my projects’ style sheets for how DRY works there, but also pick a rather complex case from the sample to perform post-optimization on that one. I selected Yandex (some time in April—the numbers changed a little since).

Here are files and data showing what I did, which was pretty much roughly formatting the style sheets and then DRYing them up (but not reviewing and optimizing them beyond that). The results are interesting for they illustrate a limitation of exclusive focus on declaration avoidance:

File Total declarations Unique declarations Ratio Size (bytes)
Yandex Original 5,443 1,921 0.35 201,410
Yandex Original, Formatted 5,443 1,921 0.35 234,792
Yandex Semi-DRY, Formatted ? 1,914 ? ?
Yandex DRY, Formatted 2,083 1,914 0.92 237,033
Yandex DRY 2,083 1,914 0.92 220,513

Feel free to poke around and verify (and ponder what else you can tell looking at these and your own development styles); meanwhile, here are some things to note:

FAQ

A few questions came up when asking for feedback on the first version of this article; as these touched on important points I’m feeling free to include them as a complement. I might add more Q&A depending on what additional feedback comes in.

Does this scale for larger sites, especially when styles are split across multiple files?

The larger the site the more repetition there’ll be, that looks like a fair statement to me. Yet this should be addressed by suggesting 10–20% repetition to be fine. On small sites, 0–5% may be attainable. As for whether the process actually works for large sites, yes, absolutely. The Yandex example above should serve as a proof of concept; as for separate files, one can at the very least aim to DRY respective sub style sheets.

What about mixins or other techniques which prefer repetition of declarations?

Mixins are a special issue because they are certainly convenient, but the very
automation (compilation) that makes them so convenient stands in the way of avoiding repetition. It’s a problem I don’t have an answer ready for.

Does it still make sense from a maintenance perspective to avoid repetition when declarations are only coincidentally the same?

Yes, because with the idea of tailoring, everything that matters is the present. So when they’re the same, they should, ideally, not be duplicated. Also, in my mind there’s only the idea of practical, not coincidental, in a sense that if declarations are the same, one should aim to consolidate them.

How does e.g. moving a width declaration away from the height declaration affect maintainability and readability?

That can get a bit messy—so both could take a hit—but there are two things that soften the impact: 1) the freedom to limit avoiding repetition to sections (which is reflected by 10–20% “permissible” overall repetition) and 2) the observation that this is more of a restriction around general declarations, whereas specific page elements would be so unique to actually maintain bundling of their declarations. I’m inclined to add 3), that with this approach, one has a great opportunity to consolidate elements and class names, too, so that what at first seems to be inconvenient separation leads to welcome structural improvements elsewhere.

What about repetition because of vendor-specific extensions?

Vendor-specific extensions are relevant for style sheet management—extensions that aren’t needed anymore should regularly be removed—but not for the DRY keeping of CSS, because respective declarations are, de facto, different. -webkit-transition is not transition.

Why is Atomic CSS (or x) not being mentioned?

Why should it? (Please comment or email.) Atomic CSS, for instance, has not been mentioned because it’s not the solution to the different issues here. Perhaps it’s not even a solution: Atomic CSS style sheets do indeed appear to be more declaration-DRY than others, but not to an extent where we should be convinced that that’s all there is to say about DRY CSS, or about CSS optimization in general. This would stop the discussion right before it started, too. Furthermore, Atomic CSS violates some of the most fundamental principles for maintainability as well as for writing HTML; even though these principles reflect a traditional paradigm in web development, they represent a valid view at how to code websites that should likewise be addressed and handled properly.

❧ The three main points of this long article summarized, then: We repeat ourselves too much in CSS; using declarations just once is often one solid avenue to avoid repetition; together, we need to put more focus on style sheet optimization. That focus is, indeed, most important: There are some tough problems in here for which I make suggestions, but we, plural, will still need to wrap our heads around them.

Thanks Tony Ruscoe and Kevin Khaw for reviewing and helping to improve this article.

Was this useful or interesting? Share (toot) this post, and support my work by learning with my ebooks!

About Me

Jens Oliver Meiert, on November 9, 2024.

I’m Jens (long: Jens Oliver Meiert), and I’m a frontend engineering leader and tech author/publisher. I’ve worked as a technical lead for companies like Google and as an engineering manager for companies like Miro, I’m a contributor to several web standards, and I write and review books for O’Reilly and Frontend Dogma.

I love trying things, not only in web development (and engineering management), but also in other areas like philosophy. Here on meiert.com I share some of my experiences and views. (Please be critical, interpret charitably, and give feedback.)

Comments (Closed)

  1. On June 9, 2017, 19:16 CEST, Daniel J Dominguez said:

    I feel this isn’t a good way of handling CSS. This method can fix a few inconsistencies in the declarations, like using a slightly off color, but that can be fixed using variables. I think that CSS repetition is a necessity, simply to make it easier to maintain. I mean if you have a button, you don’t necessarily want that linked to a content style simply because they have the same padding and margin. I feel that CSS should be seperated by intent and context of the element, not by its styles. By this and leveraging variables, you can have more maintainable code. You may argue that this would lead to bigger file sizes, and it does, in certain contexts. Let’s take yours. So I loaded up your files in three seperate tabs and compared them. Original, O-Formatted, and DRY. File sizes were as you said, but that isn’t interesting. What is, is how they were sent. You see, GZIP is a very nice compression format that browsers can handle. Turns out you are serving GZIP, so that makes it easier to test. Original was served in 38.8KB, O-Formatted was served in 40.4KB, and DRY was served in 41.1KB. Interesting how even though O-Formatted was a bigger file size, it had a smaller download than DRY. That is because GZIP really likes repetition. It is able to find these sorts of patterns and cut the download to what is necessary to recreate it. So why make it harder for maintenance, if a computer can just make use of these repetitions and optimize there.

  2. On June 11, 2017, 9:12 CEST, Vergil Penkov said:

    Declaratin DRYness (let’s say via mixins) seems okay considering GZIP based on Harry’s article here. Any thoughts on that?

  3. On June 12, 2017, 17:25 CEST, Ray Estrada said:

    What about mixins or other techniques which prefer repetition of declarations?

    To reduce declarations we could use @extend with placeholders rather than @include mixins. This achieves the goal of this article. However there are two real downsides:

    1. Code inheritance.
    When Sass compiles all selectors are put together in the original declaration location. This pulls the selector out of the existing inheritance structure into another location which sometimes leads to styles losing top inheritance. A possible solution to fix this is specificity of the selector, but then that also leads to a challenge of reusability of code.

    2. @extend placeholders cannot be used in media queries.
    This is a large one. The nature of the way the selectors are strung together to reduce the redundancy of code means that you can’t use media queries. A possible solution is to use media query changes within the placeholder itself rather than in the unique selector, however this is not always ideal or possible.

    I think moving forward it would be wise to think about how we can leverage placeholders & mixins and discover a pattern that works to get us closer to DRY code.

  4. On June 14, 2017, 11:05 CEST, Ben Frain said:

    Hello Jens, thanks for writing this. I probably need to re-read this a couple of times but after a first pass, my overriding question is, why?
    I don’t think anyone would argue that unneeded repetition is a ‘bad’ thing but what is your ultimate goal(s)? Reduced file size? UA parse and paint speed? Maintainability?
    Gzip will largely negate repetitive strings from a file size perspective and that seems to be proved by your Appendix example (am I right in reading that the Yandex DRY is a larger file than the Yandex original?).
    I would love to see further thoughts from you on what the actual gains are? Developer ergonomics? Time to first paint improvements? Size?
    Thanks again for the post.

  5. On June 14, 2017, 11:08 CEST, Ben Frain said:

    Sorry, forgot to add in the previous post, can’t many of the optimisations you hope for be automated by tools like cssnano (http://cssnano.co/)? I feel like much of problems you see could be greatly diminished by tools rather than authoring practices?

  6. On June 14, 2017, 18:51 CEST, Lea Verou said:

    DRY is not a goal in itself, the goal is maintainable stylesheets. DRY is just a tool to help us get there. Extreme, rigid adherence to DRY can actually hinder readability and maintainability, both in CSS code and programming code. Trying to deduplicate tiny amounts of code (like 1 declaration) can very easily hinder readability. I would be more interested in duplication of sets of > 3 declarations and how common that is. Also keep in mind that some of the duplication could be generated by preprocessors and not present in the source code.