Website Optimization Measures, Part II
Jens O. Meiert, February 15, 2008 / October 6, 2009.
This entry is filed under Web Development, Design.
Now that we talked about blog clean-ups, structure and element revisions as well as search engine verification in part I, here are some additional suggestions, small measures for improvement consisting of .htaccess stuff, SEO, and consistency checks.
-
Sorting .htaccess directives and adding standardized comments. Quick and dirty: I love to be organized, and I discovered some potential within my projects’ .htaccess files. I didn’t really add new stuff as many useful directives have already been in place, but I went for alphabetical sorting in certain sections, and these sections themselves have been labeled quite “metaphorically”:
# Authentication ## Authentication directives # Startup Routine ## Various alphabetically sorted directives, e.g. AddCharset utf-8 .css AddDefaultCharset utf-8 CheckSpelling On ContentDigest On DefaultLanguage en # Course Correction ## URL rewrite directives # Course Correction: P1-P3 ## Redirect and RedirectMatch directives # Emergency ## ErrorDocument directives
-
Getting additional assistance with SEO. Sure, this involves actual optimization as well, but I need to thank John Britsios for helping me with a few severe issues first. The main measure that I needed to perform was a robots.txt improvement that became necessary due to the apparently lousy archive and pagination handling of WordPress – as for the English part of this site, I had about 74 % of my pages in the supplemental index (self-promotion: see more of these tools over at recently face-lifted UITest.com). Way too much, caused by a lot of automatically generated duplicate content. So John analyzed this site and came up with a few solutions, and I’m both confident of and curious about the real outcome within the next weeks and months to come. Thoughts I had about dates in URLs didn’t really matter, yet.
-
Checking and improving UI and code consistency. There have been many other improvements, but I’ll file them under “consistency efforts”. The lesson I consequently learn from my QA initiative (with quite a few people pointing out mistakes) is likewise consequently learned when checking code. No matter how hard you try, some mistakes always go through. So checking both CSS and HTML files revealed a few though minor issues, and be it that there have been unnecessary references or even support for IE 5 in one project (whose extra code I just don’t tolerate anymore).
-
Considering but dropping hidden file extensions. No wonder that I basically skip that, I wasted too much time with mod_rewrite experiments. Okay, it wasn’t really wasted since I learned a lot, but what I finally noticed was that hiding file extensions (and the implications for my personal projects) wasn’t really worth the effort, and I stopped changing stuff when I even suspected this to become a maintenance issue. Just because you can doesn’t mean you should.
Finally, that should have been quite a few specific refactoring measures. I hope you enjoyed it anyway – I might write about other optimization efforts soon again, there still are many things to improve. Of course.
This has been the second part of an open article series. There are six additional articles on website optimization, part I, part III, part IV, part V, part VI, and part VII.
Read More
Enjoy the most popular posts, probably including:
Comments
-
On February 18, 2008, 6:17 CET, Lazar said:
Regarding the supplemental index, the way to check it is to compare in Google number of search results of:
site:meiert.com/
with
site:meiert.com/*
That seems to be what mapelli.info is doing. I personally have my doubts about using * for detecting non supplemental results, as I got some strange results few times. Since you work for Google now, and I’ve heard it has amazing transparency of work among employees of all departments, you can give us a hint about the meaning of * ;o)
Thanks for mentioning UITest.com, it has a really nice collection of links.
-
On February 19, 2008, 10:34 CET, Jens Meiert said:
site:meiert.com/
with
site:meiert.com/*
That seems to be what mapelli.info is doing.
Right, it appears to do nothing else

Thanks for mentioning UITest.com, it has a really nice collection of links.
Thank you!
-
On February 20, 2008, 1:16 CET, Bennett said:
I would love to hear about the robots.txt improvements to avoid indexing of automatically generated duplicate content. I suppose the obvious thing is to block all archive pages (categories, months, etc.) so only individual posts are crawlable. Is this what you did?
-
On February 20, 2008, 8:42 CET, Web Designer Group said:
New to SEO. Your suggestions are useful and helpful. Sorting directives and comments make code reading easy and useful. Thanks for introduction to UI test.
-
On February 20, 2008, 10:53 CET, Jens Meiert said:
Bennett, yes, basically. The most important thing was to get rid of all duplicate content generated by WordPress (mostly caused by all the archives) by leaving a “path” though (we left the “Categories” way open), then looking for other instances of dup content; for example, there have been a few files available in different formats. It took some time but appears to pay off already. (Not blaming WordPress, not now …)
-
On March 1, 2008, 17:45 CET, Robert said:
I would suggest to completely remove Apache directives which aren’t either directory-specific or rather volatile over time out of .htaccess and drop them into an appropriate http.conf-include. These were some of my candidates:
AddCharset utf-8 .css
AddDefaultCharset utf-8
CheckSpelling On
ContentDigest On
DefaultLanguage en
.htaccess parsing costs performance so why would you add to the cost by adding settings which fits into a startup configuration item just as well?
-
On March 2, 2008, 17:24 CET, Jens Meiert said:
Robert – you’re absolutely right, it’s yet advisable to disable .htaccess altogether. However, I have no access to my server’s httpd.conf file, as will be the case with most private websites, so I need to do use .htaccess files instead …
-
On November 4, 2009, 11:46 CET, SEO Tips said:
Yea, this is a wonderful piece of information. Your points in regards to code consistency and robots.txt is very useful. However, will post my comments in regards to .htaccess after doing my analysis. But thanks a many for useful information.
-
On February 24, 2010, 20:08 CET, SEO Process said:
Okay, took a while to analyze .htaccess sorting and stuff. Tried implementing on 3 different sites with different natures, architecture and rewriting techniques. As per my experience, I think coming up with something generic algorithms using wild cards (e.g. ‘*’) could be more helpful. Using wild cards, you can implement almost same algorithm on as much sites as you want to and every time you come back for administration, you don’t feel the need to recall the page structures.
So in my case, being generic by using .htaccess directives could be more helpful in optimizing a website either for SEO or webmaster activities.
makes sense?
-
On March 12, 2010, 12:47 CET, Linda Jobs said:
could I ask for assistance in preventing duplicate content to be indexed using the .htaccess method you explained above? I feel that’s the only thing not explained your article well, otherwise it’s a great stuff.
Many thanks in advance for your help!