debuggable

 
Contact Us
 

Bye, bye Friendly URL's

Posted on 2/8/06 by Felix Geisendörfer

Deprecated post

The authors of this post have marked it as deprecated. This means the information displayed is most likely outdated, inaccurate, boring or a combination of all three.

Policy: We never delete deprecated posts, but they are not listed in our categories or show up in the search anymore.

Comments: You can continue to leave comments on this post, but please consult Google or our search first if you want to get an answer ; ).

Important: A while after publishing this post I came up with a solution to the problem which I've documented in this post. "Dessert #11 - Welcome back, Friendly URL’s". So if you're a first-time reader of this article please make sure to read the follow up post as well and leave your comments there.

Usually, I don't really consume the stuff that is being put out by the rails community in terms of blogs, screencasts, or conferences. But yesterday I stumbled across David Heinemeier Hansson RailsConf 2006 Keynote Address (the slides to it can be found here), which I felt was a quite impressive and discussed some of the things I've recently struggled with.

One of the things that caught my attention the most, was DHH's oppinnion on friendly urls. Torward the end of Part 7 he is arguing that identifying Objects via friendly URL's is "just not worth it", and that auto-incrementing id's should be used instead. I've been a long time advocate for the opposite point of view, but those words along with the explanation about how this approach is stupid in the database world, made me changed my mind. But let me explain my personal take on it a little more:

In the past couple of weeks I've used a Model called UrlAlias that hooks into the routes.php and allows the completly free assignment of any public URL to any internal CakePHP url. So for example http://www.my-domain.com/about could be connected to /sites/view/1 internally based of the database table url_aliases. That gave me enough power to assign friendly url's to just about any content of the web site I've been working on. I even had an option for redirectional routing, which means if somebody would go to /sites/view/1 he would be redirected to /about instead. But lord, maintaining this complex setup has been difficult. Because what happens if you take the title of a blog as the url, and the blog title get's edited by the user afterwards? The url changes and previous links break. So I thought, well, if the title changes, remember the old one and redirect requests to it to the new one. Nice, that would even allow me to maintain legacy url's! Hm, but now that I'm doing this, I need to build an interface for this, so the user knows how the internal url system works, and can activley use it to improve the friendyness of his url's. Otherwise he'd soon get lost in the world of multiple aliases for the same content.
But now, one day you want to make an international site, so all the url's should be internationalized as well, right? No problem, just convert all the special characters of the foreign languages into url compatible ones, and make the url internationalization an additional layer of abstraction. Sweet ... or? Am I the only one getting a headache at this point? I hope not.

I mean all of this is really nice, and could improve your url's for humans as well as for search engines. But what you are really doing, is breaking up with a convention! And not just a little "alphabetical order for join table parts"-Convention, but one of the conventions most crucial to the RAD'ness of CakePHP. Breaking up with the routing system means an incredible amount of organizational overhead to any application you might build. I've decided that "It's not really worth" it for my current projects. If I need some /about or /new-product page for public advertising, I can use CakePHP's build in routing system, but for the rest I'll stick to the convention and safe myself hours and hours of programming, debugging and worrying. For search engine optimization, friendly url's would be nice, but looking at the serps (search engine result pages) for the keywords important to me, I realized that this is not the main factor for Google & Co. My time seems to be better invested in making my HTML as semantic and accessible as possible, instead of wasting my times on those stupid url's and get lost in a crazy world where I don't know what url belongs to what controller/action/etc.

What do you guys think about this? Anyone still willing to go through the trouble of "friendly url's" or do you rather go for the RAD'ness and simplicity approach instead?

--Felix Geisendörfer aka the_undefined

PS: Sorry for the long time I've not posted on here. I'm in the United States right now, visiting my host family in Atlanta, so I've been a little busy the last 2 weeks with catching up with old friends and interacting with real-world-models ; ). Expect more good stuff on here again from now on.

 
&nsbp;

You can skip to the end and add a comment.

Nate K said on Aug 02, 2006:

Felix, this is a great post. A topic I battled with for a while as well. I ran into the same problem you discussed with user input. If they were to change a title, it would change the URL friendly link - and the previous one created would then be broken. In that sense, I took the same approach as wordpress seems to take with their system - dont update the URL if they change the post title. So, the URL friendly title would stay the same. Unfortunately, this brings in an array of other issues. What happens if the user mis-spelled something? They obviously dont what that in the SEF URL. BUT - since it is serving as a 'unique key' to access the content - it shouldn't be easily editable by the user.

So, it is confusing, and there are many aspects you have to think about. Right now, I am not using CakePHP for a framework. I have played with it for a few other sites, but haven't implemented it at www.barbourbooks.com. I manage our URLS with mod_rewrite. For our needs - this fits better without the need for a router file or a URL parser. However, I like to make sure all URLS WORK. So, when we moved to a new system - I mapped the majority of the previous URLS that were indexed to their new location with the new website. So, we still have support for the majority of our legacy URLS.

Overall - I would take the extra work to make a URL SEF - versus ugly query strings or the link. I prefer the clean structure - and a structure that has keywords and meaning. Simplicity is nice, but I think it is worth giving meaning to ALL of your content - and this includes URLs. I also give convenience 'short' urls for those looking for specific books. They can type in the actual ISBN instead of the title, and this is then validated and routed with a 301 redirect to the SEF url. This way there is only one of them indexed with google and it helps keep our statistics clean of duplicate pages being accounted for.

Ok - now im rambling - but my overall response is - YES - it is worth it to have SEF URLs.

And, I hope you are enjoying your time in the states!

Peace,
Nate

Felix Geisendörfer said on Aug 02, 2006:

Hi Nate, thanks for your in-depth comment on my little rant on consistent url's. Let me explain a little further:

Having SEF Url's is great, I love them, and for the last month I've started to dislike site's just for not having them. But, if you checkout DHH's keynote a little further up to the point where he talks about ActiveResource (which will be implemented in CakePHP soon as well afaik) you will realize what advantages you are giving up for them.

He talks about how you can have Model's based on external Resources of other Rails web sites following the conventions and use them just as easily as you would with normal Models when using auto-incrementing id's. Now he also says you might be able to work with non-auto-incrementing id's as well, but it will come with a high price. Just as it probably will be in CakePHP.

Now you could still have those "conventional" url's alongside with their aliased buddies, but you know search engines don't like duplicated pages. You could propably even work around that. But again, what you are doing is working AGAINST the conventions and the framework instead of working WITH it. It is possible, but it will come at a high price and you have to decide weather you want to pay it - and I don't! Now you are not using CakePHP or Rails, so none of that applies to you in the same extend.

Recently I just come to realize the best framework is not the one that is the easiest to hack and customize but the one that comes with the SMARTEST set of conventions and approaches to your daily problems. Initially I wrote a lot about how to make CakePHP do what I imagine is best, instead of trying to understand what CakePHP thinks is the best. There will always be exceptions to the conventions, but the goal should be to avoid them as much as possible! Doing this will get you higher maintainability for your apps, easier access to support, more simplicity and just a lot more faster!

Back to SEF Url's: As I said, I like them. But looking at it realisticly, I rarly use those nice url's for anything else then Wikipedia, where I know that freely entering an URL will often be faster then starting of from the main page. How many people will make use of them on the small-business websites most of us are building? 0.5%, 1%, 2%? Not a lot. And search engines? Yes they like SEF url's, but consistent url's are worth more to them. So unless you plan on legacy support, after legacy support, after legacy support for your websites url's - auto-incrementing id's seem to be a lot more future proof of a concept. Because they follow a very easy pattern, and mapping them to whatever new / underlaying system will be a matter of 3 lines of code (or less).

For the projects I'm working on with CakePHP right now, I'll stick with auto-incrementing id's for content pages. Things like categories or tags are a different story, but SEF url's for content - "It's just not worth it" to me.

ben hirsch said on Aug 03, 2006:

I've been struggling with this for some time as well. It's really nice to hear somebody else is willing to ditch the friendly URLs. Nice article.

Madarco said on Aug 03, 2006:

Choose an URL can be a difficult task, but the most important thing is to keep it in the future: "Cool URIs don't change".
Here it is a good style guide: http://www.w3.org/Provider/Style/URI

Nate K said on Aug 03, 2006:

Felix:
I understand completely where you are coming from, and if I were using CakePHP - I would be in the same boat you are. You are correct, search engines care about CONSISTENCY - not so much the name. And, as the example I gave last time with the barbourbooks.com website - I HIGHLY doubt people will know to directly type in a SEF URL - especially because the word order might be different from the title of the book. So - I don't see a big usage overall, but it is a convenience.

And - that was the pattern I decided in the beginning - and I don't want to change ours now - so I will learn to adapt things from this point forward. Not giving the search engine duplicate pages, but feeding it a 301 permanently moved so that its not compared as duplicate content and the index will eventually be updated with its new location.

On a side note, I have played with several different PHP Frameworks, and I have found CakePHP to be the best to suit my needs, and, as you said - simplifying your day to day tools. I am actually going to be using it in some upcoming sites to watch it run in a production environment. I am very impressed with the documentation and ease of us (and a small learning curve).

Anyway, I really enjoy this topic - you've given me several things to think about....

tomo said on Aug 03, 2006:

Interesting conversation. We've had the same problem in the office recently. We use cake for several months now. I was all for custom URLs, and I opted for the exact approach you were using before Felix, and that still seems as an acceptable solution, except for the part of the user changing the url keyword. But, if You look at it in a realistic manner, that happens very, very rarely, and even then, it is most likely it will happen before SE indexing the content and folks linking to it. But hey, it's a possibility, so I've been thinking about a compromise:
you could use this instead:

domain.com/category/article_title/article_id

this way, the keyword is still at an acceptable place for the SE's, if it changes with, it doesn't really matter, since you can still find the right content by the article_id.

Felix Geisendörfer said on Aug 03, 2006:

tomo: Ah that's interesting, so you suggest to place the title in the url without actually using it for anything, but rather for SEF'ness only. So only the article_id would be considered for identifying the item.

That actually is a very nice idea. A bit dirty, but It takes away the complexity of having to identify things by the title, keeps URL's intact even if they might change, but still leaves the juicy buzzwords in for the SE's. Pretty cool!

I just thought about another trick of preventing users from changing the article_title after a while: Just use the created field to see how much time has passed since the item has been created and present a warning/"yes-no"-dialog to the user after a certain amount of hours, if he want's to change the article_title for the URL.

Hm, the only problem with the solution of yours is, that Search Engines might believe you've got dupplicated content on your page if they somehow index several pages with the same article_id but different article_title's. Now to prevent that you could check if the article_title matches the current article_title of the item with the article_id, and if not, as Nate K suggested, redirect them with a 301 to the right URL. Even so this might not always get's your backlinks-power redirected, it should at least get you out of the trouble of having "duplicated content".

Thanks for this idea tomo. I like it quite a bit and am going to play around with it. If it works out well I'll publish my source for reuse to others (and give you some credit for it of course ^^). Interesting conversations like this (including all the other commentators as well), are one of the major benifits from maintaining a php/web-dev bog like this!

One thing I would probably do differently, is to use a controller/action/article_id/article_title format, since this would be closer to the cakephp way of doing things.

Peter Goodman said on Aug 03, 2006:

For some reason, I don't know why, a lot of people are skipping over a very obvious solution:

blah.com/news/1/some_test_post

where 'some_test_post' is the friendly version of the name but serves no actual purpose and where '1' is the primary key id in the database for that news post. Simple and elegant.

Peter Goodman said on Aug 03, 2006:

Damn, seems someone beat me to it :P

tomo said on Aug 03, 2006:

Thnx for the kind reply. I'm always glad to start a productive conversation, if you realy play around with it and release a source, then it's obvious why. :)

As for the order of variables in the url, controller/action/article_id/article_title is probably more human/cake usable, but the other version is much better for the SE's, more left in the URI, the more value SE put's to it. That's why I reversed it. But hey, it doesn't change much anyway.

Keep up the good work with the site.

@Peter Goodman:
;)

Felix Geisendörfer said on Aug 03, 2006:

Hey tomo: Yes, search engines like things to be as far left as possible, but humans/conventions are more important to me then an officially unconfirmed and probably very minimal factor for the rankings of certain search engines.

Steve said on Aug 04, 2006:

I do often wonder how much importance major search engines really put on the url string, however I agree with Felix, it's probably not worth the sacrifice.

[...] Für manche Dinge brauche ich ein wenig länger, das muss am Alter liegen Wie dem auch sei, geschrieben wurde schon in diversen Blogs (z.B. Felix und Andreas) über die “freundlichen” URLs, also optisch wirkungsvollere URLs für Blogeinträge. [...]

[...] Not too long ago you've all heard me saying Bye, bye Friendly URL’s. Now, after a lot of feedback and some more thoughts on it I came to what I find a nice compromise between REST'ness, SEO'ness, and FRIENDLY'ness. [...]

Jason said on Jan 11, 2007:

Hi there,

Friendly URLs are more or less crucial for serious sites and SEO work, however, the process you described here is definitely something not worth working on, since it is overcomplicating the matter.

I find it that normal procedural approach to PHP + mod_rewrite work quite nicely for my sites, while CakePHP ought to be routed with their own routing systems, but I haven't yet worked out how to properly make the URLs work the way I want them.

Cheers,

Jason
www.flexewebs.com

Felix Geisendörfer said on Jan 11, 2007:

Jason: I've found a way that solves all friendly URL problems I used to have, see my post above: http://www.thinkingphp.org/2006/09/18/dessert-11-welcome-back-friendly-urls/.

PS: When I wrote this I typed your name and saw I wrote down "JSON" ... Too much DOM/Ajax scripting over the past days, I'm telling you ... ; ).

[...] ThinkingPHP » Bye, bye Friendly URL’s. Apakah harus seperti ini? [...]

Jason said on Jan 31, 2007:

Hi Felix,

Like I say I find it that classic PHP + a bit of mod_rewrite does the job very nicely.

If you look at what I have put together at www.flexewebs.com it all works with mod_rewrite and classic PHP4 with scripts that feature IDs and lots of other parameters, but are hidden via the mod_rewrite.

I am increasingly finding it to be the case that whatever approach one takes to programming web pages or anything else for that matter, the time consumption ends up being very similar.

I will write more about this on my blog at some point and share my own feelings on the matter. I think it is very important for people to realise and understand it.

Cheers,

Jason (JSON) :-)

Felix Geisendörfer said on Jan 31, 2007:

Jason: No offense, but sessions id's in every url? Come on, that's a real SEO killer and destroys whatever advantages mod_rewrite can offer to somebody ; ). Other then that I consider myself to be the obnoxious guy who is looking for the "perfect solution", so I always try to shoot for something a little more then "does the job" ; ). Take a look at my other article I gave you the link to above and let me know what you think about it.

Baris Balic  said on Jul 03, 2007:

I think my main quarrel with this whole scheme is that we have a system that uses an intuitive, obvious approach to handling unique objects... with unique IDs, now if we are going to present it to a user then reshaping those identifiers into something more paletteable seems both sensical and simple, for what's effectively a one way transaction.
The problem is that we're taking this highly appropriate method of identifying objects and then hiding it so that another system (usually a search engine in my case) may index it in a particular way, applying it's own set of identifying rules internally. For inter-system transactions, this should not be necessary, it seems fallic.

3-bids said on Jan 23, 2008:

How are people redirecting http://domain.com to http://www.domain.com in Cake? I've tried several different redirects, but I get a 404 at some level. Thanks.

Chris said on May 01, 2008:

@3-bids

Do that via your .htaccess. Try something like this:

RewriteEngine On
RewriteBase /

RewriteCond %{HTTP_HOST} ^yourdomain.com [NC]

RewriteRule ^(.*)$ http://www.yourdomain.com/$1 [L,R=301,NC]

Silver Knight  said on Aug 07, 2008:

Firstly, I'd like to comment to the blog owners Felix and Tim to thank them for this fantastic blog. It's been a valuable source of learning for me about many things Cake-related. Thank you for your time and effort. It is highly appreciated.

Secondly, I'd like to comment to the idiots who TRY to post spam links in the comments. Are you people morons, or what? Read a little bit about rel="nofollow" some time. You are NOT improving your search engine rankings one little bit with this slimy practice. You are only polluting the comments threads with worthless garbage that not only does not help anyone ELSE, but it ALSO doesn't even help YOU at all. The search engine robots will not index your lame links to acne-cream or futures-trading or whatever pathetic product you are trying to hawk.

(If you wonder what I am talking about, mouse over some of the links a few comments back to see where they lead, and then examine the source for the page. Notice that these links are all rel="nofollow"?)

I hate to hate, but this practice of polluting comment threads in blogs is one thing I truly DO hate. People who do this sort of thing should not be allowed to own a computer.

Well, enough of my rant. The real reason for this comment still is to thank the owners of this blog for all they do to help the CakePHP community. The world needs more people like you. Thank you again for all the good you do in the world of web design and PHP coding.

Tim Koschützki said on Aug 07, 2008:

Silver Knight: Thanks a lot for your comment. Those are what keeps us going here. :)

As for the spam guys: Listen to Silver Knight - you aren't making anyboy happy, not even yourself. : /

Silver Knight: Yeah the html parsing needs some more work. I hope we will get to that soon. Thanks again for your kind comment!

This post is too old. We do not allow comments here anymore in order to fight spam. If you have real feedback or questions for the post, please contact us.