Aggregating feeds isn’t all that simple

As I mentioned a few times, one of my first start-up ideas was an RSS aggregator.  It was back in 2005 or so, before Google Reader was even alive.  Bloglines was the coolest tool, if I remember correctly, and it sucked badly.  I got together with a few friends of mine and we started coding.  It was an interesting challenge both technically and aesthetically.  But we got it to the point where it actually worked and wasn’t all too bad.  It was a weird mixture of Python, Perl, and PHP though.

Eventually, it became too much work.  We couldn’t figure out how to monetize the thing.  And Google Reader was announced.  That sort of killed the project.

A few month back, when the announcement of Google Reader’s end of life came out, I looked at the alternatives and wasn’t pleased.  I thought with all the technical advances in the last few years, and with my own improved knowledge, I could attempt the task again.  Yes, I know, I am hopeless optimist in a lot of matters.

At least this time it took just a few days to convince me not to pursue the goal.  Alternatives are plentiful.  Each and every one of them is light years ahead.  I still don’t enjoy front-end development.  And I still have no clue as to how to monetize it.  So, the Subs Reader got frozen.  At least I got it all in frameworks, and left it in the Open Source state.  If I ever will have another try, I can pick up from here.

One of the biggest mistakes I’ve done the last time, was not documenting the project’s process at all.  I vaguely remember that I didn’t sleep for a few nights, trying to figure out all kinds of problems.  But what were they, I don’t remember.

Today, I came across a blog post which lists similar problems that I had to solve, but in greater number and variety.  Even if you aren’t thinking about writing your own RSS reader any time soon (or ever), you should still read through the Brian’s stupid feed tricks.  First of all, they clearly illustrate how much complexity is hiding in the details.  Secondly, they show non-standard is the web in general and RSS in particular.  If you do any kind of web crawling, you’d probably see half of the same issues in your application.  Thirdly, even if you aren’t crawling the web at all, but just code a web application or an API to one, you’ll many places where you can go wrong without noticing it.  All in all, it’s a great list of problems that everybody involved in web development can learn from.