Archiving web sites

LWN runs an interesting article, covering different ways of archiving a website.  It sounds trivial, but it’s not.  Even the simplest of ways – wget – will probably take you a few dozen attempts to figure out the following:

$ wget --mirror --execute robots=off --no-verbose --convert-links \
       --backup-converted --page-requisites --adjust-extension \
       --base=./ --directory-prefix=./ --span-hosts \
       --domains=www.example.com,example.com http://www.example.com/

There a few other interesting tools (like pywb) mentioned.

Killed by Google

Killed by Google is a long list of Google projects which are no more.  It looks sad and depressing, yet very impressive.  Google killed way more projects than most companies would even start.

And in all that long list, the one that still pains me the most is the Google Reader.

And if you want to see this in a different design, have a look at Google Cemetery.

Stack Overflow Buddy

An innocent joke on Facebook brought in something really golden – Stack Overflow Buddy.  It is a fun PHP library for all those of you who search for code examples on Stack Overflow and then copy-paste those into your projects.

Wow, how’s it work?

If you’re impressed, you should probably stop reading here.

  1. Split the camelCased function call into words
  2. Grab the top scoring PHP tagged questions with those words in the title from StackOverflow’s API
  3. Grab the top scoring answers for those questions
  4. Pull any and all code blocks from those answers
  5. Find the first code block that:
    1. Inteprets without error
    2. Contains one or more functions
    3. One of the functions has the same amount of arguments as were passed by the user
  6. Then we throw caution to the wind, eval, and call the new method!

This is absolutely brilliant!

Your Site on Google

Here’s something I’ve never seen before.  When searching for something on Google, I got a new widget “Your Site on Google” right above the search results.  Erm … what?

It took me a second to figure it out.  Google, of course, knows who I am, since I am logged into my Google Chrome and into all of my Google Apps.  It also knows that I manage the “mamchenkov.net” website, via Google Search Console.  So when I search for something on Google and first page of results includes a page from my own blog, it must be thinking that I’m there to monitor, test and improve my SEO.  It then provides me with some metrics and handy links to do so.  It also mentions that these are only visible to me, not the rest of the people searching.

I don’t think there’s anything wrong with it, but it is weird for a second for sure.