The Grand Theory of Amazon

As a heavy user of Amazon Web Services, I often find myself in deep discussions about Amazon company, its broad portfolio of brands, the way they make money, and their strategy going forward.

Admittedly, that’s not an easy area to understand, let alone explain or argue about. That’s why I really enjoyed this video. It is oversimplifying a lot of things, but it does a nice job of shedding some light on what is going now, where it is heading, and how it is similar and different to some other companies.

Most of What You Read on the Internet is Written by Insane People

Most of What You Read on the Internet is Written by Insane People” is a nice little roundup of statistics from a several large sites like Wikipedia, Amazon, YouTube, Reddit, etc. These stats support the viewpoint that on these huge sites, most of the content is generated by a very small number of users.

Inequalities are also found on Wikipedia, where more than 99% of users are lurkers. According to Wikipedia’s “about” page, it has only 68,000 active contributors, which is 0.2% of the 32 million unique visitors it has in the U.S. alone.
Wikipedia’s most active 1,000 people — 0.003% of its users — contribute about two-thirds of the site’s edits. Wikipedia is thus even more skewed than blogs, with a 99.8–0.2–0.003 rule.

Some of these numbers are staggering. And the people who do the work, are indeed – insane. Not medically, but by deviation of how much they do and for how long, as compared to the rest of the user base, or even population.

By the way, pretty much all posts in this very blog have been written by one person. Me. Almost 10,000 posts over 19 years. So yes, I’m also probably a little bit insane.

The Millions Silicon Valley Spends on Security for Execs

There’s plenty of talk about security when it comes to giant technical companies, like Google, Facebook, Amazon, and Apple. But that’s all usually from the perspective of the software security and end-user privacy. Here’s a different perspective on the subject – “The Millions Silicon Valley Spends on Security for Execs“.

Apple’s most recent proxy statement, filed earlier this month, shows the company spent $310,000 on personal security for CEO Tim Cook. But that’s a fraction of other tech giants’ expenditures.
Amazon and Oracle spent about $1.6 million each in their most recent fiscal years to protect Jeff Bezos and Larry Ellison, respectively, according to documents filed with the US Securities and Exchange Commission. And Google’s parent company, Alphabet, laid out more than $600,000 protecting CEO Sundar Pichai and almost $300,000 on security for former executive chair Eric Schmidt. In 2017, Intel spent $1.2 million to protect former CEO Brian Krzanich. Apple, Google, Intel, and Oracle declined to comment; Amazon did not respond to a request for comment.
Facebook CEO Mark Zuckerberg was the costliest executive to protect; Facebook spent $7.3 million on his security in 2017, and last summer the company told investors that it anticipated spending $10 million annually.

Well, that’s pretty impressive in terms of money! But do they need it really? They do, at least, to some degree:

While Silicon Valley firms haven’t disclosed many threats to the safety of their executives or offices, they have good reason to take precautions. In December, Facebook evacuated its headquarters after the company received a bomb threat. Last year an unhappy YouTube user entered the company’s San Bruno, California, headquarters and shot three employees before killing herself. And in 1992 the president of Adobe, Charles Geschke, was kidnapped at gunpoint and rescued by the FBI.

Do you still dream of being an executive in a large company?

AWSome Day Athens 2018


Last week I’ve attended the AWSome Day Athens 2018 (huge thanks to Qobo for the opportunity).  There aren’t that many technology events in Cyprus, so I’m constantly on the lookout for events in Europe.

AWSome Day Athens is part of the Amazon’s AWSome Day Global Series, which are one day events organized all throughout the world.  The events are usually for a single day, featuring the speakers from both Amazon AWS team and some of their prominent clients from the area.  AWSome Day Athens 2018 was done in partnership with Beat.

Continue reading “AWSome Day Athens 2018”

Amazon Snowmobile – a truck with up to 100 Petabytes of storage


Back in my college days, I had a professor who frequently used Andrew Tanenbaum‘s quote in the networking class:

Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway.

I guess he wasn’t the only one, as during this year’s Amazon re:Invent 2016 conference, the company announced, among other things, a AWS Snowmobile:

Moving large amounts of on-premises data to the cloud as part of a migration effort is still more challenging than it should be! Even with high-end connections, moving petabytes or exabytes of film vaults, financial records, satellite imagery, or scientific data across the Internet can take years or decades. On the business side, adding new networking or better connectivity to data centers that are scheduled to be decommissioned after a migration is expensive and hard to justify.

[…]

In order to meet the needs of these customers, we are launching Snowmobile today. This secure data truck stores up to 100 PB of data and can help you to move exabytes to AWS in a matter of weeks (you can get more than one if necessary). Designed to meet the needs of our customers in the financial services, media & entertainment, scientific, and other industries, Snowmobile attaches to your network and appears as a local, NFS-mounted volume. You can use your existing backup and archiving tools to fill it up with data destined for Amazon Simple Storage Service (S3) or Amazon Glacier.

Thanks to this VentureBeat page, we even have a picture of the monster:

aws-snowmobile

100 Petabytes on wheels!

I know, I know, it looks like a regular truck with a shipping container on it.  But I’m pretty sure it’s VERY different from the inside.  With all that storage, networking, power, and cooling needed, it would be awesome to take a pick into this thing.

 

 

Top 29 books on Amazon from Hacker News comments


hacker-news-books

I came across this nice visualization of “Top 29 books ranked by unique users linking to Amazon in Hacker News comments“.

Amazon product links were extracted and counted from 8.3M comments posted on Hacker News from Oct 2006 to Oct 2015.

Most of these are, not surprisingly, on programming and design.  A few are on startups and business.  Some are on how to have a good life.  Which is a bit weird.

Support lesson to learn from Amazon AWS


I’ve said a million times how happy I am with Amazon AWS.  Today I also want to share a positive lesson to learn from their technical support.  It’s the second time I’ve contacted them over the last year and a half, and it’s the second time I am amazed at how good well it works.

In my experience, technical support departments usually rely on one primary communication channel – be that a telephone, an email, a ticketing system, or a live chat.  The other channels are often just routed or converted into the main one, or, even, completely ignored.  But each one of those has it’s benefits and side effects.

Telephone provides the most immediate connectivity, and a much valued option of the human interaction.  But the communication is verbal, often without the paper trail.  It makes it difficult to carbon copy (CC) people on the conversation or review exactly what has been said.  It is also very free form, unstructured.

Live chat is also free form and unstructured, but it’s written, so transcripts are easily available.  It also helps with the carbon copy, but only on the receiving end – supervisors or field experts can often be included in the conversation, but adding somebody from the requesting side is rarely supported.

Email makes it easy to carbon copy people on both ends.  It provides the paper trail, but often lacks the immediate response factor.  And it’s still unstructured, making it difficult to figure out what was requested, what has been discussed and whether or not there was any resolution.  (Have you ever been a part of a lengthy multi-lingual conversation about, what turned out to be, multiple issues in the same thread?)

Ticketing/support systems help to structure the conversation and make it follow a certain workflow.  But they often lack humanity and, much like emails, the immediate response.

Now, what Amazon AWS support has done is a beautiful combination of a ticketing system and a phone.  You start off with the ticketing system – login, create a new support case, providing all the necessary information, and optionally CC other people from a single short form.  The moment you submit it, the web page asks for your phone number.  Once entered, a phone call is placed immediately by the system, connecting you to the support engineer.  The engineer confirms a few case details and lets you know that the case is in progress and expected resolution time (I was asking to raise the limit of the Elastic IP addresses on the Virtual Private Cloud, and I was told it will be done in the next 15 to 30 minute.  And it was done in 10!).  I have also received two emails – one confirming the opening of the case, with all the requested details, and another one notifying me that the work has been done, providing quick information on how to follow up, in case I needed to.

Overall experience was very smooth, fast, to the point, and very effective.  I never got lost.  I never had to figure anything out.  And my problem was attended to and resolved immediately.

I only wish more companies provided this level of support.  I’ll sure try too – but it’s a bar set high.

 

 

Top level domain nonsense and how it can break your stuff


Call me old school, but I really (I mean REALLY) don’t like the recent explosion of the top level domains.  I understand that most good names are taken in .com, .org, and .net zones, but do we really need all those .blue, .parts, and .yoga TLDs?

Why am I whining about all this all of a sudden?  I’ll tell you why.  Because a new top level domain – .aws – is about to be introduced, and it already broke something for me in a non-obvious manner.

aws

I manage a few Virtual Private Clouds on the Amazon AWS.  Many of these use and rely on some hostname naming convention (yeah, I’m familiar with the pets vs. cattle idea).  Imagine you have a few servers, which are separated into generic infrastructure and client segments, like so:

  • bastion.aws.example.com
  • firewall.aws.example.com
  • lb.aws.example.com
  • web.client1.example.com
  • db.client1.example.com
  • web.client2.example.com
  • db.client2.example.com
  • … and so on.

Working with such long FQDNs (fully qualified domain names) isn’t very convenient.  So add “search example.com” to your /etc/resolve.conf file and now you can use short hostnames like firewall.aws and web.client1.  And life is beautiful …

… until one day, when you see the following:

user@bastion.aws$> ssh firewall.aws
Permission denied (publickey).

And that’s when your heart misses a beat, the world freezes, and you go: “WTF?”.  All kinds of thoughts are rushing through your head.  Is it a typo?  Am I in the right place? Did the server get compromised?  How’s that for a little panic …

Trying a few things here and there, you manage to get into the server from somewhere else.  You are very careful.  You are looking around for any traces of the break-in, but you see nothing.  You dig through the logs both on the server and off it.  Still nothing.  You can dive into all those logwatch and cron messages in your Trash, that you were automatically deleting, cause things were working fine for so long.  There!  You find that cron was complaining that backup script couldn’t get into this machine.  Uh-oh.  This was happening for a few days now.  A black cloud of combined worry for the compromised machine and outdated backup kills the sunlight in your life.  Dammit!

Take a break to calm down.  Try to think clearly.  Don’t panic.  Stop assuming things, and start troubleshooting.

A few minutes later, you establish that the problem is not limited to that particular machine.  All your .aws hosts share this headache.  A few more minutes later, you learn that ‘ssh firewall.aws.example.com’ works fine, while ‘ssh firewall.aws’ still doesn’t.

That points toward the hostname resolution issue.   With that, it takes only a few more moments to see the following:

user@bastion.aws$> host firewall.aws
firewall.aws has address 127.0.53.53
firewall.aws mail is handled by 10 your-dns-needs-immediate-attention.aws.

Say what?  That’s not at all what I expected.  And what is that that I need to fix with my DNS?  Google search brings this beauty:

This is problably because the .dev and .local are now valid top level extensions.

Really? Who’s the genius behind that?  I thought people chose those specifically to make them internal.  So is there an .aws top level extension now too?  You bet there is!

Solution?  Well, as far as I am concerned, from this day onward, I don’t trust the brief hostnames anymore.  It’s FQDN or nothing.

CPU Steal Time. Now on Amazon EC2


Yesterday I wrote the blog post, trying to figure out what is the CPU steal time and why it occurs.  The problem with that post was that I didn’t go deep enough.

I was looking at this issue from the point of view of a generic virtual machine.  The case that I had to deal with wasn’t exactly like that.  I saw the CPU steal time on the Amazon EC2 instance.  Assuming that these were just my neighbors acting up or Amazon having a temporary hardware issue was a wrong conclusion.

That’s because I didn’t know enough about Amazon EC2.  Well, I’ve learned a bunch since then, so here’s what I found.

Continue reading “CPU Steal Time. Now on Amazon EC2”