Amazon Rekognition – Image Detection and Recognition Powered by Deep Learning

I know, I know, this blog is turning into an Amazon marketing blow-horn, but what can I do? Amazon re:Invent 2016 conference turned into an exciting stream of news for the regular Joe, like yours truly.

This time, Amazon Rekognition is announced, which is an image detection and recognition service, powered by deep learning.  This is yet another area traditionally difficult for the computers.

Like with the other Amazon AWS services, I was eager to try it out.  So I grabbed a few images from my Instagram stream, and uploaded them into the Rekognition Console.  I don’t think Rekognition actually uses Instagram to learn about the tags and such (but it is possible).  Just to make it a bit more difficult for them, I’ve used the generic image names like q1.jpg, q2.jpg, etc.

Here are the results.  Firstly, the burger.

rekognition-burger

This was spot on, with burger, food, and seasoning identified as labels.  The confidence for burger and food was almost 99%, which is correct.

Then, the beer can with a laptop in the background.

rekognition-beer

Can and tin labels are at 98% confidence. Beverage, drink, computer and electronics are at 69%, which is not bad at all.

Then I decided to try something with people.  Here goes my son Maxim, in a very grainy, low-light picture.

rekognition-maxim

People, person, human at 99%, which is correct.  Portrait and selfie at 58%, which is accurate enough.  And then female at 53%, which is not exactly the case.  But with him being still a kid, that’s not too terrible.

Let’s see what it thinks of me then.

rekognition-leonid

Human, people, person at 99% – yup. 98% for beard and hair is not bad.  But it completely missed out on the duck! :)  I guess it returns a limited number of labels, and while the duck is pretty obvious, the size of it, compared to how much of the picture is occupied by my ugly mug, is insignificant.

Overall, these are quite good results.  This blog post covers a few other cases, like figuring out the breed of a dog and emotional state of people in the picture, which is even cooler, than my tests.

Pricing-wise, I think the service is quite affordable as well:

rekognition-pricing

$1 USD per 1,000 images is very reasonable.  The traditional Free Tier allows for 5,000 images per month.  And API calls that support more than 1 image per call, are still counted as a single image.

All I need now is a project where I can apply this awesomeness…

Amazon Polly – Text to Speech in 47 Voices and 24 Languages

Amazon announced a new service – Amazon Polly – text to speech in 47 voices and 24 languages.  This part got me intrigued:

Polly was designed to address many of the more challenging aspects of speech generation. For example, consider the difference in pronunciation of the word “live” in the phrases “I live in Seattle” and “Live from New York.” Polly knows that this pair of homographs are spelled the same but are pronounced quite differently. Or, what about the “St.” Depending on the language and the context, this could mean (and should be pronounced) as either “street” or “saint.” Again, Polly knows what to do here. Polly can also deal with units, fractions, abbreviations, currencies, dates, times, and other speech components in sophisticated, language-specific fashion.

I am not much involved with text to speech these days, but I did experiments in this area a few years ago.  Simple text to simple English has been around for a long time.  But support for other languages was always limited, and even with English, the voices always sounded very robotic, and often failed to understand the simplest of native language constructs.

I tried Amazon Polly and was blown away by the quality of the synthesis.  Here are the English samples of the text from this blog post:

US English, Kendra, female:

British English, Bryan, male:

Welsh English, Geraint, male:

With that, I wanted to see what happens with other languages.  The only other language I speak is Russian, so I pasted the Russian category description into the converter, selected the Russian language, and got this:

Russian, Maxim, male:

That is pretty good!  Going further, I pasted the content of this blog post, which is a quoted story that somebody else wrote.  It has a very informal flow to it and some weird punctuation.  Listen to what it turned into:

Russian, Maxim, male:

You can still make out that it’s a robot and not a human, but it’s way better than anything else I’ve heard so far.  By far!

So, how affordable is this technology now?  The pricing page answer is very simple:

Pay-as-you-go $4.00 per 1 million characters (when outside the free tier).

It also provides some examples of how this pricing converts to real-life scenarios:

polly-pricing-examples

I don’t know about you, but my mind is blown…

Amazon Lightsail – virtual private servers made easy

Amazon announced a new service – Amazon Lightsail – virtual private servers made easy, starting at $5 per month.

pricing

This is basically a much simplified setup of a few of their services, such as Amazon EC2, Amazon EIP, Amazon AIM, Amazon EBS, Amazon Route 53, and a few others.  For those, who don’t want to figure out all the intricacies of the infrastructure setup, just pick a VPC, click a few buttons and be ready to go, whether you want a plain operating system, or an application (like WordPress) already installed.

It’s an interesting move into the lower level web and VPS hosting.  I don’t think all the hosting companies will survive this, but for those that will do, the changes are coming, I think.

Amazon Snowmobile – a truck with up to 100 Petabytes of storage

Back in my college days, I had a professor who frequently used Andrew Tanenbaum‘s quote in the networking class:

Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway.

I guess he wasn’t the only one, as during this year’s Amazon re:Invent 2016 conference, the company announced, among other things, a AWS Snowmobile:

Moving large amounts of on-premises data to the cloud as part of a migration effort is still more challenging than it should be! Even with high-end connections, moving petabytes or exabytes of film vaults, financial records, satellite imagery, or scientific data across the Internet can take years or decades. On the business side, adding new networking or better connectivity to data centers that are scheduled to be decommissioned after a migration is expensive and hard to justify.

[…]

In order to meet the needs of these customers, we are launching Snowmobile today. This secure data truck stores up to 100 PB of data and can help you to move exabytes to AWS in a matter of weeks (you can get more than one if necessary). Designed to meet the needs of our customers in the financial services, media & entertainment, scientific, and other industries, Snowmobile attaches to your network and appears as a local, NFS-mounted volume. You can use your existing backup and archiving tools to fill it up with data destined for Amazon Simple Storage Service (S3) or Amazon Glacier.

Thanks to this VentureBeat page, we even have a picture of the monster:

aws-snowmobile

100 Petabytes on wheels!

I know, I know, it looks like a regular truck with a shipping container on it.  But I’m pretty sure it’s VERY different from the inside.  With all that storage, networking, power, and cooling needed, it would be awesome to take a pick into this thing.

 

 

S3 static site with SSL

s3-static-site

S3 static site with SSL and automatic deploys using Travis” is a goldmine of all those simple technologies tied into a single knot for an impressive result.  It has a bit of everything:

  • Jekyll – simple, blog-aware, static sites engine, for managing content.
  • GitHub – for version control of the site’s content and for triggering the deployment chain.
  • Travis CI – for testing changes, building and deploying a new version.
  • Amazon S3 – simple, cheap, web-enabled storage of static content.
  • Amazon CloudFront – simple, cheap, geographically-distributed content delivery network (CDN).
  • Amazon Route 53 – simple and cheap DNS hosting and domain management.
  • Amazon IAM – identity and access management for the Amazon Web Services (AWS).
  • Let’s Encrypt – free SSL/TLS certificate provider.

When put altogether, these bits allow one to have a fast (static content combined with HTTP 2 and top-level networking) and cheap (Jekyll, GitHub, Travis and Let’s Encrypt are free, with the rest of the services costing a few cents here and there) static website, with SSL and HTTP 2.

This is a classic example of how accessible and available is modern technology, if (and only if) you know what you are doing.