An AWS outage – learning the same lessons all over again…

screenshot-2017-02-28-11-49-04Today AWS S3 is down in the US-EAST-1 region – taking many other services down along with it.

Large swaths of the internet are impacted including 1 or 2 of my non-critical sites/applications. However, my mission critical sites/applications are fine. Why? Because if they are mission critical they should never be susceptible to a failure of a single physical location… period.

You could have stayed up today – here are some examples of how:

  1. S3 Static Site Hosting – Distribute the site with Cloudfront with (at least) the standard 24 hour TTL.
  2. S3 Storage – either distribute static content with Cloudfront or replicate your bucket(s) to another region.
  3. S3 Streaming Data – write your streams to either buffer when the region is not available or fail to another region. Data processing (lambda, etc) should continue from either region in the fail over scenario.
  4. EC2 & Autoscaling – Replicate your snapshots and AMIs across regions. Run services in multiple regions or be prepared to fail over to an alternate region.
  5. Big Data workloads:
    1. Be prepared to buffer all writes locally until the services comes back up.
    2. This is equally relevant to network partitions.

Last, but not least, test these regularly. Make sure you know what happens and how your code behaves.

This list is – by no means – comprehensive, but it is a great starting point for most services.

My meta-recommendation is to always be prepared for service interruptions at every level of your architecture. You may choose not to have a 100% redundancy – but do that intentionally and transparently (with business stakeholders). You’ll save yourself a lot of panic on a day like today.

Person Recognition in Images with OpenCV & Neo4j

Time for an update on my ongoing person identification in images project; for all the background you can check out these previous posts:

Analyzing AWS Rekognition Accuracy with Neo4j

AWS Rekognition Graph Analysis – Person Label Accuracy

Person Recognition: OpenCV vs. AWS Rekognition

In my earlier serverless series I discussed and provided code for getting images into S3 and processed by AWS Rekognition – including storing the Rekognition label data in DynamoDB.

This post builds on all of those concepts.

In short – I’ve been collecting comparative data on person recognition using AWS Rekognition and OpenCV and storing that data in Neo4j for analysis.

Continue reading “Person Recognition in Images with OpenCV & Neo4j”