An AWS outage – learning the same lessons all over again…

screenshot-2017-02-28-11-49-04Today AWS S3 is down in the US-EAST-1 region – taking many other services down along with it.

Large swaths of the internet are impacted including 1 or 2 of my non-critical sites/applications. However, my mission critical sites/applications are fine. Why? Because if they are mission critical they should never be susceptible to a failure of a single physical location… period.

You could have stayed up today – here are some examples of how:

  1. S3 Static Site Hosting – Distribute the site with Cloudfront with (at least) the standard 24 hour TTL.
  2. S3 Storage – either distribute static content with Cloudfront or replicate your bucket(s) to another region.
  3. S3 Streaming Data – write your streams to either buffer when the region is not available or fail to another region. Data processing (lambda, etc) should continue from either region in the fail over scenario.
  4. EC2 & Autoscaling – Replicate your snapshots and AMIs across regions. Run services in multiple regions or be prepared to fail over to an alternate region.
  5. Big Data workloads:
    1. Be prepared to buffer all writes locally until the services comes back up.
    2. This is equally relevant to network partitions.

Last, but not least, test these regularly. Make sure you know what happens and how your code behaves.

This list is – by no means – comprehensive, but it is a great starting point for most services.

My meta-recommendation is to always be prepared for service interruptions at every level of your architecture. You may choose not to have a 100% redundancy – but do that intentionally and transparently (with business stakeholders). You’ll save yourself a lot of panic on a day like today.

Person Recognition in Images with OpenCV & Neo4j

Time for an update on my ongoing person identification in images project; for all the background you can check out these previous posts:

Analyzing AWS Rekognition Accuracy with Neo4j

AWS Rekognition Graph Analysis – Person Label Accuracy

Person Recognition: OpenCV vs. AWS Rekognition

In my earlier serverless series I discussed and provided code for getting images into S3 and processed by AWS Rekognition – including storing the Rekognition label data in DynamoDB.

This post builds on all of those concepts.

In short – I’ve been collecting comparative data on person recognition using AWS Rekognition and OpenCV and storing that data in Neo4j for analysis.

Continue reading “Person Recognition in Images with OpenCV & Neo4j”

Person Recognition: OpenCV vs. AWS Rekognition

out_mdalarm_20170112-101241
If you’ve been following along – I’ve been working with AWS Rekognition to detect people in security camera footage.

I have previous posts that discuss the results.

I’m now running the images through OpenCV using the pre-trained HOG + Linear SVM model. The picture in this post is an example of the output from OpenCV with a person detected and a bounding box drawn.

Over the next day or two I’ll start processing all the images with both Rekognition and OpenCV. I’ll also be capturing the results in Neo4j (where I’m already capturing the Rekognition object labels) to allow for comparative analysis.

Stay tuned…

AWS Rekognition Graph Analysis – Person Label Accuracy

Last week I wrote a post evaluating AWS Rekognition accuracy in finding people in images. The analysis was performed using the Neo4j graph database.

As I noted in the original post – Rekognition is either very confident it has identified a person or not confident at all. This leads to an enormous number of false negatives. Today I looked at the distribution of confidence for the Person label over the last 48 hours.

You be the judge:

rekognition-person-label-confidence-distribution

Check out original post to see how the graph is created and constantly updated as images are created in the serverless IoT processing system.

Analyzing AWS Rekognition Accuracy with Neo4j

As an extension of my series of posts on handling IoT security camera images with a Serverless architecture I’ve extended the capability to integrate AWS Rekognition

Amazon Rekognition is a service that makes it easy to add image analysis to your applications. With Rekognition, you can detect objects, scenes, and faces in images. You can also search and compare faces. Rekognition’s API enables you to quickly add sophisticated deep learning-based visual search and image classification to your applications.

My goal is to identify images that have a person in them to limit the number of images someone has to browse when reviewing the security camera alarms (security cameras detect motion – so often you get images that are just wind motion in bushes, or headlights on a wall).

Continue reading “Analyzing AWS Rekognition Accuracy with Neo4j”

Data is the currency of your Digital Transformation

This is a scary time for a company. But the state of play creates the potential for mass and creative disruption.
— $1 Billion for Dollar Shave Club: Why Every Company Should Worry @ NYTimes

Every company is a digital company. No longer is it a question of if your product will become digital – as was the case with music, newspapers, TV, movies, etc. – it is a question of how the experience of your product (and your company) changes even if your product isn’t digitized.

eCommerce, digital marketing, social, CRM, and content technology and strategies are critical. You will need to invest in those technologies – but underpinning all of those technologies is data – that data is the currency of your digital transformation.

Continue reading “Data is the currency of your Digital Transformation”

Big Data – Storage Isn’t Enough

We should have seen it coming. When we stopped even thinking about how we store data for our applications, when we just assumed some DBA would give us a database – and some SysAdmin would give us a file system. Sure, we can talk about W-SAN (what WLAN was to the LAN, but for storage) solutions like Amazon S3 and Rackspace Cloud – but they didn’t fundamentally change anything.

Big Data forces us to re-think storage completely. Not just structured/unstructured, relational/non-relational, ACID compliance or not. It forces us – at the application level – to rethink the current model exemplified by

I’m storing this because I may need it again in the future.

Where storage means physical, state aware object persistence and future means anywhere between now and the end of time.

Data Persistance – A Systemic Approach to Big Data for Applications

What Big Data applications require is a systemic approach to data. Instead of applications approaching data as only a set of if/then operations designed to determine what (if any) CRUD operations to perform it demands that applications (or supporting Data Persistence Layers) understand the nature of the data persistence required.

This is a level of complexity developers have been trained to ignore. The CRUD model itself explicitly excludes any dimensionally – or meta information about the persistence. It is all or nothing.

Data Persistance is primarily the idea that data isn’t just stored – it is stored for a specific purpose which is relevant within a specific time slice. These time slices are entirely analogous to those discussed in Preemption. Essentially, any sufficiently large real time Big Data system is simply a loosely aggregated computer system in which any data object may generate multiple tasks each of which have a specific priority.

For example, in a geo location game (Foursquare) the appearance of a new checkin requires multiple tasks which are prioritized based on their purpose, for example:

  1. Store the checkin to distribute to “friends” (real-time)
  2. Store the checking association with the venue (real-time)
  3. Analyze nearby “friends” (real-time)
  4. Determine any game mechanics, badges, awards, etc
  5. Store the checkin on the user’s activity
  6. Store the checkin object

NOTE: Many developers will look at this list above and ask: “Why not a database?” While a traditional database may suffice for a relatively low volume system (5k users, 20k checkins per day) it would not be sufficient at Big Data scale (as discussed here).

This Data Persistence solution is comprised of four vertical persistence types:

Big Data, Real Time Data Persistance

Transitory

Transitory persistance is for data persisted only long enough to perform some specific unit of work. Once the unit of work is completed the data is no longer required and can be expunged. For example: Notifying my friends (that want to be notified) that I’m at home.

Generally speaking (and this can vary widely by use case) Transitory persistence must be atomic, extremely fast and fault tolerant.

Volatile

Volatile persistance is for data that is useful but can be lost and rebuilt at any time. Object caching (how memcached is predominantly used) is one type Volatile persistence, but does not describe the entire domain. Other examples of volatile data include process orchestration data, data used to calculate decay for API Rate Limits, data arrival patterns (x/second over the last 30 seconds), etc.

The most important factor for Volatile data persistence is that the data can be rebuilt from normal operations or from long term data storage if it is not found in the Volatile dataset.

Generally speaking, data is stored in Volatile persistence because is offers superior performance, but limited dataset size.

ACID

Relational databases and atomicity, consistency, isolation and durability (ACID) are not obsolete. It is important for specific types of operations – done for specific purposes to maintain transactional compliance and ensure the entire transaction either succeeds in an atomic way, or fails. Examples of this include eCommerce transactions, Account Signup, Billing Information updates, etc.

Generally speaking, this data complies with the old rules of data. It is created/updated slowly over any given time slice, it is read periodically, there is little need to publish the information across a large group of subscribers, etc.

Amorphous

Amorphous persistence is the new kid on the block. NoSQL solutions fit nicely here. This non-volatile storage is amorphous in that the content (think property, not property value) of any object can change at any time. There is no schema, table structure or enforced relationship model. I think of this data persistence model as raw object storage, derived object storage and the transformed data that forms the basis of what Jeff Jones refers to as Counting Systems. Additionally, these systems store data in application consumable objects – with those objects being created on the way in.

Systems in this layer are generally highly scalable, fault tolerant, distributed systems with enhanced write efficiency. They offer the ability to perform the high volume writes required in real time Big Data systems without significant loss of read performance.

What Does All This Mean?

Most notably it means, that after years of obfuscating the underlying data storage from developers, we now need to re-engage application developers in the data storage conversation. No longer can a DBA define the most elegant data model based on the “I’m storing this because I may need it again in the future.” model and expect it to function in the context of a real time Big Data application.

We will hear a chorus of voices who will attack these dis-aggregated data persistence models based on complexity or the CAP Theorem or on the standard “the old way is the best way” defense of ACID and the RDBMS for everything. But all of this strikes me as a perfect illustration of what Henry Ford said:

If I had asked customers what they wanted, they would have told me they wanted a faster horse