Person Recognition in Images with OpenCV & Neo4j

Time for an update on my ongoing person identification in images project; for all the background you can check out these previous posts:

Analyzing AWS Rekognition Accuracy with Neo4j

AWS Rekognition Graph Analysis – Person Label Accuracy

Person Recognition: OpenCV vs. AWS Rekognition

In my earlier serverless series I discussed and provided code for getting images into S3 and processed by AWS Rekognition – including storing the Rekognition label data in DynamoDB.

This post builds on all of those concepts.

In short – I’ve been collecting comparative data on person recognition using AWS Rekognition and OpenCV and storing that data in Neo4j for analysis.

Continue reading “Person Recognition in Images with OpenCV & Neo4j”

AWS Rekognition Graph Analysis – Person Label Accuracy

Last week I wrote a post evaluating AWS Rekognition accuracy in finding people in images. The analysis was performed using the Neo4j graph database.

As I noted in the original post – Rekognition is either very confident it has identified a person or not confident at all. This leads to an enormous number of false negatives. Today I looked at the distribution of confidence for the Person label over the last 48 hours.

You be the judge:

rekognition-person-label-confidence-distribution

Check out original post to see how the graph is created and constantly updated as images are created in the serverless IoT processing system.

Analyzing AWS Rekognition Accuracy with Neo4j

As an extension of my series of posts on handling IoT security camera images with a Serverless architecture I’ve extended the capability to integrate AWS Rekognition

Amazon Rekognition is a service that makes it easy to add image analysis to your applications. With Rekognition, you can detect objects, scenes, and faces in images. You can also search and compare faces. Rekognition’s API enables you to quickly add sophisticated deep learning-based visual search and image classification to your applications.

My goal is to identify images that have a person in them to limit the number of images someone has to browse when reviewing the security camera alarms (security cameras detect motion – so often you get images that are just wind motion in bushes, or headlights on a wall).

Continue reading “Analyzing AWS Rekognition Accuracy with Neo4j”

Upping your data game with Graph Databases

4094315135_c192532fe2_o
Photo Credit: hjl on Flickr

Since the late-2000’s there has been an explosion of non-relational (NoSQL if you must) data persistence technology. The industry buzz has focused around the derivatives of the seminal work done at Google – i.e., BigTable – and Amazon – i.e., Dynamo.

We’ve seen massive adoption of simple document stores and key-value based stores – which focus on availability and partition tolerance and thereby enable storage and processing of schema-less (or semi-structured) data at velocities and volumes previously considered entirely impractical.

There were – however – compromises. These systems are abysmal at dealing with connections between the data – or more precisely connecting the entities in the data sets with one another in a variety of contexts.

Many of our platforms, systems, applications and services are intended to deal with these types of connections – and unfortunately most engineering teams fall back to relational databases to solve these problems. The problem with this approach, however, is that relational databases are inherently inefficient when performing complex set operations:

The true value of the graph approach becomes evident when one performs searches that are more than one level deep. For instance, consider a search for users who have “subscribers” (a table linking users to other users) in the “311” area code. In this case a relational database has to first look for all the users with an area code in “311”, then look in the subscribers table for any of those users, and then finally look in the users table to retrieve the matching users. In comparison, a graph database would look for all the users in “311”, then follow the back-links through the subscriber relationship to find the subscriber users. This avoids several searches, lookups and the memory involved in holding all of the temporary data from multiple records needed to construct the output. Technically, this sort of lookup is completed in O(log(n)) + O(1) time, that is, roughly relative to the logarithm of the size of the data. In comparison, the relational version would be multiple O(log(n)) lookups plus additional time to join all the data.[3]

Via Wikipedia

The opportunity to build highly efficient systems by leveraging natural graph models (those that exist in the real world) is massive and dramatically under utilized.

Imagine a CRM system which wasn’t tightly coupled to a rigid account, contact, entitlement hierarchy model. Imagine an student roster system which directly modeled the complexity of sections for teachers, schools, districts and states without a myriad of join tables.

Imagine a performant entity, attribute, value implementation that allows for performant queries over arbitrary attribute types and combinations.

The only barrier is investing in education required to understand the efficiencies of graphs and the graph databases available today. I highly recommend you up your game by downloading Neo4j today and beginning to learn the advantages graph databases can provide in your polyglot persistence architectures, and if you need any help let me know – I’m happy to help you and your team model your data and leverage a graph to simply your system and increase your feature velocity.