文档结构  
翻译进度:10%     翻译赏金:0 元 (?)    ¥ 我要打赏

生物识别技术使用指纹或虹膜扫描已不再新鲜。由于高分辨率相机和三维人脸识别算法的发展,在过去的几年中,面部识别作为生物识别技术的一种手段已经变得相当流行。 我第一次接触这项技术在2012 - 13年间当时谷歌第一次发布了面部解锁, 这是在他们Android操作系统中一个功能,通过识别所有者脸来解锁手机。

因此,面部识别不是新技术。 新的是开源库和AWS智能服务为开发者提供的一些最先进的面部识别算法的接入服务 (又名. Rekognition)。我们将在这篇博客里讨论这些相关的库。

第 1 段(可获 1.55 积分)

The Doorbell Project

In my previous rudimentary attempts at doorbell(s), I had used Arduino with motion and range sensors, where the motion would trigger my range sensor and measure the range every few milliseconds to determine if the object (person) at my door was approaching towards my door or was just passing by - thus concluding whether someone is really at the door.

How about if I knew not only when someone was at the door, but also tell me who s/he was and his/her name if I knew the person already. I would also love a picture of my visitors to be sent to my phone so I could be notified even if I was not at home. Finally, in the ideal state, the visitor has to simply walk at the door … no buttons to press!

第 2 段(可获 1.75 积分)

Components

To conceptualize such a doorbell, I would need a (1) camera connected to a (2) computer that would constantly look for and (3) identify objects, or visitors in this case. Upon detection of a body-like feature (or a face), the computer would capture the image, (4) compare it with known faces and if found amongst the known faces, it would (6) notify me (on my phone) about who was at the door. Obviously, if it was a new face, I would like my computer to (4) learn the face and (5) remember it for next time. For kicks, I also thought it would be a cool idea for my computer to (7) greet my visitor if it was a known face.

第 3 段(可获 1.53 积分)

This was going to be a lot of moving parts! They got mapped out as follows:

  1. A USB webcam

  2. Raspberry Pi 2 Model B as my portable computer with Python 3 scripting

  3. OpenCV. OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. [opencv.org]

  4. AWS Rekognition. Amazon Rekognition is an Amazon AI service that makes it easy to detect, index and search faces in images. [AWS Rekognition]

  5. AWS DynamoDB. Amazon DynamoDB is a fast and flexible NoSQL database service. [AWS DynamoDB]

  6. AWS Simple Notification Service. Amazon Simple Notification Service (Amazon SNS) is a fast, flexible, fully managed push notification service that lets you send messages to mobile device users, email recipients or even text messages. [AWS SNS]

  7. AWS Polly. Amazon Polly turns text into lifelike speech. It uses Amazon AI service that synthesizes speech that sounds like a human voice. [AWS Polly]

  8. AWS Lambda. AWS Lambda runs code without requiring to provision or manage servers (i.e. no EC2’s). In this case, Lambda was used for overall orchestration of all AWS services listed above.

第 4 段(可获 2.29 积分)

The Blueprint

As the above components are mapped out to the high-level vision, the execution was as described below.

Please NoteThe purpose of this blog is primarily to discuss the concept and idea, hence as much possible, I am going to stay away from the code and/or installation instructions. At the same time I will try my best to provide all my references so you can refer to them, too.

Let’s walk through some of the key components/modules of this project.

1: Face Tracking

The Python script on the Raspberry Pi utilizes the OpenCV libraries and the Haar feature-based cascade classifiers to constantly track faces. Read more about the Haar cascade classifiers on opencv.org.

第 5 段(可获 1.46 积分)

Image title

2: Face Recognition

After detecting the face(s), the Python script utilizes AWS CLI to upload the images in S3. A Lambda function is then invoked that utilizes AWS Rekognition service to search from an indexed faces collection. If a match is not found, we then index the image as a new face.

AWS facial identification algorithm stores faces in their collection as feature vectors. This makes it possible to search any image from the indexed collection as well as to save any face as a feature vector. An example of image search, as well as results of indexing an image, is shown below.

第 6 段(可获 1.3 积分)

I have intentionally missed a step in the code snippet above that involves reading and writing into the database. The database is a simple table that stores a matrix of FaceId and a name, so I can return the name of the visitor for any matched FaceId obtained from the rekognition.searchFacesByImage method above.

An example of the database table is shown below:

FaceIdName
ff43d742-0c13-5d16-a3e8-03d3f58e980bMangesh
ff499e32-0c13-3d26-a6e5-0553e5fd9e0eJohn
ff493332-0c13-34d6-a6f5-0f53ehf494ee

Someone



3: Notification

Within the same Lambda function above, when a matched face is found and the name of the visitor is retrieved from the database table, I utilized SNS to send SMS notification. Please note that there is an assumption here that an SNS topic has been setup and the required phone numbers are subscribed to that topic.

第 7 段(可获 1.59 积分)
...
var sns = new AWS.SNS();
sns.publish({
  Message: db.visitors.join(", ") + ' ' + (db,visitors.length > 1 ? 'are' : 'is') + ' at the door. ' + fullimage,
  TopicArn: 'arn:aws:sns:us-east-1:949700099995:topic'
}, function (err, data) {
  callback(err, db.visitors.join(",")); // exit out of Lambda
});
...

Image title

4: Greeting Visitors by Name

Essentially, we want to greet the user with their name. This is by far the simplest of all steps, as we need our greeting text to be converted to audio. Just to set expectations, there are multiple services that could have been used for this text-to-speech conversion, but I decided to stick with AWS, just because it is easier to call AWS services from Lambda (same ecosystem) and because it is a new offering that claims to be more realistic (and it does indeed feel that way!).

第 8 段(可获 1.15 积分)

In this case, we again use the AWS CLI from the Raspberry Pi Python script. The AWS Polly service, in response, returns an MP3 file, which can be then played using any available audio player on the Raspberry Pi itself (make sure the speaker is connected via HDMI or 3.5mm port).

Conclusion

Artificial intelligence and machine learning don't feel out of reach now, as open source libraries and services like those of Amazon Web Services are more accessible to developers. One can simply imagine the breadth and depth of applications that can be developed using these ready-to-use machine learning algorithms (services). The doorbell is just an example, but I cannot stop fathoming applications of AI in a variety of domains, such as social networking, security, traffic, medicine, education, etc.

第 9 段(可获 1.64 积分)

There is still uncharted territory in this space outside of facial recognition, such as assessing facial emotions, recognizing moving objects and vehicles, color recognition, etc. that I will continue to pursue and keep sharing updates. If you have any experience to share in this space, I’m sure we’re all ears!

第 10 段(可获 0.65 积分)

文章评论