In the Multi-Application Containers blog, we discussed our design for centralizing the location of our face recognition training data server in the public cloud, while keeping the latency-sensitive face recognition servers running on Edge cloudlets.
What’s the Problem?
As the number of face training subjects has increased, the trained data set has grown as well. There are currently 30 subjects with a total of 442 images. The Android app that uploads training images resizes them to 180x240 before sending them to the centralized Face Training server in the public cloud. Altogether, these images add up to about 4 MB. When training is performed, the resulting database (a YAML file) is 42 MB. When this file is downloaded, Gzip compression is used, so the transfer is approximately 2 MB. From the logs of a Face Detection server instance, we can see that the transfer takes about 7.4 seconds:
[facial_detection.face_recognizer:424] Checking with FaceTrainingServer [facial_detection.face_recognizer:458] is_training_data_file_current() url=http://opencv.facetraining.mobiledgex.net:8009/trainer/lastupdate/ [facial_detection.face_recognizer:461] 177.856 ms to get centralized training data timestamp: 1588979555. local=1582146029 [facial_detection.face_recognizer:465] Newer training data available. [facial_detection.face_recognizer:478] Downloading from http://opencv.facetraining.mobiledgex.net:8009/trainer/download... [facial_detection.face_recognizer:486] 7452.165 ms to download centralized training data. training_data_timestamp=1588979555
There’s also the issue of when to perform this download. Every instance of the Face Recognition Server has to check with the Face Training Server to see if new training data was available. It does this at startup and at the start of each new face recognition session. This polling model was not ideal. We prefer that the centralized server had a means of alerting the distributed Face Recognition Server instances.
These problems aren’t terrible, and the system works as-is. Some might say, “if it ain’t broke, don’t fix it,” but really, we just want an excuse to try out Redis.
What is Redis, and what are we trying to do with it?
From the Redis web site:
Redis is an open-source (BSD licensed), in-memory data structure store, used as a database, cache, and message broker.
We are most interested in using it as a database to store the face training images, but will also be using the message broker capabilities to alert Face Recognition instances of changes to the training set.
What we want to accomplish is to use Redis to store the face training images that get uploaded to the centralized Face Training Server, and make them available to the Face Recognition process on each instance of a Computer Vision Server. When this happens, a notification will be sent out to all Computer Vision Servers, and they will request the new images from Redis. Once the images are received from the database, the training procedure will take place on each Computer Vision Server. Subsequently, face recognition will be performed there for images sent by the client (e.g. Android app, Unity app, etc.).
First Attempt with Redis
This is the first project in which I have attempted to use Redis. After figuring out the basics, I was ready to try a simple, unsophisticated design to prove the concept was feasible. Redis allows you to store anything as a string of bytes, so I elected to store each of the entire JPEG face images that were submitted for training in this way. I used a scheme where each user had an index value (0001, 0002, etc.) and used the key “subject_name:index” to store the subject name with a set command, and the key “subject_images:index” to store a list of the images with a series of lpush commands. Finally, I used the ltrim command to limit the image list to the 15 most recent images for this particular user. We currently do not limit the number of users that may store training data.
This is implemented in a custom Django-admin command called images_to_redis. Here is the handle method that does the actual work:
def handle(self, *args, **options): logger.info('Connecting to redis server %s' %self.training_data_hostname) self.connect() r = self.redis r.set(SUBJECT_KEY_INDEX, 0) valid_extensions = ('jpg','jpeg', 'png') dirs = os.listdir(TRAINING_DATA_DIR) for dir in dirs: r.incr(SUBJECT_KEY_INDEX, 1) # If key doesn't exist it will get created index = r.get(SUBJECT_KEY_INDEX).decode('utf-8') # Decode from byte to string index = index.zfill(4) subject_images_key = 'subject_images:%s' % index subject_name_key = 'subject_name:%s' % index r.set(subject_name_key, dir) files = os.listdir(TRAINING_DATA_DIR+'/'+dir) for file in files: if file.endswith(valid_extensions): f = open("%s/%s/%s" %(TRAINING_DATA_DIR, dir, file), "rb") image = f.read() r.lpush(subject_images_key, image) r.ltrim(subject_images_key, 0, MAX_IMAGE_COUNT-1) # Limit to the MAX_IMAGE_COUNT most recent images. r.publish(PUB_CHANNEL_ADDED, "") # Blank means all logger.info("Published notification to channel: %s" %PUB_CHANNEL_ADDED)
Full source code is available in our edge-cloud-sampleapps public Github repository in the FaceTrainingServer directory. (This code of interest was later moved to the FaceRecognizer class so that it could be called both by the images_to_redis handle method and by the /trainer/train code as well. Full path: FaceTrainingServer/facerec/facerec/FaceRecognizer.py, redis_save_subject_images)
The Actual Data
In this screenshot, you can see the list of “subject_images:*” keys. You can see that there are 30 keys, corresponding to the 30 subjects, and that 15 is the maximum size of any list. I have selected “subject_images:0004” which displays the data bytes for the first image. Note the “JFIF” bytes signifying that this is a JPEG image.
Below is the code that reads the images from Redis. It gets the length of the list with the llen command, then in a loop, pulls each image from the database with the lindex command.
num_images = redis.llen(subject_images_key) for i in range(0, num_images): image_bytes = redis.lindex(subject_images_key, i) total_bytes += len(image_bytes) image = imageio.imread(io.BytesIO(image_bytes)) # convert to numpy array
And here is the result:
[facial_detection.face_recognizer:227] 33281.337 ms to download 360 images totaling 2616371 bytes from redis
Over 30 seconds! That’s pretty terrible. A little research showed that the way we were looping and using lindex meant that every image was retrieved from the database in a separate transaction. 360 separate times, each one with its own ACKs.
num_images = redis.llen(subject_images_key) for image_bytes in redis.lrange(subject_images_key, 0, -1): total_bytes += len(image_bytes) total_images += 1 image = imageio.imread(io.BytesIO(image_bytes)) # convert to numpy array [facial_detection.face_recognizer:227] 12000.340 ms to download 360 images totaling 2616371 bytes from redis
Over twice as fast. Maybe this Redis idea isn’t a bust after all. Let’s see what else we can do to optimize things.
Redis has a feature called pipelining that allows queries to be queued together and executed all at once in a single transaction. Let’s implement this to pull all the images at once, and then process them in a separate loop.
[facial_detection.face_recognizer:210] 2089.997 ms to download images for 30 subjects in a batch from redis
Just over 2 seconds. That’s more like it. However, as a fair comparison, we need to include the time it takes to do face detection on each image, and add the faces to the face recognizer instance before performing the training. Adding faces to the face recognizer was previously completed within the download loops, but now it is a separate loop executed after the download. When adding that time to the total, we see the entire process takes a little over 5 seconds.
[facial_detection.face_recognizer:234] 5421.897 ms to download 360 images totaling 2616371 bytes from redis
A nice improvement over where we started, and faster than the old way of downloading the pre-trained database (old way=7.4 seconds, new way=5.4 seconds).
After working with the data for a while, it became evident that the database schema we initially chose was needlessly complicated. We iterated over an integer value and used that value for two separate keys to get the complete data for a subject. For example, subject_name:0001 and subject_images:0001 were used as keys for the first subject. Why not use the subject name itself as the key so that all information is available in one place? After making this change, our database now looks like this:
Originally, our code retrieved the number of subjects (stored in yet another separate key) and iterated over a range of integers:
num_subjects = = int(self.redis.get('subject_key_index')) for i in range(0, num_images): # Process subject number i
And the “i” value was used to create
subject_images:000X keys to retrieve the data.
With our new schema, we don’t need to track the number of subjects. We now iterate over all the images of the subjects with this loop:
for key in self.redis.scan_iter("subject_images:*"): # Process subject referenced by this key
The subject name is right in the key, and the key allows us to retrieve the image data. We need not only the ability to iterate over all subjects, but also the ability to directly access a subject’s data. Directly accessing a subject’s data is easily done by appending the subject name so the key used looks like this: “subject_images:Holly”
When to Retrieve the Data?
In our old implementation, we used a polling model, described in the “What’s the Problem” section above. Now we use a PubSub model: Any time the training data is updated on the Face Training Server running in the public cloud, the server publishes a notification for this to a 'training.added' or 'training.removed' channel.
redis.publish('training.added', pubsub_added_name) # Blank name means "all"
Each Face Recognition Server instance (running within the ComputerVisionServer process on an Edge cloudlet) is subscribed to these channels and will “wake up” and download new data immediately. Here’s the subscription code:
self.channel = 'training.*' self.pubsub = self.redis.pubsub() self.pubsub.psubscribe(self.channel) logger.info('Pattern subscribed to channel %s' %self.channel)
Full source code for the image retrieval functionality is available in our edge-cloud-sampleapps public Github repository in the ComputerVisionServer directory. The full path is ComputerVisionServer/moedx/facial_detection/face_recognizer.py. Search for “redis” to see relevant code.
One more Redis feature that we plan to take advantage of in the future is “Time To Live” (TTL). We can use the expire command to determine how long a particular subject’s face images should be retained. If someone performs training, and uses the face recognition app for a single day, never to return, there is no reason to keep their data in the database.