Effective Deployment and Model Improvement Methods for Deep Learning Models on UAV Device

Loading...
Thumbnail Image

Date

2020-11-12

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

In recent years, deep neural networks have outperformed traditional methods in many tasks of computer vision, such as image classification, object detection, semantic segmentation and so on. However, as the performance of deep models goes higher, the model becomes more complicated, which makes it hard to be deployed on devices with poor performance such as UAVs or Raspberry Pi. In this work, we focus on the computer vision deep learning tasks applied to UAVs in terms of improving model performance and effective deployment. In this paper, we assign two tasks for these two directions separately: 1. Object detection for small objects on rooftop images taken by UAVs. 2. Optical Character Recognition (OCR) system for videos taken by UAVs. First, we assign the object detection for small objects on rooftop images taken by UAVs task for the model performance improvement direction. We propose several data augmentation strategies to improve the small objects detection performance. We also use model ensemble methods to improve the performance. Finally, we improve the model performance by approximately 15%. For the effective deployment direction, we assign a OCR system for videos taken by UAVs task to it. We use Raspberry Pi as our target deployment device. We propose a two-stage OCR text detection system using EAST as detection part and CRNN as recognition part for low-resolution images and design an early-exit module to speed up the detection process. Then we use model compression methods such as pruning and quantification to process the OCR system and successfully deployed it on Raspberry Pi. For the pruning part, we also do experiments on CRNN model to find out the effectiveness of original weight initialization of the model to our pruning results and get to a conclusion that the original weight of a model is not that important for pruned model. Then we manually export the pruned model after the pruning. For the quantization part, we use pytorch’s built-in static/dynamic/QAT quantization methods to quantize the two parts from float32 to int8. Finally, we deploy the OCR system on the Raspberry Pi under pytorch framework and decrease the OCR system time latency on the Raspberry Pi from about 120s to about 4-6s with little accuracy loss.

Description

Keywords

Deep learning, Machine learning, Effective deployment, OCR system

Citation