Integrating Deep Learning and Embedded Computing into the Design of Next-Generation Electromyogram-based (EMG) Human-Machine Interfacing (HMI)

ENGR 697/845 - Senior Design Project · ICELab Research

Description

In recent years, the field of machine learning has seen a resurgence in popularity and application. This has led to classifiers with state-of-the-art performance across various related fields, such as facial expression analysis, skeleton-based human action recognition, and sign language gesture recognition. In conjunction, the advancement of GPUs has made it possible to vastly increase data throughput, which is required to train AI models through large quantities of data nets over a shorter period of time and do inference in real time.
Machine learning models have been applied to EMG signals in the past, most recently, deep learning architectures. And while state of the art models show great performance, inference is done on high-end GPUs, not ideal for prosthetics or assistive robots applications. This project takes a step further by replicating and optimizing of a Neural Net architecture from a research paper and succesfully implementing it on an GPU-based embedded platform, the Jetson TX2. This requires designing the architecture and its different layers, deployment of model onto the system, testing for requirements and how the data is processed. Ultimately, this is a test to see if the inference ability of neural net algorithm on an embedded platform meets the real-team requirements and high levels of accuracy of other machine learning algorithms. The motivation behind this project is to advance the pattern recognition of EMG based systems.

Architecture

The data pipeline design, training and testing is done in a host computer, an ICE lab computer with a GTX 2080 Nvidia graphics card. The csl-hdemg dataset. Once the AI model meets viable accuracy levels, it is then deployed onto the Jetson platform to test the real-time requirement. This is done via TensorRT, Nvidia’s software for deep learning hardware acceleration. The data pipeline are as follows:

  • Data Acquisition: csl-hdemg dataset and data acquired via bio-elettronica quantrocentro
    • electromyogram signals are composed of 192 channels. The input matrix of 1920x6144, where the rows are the 192 channels and the columns are samples over time. There are multiple trials per gesture, and all the trials' matrices are concatenated across the y axis. (Hence 192 channels x 10 trials = 1920 rows).
    • Sampling rate: 2048Hz
    • 5 subjects, 5 sessions per subject, 10 trials per session
    • data set contains 27 gestures divided into 3 sets depending in complexity.
    • composed of 3 8x8 grids electrodes, totaling 192
    • bi-polar recordings: voltage difference between electrodes is recorded, every 8th channel is discarded.
  • Data-preprocessing
    1. 4th Order Butterworth Bandpass filter of 20-400Hz.
    2. Non-overlapping windowing scheme of 150 samples.
    3. Find root-mean-square of each window.
    4. Apply Baseline norm: subtract RMS from "rest" gesture.
    5. Spatial order-3 1-dim filter: for every 3 samples of data, select medium.
    6. Adaptive threshold: is element sum of RMS of ith window is greater than element sum of average RMS of all windows, set window as active.
    7. Gap filling: If windows ith-1 and ith+1 are active, set ith window as active.
    8. Image convertion using heatmap python function.
  • Neural Net (graph is decription section)
    • 2 Convolutional layers: 64 (3x3) kernels.
    • Each is followed by max-pooling layers.
    • 3 Fully connected layers: 512/512/128 outputs respectively
    • Softmax Regression with # of gestures depending on set used.

Critical Aspects

There are certain constrains for this project.

  • Real-time Pipeline: From the moment data is acquired to when it is fed through the neural net, this whole pipeline must remain below 300ms (the threshold of human noticing a delay), according to the research paper this project is based on. This in turn leads to a trade-off in terms of complexity of the net, as well as the amount of computations required in the preprocessing stage. This is especially true since we are dealing with large matrices. This is where the GPU-based embedded system comes into place, which has the advatange to process data in parallel.
  • Accuracy: Like any other systems dealing with prosthetics, gesture recognition accuracy needs to be in the upper 90s. This would reduce error in the user's intent, as well as jitter in future mechanical implementations. The trade-off of depth and layer-size of the neural also affects the accuracy, so there needs to be an equilibrium between both. Another aspect accuracy is overfitting of the model. Due to the small size of the data set, many epochs (training iterations over the entire data) is needed to train the model well. Hyperparameter training should be implemented, if time is allotted.

Experiments and Results

Experiments and tests were set at different stages of this project:

  • Architecture accuracy: csl-hdemg "Tap" dataset and data acquired were both fed to the neural network separately. Leave-one-out-cross validation for training/test set was used (1 trial per sesssion was used for testing). For cross-session accuracy for csl-hedemg, the result was 99% on training and 96% on testing. For acquired data, 84%/80% respectively.
  • Real-time: Hardware acceleration was used for inference, reducing the average time from 200ms to 25ms. In addition, the pre-processing pipeline was parallelized using CUDA reducing it from ~180 ms to ~50ms.

Results: Overall, a succesful implementation with real-time constraints of noticeable delay and high accuracy (comparable to current papers) were met. Total time from data acquired to gesture infered: ~75ms. In addition, parallelization of data preprocessing and training approximate from 1.5-2 hours to 30 minutes

References

  • Hu, Yu & Wong, Yongkang & Wei, Wentao & Du, Yu & Kankanhalli, Mohan & Geng, Weidong. (2018). A novel attention-based hybrid CNN-RNN architecture for sEMG-based gesture recognition. PLOS ONE. 13. e0206049. 10.1371/journal.pone.0206049.
  • Amma, Christoph & Krings, Thomas & Böer, Jonas & Schultz, Tanja. (2015). Advancing Muscle-Computer Interfaces with High-Density Electromyography. 929-938. 10.1145/2702123.2702501.