Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 12 December 2022

Exploiting domain transformation and deep learning for hand gesture recognition using a low-cost dataglove

  • Md. Ahasan Atick Faisal 1   na1 ,
  • Farhan Fuad Abir 1   na1 ,
  • Mosabber Uddin Ahmed 1 &
  • Md Atiqur Rahman Ahad 2  

Scientific Reports volume  12 , Article number:  21446 ( 2022 ) Cite this article

6663 Accesses

15 Citations

10 Altmetric

Metrics details

  • Biomedical engineering
  • Electrical and electronic engineering
  • Information technology

Hand gesture recognition is one of the most widely explored areas under the human–computer interaction domain. Although various modalities of hand gesture recognition have been explored in the last three decades, in recent years, due to the availability of hardware and deep learning algorithms, hand gesture recognition research has attained renewed momentum. In this paper, we evaluate the effectiveness of a low-cost dataglove for classifying hand gestures in the light of deep learning. We have developed a cost-effective dataglove using five flex sensors, an inertial measurement unit, and a powerful microcontroller for onboard processing and wireless connectivity. We have collected data from 25 subjects for 24 static and 16 dynamic American sign language gestures for validating our system. Moreover, we proposed a novel Spatial Projection Image-based technique for dynamic hand gesture recognition. We also explored a parallel-path neural network architecture for handling multimodal data more effectively. Our method produced an F1-score of 82.19% for static gestures and 97.35% for dynamic gestures from a leave-one-out-cross-validation approach. Overall, this study demonstrates the promising performance of a generalized hand gesture recognition technique in hand gesture recognition. The dataset used in this work has been made publicly available.

Similar content being viewed by others

phd thesis in gesture recognition

Representation of internal speech by single neurons in human supramarginal gyrus

phd thesis in gesture recognition

Principal component analysis

phd thesis in gesture recognition

Maximum diffusion reinforcement learning

Introduction.

From the dawn of human civilization, communication between humans has been the single most important trait for our survival. At the same time, it created the social attributes among us which had been modified over the centuries and transformed us into civilized beings. However, the first mode of communication was not a structured vocal language but involved gestures, often using hands. Later, with the progress of civilization, people adopted structured languages and used hand gesture-based communication in special cases. Recent researchers have found around 6700 spoken languages 1 and several hundred sign languages 2 , although a good number of them are not currently in use. Despite having structured vocal languages in every country, sign languages are still used primarily for communication with the deaf and hard-of-hearing community. However, since non-signers are not generally familiar with these sign languages, deaf and hard-of-hearing people face communication barriers. On the other hand, with the technological development in sensor technologies, embedded systems, camera technologies, and efficient learning systems, hand gesture recognition research has found efficient and pragmatic solutions to the communication problem.

Hand gestures and their use vary greatly depending on the field of application. Apart from sign-language communication, several other tasks, namely military coordination, interaction with digital devices, and virtual gaming consoles involve hand gestures. Based on hand motion, hand gestures can be divided into two types—static hand gestures and dynamic ones. Moreover, based on the application, one or both hands can be involved to complete a sign. Over the year, researchers have been trying to develop several technologies to use human hand gestures to communicate with the cyber world. Hence, the task of hand gesture recognition has been one of the most widely explored areas in the research domain of Human–Computer Interaction (HCI). The detection systems can be classified into two categories—contactless detection systems where the detecting device is kept at a distance from the hand and are not in any sort of contact, and wearable detection systems that are often implemented with several sensors at close contact with the hand 3 .

The researchers have explored contactless hand gesture recognition systems using several modalities, namely Radio Frequency (RF), ultrasound, and Computer Vision (CV). Google’s Project Soli is a 60 GHz millimeter-wave radar on-chip that can detect fine-grained hand gestures along with micro-finger movement 4 . Wang et al. used this chip and employed a deep learning algorithm to detect 11 dynamic hand gestures from 10 subjects 5 . However, this chip is not usable for detecting gestures at a meter-long distance. In this regard, WiFi has been used as a ubiquitous modality for detecting hand gestures from a greater distance than the Soli chip 6 , 7 , 8 ; however, it fails in the precision of detection 3 . On the other hand, several studies have discovered the potential of ultrasound for detecting hand gestures with a clear line of sight 9 , 10 , 11 . Although in recent years, these RF-based and sound-based modalities have been improved in performance and reliability in particular applications, they are still not dependable in a regular use case where the environmental parameters vary frequently 3 .

Owing to the tremendous development in Artificial Intelligence (AI) and camera technology, computer vision-based gesture detection systems are the most widely explored field of research in recent years. Although fundamentally employing computer vision modality, hand tracking and gesture recognition can be achieved in a variety of techniques namely, skin color detection 12 , 13 , 14 , appearance detection 15 , 16 , 17 , 18 , motion-based detection 19 , 20 , skeleton-based detection 21 , 22 , and depth detection 23 , 24 , 25 , 26 . Apart from the conventional RGB camera, IR-based leap motion controller 27 , 28 and Microsoft Kinect depth camera 26 , 29 , 30 , 31 are two of the most widely used hardware for depth and skeletal information detection. However, the primary shortcomings of these methods are the environmental dynamics, namely lighting conditions, line of sight, and detector proximity. Although depth-based and skeleton-based approaches have become more robust over the year, such as the MediaPipe by Google 32 , they still have not overcome those shortcomings completely. Moreover, due to the cost of the high-quality depth sensor, the usability of such systems is rather still limited.

On the other hand, sensor-based wearable datagloves are one of the most widely used contact-based hand gesture recognition systems that have overcome most of the shortcomings of contactless detection methods. VPL Inc. first introduced a commercial sensor-based dataglove back in 1987 33 . The researcher of this dataglove invented optical flex sensors which enabled them to track finger flexion. Despite inventing the technology at such an early age of HCI, these datagloves were not widely adoptable due to the high cost and lack of feasibility in regular use. In the last decade, owing to the development of low-cost high-performance sensors, processing devices, connectivity, and algorithms, researchers have explored new avenues of sensor-based hand gesture recognition.

Over the year, a wide range of both commercially available and custom-made sensors are used on datagloves for accurately capturing hand gesture dynamics. Several studies have explored surface electromyography (sEMG) sensors to capture the electrical activity inside the hand muscles during gesture performance 34 , 35 , 36 , 37 , 38 , 39 , 40 . Moreover, Wang et al. 41 , Abreu et al. 42 , and Su et al. 43 used variants of the Myo band which are commercial sEMG armbands and are specifically designed to track hand gestures. Although sEMG shows reliable performance in a wide range of gestures including sign languages, the detection algorithms were subjected to the sensor placement and the signers. Moreover, several studies have used resistive flex sensors and their variants for actively tracking finger flexions 44 , 45 , 46 and an Inertial Measurement Unit (IMU) to detect the hand movements 40 , 44 , 45 , 47 , 48 . Wen et al. developed a smart glove with 15 triboelectric nanogenerator (TENG)-based sensors for tracking 50 American Sign Language (ASL) words and 20 sentences 49 . Furthermore, using a fusion of multiple sensors has shown greater performance in several studies than a single sensor method 40 , 41 , 43 , 45 , 46 , 50 .

In our previous study, we presented a dataglove with five flex sensors and one IMU and evaluated its performance in a limited number of gestures and subjects 44 . In this work, we adopted the same sensor fusion configuration and developed and combined it with state-of-the-art deep learning techniques. We proposed a Spatial Projection Image based deep learning technique for dynamic hand gesture recognition and parallel-path neural network architecture for multimodal sensor data analysis. The system successfully recognizes 40 words from the ASL dictionary, including 24 static and 16 dynamic signs collected from 25 subjects. In a nutshell, the key contributions of this work are as follows:

We constructed a low-cost wireless capable dataglove combining flex sensors and IMU and explored state-of-the-art deep learning techniques on it.

We provided a large dataset of 40 ASL letters and words collected from 25 subjects using the proposed dataglove.

We introduced Spatial Projection to convert the 1D time-series signals into 2D images for dynamic gesture recognition which outperformed 1D CNN and classical machine learning-based approaches.

The proposed parallel-path neural network architecture showed superior feature extraction capability from multimodal data over conventional architectures.

Methods and materials

Hardware configuration.

The primary hardware is a dataglove consisting of three units, namely sensing, processing, and onboard power regulation unit. The sensing unit is comprised of five 2.2" flex sensors (SEN-10264) and an IMU (MPU-6050) which has a triaxial accelerometer and a triaxial gyroscope. The overall hardware configuration is illustrated in Fig.  1 .

figure 1

The dataglove architecture: On the left, we have the glove with all the mounted sensors and electronics. A flex sensor is shown in the top right corner. The components of the main controller board are shown in the bottom right corner. It consists of an ESP32 microcontroller, an MPU-6050 IMU, and some complementary electronics.

Sensing unit

The flex sensors are, in fact, variable resistors with flat resistance of \(25\;{\text{K}}\Omega \;\left( { \pm \;30\% } \right)\) , which are placed above the five fingers of the dataglove using fabric pockets to sense the fingers’ flex. A voltage divider was created with each flex sensor and a \(0.25 \;{\text{W}}\;100\) KΩ ( \(\pm \;5\% )\) resistor was used to convert the resistance difference during the finger flexion to the voltage difference across the sensor using the processing unit 51 .

The accelerometer and gyroscope of the IMU are configured to track the linear acceleration within \(\pm \;19.6 \;{\text{ms}}^{ - 2}\) and angular velocity within \(\pm \;4.36\;{\text{rad}}\;{\text{s}}^{ - 1}\) , respectively, which is well within the range of any human hand motion. Moreover, the IMU contains a Digital Motion Processor (DMP) which can derive the quaternions in-chip from the accelerometers and gyroscope data and thus, provides the hand orientation data along with the motion information 52 .

Processing unit

The processing unit is a WiFi-enabled development module called DOIT ESP32 Devkit V1 that has a Tensilica Xtensa LX microprocessor with a maximum clock frequency of \(240\;{\text{MHz}}\) . The 12–bit analog to digital converter (ADC) with 200-kilo samples per second maximum sampling rate is capable of sampling the flex sensors’ analog data with sufficient resolution. Moreover, the module is capable of communicating with external computers via USB which enables wired data communication 53 .

Onboard power regulation

The ESP32 module and the IMU have an operating voltage of \(3.3\;{\text{V}}\) 52 , 53 . On the other hand, the flex sensors do not have a strict operating voltage 51 . Hence, we used an LM1117 low-dropout (LDO) 3.3 V linear voltage regulator to regulate the supply voltage from the \(3.7\;{\text{V}}\) single cell LiPo battery. Moreover, we used \(10\) μF and \(100\) μF filtering capacitors to filter out the supply noise.

We explored 40 signs from the standard ASL dictionary that including 26 letters and 14 words. Among these signs, 24 require only a certain finger flexion and no hand motion; hence, are addressed as static signs or gestures. Conversely, the remaining 16 signs need hand motion alongside finger flexion to portray meaningful expression according to the ASL dictionary. Moreover, we collected the signs from 25 subjects (19 Male and 6 Female) in separate data recording sessions with a consistent protocol. Overall, three channels for acceleration in both body and earth axis, three for angular velocity, four for quaternion, and five for flex sensors were recorded in the dataset.

The data was recorded by the dataglove processing unit which was connected to a laptop for data storage via USB. The sampling frequency is set to 100 Hz and each gesture was repeated 10 times to record the performance variabilities of each subject. However, during a few sessions denoted in the dataset supplementary information, the laptop charger was connected which resulted in AC-induced noise all over those specific recorded data.

Data recording protocol

Before starting the recording process, each subject signed an approval form for the usage of their data in this research and was briefed about the data recording steps. As the subjects were not familiar with the signs before the study, they were taught each sign before the data recording via online video materials 54 . The data was recorded by the dataglove and stored on the laptop at the same time. Hence, a Python script was used on the laptop to make the handshake between the two devices and to store the data in separate folders as per the signs and the subjects.

At the beginning of each data recording session, the subjects were prompted to declare their subject id and the gesture name. Afterward, a five-second countdown is prompted on the laptop screen for preparation. Each instance of the gesture data is recorded for a 1.5 s window and the subjects can easily perform their gesture once within that window. In a single gesture recording session, this process is repeated 10 times. The gesture recording flow for each session is shown in Fig.  2 . All methods were carried out following the relevant guidelines, and the correctness of gestures was evaluated by visual inspection. All experimental protocols were approved by the University of Dhaka, Dhaka, Bangladesh. Note that informed consent was obtained from all subjects.

figure 2

The flowchart showing the data collection protocol. The diagram shows all the different steps of the data collection process. This protocol was followed during the data collection for all the subjects.

Data preprocessing

Gravity compensation.

The triaxial accelerometer of the IMU body records acceleration, which is subjected to gravity. Hence, the gravity component has to be adjusted from the recorded raw acceleration to interpret the actual motion characteristics of the dataglove. The gravity vector can be derived from the orientation of the dataglove. Quaternions express the 3d orientation of an object which is a robust alternative to the Euler angles which are often affected by gimbal-lock 55 . The digital motion processor (DMP) of the MPU-6050 processes the raw acceleration and angular velocity internally and produces quaternion. The quaternions can be expressed by Eq. ( 1 ).

where \({\varvec{Q}}\) stands for a quaternion that contains a scaler, \({q}_{w}\) and a vector, \(\mathbf{q}\left({q}_{x},{q}_{y},{q}_{z}\right)\) . The overall gravity compensation process is described in Eqs. ( 2 ) and ( 3 ) 56 .

where \({\varvec{g}}\left({g}_{x},{g}_{y},{g}_{z}\right)\) , \({\varvec{Q}}\left({q}_{w}, {q}_{x},{q}_{y},{q}_{z}\right),\) \({\varvec{l}}{\varvec{a}}\left({la}_{x},{la}_{y},{la}_{z}\right)\) , and \({\varvec{a}}\left({a}_{x},{a}_{y},{a}_{z}\right)\) denotes the gravity vector, quaternion, linear acceleration vector, and raw acceleration vector, respectively. The resultant linear acceleration ( \({\varvec{l}}{\varvec{a}}\) ) represents the body axis acceleration which is compensated for the gravity offset. This step was done in the processing unit of the dataglove.

Axis rotation

The recorded raw acceleration and the gravity-compensated linear acceleration both were in the body axis of the dataglove and the body axis is dependent on the initial orientation of the dataglove when it powers up. However, this nature of axis dependency on the initial orientation is problematic for real-world applications. Hence, we converted the triaxial acceleration vector from the body axis to the North-East-Down (NED) coordinate system which follows the directions based on the earth itself 57 . At first, a rotation matrix was calculated using the quaternions. Afterward, the NED linear acceleration is derived using matrix multiplication between the rotation matrix and the body axis linear acceleration. Equations ( 4 ) and ( 5 ) show this axis transformation process using quaternions 58 .

where \(\mathbf{R}\) , \({\varvec{Q}}\left({q}_{w}, {q}_{x},{q}_{y},{q}_{z}\right)\) , \({\varvec{L}}{\varvec{A}}\left({LA}_{x},{LA}_{y},{LA}_{z}\right)\) , and \({\varvec{l}}{\varvec{a}}\left({la}_{x},{la}_{y},{la}_{z}\right)\) stands for the rotation matrix, quaternion, NED linear acceleration, and the body axis linear acceleration, respectively. Similar to the previous step, this axis transformation is also done in the processing unit of the dataglove. Figure  3 illustrates the axial diagram of the dataglove and the axis rotation.

figure 3

The IMU orientation diagram: On left, we have the X, Y, and Z coordinates of the MPU-6050. Along these 3 axes, the accelerometer and gyroscope values are recorded. The figure on the right shows the body axis to earth axis conversion diagram.

Rolling filters

After closer inspection, we found a few random spikes in the IMU data. Hence, firstly, we removed using a rolling median filter of 10 data points to get rid of such spikes. After the spike removal, secondly, we used an extra step of applying moving average filters for the only specific sessions where the recordings were subjected to AC-induced noise which resulted in comparable waveforms for all data recordings. The implementation of the moving average filter is shown in Eq. ( 6 ) 59 :

where \(x\left[n\right]\) is the input signal, \(N\) stands for the number of data points, and \(y\left[n\right]\) denotes the output signal. However, after applying the rolling average there were a few null values at the end of each signal frame which were replaced by the nearest values in that signal. According to the data recording protocol, the gestures were performed in the middle of each 1.5-s window. Hence, replacing the few terminal data points with the nearest available valid data point does not change the signal morphology. Lastly, we used another level of rolling average filter of 10 data points, this time for the whole dataset, to further smooth the signal and also replaced the terminal null values with the nearest valid data point in each frame.

Normalization

The processed acceleration and flex sensor data are not in the same range. Hence, before employing the AI-based classification technique, data normalization is widely practiced for better convergence of the loss function 60 . We used min–max scaling as the normalization technique with a range of \(\left[ {0,1} \right]\) . It is shown in Eq. ( 7 ) 61 :

where \(x\) is the input and \({x}_{normalized}\) is the normalized output. \({x}_{\mathrm{max}}\) and \({x}_{\mathrm{min}}\) respectively denote the maximum and minimum values of the input.

Spatial projection images generation

There are several challenges associated with dynamic sign language recognition. In our case, the temporal dependency and the size of the hand were the most challenging issues. A signer can perform a sign at many different speeds. Moreover, the speed does not match up from signer to signer. To successfully recognize signs from all the subjects, first, this temporal dependency needs to be removed from the signals. The second challenge was the hand size of the signer which introduced variability in the gestures performed by different signers. In the proposed method, we tried to eliminate these two issues by utilizing the Spatial Projection Images of the dynamic gestures. However, the static gestures do not generate a meaningful pattern in the projections due to their stationary nature. Hence, this step is omitted for static signs.

When interpreting a sign, the speed of performing the sign and the signer's hand size does not matter. The spatial pattern created by the motion of the signer’s hand defines the sign. As long as the pattern is correct, the sign will be considered valid regardless of its temporal and spatial states. To capture this pattern of sign language gestures we utilized the accelerometer sensor data from our device. Using Eqs. ( 8 – 9 ), we converted the 3D acceleration into 3D displacement vectors. These vectors represent the path followed by the hand in 3D space during the performance of the gesture.

These 3D displacement vectors were then projected onto the XY, YZ, and ZX 2D planes. If the vectors are projected onto these planes for the entire timeframe of the sign, the projections form a 2D path that captures the pattern of the sign in the 3 planes as shown in Fig.  4 . No matter at which speed the gesture was performed, these 2D projections of the gesture always provide similar patterns. Hence the temporal dependency is eliminated in this process.

figure 4

Spatial projection generation process. We start with the 3-axis acceleration and then convert them into 3-axis displacement vectors. These vectors are projected onto the 2D spatial planes to generate the projection images.

After capturing the pattern of a particular gesture, we normalize the projections using the maximum and minimum values along axes. In this way, the projection from different signers results in a pattern that is similar regardless of their hand size.

The projections were generated using the Python Matplotlib 62 library where the components of the displacement were calculated along the 3 axes and they were plotted 2 at a time for the three-axis planes (XY, YZ, and ZX). We used the line plot for this with the “linewidth” parameter set to 7 and the color of the line set to black. This resulted in 3 grayscale images for the 3 projection planes for each gesture. The images were then resized to 224 × 224 pixels dimensions and we used these images for the input of our proposed model.

The proposed architecture

In this section, we present the network architecture of our proposed framework (Fig.  5 ). We have used two variations of the architecture for static and dynamic signs.

figure 5

Proposed network architectures: ( a ) Overall diagram of the proposed architecture. For static gestures, the sensor channels are processed by parallel 1D ConvNet blocks. For dynamic gestures, the accelerations are first converted into spatial projection images and features are extracted from them using the pre-trained MobileNetV2 network, ( b ) the architecture of the 1D ConvNet Blocks, and ( c ) the architecture of MobileNetV2.

Architecture for static gestures

As mentioned in the Data Preprocessing subsection, Spatial Projection Images are not used for static gestures. The normalized time series channels are passed to separate 1D ConvNet blocks to produce embeddings. These embeddings are afterward concatenated in a fully connected layer which in turn, makes the prediction. Figure  5 a shows the stacked 1D ConvNet block architecture for static gesture detection.

Architecture for dynamic gestures

We have utilized two different types of signals for the input to our model. First, we have the 3 spatial projection images generated from the acceleration data. Then we also have the 1D time-series signals from the flex sensors. So, in total, we have 8 channels of input data with 3 image channels and 5 time-series signal channels. Each of these channels was processed using separate ConvNet blocks to produce the embeddings from that particular channel. For the static gestures, the 8 time-series signals were processed using the parallel path ConvNet architecture shown in Fig.  5 b. On the other hand, the projection images were processed by a 2D ConvNet architecture (MobileNetV2 63 ) as shown in Fig.  5 c. The architectural details of these two ConvNet blocks are discussed below.

1D ConvNet block

The 1D ConvNet blocks are composed of 4 convolution layers. Each pair of convolution layers is followed by a BatchNormalization layer and a MaxPooling layer. The kernel size used in the convolution layers was set to 3, the stride was set to 1 and the padding was set to 1. The MaxPooling kernel size was set to 2 and the ReLU activation function was used. After the 4 convolution layers, the fully-connected layer with 50 neurons was used to extract the embeddings.

2D ConvNet block

The 2D ConvNet blocks are constructed using the MobileNetV2 64 architecture. MobileNet is an efficient architecture for mobile and embedded vision applications. It utilizes depthwise separable convolutions 65 to significantly reduce the computational burden compared to regular convolution. In depthwise separable convolution, each of the channels is processed with the convolution filters separately and the resultants are combined using a 1 × 1 pointwise convolution. This is known as factorization and it drastically reduces the computation and model size.

The MobileNetV2 63 is the result of the improvements done to the regular MobileNet architecture. It uses an inverted residual structure 66 where the skip connections are between the thin bottleneck layers which improves the performance compared to the classical structure. The MobileNetV2 architecture starts with a regular convolution layer with 32 filters followed by 19 residual bottleneck layers. The kernel size was set to 3 × 3 and ReLU6 64 was used as the activation function.

We used the Tensorflow 67 Python library to implement the proposed network. For the loss function, we used the Sparse Categorical Cross-Entropy loss. The loss was minimized using the Adam 68 optimizer with a learning rate of 0.0001. The network was trained for a maximum of 300 epochs with an early stopping criterion set on the validation loss with a tolerance of 30 epochs.

Ethical approval

We took written consent from all the subjects participating in the data collection process. It was mentioned in the consent form that the data will only be used for research purposes. Moreover, the dataset does not contain any personal information of the subjects but their sex and age information.

Evaluation criteria

Evaluation metrics.

To evaluate our architecture for the static and dynamic gestures, we adopted four evaluation criteria, namely macro-averaged precision, macro-averaged recall, macro-averaged F1, and accuracy which are described in Eqs. ( 10 – 16 ).

where \(TP\) , \(FP\) , and \(FN\) denote true positive, false positive, and false negative, respectively. Moreover, the \(i\) indicates the particular gesture or subject and \(N\) stands for the total number of that gesture or subject. For evaluating per-gesture performance we have used the per-class precision, recall, and F1-score, and for overall reporting, we adopted the macro-average method.

Validation method

There are several validation techniques used for evaluating a machine-learning (ML) model. Among these techniques, we have used the leave-one-out-cross-validation (LOOCV) method to determine the performance of the architecture. LOOCV is regarded as one of the most challenging validation techniques because for each training and evaluation session, the model is exposed to a single unseen subject’s data. Hence, if that particular subject’s data contains significant variation from other subjects in the training set, the resultant matrices are heavily penalized. Increasing the number of subjects in the training set also increases the chance of having more representative data in the test set.

However, our rationales behind using the LOOCV technique are to challenge the generalization of our trained model and test the model’s capability on unseen subject data. Here, we have separated one subject from the dataset as the test set and used the rest of the subject data as the training set. Thus, we repeated the process for all 25 subjects and evaluated the overall results at last.

Experiments

Baseline methods.

Since we have used a custom-made dataglove for this study and our dataset has not been benchmarked before, two classical ML and one deep learning model are employed to generate the overall result. These two classical ML algorithms provided the top performance for our previous study with the same dataglove. Moreover, 1D CNN is one of the most widely used deep learning algorithms with time-series data. Wen et al. 49 used this architecture as the AI algorithm for their study. Hence, we chose these methods for the baseline determination. Table 1 shows the results of these baseline methods for both static and dynamic gestures.

Performance evaluation of the proposed method

We have evaluated the proposed architecture for static and dynamic gestures separately. The confusion matrices illustrated in Fig.  6 projects the performance evaluation for each class. Moreover, Table 2 presents the evaluation metrics for each gesture per gesture category, and Table 3 shows the overall metrics for static and dynamic gestures.

figure 6

Confusion matrices: ( a ) confusion matrix for the static signs; ( b ) confusion matrix for the dynamic signs.

Static gestsures

In the proposed architecture, we used individual 1D ConvNet blocks for each channel of the flex and IMU to produce embeddings. The flex sensors capture the finger movements whereas the orientation can be interpreted from the acceleration. The confusion matrix in Fig.  6 a shows the majority of the detection at the diagonal with a few misclassifications. Among the 24 static gestures, 14 were classified with F1-scores over 0.8, two (k, x) had F1-scores between 0.7 and 0.8, and the F1-scores dropped below 0.7 for seven static gestures (c, e, o, s, t, u, v).

According to Fig.  7 c and o are very similar to each other in gesture shape and hand orientation 69 . The only difference is the position of the thumb with respect to the other four fingers, which touch each other during o but remain separate during c. The use of a contact sensor on the tip of the thumb might improve this classification.

figure 7

The standard ASL signs for letters and numbers 69 .

Moreover, u and v have similar finger flexion and orientation. The only subtle difference between these two gestures is that the index touches the middle finger during u but does not do so during v . A contact sensor between these two fingers might improve the detection ability of the model.

Based on Fig.  7 , we found similarities between e and s as well. While the thumb is kept below the other fingertips during e , it remains on top of the fingers like a fist during s . Although the flexion of the four fingers is a bit different, the subtle differences in the flex sensor data are not learned by the model.

Lastly, the performance of t is one of the most complex ones using a dataglove where the gesture is performed with the thumb kept in between the index and the middle fingers. The finger flexion is similar in x as well. Moreover, for some subjects, the index finger was not bent enough which resulted in a similar flexion as d . Therefore, the model sometimes misclassified t with x and d .

Among the 0.7–0.8 F1-score range, the model falsely predicted x as t and k as p in a few cases. This is also due to the similarities between the gestures.

Dynamic gestures

Compared to the static gestures, our model performed significantly well for the dynamic gestures with an F1-score ranging from perfect 1 for please , to 0.9295 for hello . Although the gesture hello is significantly different from sorry or yes , according to the confusion matrix there were some misclassifications between these classes (Fig.  8 demonstrates the differences among these 3 classes). However, since we used the LOOCV technique to generate these results, the subject-induced bias in one gesture might affect the validation for a different gesture performed by another subject.

figure 8

Differences among ( a ) ‘hello’, ( b ) ‘sorry’, and ( c ) ‘yes’ gestures.

Comparison with previous works

Based on our literature review, we showed different sensor-based gesture recognition works from 2016 in Table 4 for ease of comparison.

According to the comparison, several studies show better accuracy compared to this work. However, the number of volunteers, number of gestures, and validation method are not the same in all these studies. Moreover, due to the mode of our experiments and system, we are unable to compare our method with other systems. For example, among these works, Wen et al. 49 , Lee et al. 45 , and Abhishek et al. 72 did not provide enough information in their manuscripts regarding the number of volunteers in their dataset. Although other works have mentioned the number of users, most of them, for example, Su et al. 43 , did not consider user-independent performance. In practice, AI-based models show discrepancies in their performances on new subjects, making the user-specific metric unreliable.

However, Wen et al., 2021 49 , Lee et al. 45 , and Saquib et al. 70 customized their dataglove with sensor placements at some specific points to detect the touch at the fingertips. Such sensor placements have improved the detection capability of some specific ASL alphabets. In this work, we proposed a generalized hand gesture recognition system and used ASL signs only for validation. On the other hand, such ASL-specific systems in the abovementioned studies might not show similar performance in other application domains.

Moreover, the number of gestures, number of subjects, and gesture type are three significant parameters for the performance comparison. For example, in our previous work 44 , we used K-nearest neighbors (KNN) with the same dataglove which resulted in an accuracy of 99.53% for static and 98.64% for dynamic gestures. However, that study included only 14 static and 3 dynamic gestures collected from in total of 35 volunteers. However, the gestures chosen for the study were very distinct from each other compared to the ones we used in this study.

The comparison among several systems cannot be done based on only the accuracies of the systems. Based on the gesture type, number of gestures, number of volunteers, application, and validation method, this study presented a more robust and economic hand gesture recognition solution compared to the other works in recent years.

Limitations

Domain-specific improvement.

Each application of hand gesture recognition is different. Hence, some domain-dependent limitations are encountered in the model’s performance for a few classes which might vary for different sign language dictionaries. In this particular application, contact sensors are required at the tip of the thumb and between the index finger and the middle finger for performance improvement.

Limitation in everyday use

Although made using low-cost commercially available sensors and modules, the dataglove is not feasible for everyday outdoor use which limits the use of such systems in particular domains.

Applications

Video conference.

Due to the COVID-19 pandemic, the use of video conferences has increased in a steep curve. However, for the deaf and hard-of-hearing community, access to these video conferences is a challenge, since some platforms might not have a real-time computer vision-based sign interpreter. In this case, an accessibility software using our dataglove and proposed AI-based gesture detection system might open new avenues for the deaf and hard-of-hearing community.

Robot control

One of the primary applications of hand gesture recognition is controlling a remote cyber body, namely a robot using hand commands. Due to the promising performance of our dataglove and the detection algorithm, it can be a promising low-cost solution for a wide range of robot control applications.

Virtual reality

Nowadays, virtual reality (VR) devices are within our reach and with the announcement of Meta Verse, new avenues of VR technology have been opened. In this regard, the fundamental necessity of communicating with the cyber world is still done using wearable dataglove-based hand gestures. Our proposed dataglove can be used in conjunction with the VR headset as well.

In this paper, we developed a dataglove to detect static and dynamic hand gestures and presented a novel deep learning-based to make predictions. To validate the system, we constructed a dataset of 40 ASL signs, including 24 static signs and 16 dynamic ones, from 25 subjects. For static gestures, after data filtering, we compensated the gravity from the acceleration and converted it from the body axis to the earth axis. In the case of dynamic gestures, we generated Spatial Projection Images from 1D time series acceleration data. We also introduced a parallel path neural network architecture to extract features from different sensor channels more efficiently. Our method produced better results than classical ML and CNN-based methods for both static and dynamic gestures. The achieved results are extremely promising for various applications.

In future work, we will employ our method on several applications and create a larger dataset to explore further. Moreover, by employing a multimodal technique, we can include videos with the sensor data to accumulate additional features.

Data availability

The datasets analyzed during the current study are available in Figshare 73 ( https://figshare.com/articles/dataset/ASL-Sensor-Dataglove-Dataset_zip/20031017 ).

Comrie, B. Languages of the world. In The Handbook of Linguistics (eds Aronoff, M. & Rees-Miller, J.) 21–38 (Wiley, 2017).

Chapter   Google Scholar  

Zeshan, U. & Palfreyman, N. Typology of sign languages. Camb. Handb. Linguist. Typology 1–33 (2017).

Abir, F. F., Faisal, M. A. A., Shahid, O. & Ahmed, M. U. Contactless human activity analysis: An overview of different modalities. In Contactless Human Activity Analysis (eds Ahad, M. A. R. et al. ) 83–112 (Springer International Publishing, 2021).

Lien, J. et al. Soli: Ubiquitous gesture sensing with millimeter wave radar. ACM Trans. Graph. TOG 35 , 1–19 (2016).

Article   Google Scholar  

Wang, S., Song, J., Lien, J., Poupyrev, I. & Hilliges, O. Interacting with soli: Exploring fine-grained dynamic gesture recognition in the radio-frequency spectrum 851–860 (2016).

Pu, Q., Gupta, S., Gollakota, S. & Patel, S. Whole-home gesture recognition using wireless signals 27–38 (2013).

He, W., Wu, K., Zou, Y. & Ming, Z. WiG: WiFi-based gesture recognition system 1–7 (IEEE, 2015).

Ma, Y., Zhou, G., Wang, S., Zhao, H. & Jung, W. Signfi: Sign language recognition using wifi. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2 , 1–21 (2018).

Article   CAS   Google Scholar  

Wang, W., Liu, A. X. & Sun, K. Device-free gesture tracking using acoustic signals 82–94 (2016).

Nandakumar, R., Iyer, V., Tan, D. & Gollakota, S. Fingerio: Using active sonar for fine-grained finger tracking 1515–1525 (2016).

Gupta, S., Morris, D., Patel, S. & Tan, D. Soundwave: Using the doppler effect to sense gestures 1911–1914 (2012).

Pansare, J. R., Gawande, S. H. & Ingle, M. Real-time static hand gesture recognition for American Sign Language (ASL) in complex background (2012).

Choudhury, A., Talukdar, A. K. & Sarma, K. K. A novel hand segmentation method for multiple-hand gesture recognition system under complex background 136–140 (IEEE, 2014).

Stergiopoulou, E., Sgouropoulos, K., Nikolaou, N., Papamarkos, N. & Mitianoudis, N. Real time hand detection in a complex background. Eng. Appl. Artif. Intell. 35 , 54–70 (2014).

Chen, Q., Georganas, N. D. & Petriu, E. M. Real-time vision-based hand gesture recognition using haar-like features 1–6 (IEEE, 2007).

Kulkarni, V. S. & Lokhande, S. Appearance based recognition of american sign language using gesture segmentation. Int. J. Comput. Sci. Eng. 2 , 560–565 (2010).

Google Scholar  

Zhou, Y., Jiang, G. & Lin, Y. A novel finger and hand pose estimation technique for real-time hand gesture recognition. Pattern Recognit. 49 , 102–114 (2016).

Article   ADS   Google Scholar  

Wadhawan, A. & Kumar, P. Deep learning-based sign language recognition system for static signs. Neural Comput. Appl. 32 , 7957–7968 (2020).

Pun, C.-M., Zhu, H.-M. & Feng, W. Real-time hand gesture recognition using motion tracking. Int. J. Comput. Intell. Syst. 4 , 277–286 (2011).

Molina, J., Pajuelo, J. A. & Martínez, J. M. Real-time motion-based hand gestures recognition from time-of-flight video. J. Signal Process. Syst. 86 , 17–25 (2017).

Devineau, G., Moutarde, F., Xi, W. & Yang, J. Deep learning for hand gesture recognition on skeletal data 106–113 (IEEE, 2018).

Chen, Y., Luo, B., Chen, Y.-L., Liang, G. & Wu, X. A real-time dynamic hand gesture recognition system using kinect sensor 2026–2030 (IEEE, 2015).

Ren, Z., Meng, J. & Yuan, J. Depth camera based hand gesture recognition and its applications in human-computer-interaction 1–5 (IEEE, 2011).

Ma, X. & Peng, J. Kinect sensor-based long-distance hand gesture recognition and fingertip detection with depth information. J. Sens. https://doi.org/10.1155/2018/5809769 (2018).

Song, L., Hu, R. M., Zhang, H., Xiao, Y. L. & Gong, L. Y. Real-Time 3d Hand Gesture Detection from Depth Images Vol. 756, 4138–4142 (Trans Tech Publ, 2013).

Aly, W., Aly, S. & Almotairi, S. User-independent American sign language alphabet recognition based on depth image and PCANet features. IEEE Access 7 , 123138–123150 (2019).

Potter, L. E., Araullo, J. & Carter, L. The leap motion controller: A view on sign language 175–178 (2013).

Mittal, A., Kumar, P., Roy, P. P., Balasubramanian, R. & Chaudhuri, B. B. A modified LSTM model for continuous sign language recognition using leap motion. IEEE Sens. J. 19 , 7056–7063 (2019).

Zhang, Z. Microsoft kinect sensor and its effect. IEEE Multimed. 19 , 4–10 (2012).

Xiao, Q., Zhao, Y. & Huan, W. Multi-sensor data fusion for sign language recognition based on dynamic Bayesian network and convolutional neural network. Multimed. Tools Appl. 78 , 15335–15352 (2019).

Kumar, P., Saini, R., Roy, P. P. & Dogra, D. P. A position and rotation invariant framework for sign language recognition (SLR) using Kinect. Multimed. Tools Appl. 77 , 8823–8846 (2018).

Lugaresi, C. et al. Mediapipe: A framework for perceiving and processing reality (2019).

Burdea, G. C. & Coiffet, P. Virtual Reality Technology (John Wiley & Sons, 2003).

Book   Google Scholar  

Ding, Z. et al. sEMG-based gesture recognition with convolution neural networks. Sustainability 10 , 1865 (2018).

Hu, Y. et al. A novel attention-based hybrid CNN-RNN architecture for sEMG-based gesture recognition. PLoS ONE 13 , e0206049 (2018).

Ovur, S. E. et al. A novel autonomous learning framework to enhance sEMG-based hand gesture recognition using depth information. Biomed. Signal Process. Control 66 , 102444 (2021).

Pomboza-Junez, G. & Terriza, J. H. Hand gesture recognition based on sEMG signals using Support Vector Machines 174–178 (IEEE, 2016).

Tsinganos, P., Cornelis, B., Cornelis, J., Jansen, B. & Skodras, A. Improved gesture recognition based on sEMG signals and TCN 1169–1173 (IEEE, 2019).

Savur, C. & Sahin, F. American sign language recognition system by using surface EMG signal 002872–002877 (IEEE, 2016).

Wu, J., Sun, L. & Jafari, R. A wearable system for recognizing American sign language in real-time using IMU and surface EMG sensors. IEEE J. Biomed. Health Inform. 20 , 1281–1290 (2016).

Wang, Z. et al. Hear sign language: A real-time end-to-end sign language recognition system. IEEE Trans. Mob. Comput. https://doi.org/10.1109/TMC.2020.3038303 (2020).

Abreu, J. G., Teixeira, J. M., Figueiredo, L. S. & Teichrieb, V. Evaluating sign language recognition using the Myo Armband. In 2016 XVIII Symposium on Virtual and Augmented Reality (SVR) 64–70. https://doi.org/10.1109/SVR.2016.21 (2016).

Su, R., Chen, X., Cao, S. & Zhang, X. Random forest-based recognition of isolated sign language subwords using data from accelerometers and surface electromyographic sensors. Sensors 16 , 100 (2016).

Faisal, M. A. A., Abir, F. F. & Ahmed, M. U. Sensor dataglove for real-time static and dynamic hand gesture recognition. In 2021 Joint 10th International Conference on Informatics, Electronics Vision (ICIEV) and 2021 5th International Conference on Imaging, Vision Pattern Recognition (icIVPR) 1–7. https://doi.org/10.1109/ICIEVicIVPR52578.2021.9564226 (2021).

Lee, B. G. & Lee, S. M. Smart wearable hand device for sign language interpretation system with sensors fusion. IEEE Sens. J. 18 , 1224–1232 (2018).

Jani, A. B., Kotak, N. A. & Roy, A. K. Sensor based hand gesture recognition system for English alphabets used in sign language of deaf-mute people. In 2018 IEEE SENSORS 1–4. https://doi.org/10.1109/ICSENS.2018.8589574 (2018).

Chong, T.-W. & Kim, B.-J. American sign language recognition system using wearable sensors with deep learning approach. J. Korea Inst. Electron. Commun. Sci. 15 , 291–298 (2020).

Gałka, J., Mąsior, M., Zaborski, M. & Barczewska, K. Inertial motion sensing glove for sign language gesture acquisition and recognition. IEEE Sens. J. 16 , 6310–6316 (2016).

Wen, F., Zhang, Z., He, T. & Lee, C. AI enabled sign language recognition and VR space bidirectional communication using triboelectric smart glove. Nat. Commun. 12 , 1–13 (2021).

Yu, Y., Chen, X., Cao, S., Zhang, X. & Chen, X. Exploration of Chinese sign language recognition using wearable sensors based on deep belief net. IEEE J. Biomed. Health Inform. 24 , 1310–1320 (2020).

SparkFun. Flex Sensor 2.2—SEN-10264—SparkFun Electronics. SparkFun https://www.sparkfun.com/products/10264 .

TDK. MPU-6050—TDK, InvenSense Corporation. https://invensense.tdk.com/products/motion-tracking/6-axis/mpu-6050/ .

Espressif. ESP32 Wi-Fi & bluetooth MCU—Espressif systems. ESPRESSIF-ESP32 https://www.espressif.com/en/products/socs/esp32 .

Lapiak, J. American sign language dictionary—HandSpeak. https://www.handspeak.com/ .

Canuto, E., Novara, C., Massotti, L., Carlucci, D. & Montenegro, C. P. Chapter 2—Attitude representation. In Spacecraft Dynamics and Control (eds Canuto, E. et al. ) 17–83 (Butterworth-Heinemann, 2018).

Kim, A. & Golnaraghi, M. A quaternion-based orientation estimation algorithm using an inertial measurement unit 268–272 (IEEE, 2004).

Cai, G., Chen, B. M. & Lee, T. H. Coordinate systems and transformations. In Unmanned Rotorcraft Systems 23–34 (Springer, 2011).

Chapter   MATH   Google Scholar  

Ahmed, M., Antar, A. D., Hossain, T., Inoue, S. & Ahad, M. A. R. Poiden: Position and orientation independent deep ensemble network for the classification of locomotion and transportation modes 674–679 (2019).

Smith, S. W. Chapter 15—Moving average filters. In Digital Signal Processing (ed. Smith, S. W.) 277–284 (Newnes, USA, 2003).

Bhanja, S. & Das, A. Impact of data normalization on deep neural network for time series forecasting. https://arxiv.org/abs/812.05519 Cs Stat (2019).

Patro, S. G. K. & Sahu, K. K. Normalization: A preprocessing stage. https://arxiv.org/abs/1503.06462 Cs (2015).

Hunter, J. D. Matplotlib: A 2D graphics environment. Comput. Sci. Eng. 9 , 90–95 (2007).

Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. & Chen, L.-C. Mobilenetv2: Inverted residuals and linear bottlenecks 4510–4520 (2018).

Howard, A. G. et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. ArXiv Prepr. https://arxiv.org/abs/1704.04861 (2017).

Sifre, L. & Mallat, S. Rigid-motion scattering for texture classification. https://arxiv.org/abs/1403.1687 (2014).

He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition 770–778 (2016).

Abadi, M. et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. ArXiv Prepr. https://arxiv.org/abs/1603.04467 (2016).

Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. ArXiv Prepr. https://arxiv.org/abs/1412.6980 (2014).

Chong, T. W. & Lee, B. G. American sign language recognition using leap motion controller with machine learning approach. Sensors 18 , 3554 (2018).

Saquib, N. & Rahman, A. Application of machine learning techniques for real-time sign language detection using wearable sensors. In Proceedings of the 11th ACM Multimedia Systems Conference 178–189. (Association for Computing Machinery, 2020). https://doi.org/10.1145/3339825.3391869 .

Zhang, Y. et al. Static and dynamic human arm/hand gesture capturing and recognition via multiinformation fusion of flexible strain sensors. IEEE Sens. J. 20 , 6450–6459 (2020).

Abhishek, K. S., Qubeley, L. C. F. & Ho, D. Glove-based hand gesture recognition sign language translator using capacitive touch sensor. In 2016 IEEE International Conference on Electron Devices and Solid-State Circuits (EDSSC) 334–337. https://doi.org/10.1109/EDSSC.2016.7785276 (2016).

ASL-Sensor-Dataglove-Dataset.zip. 10.6084/m9.figshare.20031017.v1 (2022).

Download references

Acknowledgements

This work was supported by the Centennial Research Grant, University of Dhaka, Bangladesh, and APC was sponsored by University of East London, UK.

Author information

These authors contributed equally: Md. Ahasan Atick Faisal and Farhan Fuad Abir.

Authors and Affiliations

Department of Electrical and Electronic Engineering, University of Dhaka, Dhaka, 1000, Bangladesh

Md. Ahasan Atick Faisal, Farhan Fuad Abir & Mosabber Uddin Ahmed

Department of Computer Science and Digital Technologies, University of East London, London, UK

Md Atiqur Rahman Ahad

You can also search for this author in PubMed   Google Scholar

Contributions

M.A.A.F. and F.F.A. did the experiments and wrote the main manuscript text and prepared figures. Both of them contributed equally. All authors formulated the methods and design, and reviewed the manuscript. M.U.A. and M.A.R.A. are the corresponding authors. The paper has 2 corresponding authors.

Corresponding authors

Correspondence to Mosabber Uddin Ahmed or Md Atiqur Rahman Ahad .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Faisal, M.A.A., Abir, F.F., Ahmed, M.U. et al. Exploiting domain transformation and deep learning for hand gesture recognition using a low-cost dataglove. Sci Rep 12 , 21446 (2022). https://doi.org/10.1038/s41598-022-25108-2

Download citation

Received : 04 June 2022

Accepted : 24 November 2022

Published : 12 December 2022

DOI : https://doi.org/10.1038/s41598-022-25108-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

phd thesis in gesture recognition

  • DSpace@MIT Home
  • MIT Libraries
  • Doctoral Theses

Real-time continuous gesture recognition for natural multimodal interaction

Thumbnail

Other Contributors

Terms of use, description, date issued, collections.

DSpace logo

Brunel University Research Archive(BURA) preserves and enables easy and open access to all types of digital content. It showcases Brunel's research outputs. Research contained within BURA is open access, although some publications may be subject to publisher imposed embargoes. All awarded PhD theses are also archived on BURA.

  • Brunel University Research Archive
  • College of Engineering, Design and Physical Sciences
  • Dept of Electronic and Electrical Engineering
  • Dept of Electronic and Electrical Engineering Theses

Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.

  • Skip to main content
  • Accessibility information

phd thesis in gesture recognition

  • Enlighten Enlighten

Enlighten Theses

  • Latest Additions
  • Browse by Year
  • Browse by Subject
  • Browse by College/School
  • Browse by Author
  • Browse by Funder
  • Login (Library staff only)

In this section

Wearable pressure sensing for intelligent gesture recognition

Liu, Yuchi (2023) Wearable pressure sensing for intelligent gesture recognition. PhD thesis, University of Glasgow.

The development of wearable sensors has become a major area of interest due to their wide range of promising applications, including health monitoring, human motion detection, human-machine interfaces, electronic skin and soft robotics. Particularly, pressure sensors have attracted considerable attention in wearable applications. However, traditional pressure sensing systems are using rigid sensors to detect the human motions. Lightweight and flexible pressure sensors are required to improve the comfortability of devices. Furthermore, in comparison with conventional sensing techniques without smart algorithm, machine learning-assisted wearable systems are capable of intelligently analysing data for classification or prediction purposes, making the system ‘smarter’ for more demanding tasks. Therefore, combining flexible pressure sensors and machine learning is a promising method to deal with human motion recognition.

This thesis focuses on fabricating flexible pressure sensors and developing wearable applications to recognize human gestures. Firstly, a comprehensive literature review was conducted, including current state-of-the-art on pressure sensing techniques and machine learning algorithms. Secondly, a piezoelectric smart wristband was developed to distinguish finger typing movements. Three machine learning algorithms, K Nearest Neighbour (KNN), Decision Tree (DT) and Support Vector Machine (SVM), were used to classify the movement of different fingers. The SVM algorithm outperformed other classifiers with an overall accuracy of 98.67% and 100% when processing raw data and extracted features.

Thirdly, a piezoresistive wristband was fabricated based on a flake-sphere composite configuration in which reduced graphene oxide fragments are doped with polystyrene spheres to achieve both high sensitivity and flexibility. The flexible wristband measured the pressure distribution around the wrist for accurate and comfortable hand gesture classification. The intelligent wristband was able to classify 12 hand gestures with 96.33% accuracy for five participants using a machine learning algorithm. Moreover, for demonstrating the practical applications of the proposed method, a realtime system was developed to control a robotic hand according to the classification results.

Finally, this thesis also demonstrates an intelligent piezoresistive sensor to recognize different throat movements during pronunciation. The piezoresistive sensor was fabricated using two PolyDimethylsiloxane (PDMS) layers that were coated with silver nanowires and reduced graphene oxide films, where the microstructures were fabricated by the polystyrene spheres between the layers. The highly sensitive sensor was able to distinguish throat vibrations from five different spoken words with an accuracy of 96% using the artificial neural network algorithm.

Actions (login required)

Downloads per month over past year

View more statistics

-

The University of Glasgow is a registered Scottish charity: Registration Number SC004401

University of Warwick

University of Warwick Publications service & WRAP

Highlight your research.

  • Search WRAP
  • Browse by Warwick Author
  • Browse WRAP by Year
  • Browse WRAP by Subject
  • Browse WRAP by Department
  • Browse WRAP by Funder
  • Browse Theses by Department
  • Search Publications Service
  • Browse Publications service by Year
  • Browse Publications service by Subject
  • Browse Publications service by Department
  • Browse Publications service by Funder
  • Help & Advice

The Library

Hand gesture recognition in uncontrolled environments.

-

Yao, Yi (2014) Hand gesture recognition in uncontrolled environments. PhD thesis, University of Warwick.

Request Changes to record. --> Request Changes to record.

Human Computer Interaction has been relying on mechanical devices to feed information into computers with low efficiency for a long time. With the recent developments in image processing and machine learning methods, the computer vision community is ready to develop the next generation of Human Computer Interaction methods, including Hand Gesture Recognition methods. A comprehensive Hand Gesture Recognition based semantic level Human Computer Interaction framework for uncontrolled environments is proposed in this thesis. The framework contains novel methods for Hand Posture Recognition, Hand Gesture Recognition and Hand Gesture Spotting.

The Hand Posture Recognition method in the proposed framework is capable of recognising predefined still hand postures from cluttered backgrounds. Texture features are used in conjunction with Adaptive Boosting to form a novel feature selection scheme, which can effectively detect and select discriminative texture features from the training samples of the posture classes.

A novel Hand Tracking method called Adaptive SURF Tracking is proposed in this thesis. Texture key points are used to track multiple hand candidates in the scene. This tracking method matches texture key points of hand candidates within adjacent frames to calculate the movement directions of hand candidates.

With the gesture trajectories provided by the Adaptive SURF Tracking method, a novel classi�er called Partition Matrix is introduced to perform gesture classification for uncontrolled environments with multiple hand candidates. The trajectories of all hand candidates extracted from the original video under different frame rates are used to analyse the movements of hand candidates. An alternative gesture classifier based on Convolutional Neural Network is also proposed. The input images of the Neural Network are approximate trajectory images reconstructed from the tracking results of the Adaptive SURF Tracking method.

For Hand Gesture Spotting, a forward spotting scheme is introduced to detect the starting and ending points of the prede�ned gestures in the continuously signed gesture videos. A Non-Sign Model is also proposed to simulate meaningless hand movements between the meaningful gestures.

The proposed framework can perform well with unconstrained scene settings, including frontal occlusions, background distractions and changing lighting conditions. Moreover, it is invariant to changing scales, speed and locations of the gesture trajectories.

Request changes or add full text files to a record --> Request changes or add full text files to a record

Repository staff actions (login required)

Downloads per month over past year

View more statistics

Email us: [email protected] Contact Details About Us

1

Quick links

Curriculum vitae, welcome to ying yin's website.

I obtained my PhD in Computer Science from MIT in 2014. I did my research in the Multimodal Understanding Group at Computer Science and Artificial Intelligence Laboratory (CSAIL) under the guidance of Professor Randall Davis . My research focused on making the next generation intelligent multimodal user interfaces using advanced techniques in Machine Learning, and Machine Vision.

I obtained my Bachelor's Degree in Computer Engineering (Software Engineering Option) from University of British Columbia in Canada with a minor in Commerce. I attended high school at Raffles Junior College in Singapore.

Research interests

  • Machine Learning, Computer Vision
  • Multimodal Interaction / Large Display Interaction / Human Computer Interaction
  • Parallel and Distributed Computer Systems

Publications

  • Yin, Y.. Real-Time Continuous Gesture Recognition for Natural Multimodal Interaction . PhD thesis for Massachusetts Institute of Technology. Cambridge, MA. May, 2014. [ BibTex ] [ PDF ]
  • Yin, Y., and Davis, R.. Real-Time Continuous Gesture Recognition for Natural Human-Computer Interaction . 2014 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC) (to appear) . Melbourne, Australia. Jul, 2014. [ BibTex ] [ PDF ]
  • Yin, Y., and Davis, R.. Gesture Spotting and Recognition Using Salience Detection and Concatenated Hidden Markov Models . Proceedings of the 15th ACM International Conference on Multimodal Interaction (ICMI) . ChaLearn Workshop on Multimodal Gesture Recognition. Sydney, Australia. Dec, 2013. [ BibTex ] [ PDF ] [ Scoreboard ]
  • Yin, Y., Ouyang, T., Partridge, K., and Zhai, S.. Making Touchscreen Keyboards Adaptive to Keys, Hand Postures, and Individuals – A Hierarchical Spatial Backoff Model Approach. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems . Paris, France. Apr, 2013. [ BibTex ] [ PDF ] [ Video ]
  • Yin, Y.. A Hierarchical Approach to Continuous Gesture Analysis for Natural Multi-modal Interaction. Proceedings of the 14th ACM International Conference on Multimodal Interaction . pp 357--360. Santa Monica, CA. Oct, 2012. [ BibTex ] [ PDF ] [ Poster ]
  • Yin, Y.. Toward an Intelligent Multimodal Interface for Natural Interaction. Master's Thesis for Massachusetts Institute of Technology. Cambridge, MA. May, 2010. [ BibTeX ] [ PDF ]
  • Yin, Y. and Davis, R.. Toward an Intelligent Multimodal Interface for Natural Interaction. ACM Conference on Interactive Tabletops and Surfaces (ITS 2009) . Banff, Canada. Nov, 2009. [ BibTex ] [ PDF ] [ Poster ]
  • Yin, Y., Ouyang, T., Patridge, K., and Zhai, S.. Posture-Adaptive Selection. Google, submitted. Applicaiton No. 61/702,678.
  • The Third EITA Young Investigator Conference . Cambridge, MA, USA. Aug, 2013. [ Proceedings ]
  • Reviewer for 2012 International Conference on Intelligent User Interfaces (IUI).
  • Reviewer for 2013 ChaLearn Workshop on Multimodal Gesture Recognition, ICMI 2013.

Courses Taken

  • 6.854 Advanced Algorithms (Fall 2009 by Prof. David Karger ) [ Term paper ] [ Code ]
  • 6.824 Distributed Computer Systems Engineering (Spring 2009 by Prof. Frans Kaashoek ) [ Term project ]

phd thesis in gesture recognition

  • 6.867 Machine Learning (Fall 2008 by Prof. Tommi Jaakkola and Prof. Michael Collins ) [ Term project ]
  • 6.866 Machine Vision (Fall 2008 by Prof. Berthold Horn ) [ Term paper ]
  • 15.279 Management Communication
  • 15.390 New Enterprises
  • EECE 478 Computer Graphics (Spring 2008)
  • EECE 494 Real-Time Digital System Design (Spring 2008 by Prof. Sathish Gopalakrishnan )
  • EECE 411 Design of Distributed Software Applications (Fall 2007 by Prof. Matei Ripeanu )
  • EECE 456 Computer Communications (Fall 2007 by Prof. Vincent Wong )
  • EECE 476 Computer Architecture (Fall 2007 by Prof. Tor Aamodt )
  • CPSC 304 Introduction to Relational Database (Spring 2007 by Prof. Rachel Pottinger )
  • EECE 315 Operating and File Systems (Fall 2006 by Dr. L.R. Linares)
  • EECE 321 Compilers (Fall 2006)
  • CPSC 260 Object-Oriented Program Design (Fall 2005 by Prof. Kellogg Booth )

A Review of the Hand Gesture Recognition System: Current Progress and Future Directions

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Personalized Hand Pose and Gesture Recognition System for the Elderly

  • Conference paper
  • Cite this conference paper

phd thesis in gesture recognition

  • Mahsa Teimourikia 16 ,
  • Hassan Saidinejad 16 ,
  • Sara Comai 16 &
  • Fabio Salice 16  

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8515))

Included in the following conference series:

  • International Conference on Universal Access in Human-Computer Interaction

3506 Accesses

1 Citations

Elderly population is growing all over the globe. Novel human-computer interaction systems and techniques are required to fill the gap between elderly reduced physical and cognitive capabilities and the smooth usage of technological artefacts densely populating our environments. Gesture-based interfaces are potentially more natural, intuitive, and direct. In this paper, we propose a personalized hand pose and gesture recognition system (called HANDY) supporting personalized gestures and we report the results of two experiments with both younger and older participants. Our results show that by sufficiently training our system we can get similar accuracies for both younger and older users. This means that our gesture recognition system can accommodate the limitations of an ageing-hand even in presence of hand issues like arthritis or hand tremor.

Download to read the full chapter text

Chapter PDF

Similar content being viewed by others.

phd thesis in gesture recognition

Gesture-Based Applications for Elderly People

phd thesis in gesture recognition

Gestural Interfaces for Mobile and Ubiquitous Applications

phd thesis in gesture recognition

Control with Hand Gestures by Older Users: A Review

  • gestural interaction
  • gesture recognition system

UN: Department of economic and social affairs (desa) world population ageing 2009. DESA, United Nations, New York (2009)

Google Scholar  

UN: Review and appraisal of the progress made in achieving the goals and objectives of the programme of action of the international conference on population and development, 1999 report. United Nations publication, Sales No. E.99.XIII.16 (1999)

Kalache, A., Gatti, A.: Active ageing: a policy framework. Advances in gerontology= Uspekhi gerontologii/Rossiiskaia akademiia nauk. Gerontologicheskoe Obshchestvo 11, 7–18 (2002)

Malanowski, N., Ozcivelek, R., Cabrera, M.: Active ageing and independent living services: the role of information and communication technology. European Communitiy (2008)

Rautaray, S.S., Agrawal, A.: Vision based hand gesture recognition for human computer interaction: a survey. Artificial Intelligence Review, 1–54 (2012)

Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: A review. Computer Vision and Image Understanding 108(1), 52–73 (2007)

Article   Google Scholar  

Karam, M.: PhD Thesis: A framework for research and design of gesture-based human-computer interactions. PhD thesis, University of Southampton (2006)

Wachs, J.P., Kölsch, M., Stern, H., Edan, Y.: Vision-based hand-gesture applications. Communications of the ACM 54(2), 60–71 (2011)

Fisk, A.D., Rogers, W.A., Charness, N., Czaja, S.J., Sharit, J.: Designing for older adults: Principles and creative human factors approaches. CRC press (2012)

Ranganathan, V.K., Siemionow, V., Sahgal, V., Yue, G.H.: Effects of aging on hand function. Journal of the American Geriatrics Society 49(11), 1478–1484 (2001)

Stößel, C., Wandke, H., Blessing, L.: Gestural interfaces for elderly users: help or hindrance? In: Kopp, S., Wachsmuth, I. (eds.) GW 2009. LNCS, vol. 5934, pp. 269–280. Springer, Heidelberg (2010)

Chapter   Google Scholar  

Murthy, G., Jadon, R.: A review of vision based hand gestures recognition. International Journal of Information Technology and Knowledge Management 2(2), 405–410 (2009)

Camastra, F., De Felice, D.: LVQ-based hand gesture recognition using a data glove. In: Apolloni, B., Bassis, S., Esposito, A., Morabito, F.C. (eds.) Neural Nets and Surroundings. Smart Innovation, Systems and Technologies, vol. 19, pp. 159–168. Springer, Heidelberg (2013)

Garg, P., Aggarwal, N., Sofat, S.: Vision based hand gesture recognition. World Academy of Science, Engineering and Technology 49(1), 972–977 (2009)

Zhu, H.M., Pun, C.M.: Real-time hand gesture recognition from depth image sequences. In: 9th Int. Conf. Computer Graphics, Imaging and Visualization (2012)

Liu, X., Fujimura, K.: Hand gesture recognition using depth data. In: Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 529–534. IEEE (2004)

Binh, N.D., Shuichi, E., Ejima, T.: Real-time hand tracking and gesture recognition system. In: Proc. GVIP, pp. 19–21 (2005)

Chen, F.S., Fu, C.M., Huang, C.L.: Hand gesture recognition using a real-time tracking method and hidden markov models. Image and Vision Computing 21(8), 745–758 (2003)

Starner, T., Weaver, J., Pentland, A.: Real-time american sign language recognition using desk and wearable computer based video. IEEE Trans. on Pattern Analysis and Machine Intelligence 20(12), 1371–1375 (1998)

Molina, J., et al.: Real-time user independent hand gesture recognition from time-of-flight camera video using static and dynamic models. Machine Vision and Applications 24(1), 187–204 (2013)

Baudel, T., Beaudouin-Lafon, M.: Charade: Remote control of objects using free-hand gestures. Commun. ACM 36(7), 28–35 (1993)

Freeman, D., Vennelakanti, R., Madhvanath, S.: Freehand pose-based gestural interaction: Studies and implications for interface design. In: IHCI, pp. 1–6 (2012)

Lee, S.S., Chae, J., Kim, H., Lim, Y.K., Lee, K.P.: Towards more natural digital content manipulation via user freehand gestural interaction in a living room. In: Proc. UbiComp 2013, pp. 617–626 (2013)

Norman, D.A.: Natural user interfaces are not natural. Interactions 17(3), 6–10 (2010)

Malizia, A., Bellucci, A.: The artificiality of natural user interfaces. Commun. ACM 55(3), 36–38 (2012)

Lee, T.-Y., Kim, H.-H., Park, K.-H.: Gesture-based interface using baby signs for the elderly and people with mobility impairment in a smart house environment. In: Lee, Y., Bien, Z.Z., Mokhtari, M., Kim, J.T., Park, M., Kim, J., Lee, H., Khalil, I. (eds.) ICOST 2010. LNCS, vol. 6159, pp. 234–237. Springer, Heidelberg (2010)

Sunwoo, J., Yuen, W., Lutteroth, C., Wünsche, B.: Mobile games for elderly healthcare. In: Proceedings of the 11th International Conference of the NZ Chapter of the ACM Special Interest Group on Human-Computer Interaction, pp. 73–76. ACM (2010)

Gerling, K., Livingston, I., Nacke, L., Mandryk, R.: Full-body motion-based game interaction for older adults. In: Proceedings of the 2012 ACM Annual Conference on Human Factors in Computing Systems, pp. 1873–1882. ACM (2012)

Anastasiou, D., Jian, C., Zhekova, D.: Speech and gesture interaction in an ambient assisted living lab. In: Proceedings of the 1st Workshop on Speech and Multimodal Interaction in Assistive Environments, pp. 18–27. Association for Computational Linguistics (2012)

Teimourikia, M., Saidinejad, H., Comai, S.: Handy: A configurable gesture recognition system. In: 7th Int. Conf. on ACHI (2014) (accepted for publication)

Aurenhammer, F.: Voronoi diagrams: A survey of a fundamental geometric data structure. ACM Computing Surveys (CSUR) 23(3), 345–405 (1991)

Berndt, D.J., Clifford, J.: Using dynamic time warping to find patterns in time series. In: KDD Workshop, Seattle, WA, vol. 10, pp. 359–370 (1994)

Download references

Author information

Authors and affiliations.

Department of Electronics, Information and Bioengineering, Politecnico di Milano, via Ponzio 34/5, 22100, Milan, Italy

Mahsa Teimourikia, Hassan Saidinejad, Sara Comai & Fabio Salice

You can also search for this author in PubMed   Google Scholar

Editor information

Editors and affiliations.

Foundation for Research and Technology - Hellas (FORTH)Institute of Computer Science, N. Plastira 100, Vassilika Vouton, 70013, Heraklion, Crete, Greece

Constantine Stephanidis  & Margherita Antona  & 

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper.

Teimourikia, M., Saidinejad, H., Comai, S., Salice, F. (2014). Personalized Hand Pose and Gesture Recognition System for the Elderly. In: Stephanidis, C., Antona, M. (eds) Universal Access in Human-Computer Interaction. Aging and Assistive Environments. UAHCI 2014. Lecture Notes in Computer Science, vol 8515. Springer, Cham. https://doi.org/10.1007/978-3-319-07446-7_19

Download citation

DOI : https://doi.org/10.1007/978-3-319-07446-7_19

Publisher Name : Springer, Cham

Print ISBN : 978-3-319-07445-0

Online ISBN : 978-3-319-07446-7

eBook Packages : Computer Science Computer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

Final Report -Hand Gesture Recognition using Neural Networks Hand Gesture Recognition Using Neural Networks Thesis Supervisor: Terry Windeatt Centre for Vision, Speech and Signal Processing

Profile image of salma boussetta

Using orientation histograms a simple and fast algorithm will be developed to work on a workstation. It will recognize static hand gestures, namely, a subset of American Sign Language (ASL). Previous systems have used datagloves or markers for input in the system. A pattern recognition system will be using a transform that converts an image into a feature vector, which will then be compared with the feature vectors of a training set of gestures. The final system will be implemented with a Perceptron network.

Related Papers

Roumyashree Subudhi

phd thesis in gesture recognition

Ernest Popardowski

This paper presents the development and implementation of an application that recognizes American Sign Language signs with the use of deep learning algorithms based on convolutional neural network architectures. The project implementation includes the development of a training set, the preparation of a module that converts photos to a form readable by the artificial neural network, the selection of the appropriate neural network architecture and the development of the model. The neural network undergoes a learning process, and its results are verified accordingly. An internet application that allows recognition of sign language based on a sign from any photo taken by the user is implemented, and its results are analyzed. The network effectiveness ratio reaches 99% for the training set. Nevertheless, conclusions and recommendations are formulated to improve the operation of the application.

International Journal of Machine Learning and Computing

Maysam Abbod

IRJET Journal

IAEME PUBLICATION

IAEME Publication

With the development of information technology in our Society one can expect that computer systems to a larger extent will be embedded into our daily life. These environments lead to the new types of human-computer interaction (HCI). The use of hand gestures provides an attractive alternative to cumbersome interface devices for human-computer interaction (HCI). The existing HCI techniques may become a bottleneck in the effective utilization of the available information flow. Gestures are expressive, meaningful body motions. Interpretation of human gestures such as hand movements or facial expressions, using mathematical algorithms is done using gesture recognition. Gesture recognition is also important for developing alternative human-computer interaction modalities. This research will have tested the proposed algorithm over 100 sign images of ASL. The simulation will show that the true match rate is increased from 77.7% to 84% while the false match rate is decreased from 8.33 % to 7.4%.

shrutika suri

Information communication between two people can be done using various medium. These may be linguistic or gestures. Gestures recognition means identification and recognition of gestures originating from any type of body motion but only originate from face or hand. It is a process by which gestures made by users are used to convey the information. It provides important aspects of human interaction, both interpersonally and in the context of human - computer interfaces. There are several approaches available for recognizing gesture, some of them being MATLAB, Artificial Neural Networks, etc. This paper is a comprehensive evaluation of how gesture can be recognized in a more natural way using neural networks. It consists of 3 stages: image acquisition, feature extraction and recognition. In first stage the image is captured using a webcam, digital camera in approximate frame rate. In the second stage features are extracted using input image. The features may be angle made between finge...

Himanshu Tiwari

Hand gesture recognition system can be used in different area, for example, HCI remote control, android regulator, computer generated truth and so forth. Needle gesture recognition system is for the most part the investigation of identification and acknowledgment of different arrow motions like American Sign Verbal hand gestures, Danish Sign Language hand motions and so on by a processer system. This work is centered on three fundamental issues in building up a motion acknowledgment framework. This work is centered on three fundamental issues in building up a motion acknowledgment framework. Human Computer Interaction requires using various modalities (for example body position, speech, hand motions, Lip development, Facial articulations, and so on.) and coordinating them together for an increasingly vivid client experience. Hand signals are a natural yet ground-breaking correspondence methodology which as not been completely investigated for Human Computer Interaction The most rece...

rama chawla

Understanding human motions can be posed as a pattern recognition problem. In order to convey visual messages to a receiver, a human expresses motion patterns. Loosely called gestures, these patterns are variable but distinct and have an associated meaning. The Pattern recognition by any computer or machine can be implemented via various methods such as Hidden Harkov Models, Linear Programming and Neural Networks. Each method has its own advantages and disadvantages, which will be studied separately later on. This paper reviews why using ANNs in particular is better suited for analyzing human

International Journal for Research in Applied Science & Engineering Technology (IJRASET)

IJRASET Publication

This work presents a computer-vision-based application for recognizing hand gestures. A live video feed is captured by a camera, and a still image is extracted from that feed with the aid of an interface. At least once per count hand gesture (one, two, three, four, and five), the system is trained. After that, the system is given a test gesture to see if it can identify it. Several algorithms that are capable of distinguishing a hand gesture were studied. It was determined that the highest rate of accuracy was achieved by using the computational neural network known as the Alexnet algorithm. Traditionally, systems have used data gloves or markers as a means of input. We are free to use the system however we like. In this way, the user can make natural hand gestures in front of the camera. The system implemented serves as an extendable basis for future work toward a fully robust hand gesture recognition system, which is still the subject of intensive research and development.

Engineering Applications of Artificial Intelligence

Nizar Dahir

RELATED PAPERS

Gabriel Cárdenas

Journal of Physics A: Mathematical and Theoretical

Jean-Marie Maillard

Proceedings of the National Academy of Sciences of the United States of America

Wim Vyverman

JAIRO ANDRES SALAZAR HURTADO

Journal of Biological Chemistry

Bulletin d'histoire politique

Xavier Gélinas

2017 4th Asia-Pacific World Congress on Computer Science and Engineering (APWC on CSE)

The Lancet Child & Adolescent Health

Stephen Corbett

Brain Structure and Function

DAVID PEREZ GONZALEZ

International Journal of Molecular Sciences

Maurizio Vigili

BMC Health Services Research

Samuel Ajayi

Proceedings in applied mathematics & mechanics

Peter Ehrhard

Journal of Public Health

Manfred Laubichler

Case Reports in Oncology

NATALIA NOEMI SERRANO RAMIREZ

Shrinivas Darak

Hirayuki Enomoto

Revista de filología románica

ROSA M. MEDINA-GRANDA

Macromolecules

Vagelis Harmandaris

Best Practise

Novan Syafrian

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

IMAGES

  1. Applied Sciences

    phd thesis in gesture recognition

  2. Applied Sciences

    phd thesis in gesture recognition

  3. Hand/Gesture Recognition Thesis

    phd thesis in gesture recognition

  4. What is Gesture Recognition? How Does it Work, and its Applications?

    phd thesis in gesture recognition

  5. Sample images from hand gesture recognition datasets.

    phd thesis in gesture recognition

  6. (PDF) Hand Gesture Recognition: A Literature Review

    phd thesis in gesture recognition

VIDEO

  1. PhD Thesis Defense. Viktoriia Chekalina

  2. PhD Thesis Chapters

  3. AI-based gesture detection using radar

  4. Artificial Intelligence-Hand gestures Recognition- opencv, Tensorflow and Python

  5. Hand Gesture Recognition Using Convolutional Neural Network

  6. HAND GESTURE RECOGNITION USING OPENCV

COMMENTS

  1. PDF Hand Gesture Recognition using Deep Learning Neural Networks

    Hand Gesture Recognition using Deep Learning 1 ... Norah Meshari Alnaim A thesis submitted for the degree of Doctor of Philosophy Department of Electronic & Computer Engineering School of Engineering and Design and Physical Sciences Brunel University London December 2019. ... of my PhD research. I am literally thankful to my parents who were ...

  2. PDF Hand Gesture Recognition using a Low-Cost Sensor with Digital Signal

    Hussein Walugembe PhD Thesis Page 4 of 145 Abstract Our research concerns a hand gesture recognition framework that makes use of a low cost off-the-shelf _ device. The device is a visual markerless sensor system called the Leap Motion controller (LM). However, before deploying the LM, we investigate its accuracy

  3. PDF Robust human computer interaction using dynamic hand gesture recognition

    1.2.1 Basic Hand Gesture Recognition Process. The process through which a computer vision-based system recognises hand ges-tures can be divided into four stages [13]. In the first stage, one or multiple cameras obtain image data, then according to the data model, check for hand gesture input data in the stream.

  4. Exploiting domain transformation and deep learning for hand gesture

    Hand gesture recognition is one of the most widely explored areas under the human-computer interaction domain. Although various modalities of hand gesture recognition have been explored in the ...

  5. Vision based hand gesture recognition for human computer interaction: a

    It further discusses the advances that are needed to further improvise the present hand gesture recognition systems for future perspective that can be widely used for efficient human computer interaction. ... Kanniche MB (2009) Gesture recognition from video sequences. PhD Thesis, University of Nice. Kanungo T, Mount DM, Netanyahu NS, Piatko CD ...

  6. Research Approach of Hand Gesture Recognition based on Improved YOLOV3

    Liu Chunhua. 2017.Research on gesture Recognition Technology based on Wearable Controller. Master's Thesis, Harbin University of Science and Technology. Google Scholar; Liang Zhijie. 2019. Research on the Key Technology of Deaf-Mute Sign Language recognition. PhD Thesis, Central China Normal University. Google Scholar; Tian Yi. 2006.

  7. PDF Piezoresistive sensor for hand gesture recognition

    The flexible wrist-worn device with a five-sensing array is used to measure pressure distribution around the wrist for accurate and comfortable hand gesture recognition. The intelligent wristband is able to classify 12 hand gestures with 96.33% accuracy for five participants using an ML algorithm.

  8. Real-time continuous gesture recognition for natural multimodal interaction

    The novel approaches in this thesis include: a probabilistic recognition framework based on a flattened hierarchical hidden Markov model (HHMM) that unifies the recognition of path and pose gestures; and a method of using information from the hidden states in the HMM to identify different gesture phases (the pre-stroke, the nucleus and the post ...

  9. A Multichannel Convolutional Neural Network for Hand Posture Recognition

    Karam, M.: PhD Thesis: A framework for research and design of gesture-based human-computer interactions. PhD thesis, University of Southampton (October 2006) ... Vision based hand gesture recognition for human computer interaction: a survey. Artificial Intelligence Review, 1-54 (November 2012) Google Scholar

  10. Brunel University Research Archive: Hand gesture recognition using deep

    All awarded PhD theses are also archived on BURA. Brunel University Research Archive; College of Engineering, Design and Physical Sciences; Dept of Electronic and Electrical Engineering; ... Gesture recognition concerns non-verbal motions used as a means of communication in HCI. A system may be utilised to identify human gestures to convey ...

  11. (PDF) PhD Thesis: A framework for research and design of gesture-based

    1.1 Gestures and Human-Computer Interactions. Gestures have long been considered a promising approach to enabling a natural and in-. tuitive method for human-computer interactions for m yriad ...

  12. Wearable pressure sensing for intelligent gesture recognition

    This thesis focuses on fabricating flexible pressure sensors and developing wearable applications to recognize human gestures. Firstly, a comprehensive literature review was conducted, including current state-of-the-art on pressure sensing techniques and machine learning algorithms. Secondly, a piezoelectric smart wristband was developed to ...

  13. PDF Thesis Overview: Dynamic Gesture Recognition and its Application to

    PhD Thesis in Computer Science. 1. Advisors: Laura Lanzarini, Alejandro Rosete ... First, a state of the art study about the gesture recognition was carried out. Intelligent techniques for image and

  14. Hand gesture recognition in uncontrolled environments

    A comprehensive Hand Gesture Recognition based semantic level Human Computer Interaction framework for uncontrolled environments is proposed in this thesis. The framework contains novel methods for Hand Posture Recognition, Hand Gesture Recognition and Hand Gesture Spotting. ... Thesis Type: PhD: Publication Status: Unpublished: Supervisor(s ...

  15. PhD Thesis. Vision-based gesture recognition in a robot learning by

    View PhD Thesis. Vision-based gesture recognition in a robot learning by imitation framework Research Papers on Academia.edu for free.

  16. Ying Yin's Home Page

    PhD thesis for Massachusetts Institute of Technology. Cambridge, MA. May, 2014. Yin, Y., and Davis, R.. Real-Time Continuous Gesture Recognition for Natural Human-Computer Interaction. 2014 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC) (to appear) . Melbourne, Australia. Jul, 2014. ...

  17. Hand Gesture Recognition Methods and Applications: A Literature Survey

    Tiny hand gesture recognition without localization via a deep convolutional network. IEEE Trans. Consum. Electron, no.63, pp.251-257. Google Scholar Digital Library; Gongfa Li, Heng Tang, Ying Sun, Jianyi Kong, Guozhang Jiang, Du Jiang, Bo Tao, Shuang Xu, and Honghai Liu. 2019. Hand gesture recognition based on convolution neural network.

  18. PDF Real-time Finger Spelling American Sign Language Recognition

    1.3. Hand Gesture Recognition Hand gesture recognition is an ability of computing devices to understand human hand gestures with help of sophisticated mathematical algorithms. Computer vision based hand gesture recognition is a popular human computer interaction technique, as it is very near to natural human interaction.

  19. A Review of the Hand Gesture Recognition System: Current Progress and

    This paper reviewed the sign language research in the vision-based hand gesture recognition system from 2014 to 2020. Its objective is to identify the progress and what needs more attention. We have extracted a total of 98 articles from well-known online databases using selected keywords. The review shows that the vision-based hand gesture recognition research is an active field of research ...

  20. Personalized Hand Pose and Gesture Recognition System for ...

    This means that our gesture recognition system can accommodate the limitations of an ageing-hand even in presence of hand issues like arthritis or hand tremor. ... PhD thesis, University of Southampton (2006) Google Scholar Wachs, J.P., Kölsch, M., Stern, H., Edan, Y.: Vision-based hand-gesture applications. Communications of the ACM 54(2), 60 ...

  21. PDF Gesture Recognition in Tennis Biomechanics

    GESTURE RECOGNITION IN TENNIS BIOMECHANICS _____ A Thesis Submitted to the Temple University Graduate Board _____ In Partial Fulfillment of the Requirements for the Degree MASTER OF SCIENCE OF ELECTRICAL ENGINEERING _____ by Victor C. Espinoza Bernal Diploma Date December 2018 Thesis Approvals: Iyad Obeid, PhD.

  22. Final Report -Hand Gesture Recognition using Neural Networks Hand

    Gestures are expressive, meaningful body motions. Interpretation of human gestures such as hand movements or facial expressions, using mathematical algorithms is done using gesture recognition. Gesture recognition is also important for developing alternative human-computer interaction modalities.

  23. PDF Bachelor Thesis

    8. Conclusion. Within the framework of this bachelor's thesis, I created an application that is capable of real-time static hand gesture recognition. Along the way, I described the whole process of its creation using the background segmentation, contour finding and Hu moments recognition techniques.