Synthetic Sensor Data for Human Activity Recognition

Alharbi, F; Ouarbya, L and Ward, J A. 2020. 'Synthetic Sensor Data for Human Activity Recognition'. In: The International Joint Conference on Neural Networks (IJCNN). Glasgow, United Kingdom. [Conference or Workshop Item] (Forthcoming)

No full text available
[img] Text
Final__PRESUBMIT_JW- -29 - April -2020.pdf - Accepted Version
Permissions: Administrator Access Only
Available under License Creative Commons Attribution Non-commercial.

Download (598kB)

Abstract or Description

Human activity recognition (HAR) based on wearable sensors has emerged as an active topic of research in machine learning and human behavior analysis because of its applications in several fields, including health, security and surveillance, and remote monitoring. Machine learning algorithms are frequently applied in HAR systems to learn from labeled sensor data. The effectiveness of these algorithms generally relies on having access to lots of accurately labeled training data. But labeled data for HAR is hard to come by and is often heavily imbalanced in favor of one or other dominant classes, which in turn leads to poor recognition performance.
In this study we introduce a generative adversarial network (GAN)-based approach for HAR that we use to automatically synthesize balanced and realistic sensor data. GANs are robust generative networks, typically used to create synthetic images that cannot be distinguished from real images. Here we explore and construct a model for generating several types of human activity sensor data using a Wasserstein GAN (WGAN). We assess the synthetic data using two commonly-used classifier models, Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM). We evaluate the quality and diversity of the synthetic data by training on synthetic data and testing on real sensor data, and vice versa. We then use synthetic sensor data to oversample the imbalanced training set. We demonstrate the efficacy of the proposed method on two publicly available human activity datasets, the Sussex-Huawei Locomotion (SHL) and Smoking Activity Dataset (SAD). We achieve improvements of using WGAN augmented training data over the imbalanced case, for both SHL (0.85 to 0.95 F1-score), and for SAD (0.70 to 0.77 F1-score) when using a CNN activity classifier.

Item Type:

Conference or Workshop Item (Paper)

Departments, Centres and Research Units:



20 March 2020Accepted

Event Location:

Glasgow, United Kingdom

Item ID:


Date Deposited:

01 May 2020 14:24

Last Modified:

29 Jan 2021 17:00


View statistics for this item...

Edit Record Edit Record (login required)