Seminar: Superior Performance of Machine Learning Algorithms in Metabolic Equivalent (MET) Predictio
Supervisor: Dr. Daniel Fuller
Superior Performance of Machine Learning Algorithms in Metabolic Equivalent (MET) Prediction Using Cell Phone Accelerometer Data
Department of Computer Science
Friday, November 1, 2019, 1:30 p.m., Room EN 2022
Introduction: Physical activity has a significant impact on public health. Accelerometry, using the human body’s acceleration data, can assist researchers in measuring physical activity at the population level. In this study, we use machine learning algorithms and feature engineering techniques to classify physical activity from smartphone accelerometer data.
Methods: We recruited forty-eight participants to complete a series of activities while carrying a Samsung phone in their right pocket. They were asked to sit, lie down, walk, and run (running at a 3 Metabolic Equivalents of Task (METs) pace, 5 METs pace, and at 7 METs pace). Their physical activity intensity levels were measured using indirect calorimetry, a metabolic cart. Ethica Data, a smartphone app collected raw accelerometer data. We wrote, deployed, and used an R package, activityCounts, to calculate activity counts from the raw accelerometer data. We expanded the feature domain by generating new features based on the raw accelerometer data and activity counts. R(version 3.6) was also used for data processing, data cleaning, and modelling. We compared the performance of several machine learning algorithms; Random Forest, Support Vector Machines, Naïve Bayes, Linear Discriminant Analysis, and K-nearest Neighbours using the caret package.
Results: Using the raw accelerometer data and the features created based on them leads to high
accuracy and validity. Random Forest models had the best performance among all machine learning algorithms and achieved an accuracy of 92.9% with an area under the receiver operating characteristic curve (ROC) of 0.99.
Conclusion: Our results suggest that using smartphones to measure physical activity is accurate and reliable. Machine learning algorithms such as Random Forest with the help of feature generation techniques can accurately classify physical activity intensity levels in laboratory settings.