Skip to main content
WP1: Uncertainty Quantification for ML models for PPG signals

The aim of this work package is to develop methods for quantifying the uncertainty in supervised machine learning and deep learning models to enable the performance of at least 3 classification and 3 regression tasks involving PPG data, considering the effects of both aleatoric (data) and epistemic (model) uncertainty on model predictions. Conventionally, one divides the total predictive uncertainty into aleatoric and epistemic uncertainty. Aleatoric uncertainty represents the noise inherent in the observed data, whereas epistemic uncertainty accounts for the uncertainty in the model, i.e., uncertainty which can usually be explained away given more data. A common framework will be developed to evaluate the performance of trained models and to quantify and validate the uncertainties obtained for those models, thus permitting the accuracy and uncertainty of models to be compared, and models with high accuracy and low uncertainty to be identified.


WP2: Creation of benchmark datasets involving PPG signals and community uptake

The aim of this work package is to generate 5 benchmark problems containing datasets of clinical interest on which the performance and uncertainty metrics from A1.1.6 and A1.2.6 can be evaluated, and to promote the uptake and implementation of the methods for uncertainty quantification developed during the project to the scientific, medical device, digital and healthcare communities. The chosen problems will include both classification and regression problems and will be selected according to their clinical interest and the availability of public open-source data. Appropriate meta-data will be included with the datasets. The datasets will contain some variety, i.e., they may contain real, synthetic, or phantom data, data of varying quality, signals from different sites on the body, and different demographics. Examples of benchmark problems for which publicly available datasets exist include determining (i) systolic and diastolic blood pressure (ii) blood glucose and (iii) vascular age for regression, and detection of (iv) atrial fibrillation and other heart rhythm conditions (v) diabetes and (vi) aneurysms, arterial stiffening or stenoses for classification problems. Other problems might however be considered as well. In addition, a good practice guide and an accompanying code repository for an independent review of machine learning models will be developed.