Machine Learning Technologies for Advancing Digital Biomarkers for Alzheimer's Disease

Embedded AI and IoT Lab

The Chinese University of Hong Kong

Overview

Alzheimer’s Disease (AD) and related dementia are a growing global health challenge due to the aging population. A major barrier to the treatment of AD is that many patients are either not diagnosed or diagnosed at the late stages of the disease. A recent major advance in early AD diagnosis and intervention is to leverage AI and sensor devices to capture physiological, behavioral, and lifestyle symptoms of AD (e.g., activities of daily living and social interactions) in natural home environments, referred to as digital biomarkers. In this project, we propose the first end-to-end system that integrates multi-modal sensors and federated learning algorithms for detecting multidimensional AD digital biomarkers in natural living environments. We develop a compact multi-modality hardware system that can function for up to months in home environments to detect digital biomarkers of AD. On top of the hardware system, we design a multi-modal federated learning system that can accurately detect more than 20 digital biomarkers in a real-time and privacy-preserving manner. Our approach collectively addresses several major real-world challenges, such as limited data labels, data heterogeneity, and limited computing resources.

Patient Recruitment and Results: To date, our system has been deployed in a four-week clinical trial involving 91 elderly participants (43 females and 48 males, 61 - 93 years old). The participants were from three groups: 31 with Alzheimer’s Disease, 30 with mild cognitive impairment (MCI), and 30 are cognitively normal. The results indicate that our system can accurately detect a comprehensive set of digital biomarkers with up to 93.8% accuracy and identify AD with an average of 88.9% accuracy. Our system offers a new platform that can allow AD clinicians to characterize and track the complex correlation between multidimensional interpretable digital biomarkers, demographic factors of patients, and AD diagnosis in a longitudinal manner.

Ethics: All the data collection in this study was approved by the Institutional Review Board of CUHK, and Clinical Research Ethics Committee of Joint CUHK and Hong Kong Hospital Authority (New Territories East Cluster).

People

Professors: Guoliang Xing (PI, Professor, Department of Information Engineering, CUHK), Timothy CY Kwok (co-PI, Professor, Department of Medicine & Therapeutics, CUHK), Doris Sau Fung YU (co-PI, Professor, School of Nursing, HKU), Allen Ting Chun Lee (co-PI, Assistant Professor, Department of Psychiatry, CUHK), Rosanna Yuen-Yan Chan (Adjunct Associate Professor, Department of Information Engineering, CUHK), Bolei Zhou (Assistant Professor, Computer Science Department, UCLA), Zhenyu Yan (Research Assistant Professor, Department of Information Engineering, CUHK)

Students and Postdocs: Xiaomin Ouyang (Ph.D, 2023, CUHK, Team Leader), Xian Shuai (Ph.D, 2022, CUHK), Yang Li (Ph.D student, CUHK), Li Pan (Research Assitant, CUHK), Xifan Zhang (Ph.D student, CUHK), Heming Fu (Research Assitant, CUHK), Sitong Cheng (Research Assitant, CUHK), Xinyan Wang (Undergraduate Student Helper, CUHK), Jiang Xin (Visiting Ph.D student, from Central South University), Shihua Cao (Postdoc Researcher, CUHK), Hazel Mok (Postdoc Researcher, CUHK)

Research Thrusts

1. System Design. We develop a compact multi-modality hardware system that can function for up to months in home environments to detect digital biomarkers of AD. It incorporates three privacy-preserving sensors (a depth camera, a mmWave radar, and a microphone), an NVIDIA single-board edge computer, and a 4G cellular interface that can communicate with the server.

The goal of the hardware design is to capture lots of digital biomarkers in a privacy-preserving manner while ensuring the durability and scalability of the system. To this end, we choose three privacy-preserving sensor modalities: a depth camera, a mmWave radar, and a microphone. In particular, the Time-of-Flight (ToF) depth camera cannot reveal sensitive personal information like faces; the mmWave radar can only detect the motions of the subjects; the ambient microphones run real-time algorithms to extract acoustic features without recording any raw acoustic data. Collectively, the three sensor modalities can collectively capture a wide range of biomarkers such as having meals, conversations, watching TV, etc. Moreover, the hardware nodes are incorporated with a cellular interface to communicate with the server located in our lab using 4G LTE through a Virtual Private Network (VPN). A major challenge of continuous training with all of the collected sensor data in online FL is the significant training delay. We apply several data reduction strategies to reduce the model training delay in continuous multi-modal FL without significantly sacrificing the data quality. In order to capture the main living area of a living room, we put the node at the height of 1.5m-1.8m (typically on the shelf or cabinet around the sofa), and use the tripod to adjust the height and angle of the box. The installation process typically takes about ten minutes per home.

2. Contrastive Fusion Learning for Multi-Modal Activity Recognition with Small Data. Multi-modal sensing systems are essential for capturing complex and dynamic human activities such as conversation and family meals, which are important digital biomarkers for Alzheimer's disease. However, fusing multiple sensor modalities in human activity recognition (HAR) applications presents several major challenges. First, there usually exists a very limited amount of labeled data, as it is difficult to label multi-modal data in real-world settings. Second, different types of sensors usually produce highly heterogeneous information about the same events/activities, making it challenging to extract useful information for efficient fusion. Third, the sensor data in HAR applications is often privacy-sensitive and changing over time, which requires on-device training using continuous multimodal data.

We propose Cosmo to address such challenges, which is a new system for contrastive fusion learning with small data in multi-modal HAR applications. Cosmo features a novel two-stage training strategy that leverages both unlabeled data on the cloud and limited labeled data on edge. In the first stage, Cosmo employs a novel fusion-based contrastive learning approach to train the feature encoders using unlabeled multimodal data. As a result, Cosmo can extract consistent information that represents the common knowledge shared among different modalities. In the second stage, a new quality-guided attention mechanism is designed to allow the classifier to capture the strengths of different modalities based on only limited labeled data, which explores the complementary information of different modalities. By integrating a novel iterative fusion learning algorithm, Cosmo can effectively combine both consistent and complementary information across different modalities for efficient fusion. Our evaluation on a cloud-edge testbed using three real-world multi-modal HAR datasets shows that Cosmo significantly improves over state-of-the-art baselines in recognition accuracy and convergence delay. For example, Cosmo can achieve about 90% recognition accuracy with only 400 labeled data samples.

3. Federated Learning Systems for Privacy-Preserving Activity Recognition. Most of the previous activity recognition studies are focused on the centralized learning approach that needs to be trained centrally using all the data collected from users. However, collecting sensor data centrally imposes significant privacy concerns for applications like longitudinal chronic condition monitoring. Federated learning (FL) is a distributed machine learning approach, which only requires the nodes to upload model weights to avoid exposing users’ raw data during the learning process.

Existing FL paradigms yield unsatisfactory performance for real-world activity recognition in Alzheimer’s patient monitoring. First, different subjects usually have highly heterogeneous data distributions. For example, AD patients and cognitively normal subjects exhibit highly different behavior patterns, resulting in non-i.i.d. data distributions among nodes. We find that, in spite of the heterogeneity, data distributions of different subjects’ activities may share significant spatial-temporal similarity. Motivated by this key observation, we propose ClusterFL , a similarity-aware federated learning system that can provide high model accuracy and low communication overhead. ClusterFL features a novel clustered multi-task federated learning framework that maximizes the training accuracy of multiple learned models while automatically capturing the intrinsic clustering relationship among the data of different nodes. Second, in both local and global views, the distribution of different activities usually yields a long tail effect, where some activities such as “sitting” incur frequently, and others like “writing” appear rarely. We propose BalanceFL , a federated learning framework that can robustly learn both common and rare classes from long-tailed real-world data. Instead of letting nodes upload biased local models trained on imbalanced private data, we design a new local self-balancing scheme, which forces the uploaded local model to behave as if it were trained from a uniform distribution dataset with the help of the aggregated global model. Extensive experiments on an NVIDIA edge FL testbed using real-world HAR datasets show our approach can acheive high model accuracy and low communication latency.

Major Publications

1. Xiaomin Ouyang, Xian Shuai, Jiayu Zhou, Ivy Wang Shi, Zhiyuan Xie, Guoliang Xing, Jianwei Huang, "Cosmo: Contrastive Fusion Learning with Small Data for Multimodal Human Activity Recognition", The 28th Annual International Conference on Mobile Computing and Networking (MobiCom), 2022, acceptance ratio: 56/317=17.7%.

2. Xiaomin Ouyang, Zhiyuan Xie, Jiayu Zhou, Jianwei Huang, Guoliang Xing, "ClusterFL: A Similarity-Aware Federated Learning System for Human Activity Recognition", The 19th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys), 2021, acceptance ratio: 36/166=21.6%.

3. Linlin Tu, Xiaomin Ouyang, Jiayu Zhou, Yuze He, Guoliang Xing, "FedDL: Federated Learning via Dynamic Layer Sharing for Human Activity Recognition", The 19th ACM Conference on Embedded Networked Sensor Systems (SenSys), 2021, acceptance ratio: 25/139=17.98%.

4. Xian Shuai, Yulin Shen, Siyang Jiang, Zhihe Zhao, Zhenyu Yan, Guoliang Xing, "BalanceFL: Addressing Class Imbalance in Long-Tail Federated Learning", The 21st ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN), 2022, acceptance ratio: 38/126=30.2%.

Funding

1. Alzheimer’s Drug Discovery Foundation (ADDF), “Machine Learning Technologies for Advanced Digital Biomarkers for Alzheimer's Disease”, PI, HKD $5,560,354, 2021-2023.

2. Collaborative Research Fund, "Small Data Learning for Alzheimer's Disease: From Digital Biomarker to Personalized Intervention", PI, HKD $8,230,720, 2022-2025.

3. RGC Research Grant General Research Fund, "HomeSense: A Pervasive System for Home Activity Recognition via Federated Learning", PI, HKD $1,045,055, 2021-2023.