Migraine Prediction Engine

Overview

The Migraine Prediction Engine is a data-driven research project aimed at leveraging advanced machine learning models to predict migraine episodes. Conducted across two distinct phases, this project integrates diverse data streams, including digital biomarkers from wearables, weather variables, and patient-reported digital diaries. The ultimate goal is to reduce the uncertainty and burden of migraines, enabling patients to take preventive measures effectively.

Motivation

Migraines represent one of the most debilitating neurological disorders, disproportionately affecting women during their peak productive years. The unpredictability of migraine episodes exacerbates anxiety and significantly impacts quality of life. Existing solutions, both prophylactic and abortive, are often insufficient and can lead to adverse side effects. This project seeks to address these limitations by developing a predictive system that combines physiological, environmental, and personal data to provide actionable insights for migraine sufferers.

Situation

Traditional methods for migraine management rely heavily on patients identifying and avoiding triggers through manual logging, which is prone to recall bias. While some digital solutions exist, they often require exhaustive user input and fail to accommodate the unique migraine patterns of individuals. Furthermore, single-stream models lack the predictive accuracy needed to mitigate migraine episodes effectively. Our study aims to build a more comprehensive, personalized approach using state-of-the-art machine learning models.

Task

Our research was structured in two phases:

Phase 1: Understanding the migraine landscape in a Pakistani context, where environmental and socio-cultural factors pose additional challenges. This phase involved qualitative research through semi-structured interviews with migraine patients and healthcare professionals to identify common triggers, symptoms, and the feasibility of tech-based interventions.
Phase 2: A more data-intensive in-lab study conducted in Massachusetts, USA, focusing on developing and validating predictive models using multi-stream data. This phase included the deployment of wearable devices, the development of a digital diary app, and the integration of weather data.

Action

Phase 1: Qualitative Analysis

Data Collection: Conducted 13 interviews with migraine patients and 2 with doctors to explore migraine management practices, identify common triggers, and assess the potential acceptance of wearable technology in Pakistan.
Thematic Analysis: Coded interview transcripts to extract themes related to migraine perceptions, common symptoms, and barriers to effective management. Key findings indicated a preference for non-pharmacological treatments and a reluctance to adopt wearable technology due to cultural and economic factors.

Phase 2: Quantitative Analysis and Model Development

Data Streams: Collected over 1,252 hours of wearable data, 733 days of diary entries, and two years of weather information using Python libraries like scikit-learn, PyTorch, and TensorFlow.
Feature Engineering: Converted qualitative inputs into machine-readable formats using techniques like one-hot encoding for categorical data and moving averages for time-series imputation.
Model Training:
- Employed K-Nearest Neighbors (KNN), Random Forests, and Multilayer Perceptrons (MLP) for supervised learning.
- Trained Long Short-Term Memory (LSTM) networks for sequential data to account for temporal dependencies, using Python’s PyTorch and scikit-learn for model cross-validation.
- Integrated weather variables to understand their impact on migraine onset, using data from the Weatherbit API.
Voting Mechanism: Developed a novel voting algorithm to aggregate predictions from wearable and diary data streams, enhancing model robustness.

Results

Phase 1 Insights: Identified unique triggers prevalent in Pakistan, such as fasting and extreme heat, and highlighted the cultural barriers to tech adoption. The thematic analysis provided foundational knowledge to tailor future interventions.
Phase 2 Model Performance: Achieved prediction accuracies between 60% and 73%, with lead times of 2-8 hours. The LSTM model demonstrated the ability to predict migraine episodes by detecting patterns in physiological data, stress levels, and weather fluctuations. The use of a harmonic mean to combine predictions from diary and sensor data improved the system's overall precision.
Impact: Demonstrated the potential of personalized predictive models in reducing migraine-related distress. Emphasized the need for tailored solutions that address both individual and cultural factors.

Conclusion

The Migraine Prediction Engine exemplifies the integration of health informatics, machine learning, and user-centered design. By combining diverse data streams and employing robust statistical models, this project paves the way for innovative migraine management strategies. Our research underscores the importance of personalized healthcare solutions and the role of interdisciplinary approaches in addressing complex medical challenges.