PRISM

PRISM is an open-source, multi-sensor health platform for ecological momentary assessment, just-in-time interventions, and passive sensing. It enables researchers to collect real-world self-report, audio-context, physiological, behavioural, and environmental data from smartphones, smartwatches, and connected devices.

The platform was initially developed for hearing and communication research, where understanding listening difficulty in everyday environments requires real-time, context-rich measurement. However, the architecture is designed to support broader health and behavioural research applications.

PRISM might stand for:

PRedicting Interaction difficulty using Sensors and Micro-EMA
Personalized Real-time Interaction difficulty for Sound Measurement
Platform for Research In Situational Momentary assessment.

Modern hearing aids often struggle to adapt to rapidly changing sound environments, like moving from a quiet room to a busy street or a noisy restaurant. PRISM and related technologies help bridge the gap between raw audio processing and intelligent environmental awareness, leading to a more seamless and personalised hearing experience for users.

Get started

Learn how to host your own instance of the PRISM Research Portal: Setup Guide documentation.
Learn how to design and deploy your own PRISM experiments from the Research Portal: Researcher Guide documentation.
Learn how to use the PRISM App to feedback data to an experiment: Participant Guide documentation.
Dive deeper in the Australian Future Hearing Initiative and live hearing tests you can take: Other Tools documentation.
Read about the motivation and science behind the PRISM platform below!

Capabilities

A web portal for researchers to run experiments and view results.
An Android mobile and Watch app for participant data collection.
A backend server for data storage and processing.
A set of machine learning models for auditory scene analysis.

The Problem: Hearing in Complex Environments

Hearing loss affects over 1.5 billion people worldwide and is one of the leading contributors to reduced quality of life. Modern hearing aids are increasingly sophisticated, yet they still struggle to adapt to rapidly changing sound environments like moving from a quiet office to a busy street, a noisy restaurant, or a reverberant lecture hall.

Traditional hearing-aid fitting is performed in a clinical setting with a controlled acoustic environment, but real-world listening situations are far more varied and unpredictable. There is a well-documented gap between how a hearing device performs in the clinic and how it performs in the wild. Bridging this gap requires understanding when, where, and why users experience difficulty — and that understanding can only come from data collected in the user’s everyday life.

What is PRISM?

Ecological Momentary Assessment (EMA)

PRISM is built around the methodology of Ecological Momentary Assessment (EMA), a research technique that captures data from participants in their natural environments, in real time, rather than relying on retrospective recall in a lab.

In a PRISM experiment, participants carry a smartphone (and optionally a smartwatch) that:

Captures a brief audio snapshot, a short recording of the acoustic environment.
Prompts a self-report, the participant rates their current listening experience via simple feedback buttons (e.g., Good, Okay, Bad).
Collects wearable sensor data, physiological and inertial signals that provide additional context about the listener’s state and activity.

This approach provides ecologically valid, time-stamped, multi-modal data that links subjective hearing difficulty to objective environmental and physiological measurements.

Why PRISM?

PRISM addresses a key bottleneck in hearing science: the lack of large-scale, ecologically valid, multi-modal datasets linking real-world acoustic environments to user-reported hearing difficulty. By providing an end-to-end platform, from experiment design and participant recruitment through data collection and ML-assisted analysis, PRISM enables researchers to:

Study hearing difficulty where it actually happens, not just in the lab.
Combine subjective self-reports with objective sensor and ML-derived features.
Build and validate models that can eventually drive real-time, adaptive hearing-aid algorithms.
Rapidly iterate on experiment designs through a configurable web interface.

The long-term vision, as part of the Australian Future Hearing Initiative, is a future where hearing devices automatically recognise difficult listening situations and adjust their processing in real time giving users seamless hearing across every environment they encounter.

Features

Multi-Modal Sensor Data Collection

A key design principle of PRISM is that hearing difficulty cannot be understood from audio alone. The platform collects data from multiple sensor modalities across phone and watch devices:

Sensor	Device	Purpose
Microphone	Phone, Watch	Captures environmental audio for acoustic scene classification
Heart Rate	Watch	Physiological indicator of stress or cognitive effort
Body Temperature	Watch	Additional physiological context
Accelerometer	Phone, Watch	Activity recognition (walking, sitting, commuting)
Gyroscope	Phone, Watch	Complements accelerometer for motion and posture estimation

Each sensor is independently configurable per experiment where researchers can set sampling rates, enable or disable raw data upload, and define thresholds that control when data collection or model inference is triggered.

On-Device Machine Learning

PRISM integrates machine learning models that run alongside the data collection pipeline. These models serve two primary roles defined in the platform:

YAMNet+ - Acoustic Scene Classification

YAMNet+ is an environment analysis model that classifies the acoustic scene surrounding the participant (e.g., open space, restaurant, traffic, quiet room). It takes normalised float32 audio as input and produces scene labels, enriching each data sample with an automatic environmental context tag.

This builds on the YAMNet architecture and is described in:

Zhong, H., Buchholz, J. M., Maclaren, J., Carlile, S., & Lyon, R. (2026). A dataset and model for auditory scene recognition for hearing devices: AHEAD-DS and OpenYAMNet. arXiv:2508.10360

WONU - Communication Breakdown Detection

The Words of Not Understanding (WONU) model detects linguistic cues in speech that indicate a communication breakdown. Phrases such as “pardon?”, “sorry, what?”, or “can you repeat that?” are examples of such cues. These cues are a direct, in-the-moment signal that a hearing-aid user is struggling to follow conversation, making WONU a complementary signal to the participant’s self-report.

ML Models List

Rule-Based Triggers

Researchers can configure sensor-threshold triggers that automate data collection and model activation. A trigger is defined by:

A sensor source (e.g., phone microphone, heart rate monitor).
A metric and threshold (e.g., sound level > 70 dB, heart rate > 100 BPM, temperature > 37 °C).
An action - what happens when the threshold is crossed:
- Activate an ML model - wake YAMNet+ or WONU to classify the current moment.
- Send a notification - prompt the participant to provide a self-report.
- Launch a form - open an in-app or Google Form survey for richer contextual data.

This trigger system allows PRISM experiments to be event-driven rather than purely scheduled, capturing data precisely when listening conditions are most challenging or interesting.

PRISM Study/Experiment Workflow Overview

Running a PRISM study requires 3 main components:

flowchart TD
    subgraph s1["Firebase"]
        direction LR
        n1@{ shape: "hex", label: "Create Firebase Project" }
    end

    subgraph s2["Research Portal"]
        direction TB
        n2@{ label: "Backend" }
        n3@{ label: "Frontend" }
        n2 <--> n3
        n4@{ shape: "hex", label: "Install Research Portal" }
        n5@{ shape: "hex", label: "Configure and deploy experiments" }
        n6@{ shape: "hex", label: "Experimenter monitors study" }
        n4 --> n5
        n5 --> n6
    end

    subgraph s3["App"]
        direction TB
        n7@{ label: "Android" }
        n8@{ label: "WearOS" }
        n7 --> n8
        n9@{ shape: "hex", label: "Build and Distribute App" }
        n10@{ shape: "hex", label: "Users interact with App" }
        n9 --> n10
    end
    
    s1 <--> s2
    s1 <--> s3

    style n1 fill:#FAD2CF
    style n2 fill:#FEEFC3
    style n3 fill:#FEEFC3
    style n4 fill:#FAD2CF
    style n5 fill:#FAD2CF
    style n6 fill:#FAD2CF
    style n7 fill:#FEEFC3
    style n8 fill:#FEEFC3
    style n9 fill:#FAD2CF
    style n10 fill:#FAD2CF
    style s1 fill:#D2E3FC
    style s2 fill:#CEEAD6
    style s3 fill:#CEEAD6

Firebase project: you must create a Firebase project for the authentication and secure storage for a PRISM experiment.
Research Portal: you must install the PRISM Research Portal from where you can confiure experiemtns, add and communicate with participants, and monitor progress of the data collection study.
App: Users must install the app so they can interact with the PRISM experiment.

Citation

If you use these capabilities from the code, model, or datasets in your research, please cite:

@misc{zhong2026datasetmodelauditoryscene,
      title={A dataset and model for auditory scene recognition for hearing devices: AHEAD-DS and OpenYAMNet}, 
      author={Henry Zhong and Jörg M. Buchholz and Julian Maclaren and Simon Carlile and Richard Lyon},
      year={2026},
      eprint={2508.10360},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      url={https://arxiv.org/abs/2508.10360}, 
}

License

Code is licensed under the MIT License or Apache 2.0 depending on the component. Please check individual LICENCE files in each repository.

Contact

For questions, feedback, or collaboration opportunities, please contact:

Name: Romaric Bouveret
Email: romaric.bouveret@mq.edu.au