[PPT] - Exploratory Application of AI/ML in Clinical Development Jane PowerPoint Presentation

SLIDE 1

Exploratory Application of AI/ML in Clinical Development

Jane Tiller, FRCPsych

1

SLIDE 2

Disclosures
Full time employee of BlackThorn Therapeutics
Own stock in Bristol Myers Squibb

2

SLIDE 3

Significant unmet need
Limited brain-based understanding of behavior
Lack of novel targets/MOAs
High failure rate in clinical trials
Precision medicine: elusive in neuropsychiatric disorders

Can advances in computational and clinical neuroscience help address these challenges ?

3

Clinical Development: The Challenge

SLIDE 4

Neurotype 1 Neurotype 2 Neurotype 3

Behavioral Symptoms Facial/Voice Data Functional Biomarkers Brain Imaging

IDENTIFY PATIENT SUBGROUPS MEASUREMENT COMPUTATION Rx APPLICATION

IDENTIFY PATIENTS MOST LIKELY TO RESPOND TO A SPECIFIC TREATMENT

Can Explainable AI/ML Enable Precision Psychiatry?

We believe we can use the power of AI/ML to identify patient subgroups that may be more likely to benefit from Rx

SLIDE 5

We have applied explainable AI (XAI)/ML approaches to

three independent DBPC studies in major depression

XAI can:
Identify patients who are predicted to respond to a (specific) Rx
Generate insights from studies regardless of trial success
Offer an approach to select patients for clinical trials
Patient enrichment strategy
Targeted indication in later phase development

5

Use of AI/ML in Clinical Development

SLIDE 6

Example of exploratory AI/ML

Applied to a negative study for hypothesis generation

SLIDE 7

Efficacy
Tolerability
Onset of Action

NE NEP-MDD MDD-20 201 a 1 a Phase 2 2a S Study o

f a

a No Nociceptin Antago goni nist f for M Major Depressive Di Disorder er ( (MDD MDD): 2 Key ey A Aims ms

BTRX-246040 (NEP-MDD-201) Dimensional Understanding of Symptoms Key symptom domains relevant to the mechanism

f action

Qualitative and Quantitative Assessments Traditional Clinical Scales (MADRS) Exploratory Vocal Biomarkers Domain-Specific Clinical Scales (SHAPS, DARS) Quantitative Behavioral Assessments (PRT, EEFRT) Behavioral Fingerprinting (Mindstrong)

1 2

AFFECT MOTIVATION COGNITION

SLIDE 8

NEP-MDD-201 design and patient disposition

DBPC, 1:1 randomization
104 MDD patients
1:1 stratification, SHAPS ≤ 4 : SHAPS > 4
Dose 80mg
8-week treatment phase
MADRS primary outcome measure

(change from baseline to week8)

SLIDE 9

BTRX-264040: Well tolerated No significant effect on the primary outcome measure

Baseline MADRS BTRX-246040: 35.2 Placebo: 35.0 Week 8 MADRS BTRX-246040: 20.6 Placebo: 20.3

SLIDE 10

Explainable AI applied to NEP-MD-201

The rich phenotyping was an advantage for exploratory XAI analyses
The model is built using baseline features only
Objective was to predict change in MADRS score (baseline- week 8)

Baseline features (variables): demographics, scales and tasks

Age Sex MADRS (Montgomery-Asberg Depression Rating Scale) HAMA (Hamilton Anxiety Rating Scale) HADS (Hospital Anxiety and Depression Scale) SHAPS (Snaith-Hamilton Pleasure Scale) DARS (Dimensional Anhedonia Rating Scale) PRT (Probabilistic Reward Task) EEfRT (Effort Expenditure for Rewards Task) FERT (Facial Expression Recognition Task)

SLIDE 11

Analytical Approaches

Personalized Advantage Index (PAI), Webb et al.

Assigns a score indexing the likelihood of responding to drug or placebo, based on baseline features alone

Forward Feature Selection model, Mellem et al.

Data reduction method based on the importance of the features in the predictive model

Multivariate Correspondence Analysis (MCA) -based rule mining, Gao et al.

Generates a rule list to explain how to apply the features identified from forward feature selection If age < X and MADRS >Y then drug responder

Webb, C., et al. (2019). Personalized prediction of antidepressant v. placebo response: Evidence from the EMBARC study. Psychological Medicine, 49(7), 1118-1127.
Mellem MS, Liu Y, Gonzalez H, Kollada M, Martin WJ, Ahammad P (2019): Machine learning models identify multimodal measurements highly predictive of transdiagnostic symptom severity for mood, anhedonia, and
anxiety. Biological Psychiatry Cognitive Neurosci Neuroimaging. https://doi.org/10.1016/j.bpsc.2019.07.007
Gao, Gonzalez, Ahammad, “MCA-based Rule Mining Enables Interpretable Inference in Clinical Psychiatry.” arXiv:1810.11558. (AAAI 2019)

Placebo indicated Drug indicated Top features

SLIDE 12

Personalized Advantage Index (PAI)

Indexes the likelihood of responding to drug or placebo

SLIDE 13

Predicted Response Vs Actual Response

SLIDE 14

Separation Was Seen At All Time Points

SLIDE 15

PAI can predict who will respond but does not tell you why
Clinicians (and clinical developers) want interpretable results
XAI algorithm generates a “rule list” to classify individuals which can be

interpreted by experts

In the form if <literal 1> and ….and <literal k> then <emission>
Fully transparent

Gao, Gonzalez, Ahammad, “MCA-based Rule Mining Enables Interpretable Inference in Clinical Psychiatry.” arXiv:1810.11558. (AAAI 2019)

Turning PAI into Actionable Insights

SLIDE 16

BTRX-246060 Indicated vs Rest: an example

1. IF FERT Response Bias - Angry smaller than X

AND HADS-A larger than Y THEN BTRX-040 Ind → P=0.789, CI=(0.586, 0.936)

2. ELSE IF HAMA smaller than P

AND HADS-D larger than Q AND PRT Hit Rate Lean – Block 3 smaller than R THEN Rest → P=0.969, CI=(0.888, 0.999)

3. ELSE Rest → P=0.545, CI=(0.340, 0.743)

SLIDE 17

Potential Uses for Clinical Development

Hypothesis generation
Trial enrichment
Enroll based on the rules
Tailor the rule list
Omit features from the model

inputs, to allow trade offs between effect size, operational ease and addressable population to be interrogated

SLIDE 18

Potential for Clinical Development

0.2 0.4 0.6 0.8 1 1.2 1.4 Study 1 Study 2 Study 3 Option A Option B Option C

0.02 0.12 0.56 0.68 0.40 0.82 ~0.6 ~0.8 ~1.2

Retrospective analyses of 3 DBPC MDD trials show an increased effect size Effect Size

Approx. mean effect

size of approved antidepressants Very high precision Small population High precision Moderate population Lower precision Higher population

Tailored rule lists

Study effect size Rule list effect size

Needs prospective testing

SLIDE 19

No one analytical approach is sufficient
Even “off-the-shelf” tools need modification
Combination of analytical approaches required to

generate explainable models

We can build explainable models to predict patient

response – prospective testing needed

XAI offers an approach to hypothesis generation and

potentially, enrichment for clinical trials

What Have We Learned So Far?

SLIDE 20

Digital Phenotyping

NEP-MDD-201

SLIDE 21

Smartphone Digital Phenotyping

Gestures used (taps, swipes)
Orientation and acceleration
f the phone
Keystroke patterns
Word histograms (“word clouds”)
Number of phone calls, time

and date

Number of emails, time and date
Number of text messages, time

and date

Location

Mindstrong Digital Biomarker Validated Assessment Mood HAMD Processing Speed Symbol Digit Modality Working Memory Digits Forward Visual Memory Brief Visual Memory Test Cognitive Control Go-No-Go

Paul Dagum. Digital Biomarkers of Cognitive Function. npj Digital Medicine(2018)1:10 ; doi:10.1038/s41746-018-0018-4

Machine learning, pattern identification and feature extraction

SLIDE 22

Smartphone Digital Phenotyping

No effect on biomarkers for:
HAMD
Processing speed
Working memory
Some effects on age adjusted

biomarkers for:

Visual memory
Cognitive control

Digital Biomarkers of Cognitive Function, Paul Dagum npj Digital Medicine(2018)1:10 ; doi:10.1038/s41746-018-0018-4

SLIDE 23

Heavy burden of instrumentation in this trial, 6-8 hours per site visit
slow enrollment (<1pt/site/mth) and placebo response
Complexity
Education for sites

Practical Learnings from NEP-MDD-201

Privacy concerns
App store warning vs ICF
Some subjects chose not to participate
BYOD (notifications turned off)
Terminating data collection for patients LTFU
Vocal data needed to be listened to for AEs

SLIDE 24

Multimodal biomarker development: depression, anxiety and wellness

Evolve subjective scales to quantitative behavioral scales for higher resolution brain disorder models Development of multimodal measures for flexibility and higher specificity/selectivity across subsegments

Feasibility and Phase 0 study

4x more participants enrolled than study design
Enrollment completed in < 30 days via social media ads
Captured vocal and facial data for mood and anxiety research
Fully integrated with pathfinderTM platform for data capture,

data analysis and machine learning

SLIDE 25

Conclusions and Future Direction

XAI methodologies can be successfully applied to psychiatric data sets
Rule lists can be tailored and offer the potential for patient selection
Requires randomized placebo-controlled data
Data will iteratively increase precision
Results need prospective evaluation
Performance in other diagnoses than MDD is not yet known
Digital and quantitative biomarkers in clinical trials need to be

low burden

Multimodal assessments appear to confer advantage- but need integrated

SLIDE 26

Ref eferences es

MADRS Montgomery and Asberg Depression Rating Scale,Montgomery and Asberg. Brit. J. Psychiat. 1979;134, 382-389.
HAMA - The assessment of anxiety states by rating. Br J Med Psychol 1959; 32:50–55.
HADS-A& HADS-D Hospital Anxiety and Depression Rating Scale Zigmond AS, Snaith RP Acta Psychiatr Scand. 1983;67:361–

370.

SHAPS Snaith et al. A scale for the assessment of hedonic tone. The Snaith-Hamilton Pleasure Scale. The British Journal of

Psychiatry (1995) 167, 99-103.

DARS Development and validation of the Dimensional Anhedonia Rating Scale (DARS) in a community sample and

individuals with major depression. Psychiatry Res. 2015 Sep 30;229(1-2):109-19.

PRT Pobabilistic Reward Task Pizzagalli, D. A., Jahn, A. L., O’Shea, J. P. (2005). Toward an objective characterization of an

anhedonic phenotype: A Signal-detection approach. Biological Psychiatry, 57, 319-327.

EFFfRT Effort Expenditure for Rewards Task Treadway et. al. 2009. “Worth the EEfRT? The Effort Expenditure for Rewards Task as

an Objective Measure of Motivation and Anhedonia. PLOS ONE 4(8): e6598. doi:10.1371/journal/pone.0006598Treadway et

al. PLoS ONE; Aug 2009;Vol 4;Issue 8.
FERT Facial Expression Recognition Task Psychol Assess. 2018 Nov;30(11):1479-1490. doi: 10.1037/pas0000595. Epub 2018 Jul 19.
Mellem MS, Liu Y, Gonzalez H, Kollada M, Martin WJ, Ahammad P (2019): Machine learning models identify multimodal

measurements highly predictive of transdiagnostic symptom severity for mood, anhedonia, and anxiety. Biological Psychiatry Cognitive Neurosci Neuroimaging. https://doi.org/10.1016/j.bpsc.2019.07.007

modified from Webb et al. Webb, C., et al. (2019). Personalized prediction of antidepressant v. placebo response:

Evidence from the EMBARC study. Psychological Medicine, 49(7), 1118-1127.

Gao, Q., Gonzalez, H., & Ahammad, P. (2019). MCA-based Rule Mining Enables Interpretable Inference in Clinical
Psychiatry. In: Precision Health and Medicine. W3PHAI 2019. Studies in Computational Intelligence, vol 843. Springer.
Paul Dagum. Digital Biomarkers of Cognitive Function. npj Digital Medicine(2018)1:10 ; doi:10.1038/s41746-018-0018-4