Problem: Genome Data Held in Silos, Unshared, not Standardized for - - PowerPoint PPT Presentation

problem genome data held in silos unshared not
SMART_READER_LITE
LIVE PREVIEW

Problem: Genome Data Held in Silos, Unshared, not Standardized for - - PowerPoint PPT Presentation

Problem: Genome Data Held in Silos, Unshared, not Standardized for Exchange No one institute has enough on its own to make progress. Every researcher and clinician should be able to compare their genome data to others. We need a public ledger


slide-1
SLIDE 1

Problem: Genome Data Held in Silos, Unshared, not Standardized for Exchange

No one institute has enough on its own to make progress. Every researcher and clinician should be able to compare their genome data to others.

slide-2
SLIDE 2

We need a public ledger for sharing

GATTTATCTGCTCTCGTTG GAAGTACAAAATTCATTAAT GCTATGCACAAAATCTGTAG TAGTGTCCCATCTATTT C

slide-3
SLIDE 3

Alliance for data sharing

UC Santa Cruz Genomics Institute

slide-4
SLIDE 4

Enabling Responsible Sharing of Genomic and Clinical Data

  • GA4GH Founded on June 5, 2013

– More than 400 insitu:onal members – From more than 40 countries, – Approx. 1/3 are companies

  • Mission: to enable rapid progress in biomedicine
  • Strategy:

– support major driver projects – create and maintain interoperability of technology plaJorm standards – develop guidelines and harmonizing procedures for privacy and ethics in the interna:onal regulatory context – engage stakeholders across sectors to encourage the responsible and voluntary sharing of data and of methods

slide-5
SLIDE 5

genomicsandhealth.org

Cancer Global Data Sharing

Cancer is driven by muta8ons in DNA. Precision treatment

  • f cancer depends on knowledge of these muta8ons

Start with a Pilot:

  • build a mechanism for recording cancer DNA muta:ons

and clinical informa:on from millions of cancer pa:ent par:cipants across the world

  • Ini:ally called the Ac:onable Cancer Genome Ini:a:ve
  • Cancer is the right place to start. Once this is working,

similar technology could be used to share DNA informa:on for other diseases

slide-6
SLIDE 6

Example proposed public record

gene: BRAF variant: V600E Pa:ent ID: 163a0083-26fa-4705-bc\-d264c4cff796 Gender: Male Ethnicity: White Caucasian Age at Diagnosis: 57 Tumor Classifica:on: non-small-cell lung carcinoma (MeSH D002289) Tissue or organ of origin: Lung Tumor morphology: Squamous (epidermoid)

slide-7
SLIDE 7

Data Sharing Specifica:ons

  • Data open and available to all
  • Ubiquitously accessible on the Internet
  • Can scale to accept dona:ons from 1000s of

sources

  • Not maintained by any central authority or

:ed to any single country, loca:on or ins:tu:on

  • Not corrup:ble
  • Protects par:cipant privacy
  • Stable design so that it may be used by many

3rd party applica:on programs (“apps”)

slide-8
SLIDE 8

This is accomplished with a Shared Public Ledger

Ethereum: https://www.ethereum.org/ foundation Ripple: https://ripple.com/ Hyperledger: https://github.com/hyperledger/ hyperledger IBM Open BlockChain: www.ibm.com/blockchain/ MIT Enigma project enigma.media.mit.edu AirBnB (proposed): www.coindesk.com/airbnb-exec- use-blockchain/

slide-9
SLIDE 9

Simplest Shared Public Ledger

  • Record of transac8ons over :me
  • Transac:on is adding informa:on to the database

– Special case: New informa:on marks previous informa:on as out-of-date

  • It is only possible to add more transac:ons, no

transac:on is ever erased or altered

  • 1000s of copies of the ledger all over the world

are kept in sync while addi:onal transac:ons come in from mul:ple sources by miners

  • A shared public ledger keeps track of data

provenance, i.e. when and how data was entered and updated, so users have the reputa:on/ reliability informa:on they need to filter out data they don’t want

slide-10
SLIDE 10

Ethereum Cancer KnowLedger Pilot

Website at findpubs.org See: https://github.com/maximilianh/acgi

slide-11
SLIDE 11

Who will use the shared public ledger?

Data Users are:

  • Professional Researchers
  • Ci:zen Scien:sts
  • Clinicians
  • Developers of molecular

dx, drugs, decision analysis tools

  • Payers
  • Pa:ents/par:cipants
  • Regulatory agencies and

treatment guideline

  • rganiza:ons

Shared Public Ledger

? ? ? ?

slide-12
SLIDE 12

Where do the data come from?

  • Ul:mately, all data comes from

individual par8cipants who wish to share their gene:c and clinical informa:on for research or improvement of medicine

? ? ? ? ? ? ? ? ?

slide-13
SLIDE 13

How do data enter the ledger?

par:cipants Shared Public Ledger

Researchers

Clinicians Individuals Developers

?

slide-14
SLIDE 14

A par:cipant works with a Trusted Steward to add their informa:on to the ledger

slide-15
SLIDE 15

Possible Trusted Stewards

  • Medical research ins:tu:ons (e.g. GENIE ins:tu:ons)
  • Hospitals and clinics
  • Pa:ent registry services and clinical trial recruitment
  • rganiza:ons
  • Pa:ent agency advocate groups – possibly providing

service to allow pa:ents to maintain agency over data AND par:cipate in clinical trials (e.g. Sage Trust and Gene:c Alliance)

  • Gene:c tes:ng companies

All trusted stewards use the same somware (provided by GA4GH) to add informa:on to the ledger System is designed to support thousands of stewards globally

slide-16
SLIDE 16

Medical Clinics Shared Public Ledger

Pa:ent Advocacy Groups

Pa:ent Registries Tes:ng Companies

Researchers

Clinicians Developers Individuals par:cipants Data Users Trusted Stewards

slide-17
SLIDE 17

What goes into the public ledger and what stays with the steward?

Steward

  • Par:cipant personal

iden:fying informa:on and staged consent

  • Par:cipant’s extended

clinical and gene:c info – Possibly iden:fying

  • Par:cipant’s instruc:ons

for sharing addl info with qualified researchers w/o recontact

  • Par:cipant’s instruc:ons

for recontact Public Ledger

  • Par:cipant’s gene:c

variants in selected genes

  • ~1 dozen broad, non-

iden:fying clinical features

  • Steward’s iden:ty and

contact info

  • Random numerical ID

for par:cipant

slide-18
SLIDE 18

Example: par:cipant -> public ledger

  • Par:cipant visits doctor at a medical clinic
  • Doctor orders gene:c test
  • Doctor suggests data dona:on through

steward (possibly her own ins:tu:on)

  • Test results come back
slide-19
SLIDE 19

Test Results

slide-20
SLIDE 20
  • Par:cipant visits steward
  • Steward records from the par:cipant:

– personal informa:on – test results – addi:onal gene:c and clinical data – consent to donate to database – addi:onal sharing and recontact preferences

  • Steward appends par:cipant’s publicly

sharable gene:c/clinical data to public ledger

slide-21
SLIDE 21

Medical Clinics Shared Public Ledger

Pa:ent Advocacy Groups

Pa:ent Registries Tes:ng Companies

Researchers

Clinicians Developers Individuals par:cipants Data Users Trusted Stewards

slide-22
SLIDE 22

Recontact

  • Data user discovers muta:ons of interest by

using 3rd party app on the public ledger and wants more informa:on about the par:cipant that provided in the ledger

  • Data user contacts par:cipant’s steward
  • Steward has info about under what

circumstances the par:cipant will share addi:onal data or agrees to be recontacted

  • As appropriate, steward will

– supply addi:onal informa:on to data user or – set up contact between user and par:cipant

slide-23
SLIDE 23

par:cipants Medical Clinics Shared Public Ledger

Pa:ent Advocacy Groups

Pa:ent Registries Tes:ng Companies

Researchers

Clinicians Developers Individuals Data Users Trusted Stewards

Third-Party App

slide-24
SLIDE 24

How do two stewards know if they have a par:cipant in common?

  • On the public ledger, a par:cipant is iden:fied with a

random number

  • Each steward securely stores personal iden:fiable

informa:on for their par:cipants; only they can associate par:cipant with a public random number

  • Personal iden:fiable informa:on is compared

between the two stewards without revealing any informa:on except which par:cipants they have in common by a cryptographic trick (secure mul:party computa:on)

To make it comparable,

personal information could be collected by the NIH NDAR GUID standard

slide-25
SLIDE 25

Demo: secure mul:party computa:on to compute private set intersec:on

Steward A 100,000 participants 10 overlap with B Steward B 100,000 participants 10 overlap with A Homomorphic encryption Internet

Neither server sees the personal information on the other. Only the checksum identifying the participants in common is visible, nothing else Runtime: 10 seconds over a transatlantic link, single CPU Implementation and experiments by Max Haeussler

slide-26
SLIDE 26

Who can be a steward?

  • Any en:ty that:

– Has a legi:mate permanent contact – Follows the rules – Has enough par:cipants to prevent “par:cipant reiden:fica:on by steward” (small stewards can be anonymously pooled)

  • All stewards will have Internet ra:ngs; these

can be available on a ledger (e.g. like AirBnB); users can filter out unreliable steward data

slide-27
SLIDE 27

Who pays for all this?

  • System can be designed and implemented for a few

million dollars; long term maintenance is the only issue

  • Governments or philanthropies (possibly associated

with hospitals, pa:ent advocacy groups, etc.) could supply general funding

  • “Taxes” on gene:c tests could provide revenue
  • Stewards can be mo:vated to secure data dona:ons

either by altruism or commissions from data users

  • 3rd party app developers can charge for use of their

tools or sell adver:sing to support their efforts and to support the public ledger

slide-28
SLIDE 28

Summary

  • A completely decentralized, public database is

possible, while s:ll protec:ng privacy

  • We need this because “trust is local”
  • No single state government or private
  • rganiza:on can/should own or control all the

world’s gene:c data

  • Once launched, a shared public ledger grows

and is maintained organically by the global community because it benefits them, much like the Internet itself