Coding for Optimized Writing Rate in DNA Storage
Siddharth Jain, Farzad Farnoud, Moshe Schwartz, Shuki Bruck
Coding for Optimized Writing Rate in DNA Storage Siddharth Jain, - - PowerPoint PPT Presentation
Coding for Optimized Writing Rate in DNA Storage Siddharth Jain, Farzad Farnoud, Moshe Schwartz, Shuki Bruck IEEE ISIT 2020 DN DNA Stor orage Information In this DNA Synthesis (Writing) talk Storage Medium (Multiple Strands of DNA) DNA
Siddharth Jain, Farzad Farnoud, Moshe Schwartz, Shuki Bruck
DNA Synthesis (Writing) DNA Sequencing (Reading) Reconstruction Storage Medium (Multiple Strands of DNA)
(H. H. Lee, R. Kalhor, N. Goela, J. Bolot, and G. M. Church, “Terminator-free template-independent enzymatic DNA synthesis for digital information storage,” Nature Communications, vol. 10, no. 2383, pp. 1–12, 2019. )
(H. H. Lee, R. Kalhor, N. Goela, J. Bolot, and G. M. Church, “Terminator-free template-independent enzymatic DNA synthesis for digital information storage,” Nature Communications, vol. 10, no. 2383, pp. 1–12, 2019. )
C C C C C A
(H. H. Lee, R. Kalhor, N. Goela, J. Bolot, and G. M. Church, “Terminator-free template-independent enzymatic DNA synthesis for digital information storage,” Nature Communications, vol. 10, no. 2383, pp. 1–12, 2019. )
C C C C C A
(H. H. Lee, R. Kalhor, N. Goela, J. Bolot, and G. M. Church, “Terminator-free template-independent enzymatic DNA synthesis for digital information storage,” Nature Communications, vol. 10, no. 2383, pp. 1–12, 2019. )
C C C C C A
(H. H. Lee, R. Kalhor, N. Goela, J. Bolot, and G. M. Church, “Terminator-free template-independent enzymatic DNA synthesis for digital information storage,” Nature Communications, vol. 10, no. 2383, pp. 1–12, 2019. )
Sequence to be synthesized: ACTAG
𝐸
!→#(𝑢$)
𝐸#→%(𝑢&) 𝐸%→!(𝑢') 𝐸
!→((𝑢))
Length of run in each round is given by a distribution 𝐸 which depends on
(H. H. Lee, R. Kalhor, N. Goela, J. Bolot, and G. M. Church, “Terminator-free template-independent enzymatic DNA synthesis for digital information storage,” Nature Communications, vol. 10, no. 2383, pp. 1–12, 2019. )
(M. Schwartz and J. Bruck, “On the capacity of the precision-resolution system,” IEEE Trans. Inform. Theory, vol. 56, no. 3, pp. 1028–1037, 2010. )
erroneous measurement of run lengths.
can be recovered without any error.
𝐸
!→#(𝑢$)
𝐸#→%(𝑢&) 𝐸%→!(𝑢') 𝐸
!→((𝑢))
T A C G 1 2 1 3 2 1 1 2 1 3 1 2 1 4 1 3 1 2 1 3 1 2 1 4
𝑢!→# = {1, 2} 𝑢!→$ = {1, 3} 𝑢!→% = {1, 3} 𝑢#→! = {1, 2} 𝑢#→$ = {1, 2} 𝑢#→% = {1, 3} 𝑢$→! = {1, 2} 𝑢$→# = {1, 3} 𝑢$→% = {1, 2} 𝑢%→! = {1, 4} 𝑢%→# = {1, 2} 𝑢%→$ = {1, 4}
A d1 d2 T 1 1 1
𝐵$!: 𝐵𝑒𝑘𝑏𝑑𝑓𝑜𝑑𝑧 𝑁𝑏𝑢𝑠𝑗𝑦 𝑝𝑔 𝐻&, 𝜇 𝐵$! : 𝑁𝑏𝑦𝑗𝑛𝑣𝑛 𝐹𝑗𝑓𝑜 𝑊𝑏𝑚𝑣𝑓 𝑝𝑔 𝐵$!
Add auxiliary vertices A T 3
𝟐 ≤ 𝒖𝒄→𝒃
(𝟐) < 𝒖𝒄→𝒃 𝟑
< ⋯ < 𝒖𝒄→𝒃
(ℓ)
≤ 𝑵
T A C G 1 2 1 3 2 1 1 2 1 3 1 2 1 4 1 3 1 2 1 3 1 2 1 4
𝑢!→# = {1, 2} 𝑢!→$ = {1, 3} 𝑢!→% = {1, 3} 𝑢#→! = {1, 2} 𝑢#→$ = {1, 2} 𝑢#→% = {1, 3} 𝑢$→! = {1, 2} 𝑢$→# = {1, 3} 𝑢$→% = {1, 2} 𝑢%→! = {1, 4} 𝑢%→# = {1, 2} 𝑢%→$ = {1, 4}
𝟐 ≤ 𝒖𝒄→𝒃
(𝟐) < 𝒖𝒄→𝒃 𝟑
< ⋯ < 𝒖𝒄→𝒃
(ℓ)
≤ 𝑵
Receiver: 𝒔𝟐, 𝒔𝟑, … , 𝒔𝑶 𝒕. 𝒖 . 𝒔𝒌 ~ 𝑬𝒄→𝒃(𝒖𝒄→𝒃
(𝒋) )
ℚ 𝒔𝟐, 𝒔𝟑, … , 𝒔𝑶 = [ 𝒋 . 𝒕. 𝒖. 𝐐𝐬 [ 𝒋 = 𝒋 𝒋 ≥ 𝟐 − 𝜺.
CGGG CGGGGG CGGGG CGGGGGG CGGGGG
State Splitting encoder
𝑛 ∈ {0,1}! 𝑒 ∈ [ℓ]"
Error Correction Coding Multiple Sequence Alignment + Quantizing Function ℚ Terminator Free DNA Synthesis Channel
Decoding for ECC
, 𝑒 ∈ [ℓ]"
State Splitting decoder
𝑶 𝒅𝒑𝒒𝒋𝒇𝒕 𝑡 rounds where for each round there are ℓ possible round times 𝑐𝑓𝑑𝑏𝑣𝑡𝑓 𝑝𝑔 𝑢ℎ𝑓 𝜀 𝑓𝑠𝑠𝑝𝑠 𝑗𝑜𝑢𝑠𝑝𝑒𝑣𝑑𝑓𝑒 𝑐𝑧 𝑢ℎ𝑓 𝑑ℎ𝑏𝑜𝑜𝑓𝑚
Let Gʹ be the ordinary version of G. Further assume the k user informaMon bits are i.i.d. uniform random bits. Then for all large enough k, the user informaMon bits may be encoded into a sequence using at most Here α is the sum of probabilities of non-auxiliary vertices in the stationary distribution of the max-entropic Markov chain and 𝐷1,ℓ ≜ 1 + 𝜀𝑚𝑝ℓ 𝜀 ℓ − 1 + (1 − 𝜀)𝑚𝑝ℓ(1 − 𝜀)
𝑙 𝑑𝑏𝑞 𝑇 𝐻 (1 + 𝛽 1 𝐷*,ℓ − 1 𝑚𝑝-.$(ℓ)) synthesis time, and be decoded correctly with high probability.
State Splitting encoder
𝑛 ∈ {0,1}! 𝑒 ∈ [ℓ]"
Error Correction Coding Multiple Sequence Alignment + Quantizing Function ℚ Terminator Free DNA Synthesis Channel
Decoding for ECC
, 𝑒 ∈ [ℓ]"
State Splitting decoder
𝑶 𝒅𝒑𝒒𝒋𝒇𝒕
(1 + 𝛽 1 𝐷1,ℓ − 1 𝑚𝑝345(ℓ))
framework and terminator free DNA synthesis method.
run lengths.
deletion of runs.