SLIDE 1
Faster Pattern Matching with Mismatches in Compressed Texts Karl - - PowerPoint PPT Presentation
Faster Pattern Matching with Mismatches in Compressed Texts Karl - - PowerPoint PPT Presentation
Few Matches or Almost Periodicity: Faster Pattern Matching with Mismatches in Compressed Texts Karl Bringmann, Marvin Knnemann, and Philip Wellnitz Max Planck Institute for Informatics, Saarland Informatics Campus (SIC), Saarbrcken, Germany
SLIDE 2
SLIDE 3
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Pattern Matching with Mismatches
Pattern Matching with Mismatches Given a text t, a pattern p, and an integer k, does t have a length-|p| substring with Hamming-distance at most k to p? t p Finding ANPAN, k = 2 P A N C A K E A N P A N
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 4
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Pattern Matching with Mismatches
Pattern Matching with Mismatches Given a text t, a pattern p, and an integer k, does t have a length-|p| substring with Hamming-distance at most k to p?
- Thm. [Gawrychowski,Uznanski’18]
Pattern matching with k mismatches on a text of length n and a pattern of length m can be solved in time
∼
O((m + k√m) · n/m).
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 5
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Pattern Matching with Mismatches
Pattern Matching with Mismatches Given a text t, a pattern p, and an integer k, does t have a length-|p| substring with Hamming-distance at most k to p?
- Thm. [Gawrychowski,Uznanski’18]
Pattern matching with k mismatches on a text of length n and a pattern of length m can be solved in time
∼
O((m + k√m) · n/m). Matching (conditional) lower bound [GU’18]
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 6
Basic Definitions and General Overview New Structural Insights Faster Algorithm
What if the text is much larger than the pattern? ANPANISAJAPANESESWEETROLLMOSTCOMMONLYFILLEDWITHREDBEANPASTEANPANCANALSOBEPREPAREDWITHOTHERFILLINGSINCLUDINGWHITEBEANSGRE ANPAN
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 7
Basic Definitions and General Overview New Structural Insights Faster Algorithm
What if the text is much larger than the pattern?
ANPANISAJAPANESESWEETROLLMOSTCOMMONLYFILLEDWITHREDBEANPASTEANPANCANALSOBEPREPAREDWITHOTHERFILLINGSINCLUDINGWHITEBEANSGREENBEANSSESAMEANDCHESTNUT
ANPAN
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 8
Basic Definitions and General Overview New Structural Insights Faster Algorithm
What if the text is much larger than the pattern and given in a compressed representation?
ANPANISAJAPANESESWEETROLLMOSTCOMMONLYFILLEDWITHREDBEANPASTEANPANCANALSOBEPREPAREDWITHOTHERFILLINGSINCLUDINGWHITEBEANSGREENBEANSSESAMEANDCHESTNUT
ANPAN
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 9
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Grammar Compression Straight-Line Program (SLP)
A Straight-Line Program or SLP T is a context-free grammar that generates exactly one string eval(T ).
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 10
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Grammar Compression Straight-Line Program (SLP)
An SLP T is a set of non-terminals {T1, . . . , Tn} and productions of the form Ti → σ or Ti → TℓTr, where ℓ, r < i. We write eval(T ) = eval(Tn) for the generated string.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 11
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Grammar Compression Straight-Line Program (SLP)
An SLP T is a set of non-terminals {T1, . . . , Tn} and productions of the form Ti → σ or Ti → TℓTr, where ℓ, r < i. We write eval(T ) = eval(Tn) for the generated string. T1 → A; T2 → N; T3 → P T3 P
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 12
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Grammar Compression Straight-Line Program (SLP)
An SLP T is a set of non-terminals {T1, . . . , Tn} and productions of the form Ti → σ or Ti → TℓTr, where ℓ, r < i. We write eval(T ) = eval(Tn) for the generated string. T1 → A; T2 → N; T3 → P T4 → T1T2; T5 → T4T3 A N P T1 T2 T3 T4 T5
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 13
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Grammar Compression Straight-Line Program (SLP)
An SLP T is a set of non-terminals {T1, . . . , Tn} and productions of the form Ti → σ or Ti → TℓTr, where ℓ, r < i. We write eval(T ) = eval(Tn) for the generated string. T1 → A; T2 → N; T3 → P T4 → T1T2; T5 → T4T3 T6 → T5T4 A N P T1 A T2 N T1 T2 T3 T4 T4 T5 T6
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 14
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Grammar Compression Straight-Line Program (SLP)
An SLP T is a set of non-terminals {T1, . . . , Tn} and productions of the form Ti → σ or Ti → TℓTr, where ℓ, r < i. We write eval(T ) = eval(Tn) for the generated string. T1 → A; T2 → N; T3 → P T4 → T1T2; T5 → T4T3 T6 → T5T4; T7 → T6T4 A N P T1 A T2 N T3 P T1 A T2 N T1 T2 T3 T4 T4 T4 T5 T5 T6 T7
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 15
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Known Results
Problem uncompressed LZW/LZ78 text SLP text n = Ω( √ N) n = Ω(log N) Pattern O(N + m) O(n + m) ※
∼
O(n + m) ※ Matching [KMP’77] [G’12] [J’15] PM with k
∼
O( N
m(m + k√m))
O(n√mk2)
∼
O(nm poly(k)) Mismatches [GU’18] [GS’13] [T’14,BLRS’15] N: length of uncompressed text m: length of pattern n: length of compressed text ※: allows compressed pattern k: number of mismatches
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 16
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Known Results
Problem uncompressed LZW/LZ78 text SLP text n = Ω( √ N) n = Ω(log N) Pattern O(N + m) O(n + m) ※
∼
O(n + m) ※ Matching [KMP’77] [G’12] [J’15] PM with k
∼
O( N
m(m + k√m))
O(n√mk2)
∼
O(nm poly(k)) Mismatches [GU’18] [GS’13] [T’14,BLRS’15] N: length of uncompressed text m: length of pattern n: length of compressed text ※: allows compressed pattern k: number of mismatches
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 17
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Known Results
Problem uncompressed LZW/LZ78 text SLP text n = Ω( √ N) n = Ω(log N) Pattern O(N + m) O(n + m) ※
∼
O(n + m) ※ Matching [KMP’77] [G’12] [J’15] PM with k
∼
O( N
m(m + k√m))
O(n√mk2)
∼
O(nm poly(k)) Mismatches [GU’18]
∼
O(nk4 + mk) N: length of uncompressed text m: length of pattern n: length of compressed text ※: allows compressed pattern k: number of mismatches
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 18
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Known Results
Problem uncompressed LZW/LZ78 text SLP text n = Ω( √ N) n = Ω(log N) Pattern O(N + m) O(n + m) ※
∼
O(n + m) ※ Matching [KMP’77] [G’12] [J’15] PM with k
∼
O( N
m(m + k√m))
O(n√mk2)
∼
O(nm poly(k)) Mismatches [GU’18]
∼
O(nk4 + mk) N: length of uncompressed text m: length of pattern n: length of compressed text ※: allows compressed pattern k: number of mismatches Improvement obtained via new structural insight in solution structure
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 19
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Solution Structure of Pattern Matching Fact (Folklore)
Let text t and pattern p, |t| ≤ 3
2|p|, be given such that there are ≥ 2 matches
- f p in t that together match t completely. Then, both p and t are periodic
with some period x and every match of p in t starts at a position 1 + i · |x|.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 20
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Solution Structure of Pattern Matching Fact (Folklore)
Let text t and pattern p, |t| ≤ 3
2|p|, be given such that there are ≥ 2 matches
- f p in t that together match t completely. Then, both p and t are periodic
with some period x and every match of p in t starts at a position 1 + i · |x|. p t
p p
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 21
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Solution Structure of Pattern Matching Fact (Folklore)
Let text t and pattern p, |t| ≤ 3
2|p|, be given such that there are ≥ 2 matches
- f p in t that together match t completely. Then, both p and t are periodic
with some period x and every match of p in t starts at a position 1 + i · |x|. p t x x
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 22
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Solution Structure of Pattern Matching Fact (Folklore)
Let text t and pattern p, |t| ≤ 3
2|p|, be given such that there are ≥ 2 matches
- f p in t that together match t completely. Then, both p and t are periodic
with some period x and every match of p in t starts at a position 1 + i · |x|. p t x x x
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 23
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Solution Structure of Pattern Matching Fact (Folklore)
Let text t and pattern p, |t| ≤ 3
2|p|, be given such that there are ≥ 2 matches
- f p in t that together match t completely. Then, both p and t are periodic
with some period x and every match of p in t starts at a position 1 + i · |x|. p t x x x x x
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 24
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Solution Structure of Pattern Matching Fact (Folklore)
Let text t and pattern p, |t| ≤ 3
2|p|, be given such that there are ≥ 2 matches
- f p in t that together match t completely. Then, both p and t are periodic
with some period x and every match of p in t starts at a position 1 + i · |x|. p t x x x x x x x x x x x x x
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 25
Basic Definitions and General Overview New Structural Insights Faster Algorithm
What is the solution structure of Pattern Matching with Mismatches?
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 26
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Solution Structure of Pattern Matching with Mismatches
If there are at least 2 k-matches of p in t, then p and t are periodic and every k-match of p starts at a position 1 + i|x|?
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 27
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Solution Structure of Pattern Matching with Mismatches
If there are at least two k-matches of p in t, then p and t are periodic and every k-match of p starts at a position 1 + i|x|?
t p A A A A B B B B A A B B · · · · · · · · · · · · · · · · · · Am Bm Am/2 Bm/2
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 28
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Solution Structure of Pattern Matching with Mismatches
If there are at least two k-matches of p in t, then p and t are periodic and every k-match of p starts at a position 1 + i|x|?
t p A A A A B B B B A A B B · · · · · · · · · · · · · · · · · · Am Bm Am/2 Bm/2 p and t not periodic, but 2k k-matches of p in t
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 29
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Solution Structure of Pattern Matching with Mismatches
If there are at least two Ω(poly(k)) k-matches of p in t, then p and t are periodic and every k-match of p starts at a position 1 + i|x|?
t p A A A A B B B B A A B B · · · · · · · · · · · · · · · · · · Am Bm Am/2 Bm/2
Insight 1
Periodicity only if number of k-matches of p in t is Ω(poly(k))
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 30
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Solution Structure of Pattern Matching with Mismatches
If there are at least Ω(poly(k)) k-matches of p in t, then p and t are periodic and every k-match
- f p starts at a position 1 + i|x|?
t p A A A A A A A A A A A A A A A · · · · · · · · · A2m Am
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 31
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Solution Structure of Pattern Matching with Mismatches
If there are at least Ω(poly(k)) k-matches of p in t, then p and t are periodic and every k-match
- f p starts at a position 1 + i|x|?
t p A A A A A A A A A A A A A A A
B at k/2 random positions each
B B B B B · · · · · · · · · A2m Am
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 32
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Solution Structure of Pattern Matching with Mismatches
If there are at least Ω(poly(k)) k-matches of p in t, then p and t are periodic and every k-match
- f p starts at a position 1 + i|x|?
t p A A A A A A A A A A A A A A A
B at k/2 random positions each
B B B B B · · · · · · · · · A2m Am O(m) k-matches of p in t, but p and t not perfectly periodic
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 33
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Solution Structure of Pattern Matching with Mismatches
If there are at least Ω(poly(k)) k-matches of p in t, then p and t are periodic periodic up to O(k) mismatches and every k-match of p starts at a position 1 + i|x|?
t p A A A A A A A A A A A A A A A
B at k/2 random positions each
B B B B B · · · · · · · · · A2m Am
Insight 2
Periodicity only up to O(k) mismatches
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 34
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Solution Structure of Pattern Matching with Mismatches
If there are at least Ω(poly(k)) k-matches of p in t, then p and t are periodic up to O(k) mismatches and every k-match of p starts at a position 1 + i|x|?
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 35
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most O(k2), or
t′: shortest substring of t such that any k-match of p in t is also a k-match in t′
Both t′ and p have HD O(k) to the same periodic string x and all k-matches of p in t′ start at a position 1 + i · |x|.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 36
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most O(k2), or
t′: shortest substring of t such that any k-match of p in t is also a k-match in t′
Both t′ and p have HD O(k) to the same periodic string x and all k-matches of p in t′ start at a position 1 + i · |x|.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 37
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t x∗
i · · · · · · tj
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 38
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj
Consider t′: shortest substring of t that contains all k-matches
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 39
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj p1 p2 pi p16k
· · · · · ·
Split p into 16k parts pi of equal length
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 40
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj pi
Fix a pi
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 41
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj pi xi xi
Consider prefix xi of pi that is also a period of pi
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 42
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj pi xi xi
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi · · · · · ·
Find first 3k mismatches between p and x∗
i before and after pi
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 43
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj pi xi xi ≤ 3k mism. ≤ 3k mism.
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi · · · · · ·
Find first 3k mismatches between p and x∗
i before and after pi
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 44
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj xi xi < 6k mism.
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi · · · · · ·
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 45
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj xi xi < 6k mism. xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi < 2 · (6 + 1)k = 14k mism.
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi · · · · · ·
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 46
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj pi xi xi ≤ 3k mism. ≤ 3k mism.
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi · · · · · ·
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 47
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj pi xi xi = 3k mism.
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi · · · · · ·
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 48
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj pi xi xi = 3k mism.
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi · · · · · ·
Insight
Any k-match of p in t′ must match at least one pi’s exactly.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 49
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj pi xi xi = 3k mism.
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi · · · · · ·
Fix a pi
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 50
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj pi xi xi = 3k mism.
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi · · · · · ·
Fix a pi; count k-matches where pi is matched exactly
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 51
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj pi xi xi = 3k mism. xi xi xi xi xi xi xi xi xi xi xi
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi · · · · · ·
Consider occurrences of xi in t′
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 52
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj pi xi xi = 3k mism. xi xi xi xi xi xi xi xi xi xi xi
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi · · · · · ·
Problem
Up to O(m) exact matches of xi in t′.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 53
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj pi xi xi = 3k mism. xi xi xi xi xi xi xi
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi · · · · · ·
Consider power stretches of xi in t′ of length ≥ |pi|
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 54
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj pi xi xi = 3k mism. xi xi xi xi xi xi xi
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi · · · · · ·
Consider power stretches of xi in t′ of length ≥ |pi| at most 150k different power stretches
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 55
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj pi xi xi = 3k mism. xi xi xi xi tj
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi · · · · · ·
Fix a power stretch tj of xi in t′.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 56
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj pi xi xi = 3k mism. xi xi xi xi tj ≥ 2k mism.
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi · · · · · ·
Fix a power stretch tj of xi in t′.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 57
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj pi xi xi = 3k mism. xi xi xi xi tj ≥ 2k mism.
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi · · · · · ·
Insight
Must align at least one mismatch.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 58
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result, Proof Overview
Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most 1000k 2, or Both t′ and p have HD < 20k to a periodic x; all k-matches start at position 1 + i · |x|.
p1
p t′ x∗
i · · · · · · tj pi xi xi = 3k mism. xi xi xi xi tj ≥ 2k mism.
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi · · · · · ·
Insight
At most O(k 4) matches: O(k) parts in p, O(k) stretches, O(k 2) matches per combination.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 59
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Main Result Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, at least one of the following holds: The number of k-matches of p in t is at most O(k2), or
t′: shortest substring of t such that any k-match of p in t is also a k-match in t′
Both t′ and p have Hamming distance O(k) to the same periodic string x and all k-matches of p in t′ start at a position 1 + i · |x|.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 60
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Faster Algorithm Theorem (Algorithm)
Pattern matching with k mismatches on a text t given by an SLP of size n and a pattern p of length m can be solved in time O(n k3 (k log k + log m) + k m).
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 61
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Faster Algorithm Theorem (Algorithm)
Pattern matching with k mismatches on a text t given by an SLP of size n and a pattern p of length m can be solved in time O(n k3 (k log k + log m) + k m). Pattern-Compressed String [GS’13] Let p be a string of length m. We call a string f = v1 . . . vq, q
i=1 |vi| ≤ 2m a
p-pattern-compressed string (pc-string) if every vi is a substring of p. We call the vi’s factors of f.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 62
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Faster Algorithm for Pattern-Compressed Strings
Pattern-Compressed String [GS’13]
Let p be a string of length m. We call a string f = v1 . . . vq, q
i=1 |vi| ≤ 2m a p-pattern-
compressed string (pc-string) if every vi is a substring of p. We call the vi’s factors of f.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 63
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Faster Algorithm for Pattern-Compressed Strings
Pattern-Compressed String [GS’13]
Let p be a string of length m. We call a string f = v1 . . . vq, q
i=1 |vi| ≤ 2m a p-pattern-
compressed string (pc-string) if every vi is a substring of p. We call the vi’s factors of f.
PC-String, inst. J1 k, p, f1 with O(k) factors PC-String, inst. J2 k, p, f2 with O(k) factors PC-String, inst. Jn k, p, fn with O(k) factors . . . SLP Instance I SLP T = T1, . . . , Tn k, p of length m T3 T2 T1 B A
O(n k3 (k log k + log m) + k m)
O(k3(k log k + log m)) T(m, k) algorithm
O(n (log m + T(m, k)) + km)
algorithm
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 64
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Faster Algorithm for Pattern-Compressed Strings
Pattern-Compressed String [GS’13]
Let p be a string of length m. We call a string f = v1 . . . vq, q
i=1 |vi| ≤ 2m a p-pattern-
compressed string (pc-string) if every vi is a substring of p. We call the vi’s factors of f.
PC-String, inst. J1 k, p, f1 with O(k) factors PC-String, inst. J2 k, p, f2 with O(k) factors PC-String, inst. Jn k, p, fn with O(k) factors . . . SLP Instance I SLP T = T1, . . . , Tn k, p of length m T3 T2 T1 B A O(k3(k log k + log m)) algorithm
O(n k3 (k log k + log m) + k m)
algorithm
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 65
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Faster Algorithm for Pattern-Compressed Strings
Pattern-Compressed String [GS’13]
Let p be a string of length m. We call a string f = v1 . . . vq, q
i=1 |vi| ≤ 2m a p-pattern-
compressed string (pc-string) if every vi is a substring of p. We call the vi’s factors of f.
PC-String, inst. J1 k, p, f1 with O(k) factors PC-String, inst. J2 k, p, f2 with O(k) factors PC-String, inst. Jn k, p, fn with O(k) factors . . . SLP Instance I SLP T = T1, . . . , Tn k, p of length m T3 T2 T1 B A O(k3(k log k + log m)) algorithm
O(n k3 (k log k + log m) + k m)
algorithm
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 66
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Faster Algorithm for Pattern-Compressed Strings
Theorem (Algorithm for pc-strings) Pattern matching with k mismatches on a pattern p of length m and a p-pc-string f of size O(k) representing at most 2m characters, can be solved in time O(k3(k log k + log m)).
(With O(km) preprocessing on p.)
Implementation of structural insight Need e.g. tools for finding first O(k) mismatches to a periodic string
- r finding all power stretches of a given string in a pc-string
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 67
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Faster Algorithm for Pattern-Compressed Strings
Theorem (Algorithm for pc-strings) Pattern matching with k mismatches on a pattern p of length m and a p-pc-string f of size O(k) representing at most 2m characters, can be solved in time O(k3(k log k + log m)).
(With O(km) preprocessing on p.)
Implementation of structural insight Need e.g. tools for finding first O(k) mismatches to a periodic string
- r finding all power stretches of a given string in a pc-string
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 68
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Faster Algorithm for Pattern-Compressed Strings
Theorem (Algorithm for pc-strings) Pattern matching with k mismatches on a pattern p of length m and a p-pc-string f of size O(k) representing at most 2m characters, can be solved in time O(k3(k log k + log m)).
(With O(km) preprocessing on p.)
Implementation of structural insight Need e.g. tools for finding first O(k) mismatches to a periodic string
- r finding all power stretches of a given string in a pc-string
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 69
Basic Definitions and General Overview New Structural Insights Faster Algorithm
Faster Algorithm Theorem (Algorithm)
Pattern matching with k mismatches on a text t given by an SLP of size n and a pattern p of length m can be solved in time O(n k3 (k log k + log m) + k m).
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 70
Open Problems
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 71
Open Problems
Improve insight to O(k) mismatches in the aperiodic case
Theorem (Structural Insight′) [KW’19+]
For pattern p and text t, |t| ≤ 2|p|, it holds at least one of: The number of k-matches of p in t is at most O(k), or
t′: shortest substring of t such that any k-match of p in t is also a k-match in t′
Both t′ and p have Hamming distance O(k) to the same periodic string x and all k-matches of p in t′ start at a position 1 + i · |x|.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 72
Open Problems
Improve insight to O(k) mismatches in the aperiodic case Improve dependence on k in the algorithm
Theorem (Algorithm)
Pattern matching with k mismatches on a text t given by an SLP of size n and a pattern p of length m can be solved in time O(n k3 (k log k + log m) + k m).
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 73
Open Problems
Improve insight to O(k) mismatches in the aperiodic case Improve dependence on k in the algorithm Fully-compressed setting (p also given as an SLP) Pattern Matching with Errors (Edit distance instead of Hamming distance)
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 74
SLIDE 75
Solution Structure of Pattern Matching with Mismatches
t p A2m/3−1 A A A A A A A A A A A A A A A A2m Am · · · · · · · · · · · · · · · · · ·
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 76
Solution Structure of Pattern Matching with Mismatches
t p A2m/3−1 A A A A A A A A A A A A A A A
B at 2m/3, 4m/3 in t and the middle k + 1 positions in p
B B B B A2m/3−1 A2m/3−1 A2m/3 Bk+1 A(m−k−1)/2 A(m−k−1)/2 · · · · · · · · · · · · · · · · · ·
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 77
Solution Structure of Pattern Matching with Mismatches
t p A2m/3−1 A A A A A A A A A A A A A A A
B at 2m/3, 4m/3 in t and the middle k + 1 positions in p
B B B B A2m/3−1 A2m/3−1 A2m/3 Bk+1 A(m−k−1)/2 A(m−k−1)/2 · · · · · · · · · · · · · · · · · · All matches start at the union of two intervals.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 78
Solution Structure of Pattern Matching with Mismatches
t p A2m/3−1 A A A A A A A A A A A A A A A
B at 2m/3, 4m/3 in t and the middle k + 1 positions in p
B B B B A2m/3−1 A2m/3−1 A2m/3 Bk+1 A(m−k−1)/2 A(m−k−1)/2 · · · · · · · · · · · · · · · · · ·
Insight 3
Arithmetic progression only approximates all matches
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 79
Main Result Theorem (Structural Insight)
Given strings p of length m and t of length at most 2m, at least one of the following holds: The number of k-matches of p in t is at most O(k2). t′: shortest substring of t such that any k-match of p in t is also a k-match in t′ There is a substring x of p, with |x| = O(m/k), such that δH(p, x∗[1, m]) ≤ O(k) and δH(t′, x∗[1, |t′|]) ≤ O(k). Moreover, any k-match of p in t′ starts at a position of the form 1 + i · |x| with 0 ≤ i ≤ (|t′| − |p|)/|x| (but not every starting position 1 + i · |x| necessarily yields a k-match).
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 80
Main Result Theorem (Structural Insight)
Given strings p of length m and t of length at most 2m, at least one of the following holds: The number of k-matches of p in t is at most O(k2). t′: shortest substring of t such that any k-match of p in t is also a k-match in t′ There is a substring x of p, with |x| = O(m/k), such that δH(p, x∗[1, m]) ≤ O(k) and δH(t′, x∗[1, |t′|]) ≤ O(k). Moreover, any k-match of p in t′ starts at a position of the form 1 + i · |x| with 0 ≤ i ≤ (|t′| − |p|)/|x| (but not every starting position 1 + i · |x| necessarily yields a k-match).
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 81
Main Result Theorem (Structural Insight)
Given strings p of length m and t of length at most 2m, at least one of the following holds: The number of k-matches of p in t is at most O(k2). t′: shortest substring of t such that any k-match of p in t is also a k-match in t′ There is a substring x of p, with |x| = O(m/k), such that δH(p, x∗[1, m]) ≤ O(k) and δH(t′, x∗[1, |t′|]) ≤ O(k). Moreover, any k-match of p in t′ starts at a position of the form 1 + i · |x| with 0 ≤ i ≤ (|t′| − |p|)/|x| (but not every starting position 1 + i · |x| necessarily yields a k-match).
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 82
Main Result Theorem (Structural Insight)
Given strings p of length m and t of length at most 2m, at least one of the following holds: The number of k-matches of p in t is at most O(k2). t′: shortest substring of t such that any k-match of p in t is also a k-match in t′ There is a substring x of p, with |x| = O(m/k), such that δH(p, x∗[1, m]) ≤ O(k) and δH(t′, x∗[1, |t′|]) ≤ O(k). Moreover, any k-match of p in t′ starts at a position of the form 1 + i · |x| with 0 ≤ i ≤ (|t′| − |p|)/|x| (but not every starting position 1 + i · |x| necessarily yields a k-match).
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 83
Main Result Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, it holds at least one of: The number of k-matches of p in t is at most O(k2), and Both t and p have HD O(k) to the same periodic string.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 84
Main Result Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, it holds at least one of: The number of k-matches of p in t is at most O(k2), and Both t and p have HD O(k) to the same periodic string.
t P A N C A K E P A N Finding ANPAN, k = 2 non-periodic case t P U N R A N P A M P A N Finding PANPAN, k = 2 periodic case
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 85
Main Result Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, it holds at least one of: The number of k-matches of p in t is at most O(k2), and Both t and p have HD O(k) to the same periodic string.
t P A N C A K E P A N p A N P A N A N P A N Finding ANPAN, k = 2 non-periodic case t P U N R A N P A M P A N Finding PANPAN, k = 2 periodic case
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 86
Main Result Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, it holds at least one of: The number of k-matches of p in t is at most O(k2), and Both t and p have HD O(k) to the same periodic string.
t P A N C A K E P A N p A N P A N A N P A N Finding ANPAN, k = 2 non-periodic case t P U N R A N P A M P A N Finding PANPAN, k = 2 periodic case
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 87
Main Result Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, it holds at least one of: The number of k-matches of p in t is at most O(k2), and Both t and p have HD O(k) to the same periodic string.
t P A N C A K E P A N p A N P A N A N P A N Finding ANPAN, k = 2 non-periodic case t P U N R A N P A M P A N p P A N P A N P A N P A N P A N P A N Finding PANPAN, k = 2 periodic case
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 88
Main Result, Proof Overview Theorem (Structural Insight)
For pattern p and text t, |t| ≤ 2|p|, it holds at least one of: The number of k-matches of p in t is at most O(k2), and Both t and p have HD O(k) to the same periodic string.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 89
Main Result, Proof Overview Theorem (Structural Insight)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, then both t and p have a HD < 20k to the same periodic string.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 90
Main Result, Proof Overview Theorem (Structural Insight)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, then both t and p have a HD < 20k to the same periodic string.
Main Steps: At least 1000k2 k-matches of p in t and p has a HD < 6k to a specific periodic string x ∈ x(p) = ⇒ t has a Hamming Distance < 20k to x p has HD ≥ 6k to any specific periodic string x ∈ x(p) = ⇒ Less than 1000k2 k-matches of p in t
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 91
Main Result, Proof Overview Theorem (Structural Insight)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, then both t and p have a HD < 20k to the same periodic string.
Main Steps: At least 1000k2 k-matches of p in t and p has a HD < 6k to a specific periodic string x ∈ x(p) = ⇒ t has a Hamming Distance < 20k to x p has HD ≥ 6k to any specific periodic string x ∈ x(p) = ⇒ Less than 1000k2 k-matches of p in t
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 92
Main Result, Proof Overview Lemma (Step 1)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, and p has HD < 6k to a periodic string x ∈ x(p), then t has HD < 20k to x.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 93
Main Result, Proof Overview Lemma (Step 1)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, and p has HD < 6k to a periodic string x ∈ x(p), then t has HD < 20k to x.
t p
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 94
Main Result, Proof Overview Lemma (Step 1)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, and p has HD < 6k to a periodic string x ∈ x(p), then t has HD < 20k to x.
p1
p Split p into 16k parts pi of equal length
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 95
Main Result, Proof Overview Lemma (Step 1)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, and p has HD < 6k to a periodic string x ∈ x(p), then t has HD < 20k to x.
p1
p
p1 p2 pi p16k
· · · · · · Split p into 16k parts pi of equal length
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 96
Main Result, Proof Overview Lemma (Step 1)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, and p has HD < 6k to a periodic string x ∈ x(p), then t has HD < 20k to x.
p1
p
pi
Fix a pi
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 97
Main Result, Proof Overview Lemma (Step 1)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, and p has HD < 6k to a periodic string x ∈ x(p), then t has HD < 20k to x.
p1
p
pi xi xi
Consider prefix xi of pi that is also a period of pi
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 98
Main Result, Proof Overview Lemma (Step 1)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, and p has HD < 6k to a periodic string x ∈ x(p), then t has HD < 20k to x.
p1
p
pi xi xi
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi
Find first 3k mismatches between p and x∗
i before and after pi
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 99
Main Result, Proof Overview Lemma (Step 1)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, and p has HD < 6k to a periodic string x ∈ x(p), then t has HD < 20k to x.
p1
p
pi xi xi ≤ 3k mism. ≤ 3k mism.
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi
Find first 3k mismatches between p and x∗
i before and after pi
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 100
Main Result, Proof Overview Lemma (Step 1)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, and p has HD < 6k to some x∗
i , 1 ≤ i ≤ 16k, then t has HD < 20k to x∗ i .
p1
p
pi xi xi < 6k mism.
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 101
Main Result, Proof Overview Lemma (Step 1)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, and p has HD < 6k to some x∗
i , 1 ≤ i ≤ 16k, then t has HD < 20k to x∗ i .
Claim (Proof omitted)
If there are at least 2 + 16k k-matches of p in t, all starting positions of k-matches differ by (integer) multiples of |xi|.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 102
Main Result, Proof Overview Lemma (Step 1)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, and p has HD < 6k to some x∗
i , 1 ≤ i ≤ 16k, then all starting positions of k-matches differ by multiples of |xi|
and t has HD < 20k to x∗
i .
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 103
Main Result, Proof Overview Lemma (Step 1)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, and p has HD < 6k to some x∗
i , 1 ≤ i ≤ 16k, then all starting positions of k-matches differ by multiples of |xi|
and t has HD < 20k to x∗
i .
x∗
i
t p
pi pi
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 104
Main Result, Proof Overview Lemma (Step 1)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, and p has HD < 6k to some x∗
i , 1 ≤ i ≤ 16k, then all starting positions of k-matches differ by multiples of |xi|
and t has HD < 20k to x∗
i .
x∗
i
t x∗
i
p
pi pi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 105
Main Result, Proof Overview Lemma (Step 1)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, and p has HD < 6k to some x∗
i , 1 ≤ i ≤ 16k, then all starting positions of k-matches differ by multiples of |xi|
and t has HD < 20k to x∗
i .
x∗
i
t x∗
i
p
pi pi < 6k mism. to x∗
i
< 6k mism. to x∗
i
xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 106
Main Result, Proof Overview Lemma (Step 1)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, and p has HD < 6k to some x∗
i , 1 ≤ i ≤ 16k, then all starting positions of k-matches differ by multiples of |xi|
and t has HD < 20k to x∗
i .
x∗
i
t x∗
i
p
pi pi < 6k mism. to x∗
i
< 6k mism. to x∗
i
< 7k mism. to x∗
i
< 7k mism. to x∗
i
xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 107
Main Result, Proof Overview Lemma (Step 1)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, and p has HD < 6k to some x∗
i , 1 ≤ i ≤ 16k, then all starting positions of k-matches differ by multiples of |xi|
and t has HD < 20k to x∗
i .
Lemma (Step 2)
Fix a pattern p of length m and a text t of length at most 2m. If the pattern p has a HD ≥ 6k to all strings x∗
i , 1 ≤ i ≤ 16k, then there are less
than 1000k2 k-matches of p in t.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 108
Main Result, Proof Overview Lemma (Step 2)
Fix a pattern p of length m and a text t of length at most 2m. If the pattern p has a HD ≥ 6k to all strings x∗
i , 1 ≤ i ≤ 16k, then there are less
than 1000k2 k-matches of p in t.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 109
Main Result, Proof Overview Lemma (Step 2)
Fix a pattern p of length m and a text t of length at most 2m. If the pattern p has a HD ≥ 6k to all strings x∗
i , 1 ≤ i ≤ 16k, then there are less
than 1000k2 k-matches of p in t.
p1
p
p1 p2 pi p16k
· · · · · · Recall: Split p into 16k parts pi of equal length
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 110
Main Result, Proof Overview Lemma (Step 2)
Fix a pattern p of length m and a text t of length at most 2m. If the pattern p has a HD ≥ 6k to all strings x∗
i , 1 ≤ i ≤ 16k, then there are less
than 1000k2 k-matches of p in t.
p1
p
p1 p2 pi p16k
· · · · · ·
Insight
Any k-match of p in t must match at least 15k pi’s exactly.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 111
Main Result, Proof Overview Lemma (Step 2)
Fix a pattern p of length m and a text t of length at most 2m. If the pattern p has a HD ≥ 6k to all strings x∗
i , 1 ≤ i ≤ 16k, then there are less
than 1000k2 k-matches of p in t.
p1
p
pi
Fix a pi
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 112
Main Result, Proof Overview Lemma (Step 2)
Fix a pattern p of length m and a text t of length at most 2m. If the pattern p has a HD ≥ 6k to all strings x∗
i , 1 ≤ i ≤ 16k, then there are less
than 1000k2 k-matches of p in t.
p1
p
pi
Fix a pi; count k-matches where pi is matched exactly
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 113
Main Result, Proof Overview Lemma (Step 2)
Fix a pattern p of length m and a text t of length at most 2m. If the pattern p has a HD ≥ 6k to all strings x∗
i , 1 ≤ i ≤ 16k, then there are less
than 1000k2 k-matches of p in t.
p1
p
pi xi xi
t
xi xi xi xi xi xi xi xi xi xi xi
Search for xi in t
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 114
Main Result, Proof Overview Lemma (Step 2)
Fix a pattern p of length m and a text t of length at most 2m. If the pattern p has a HD ≥ 6k to all strings x∗
i , 1 ≤ i ≤ 16k, then there are less
than 1000k2 k-matches of p in t.
p1
p
pi xi xi
t
xi xi xi xi xi xi xi xi xi xi xi
Problem
Up to O(m) exact matches of xi in t.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 115
Main Result, Proof Overview Lemma (Step 2)
Fix a pattern p of length m and a text t of length at most 2m. If the pattern p has a HD ≥ 6k to all strings x∗
i , 1 ≤ i ≤ 16k, then there are less
than 1000k2 k-matches of p in t.
p1
p
pi xi xi
t
xi xi xi xi xi xi xi
Search for power stretches of xi in t of length ≥ |pi|
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 116
Main Result, Proof Overview Lemma (Step 2)
Fix a pattern p of length m and a text t of length at most 2m. If the pattern p has a HD ≥ 6k to all strings x∗
i , 1 ≤ i ≤ 16k, then there are less
than 1000k2 k-matches of p in t.
p1
p
pi xi xi
t
xi xi xi xi xi xi xi
Insight
Only ≤ 150k different power stretches of xi in t.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 117
Main Result, Proof Overview Lemma (Step 2)
Fix a pattern p of length m and a text t of length at most 2m. If the pattern p has a HD ≥ 6k to all strings x∗
i , 1 ≤ i ≤ 16k, then there are less
than 1000k2 k-matches of p in t.
p1
p
pi xi xi
t
xi xi xi xi xi xi xi
Fix a power stretch tj of xi in t.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 118
Main Result, Proof Overview Lemma (Step 2)
Fix a pattern p of length m and a text t of length at most 2m. If the pattern p has a HD ≥ 6k to all strings x∗
i , 1 ≤ i ≤ 16k, then there are less
than 1000k2 k-matches of p in t.
p1
p
pi xi xi
t
xi xi xi xi tj
Fix a power stretch tj of xi in t.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 119
Main Result, Proof Overview Lemma (Step 2)
Fix a pattern p of length m and a text t of length at most 2m. If the pattern p has a HD ≥ 6k to all strings x∗
i , 1 ≤ i ≤ 16k, then there are less
than 1000k2 k-matches of p in t.
p1
p
pi xi xi
t
xi xi xi xi tj
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 120
Main Result, Proof Overview Lemma (Step 2)
Fix a pattern p of length m and a text t of length at most 2m. If the pattern p has a HD ≥ 6k to all strings x∗
i , 1 ≤ i ≤ 16k, then there are less
than 1000k2 k-matches of p in t.
p1
p
pi xi xi
t
xi xi xi xi tj
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi ≥ 3k mism.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 121
Main Result, Proof Overview
p1
p
pi xi xi
t
xi xi xi xi tj
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi ≥ 3k mism. ≥ 2k mism.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 122
Main Result, Proof Overview
p1
p
pi xi xi
t
xi xi xi xi tj
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi ≥ 3k mism. ≥ 2k mism.
Insight
Must align at least k mismatches.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 123
Main Result, Proof Overview
p1
p
pi xi xi
t
xi xi xi xi tj
x∗
i xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi xi ≥ 3k mism. ≥ 2k mism.
Insight
At most O(k4) matches: O(k) parts in p, O(k) streches, O(k2) matches per combination.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 124
Main Result, Proof Overview Lemma (Step 1)
Fix a pattern p of length m and a text t of length at most 2m. If the number of k-matches of p in t is at least 1000k2, and p has HD < 6k to some x∗
i , 1 ≤ i ≤ 16k, then all starting positions of k-matches differ by multiples of |xi|
and t has HD < 20k to x∗
i .
Lemma (Step 2)
Fix a pattern p of length m and a text t of length at most 2m. If the pattern p has a HD ≥ 6k to all strings x∗
i , 1 ≤ i ≤ 16k, then there are less
than 1000k2 k-matches of p in t.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 125
Main Result, Proof Overview Theorem (Structural Insight)
Given strings p of length m and t of length at most 2m, at least one of the following holds: The number of k-matches of p in t is at most O(k2). t′: shortest substring of t such that any k-match of p in t is also a k-match in t′ There is a substring x of p, with |x| = O(m/k), such that δH(p, x∗[1, m]) ≤ O(k) and δH(t′, x∗[1, |t′|]) ≤ O(k). Moreover, any k-match of p in t′ starts at a position of the form 1 + i · |x| with 0 ≤ i ≤ (|t′| − |p|)/|x| (but not every starting position 1 + i · |x| necessarily yields a k-match).
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 126
Faster Algorithm for Pattern-Compressed Strings
Pattern-Compressed String [GS’13]
Let p be a string of length m. We call a string f = v1 . . . vq, q
i=1 |vi| ≤ 2m a p-pattern-
compressed string (pc-string) if every vi is a substring of p. We call the vi’s factors of f.
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 127
Faster Algorithm for Pattern-Compressed Strings
Pattern-Compressed String [GS’13]
Let p be a string of length m. We call a string f = v1 . . . vq, q
i=1 |vi| ≤ 2m a p-pattern-
compressed string (pc-string) if every vi is a substring of p. We call the vi’s factors of f.
PC-String, inst. J1 k, p, f1 with O(k) factors PC-String, inst. J2 k, p, f2 with O(k) factors PC-String, inst. Jn k, p, fn with O(k) factors . . . SLP Instance I SLP T = T1, . . . , Tn k, p of length m T3 T2 T1 B A
O(n k3 (k log k + log m) + k m)
O(k3(k log k + log m)) T(m, k) algorithm
O(n (log m + T(m, k)) + km)
algorithm
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 128
Faster Algorithm for Pattern-Compressed Strings
Pattern-Compressed String [GS’13]
Let p be a string of length m. We call a string f = v1 . . . vq, q
i=1 |vi| ≤ 2m a p-pattern-
compressed string (pc-string) if every vi is a substring of p. We call the vi’s factors of f.
PC-String, inst. J1 k, p, f1 with O(k) factors PC-String, inst. J2 k, p, f2 with O(k) factors PC-String, inst. Jn k, p, fn with O(k) factors . . . SLP Instance I SLP T = T1, . . . , Tn k, p of length m T3 T2 T1 B A O(k3(k log k + log m)) algorithm
O(n k3 (k log k + log m) + k m)
algorithm
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 129
Faster Algorithm for Pattern-Compressed Strings
Pattern-Compressed String [GS’13]
Let p be a string of length m. We call a string f = v1 . . . vq, q
i=1 |vi| ≤ 2m a p-pattern-
compressed string (pc-string) if every vi is a substring of p. We call the vi’s factors of f.
PC-String, inst. J1 k, p, f1 with O(k) factors PC-String, inst. J2 k, p, f2 with O(k) factors PC-String, inst. Jn k, p, fn with O(k) factors . . . SLP Instance I SLP T = T1, . . . , Tn k, p of length m T3 T2 T1 B A O(k3(k log k + log m)) algorithm
O(n k3 (k log k + log m) + k m)
algorithm
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 130
Faster Algorithm for Pattern-Compressed Strings
Theorem (Algorithm for pc-strings) Pattern matching with k mismatches on a pattern p of length m and a p-pc-string f of size O(k) representing at most 2m characters, can be solved in time O(k3(k log k + log m)).
(With O(km) preprocessing on p.)
Implementation of structural insight Need e.g. tools for finding first O(k) mismatches to a periodic string
- r finding all power stretches of a given string in a pc-string
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 131
Faster Algorithm for Pattern-Compressed Strings
Theorem (Algorithm for pc-strings) Pattern matching with k mismatches on a pattern p of length m and a p-pc-string f of size O(k) representing at most 2m characters, can be solved in time O(k3(k log k + log m)).
(With O(km) preprocessing on p.)
Implementation of structural insight Need e.g. tools for finding first O(k) mismatches to a periodic string
- r finding all power stretches of a given string in a pc-string
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 132
Faster Algorithm for Pattern-Compressed Strings
Theorem (Algorithm for pc-strings) Pattern matching with k mismatches on a pattern p of length m and a p-pc-string f of size O(k) representing at most 2m characters, can be solved in time O(k3(k log k + log m)).
(With O(km) preprocessing on p.)
Implementation of structural insight Need e.g. tools for finding first O(k) mismatches to a periodic string
- r finding all power stretches of a given string in a pc-string
Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts
SLIDE 133
SLIDE 134