Irregular Languages CSCI 3130 Formal Languages and Automata Theory - - PowerPoint PPT Presentation

irregular languages
SMART_READER_LITE
LIVE PREVIEW

Irregular Languages CSCI 3130 Formal Languages and Automata Theory - - PowerPoint PPT Presentation

1/18 Irregular Languages CSCI 3130 Formal Languages and Automata Theory Siu On CHAN Chinese University of Hong Kong Fall 2015 Seems to require a DFA with infinitely many states 2/18 Non-regular languages Are there irregular languages?


slide-1
SLIDE 1

1/18

Irregular Languages

CSCI 3130 Formal Languages and Automata Theory Siu On CHAN

Chinese University of Hong Kong

Fall 2015

slide-2
SLIDE 2

2/18

Non-regular languages

Are there irregular languages? Candidate from last lecture:

L = {0n10n1 | n 0}

(duplicate of language of 0∗1 = {1, 01, 001, 0001, . . . }) Why do we believe it is irregular? Seems to require a “DFA” with infinitely many states Afuer reading the first half, need to remember number of zeros so far 11, 0101, 001001, 00010001, … Infinitely many possibilities Let’s formally prove this intuition

slide-3
SLIDE 3

2/18

Non-regular languages

Are there irregular languages? Candidate from last lecture:

L = {0n10n1 | n 0}

(duplicate of language of 0∗1 = {1, 01, 001, 0001, . . . }) Why do we believe it is irregular? Seems to require a “DFA” with infinitely many states Afuer reading the first half, need to remember number of zeros so far 11, 0101, 001001, 00010001, … Infinitely many possibilities Let’s formally prove this intuition

slide-4
SLIDE 4

3/18

Distinct states for 01 and 0001

Claim

If a deterministic automaton accepts L = {0n10n1 | n 0}, the state q it reaches upon reading 01 must be different from the state q′ it reaches upon reading 0001

q q′

1 1

q (= q′)

1 1

slide-5
SLIDE 5

4/18

Distinct states for 01 and 0001

Claim

If a deterministic automaton accepts L = {0n10n1 | n 0}, the state q it reaches upon reading 01 must be different from the state q′ it reaches upon reading 0001

q (= q′)

01 0001

Why not? Reason: afuer going to q, if it reads 01 and reaches r …

q (= q′) r

01 0001 01 If r is not accepting, it rejects 0101

If r is accepting state, it accepts 000101

slide-6
SLIDE 6

5/18

General case: distinguishable strings

If a deterministic automaton accepts L, if there are strings x and y such that xz ∈ L but yz /

∈ L, then the automaton must be in two different

states upon reading x and y

q q′ x y

q (= q′) x y

Reason:

q (= q′) r x y z

If r is not accepting, it rejects xz

If r is accepting state, it accepts yz

slide-7
SLIDE 7

6/18

Distinguishable strings

x and y are distinguishable by L if for some string z,

we have xz ∈ L and yz /

∈ L (or the other way round)

If x and y are distinguishable by L, any deterministic automaton accepting

L must reach different states upon reading x and y q q′ x y

q (= q′) x y

slide-8
SLIDE 8

7/18

Requires many states

Strings x1, . . . , xn are called pairwise distinguishable by L if every pair xi and xj are distinguishable by L, for any i = j. If strings x1, . . . , xn are pairwise distinguishable by L, any deterministic automaton accepting L must have at least n states

q1 q2 qn x1 x2 xn

q1 q3 qn−1 x1 x2 x3 xn

slide-9
SLIDE 9

8/18

Pigeonhole principle

If you put 5 balls into 4 bins, then (at least) two balls end up in the same bin More generally If you put n balls into (at most) n − 1 bins, then (at least) two balls end in the same bin

slide-10
SLIDE 10

9/18

Pigeonhole principle

slide-11
SLIDE 11

10/18

Requires many states

If strings x1, . . . , xn are pairwise distinguishable by L, any deterministic automaton accepting L must have at least n states Otherwise:

q1 qi qn−1 x1 xi xj xn

If there are (at most) n − 1 states, by pigeonhole principle, two different strings xi and xj must end up at the same state, but: If xi and xj are distinguishable by L, any deterministic automaton accepting L must reach different states upon reading xi and xj✗

slide-12
SLIDE 12

11/18

0n10n1 is not regular

Suffices find an infinitely sequence of strings that are pairwise distinguishable by L = {0n10n1 | n 0} Afuer reading the first half, need to remember number of zeros so far 11, 0101, 001001, 00010001, … 1, 01, 001, 0001, . . . are pairwise distinguishable by L Why are 0i1 and 0j1 distinguishable by L? (i = j) Take z 0i1 0i1 i1

L

0j10i1

L

slide-13
SLIDE 13

11/18

0n10n1 is not regular

Suffices find an infinitely sequence of strings that are pairwise distinguishable by L = {0n10n1 | n 0} Afuer reading the first half, need to remember number of zeros so far 11, 0101, 001001, 00010001, … 1, 01, 001, 0001, . . . are pairwise distinguishable by L Why are 0i1 and 0j1 distinguishable by L? (i = j) Take z = 0i1 0i10i1 ∈ L 0j10i1 /

∈ L

slide-14
SLIDE 14

12/18

Which of these are (ir)regular?

L1 = {x | x has the same number of 0s and 1s} L2 = {0n1m | n > m 0} L3 = {x | x has the same number of patterns 01 and 11} L4 = {x | x has the same number of patterns 01 and 10} L5 = {x | x has a different number of 0s and 1s}

slide-15
SLIDE 15

13/18

L1 = Same number of 0s and 1s

Why does it require infinitely many states to accept? Need to remember number of 0s (or 1s) read so far 0 00 000 are pairwise distinguishable by L Why are 0i and 0j distinguishable by L ?

i j

Take z 1i 0i

i

L

0j

i

L

slide-16
SLIDE 16

13/18

L1 = Same number of 0s and 1s

Why does it require infinitely many states to accept? Need to remember number of 0s (or 1s) read so far

ε, 0, 00, 000, . . . are pairwise distinguishable by L1

Why are 0i and 0j distinguishable by L1?

(i = j)

Take z 1i 0i

i

L

0j

i

L

slide-17
SLIDE 17

13/18

L1 = Same number of 0s and 1s

Why does it require infinitely many states to accept? Need to remember number of 0s (or 1s) read so far

ε, 0, 00, 000, . . . are pairwise distinguishable by L1

Why are 0i and 0j distinguishable by L1?

(i = j)

Take z = 1i 0i1i ∈ L1 0j1i /

∈ L1

slide-18
SLIDE 18

14/18

L2 = {0n1m | n > m}

Like L1, need to remember number of 0s read so far

ε, 0, 00, 000, . . . are pairwise distinguishable by L2

Why are 0i and 0j distinguishable by L2?

(i > j)

Take z 1i 0i

i

L

0j

i

L

slide-19
SLIDE 19

14/18

L2 = {0n1m | n > m}

Like L1, need to remember number of 0s read so far

ε, 0, 00, 000, . . . are pairwise distinguishable by L2

Why are 0i and 0j distinguishable by L2?

(i > j)

Take z = 1i−1 0i1i−1 ∈ L2 0j1i−1 /

∈ L2

slide-20
SLIDE 20

15/18

L3 = same number of 01s and 11s

Need to remember the number of 01s read so far

ε, 01, 0101, 010101, . . . are pairwise distinguishable by L3

Why are (01)i and (01)j distinguishable by L3?

(i > j)

Take z 1i 01 i1i

L

01 j1i

L

Example: 010101111

i

slide-21
SLIDE 21

15/18

L3 = same number of 01s and 11s

Need to remember the number of 01s read so far

ε, 01, 0101, 010101, . . . are pairwise distinguishable by L3

Why are (01)i and (01)j distinguishable by L3?

(i > j)

Take z = 1i

(01)i1i ∈ L3 (01)j1i / ∈ L3

Example: 010101111

(i = 3)

slide-22
SLIDE 22

16/18

L4 = same number of 01s and 10s

ε, 01, 0101, 010101, . . . are pairwise distinguishable by L4

Why are (01)i and (01)j distinguishable by L4?

(i > j)

Take z = (10)i

(01)i(10)i ∈ L4 (10)j(10)i / ∈ L4

Example: 010101101010

(i = 3)

In fact, 01 j 10 i

L because there are as many 01 as 10

In fact, L is regular (see Week 2 tutorial)

slide-23
SLIDE 23

16/18

L4 = same number of 01s and 10s

ε, 01, 0101, 010101, . . . are pairwise distinguishable by L4

Why are (01)i and (01)j distinguishable by L4?

(i > j)

Take z = (10)i

(01)i(10)i ∈ L4 (10)j(10)i / ∈ L4

Example: 010101101010

(i = 3)

In fact, (01)j(10)i ∈ L4 because there are as many 01 as 10 In fact, L4 is regular (see Week 2 tutorial)

slide-24
SLIDE 24

17/18

L5 = different number of 0s and 1s

Is L5 irregular? Yes If L were regular, then so is

L L x x has the same number of 0s and 1s

But we saw that L is irregular, therefore so is L

slide-25
SLIDE 25

17/18

L5 = different number of 0s and 1s

Is L5 irregular? Yes If L5 were regular, then so is

L5 = L1 = {x | x has the same number of 0s and 1s}

But we saw that L1 is irregular, therefore so is L5

slide-26
SLIDE 26

18/18

An exercise

L6 = properly nested strings of parentheses Σ = {(, )}

(), (()), ()() are in L6 (, ), )( are not Exercise: show that L6 is irregular What does it mean? Language computational problem DFA machine with finite memory

L is irregular

checking whether (arbitrarily long) strings are properly nested requires unbounded amount of memory

slide-27
SLIDE 27

18/18

An exercise

L6 = properly nested strings of parentheses Σ = {(, )}

(), (()), ()() are in L6 (, ), )( are not Exercise: show that L6 is irregular What does it mean? Language = computational problem DFA = machine with finite memory

L6 is irregular ⇒ checking whether (arbitrarily long) strings are properly

nested requires unbounded amount of memory