[PPT] - ? Marco Gori University of Siena (Italy) ILP 2018 Outline PowerPoint Presentation

SLIDE 1

ILP 2018 Marco Gori University of Siena (Italy)

?

LEARNING AND INFERENCE WITH CONSTRAINTS

SLIDE 2

ILP 2018

Outline

Environment and constraints
Bridging logic and real-valued constraints
Representational issues
Learning, Reasoning and Inference with

constraints (lyrics s/w environment)

SLIDE 3

ILP 2018

ENVIRONMENTS AND CONSTRAINTS

SLIDE 4

ILP 2018

y((0, 1), (1, 0))

¬y((0, 0), (1, 1)).

∧ ∨ L = {((0, 0), 0), ((0, 1), 1), ((1, 0), 1), ((1, 1), 0)} =

Supervised Learning

“hard” architectural constraints Lagrangian framework training set constraints

xκ5 − σ(w53xκ3 + w54xκ4 + b4) = 0 xκ4 − σ(w41xκ1 + w42xκ2 + b4) = 0 xκ3 − σ(w31xκ1 + w32xκ2 + b3) = 0

κ = 1, 2, 3, 4

x15 = 1, x25 = 1, x35 = 0, x45 = 0

1 2

3

4

SLIDE 5

fωh(h) = fωa ◦ fah(h).

fωh : W → H : h → ω(h), fah : W → A : h → a(h), fωa : A → W : a → ω(a),

Enforcing Consistencies

wωawah − wωh = 0, wahbah + bωa − bωh = 0.

This functional equation is imposing the circulation of coherence. Since the functions are linear, this constraint can be converted to wωhh + bωh = wωawahh + (wahbah + bωa). The equivalence ∀h ∈ R+ yields

SLIDE 6

ILP 2018

(MASS ≥ 30) ∧ (PLASMA ≥ 126) ⇒ positive (MASS ≤ 25) ∧ (PLASMA ≤ 100) ⇒ negative.

Pima Indian Diabetes Dataset

body mass index blood glucose

(SIZE ≥ 4) ∧ (NODES ≥ 5) ⇒ recurrent ( 1 9) ( = 0)

≥ ∧ ≥ ⇒ (SIZE ≤ 1.9) ∧ (NODES = 0) ⇒ non recurrent,

Wisconsin Breast Cancer Prognosis

diameter of the tumor number of metastasized lymph nodes

Diagnosis and Prognosis in Medicine

SLIDE 7

?

DeepLearn 2018

Reconstruction of overwritten chars

Recognize the foreground and background numbers I was told that the foreground char is less or equal to the background char MNIST

SLIDE 8

ILP 2018

Reconstruction of overwritten chars

MNIST

SLIDE 9

ILP 2018

Patterns, labels, and individuals

Giuseppe 178, 70, 45

label pattern What about learning and inference with individuals?

X x (X, x)

SLIDE 10

ILP 2018

Inference in formal logic

Domain(label="People") Individual(label="Marco", "People") Individual(label="Giuseppe", "People") Individual(label="Michelangelo", "People") Individual(label="Francesco", "People") Individual(label="Franco", "People") Individual(label="Andrea", "People") Predicate(label="fatherOf", ("People", "People")) Predicate(label="grandFatherOf", ("People", "People")) Predicate(label="eq", ("People", "People"), function=eq) Constraint("fatherOf(Marco, Giuseppe)") Constraint("fatherOf(Giuseppe, Michelangelo)") Constraint("fatherOf(Giuseppe, Francesco)") Constraint("fatherOf(Franco, Andrea)") Constraint("forall x: not fatherOf(x,x)") Constraint("forall x: not grandFatherOf(x,x)")

nly labels are involved!

SLIDE 11

ILP 2018

Inference in formal logic

Constraint("forall x: forall y: fatherOf(x,y) -> not fatherOf(y,x)") Constraint("forall x: forall y: grandFatherOf(x,y)

> not grandFatherOf(y,x)")

Constraint("forall x: forall y: fatherOf(x,y) -> not grandFatherOf(x,y)") Constraint("forall x: forall y: grandFatherOf(x,y) -> not fatherOf(x,y)") Constraint("forall x: forall y: forall z: fatherOf(x,z) and fatherOf(z,y) -> grandFatherOf(x,y)") Constraint("forall x: forall y: forall z: (fatherOf(x,y) and not eq(x,z)) -> not fatherOf(z,y)")

SLIDE 12

ILP 2018

Inference in formal logic

true: grandFatherOf("Marco", "Michelangelo"),

Constraint("forall x: forall y: forall z: grandFatherOf(x,z) and fatherOf(y,z) -> fatherOf(x,y)")

¬ , grandFatherOf("Marco", "Francesco"),

SLIDE 13

ILP 2018

Full inference on individuals

from formal logic from neural nets

(X, x)

(agex, weightx, heightx, agey, weighty, heighty) consistency constraints Complexity issues: the inference in the environment avoids massive exploration of the Boolean hypercube

SLIDE 14

?

“There are finer fish in the sea that have ever been caught,” Irish proverb

learning relations and logic

BRIDGING LOGIC AND REAL-VALUED CONSTRAINTS

SLIDE 15

(Formal) Logic Optimization, statistics

Any break through the wall?

Two Schools of Thought

SLIDE 16

ILP 2018

p-norm

Logic by Real Numbers

Φ(x, f(x)) = 0

Φ(f(x)) = 0

general form

∀x

SLIDE 17

ILP 2018

Gödel T-norm

Logic by Real Numbers (con’t)

SLIDE 18

: t f1(x1)(1 − f2(x2)) = 0 e = 0 holds true. Of

− also f2(x2)(1 − f1(x1)) = 0 h f1(x1) + f2(x2) − 2f1(x1)f2(x2) = 0,

f 2

1 (x1) + f 2 2 (x2) − 2f1(x1)f2(x2)

= (f1(x1) − f2(x2))2 = 0

Tricky Issues

1 ⇒ 2 2 ⇒ 1 2 ⇔ 1

f1(x1) = f2(x2)

?

Petr Hájek on Mathematical Fuzzy Logic, Springer 2016

SLIDE 19

ILP 2018

Supervised Learning

f(xκ) ⇔ yκ,  = 1, . . . , `

and Łukasiewicz,

max{min{1 − fκ(xκ) + yκ, 1)} + min{1 − yκ + f(xκ), 1), 1}}

yκ ⇒ f(xκ) : min{1 − yκ + f(xκ), 1} f(xκ) ⇒ yκ : min{1 − f(xκ) + yκ, 1}

(f(xκ) ⇒ y(xκ)) ∧ (yκ ⇒ f(xκ))

1 − |yκ − f(xκ)|

The discover of loss by t-norms …

Φ(x, f(x)) = 0

SLIDE 20

ILP 2018

Unsupervised Learning

two groups exclusive properties inclusive properties all data are in a certain domain

∀x (A(x) ∨ B(x)) ∧ D(x) ∀x (A(x) ⊕ B(x)) ∧ D(x)

SLIDE 21

ILP 2018

“the simplest solution” compatible with the constraints

REPRESENTATIONAL ISSUES

We use the Lagrangian

ptimization framework

SLIDE 22

A New Communication Protocol

data + constraints

Φ(x, f(x)) = 0

from constraints to loss functions

X

κ∈U

φ2(xκ, f(xκ)) ∀x

SLIDE 23

A New Communication Protocol

Supervised
Unsupervised
Semi-supervised

f (x))

, f (x))

: φi(x, f (x)) = 0,

perceptual space tasks cognitive laws

?

learning problem

learning of constraints

data + constraints

SLIDE 24

The New Role of Learning Data

∈ hair(x) ⇒ mammal(x) mammal(x) ∧ hoofs(x) ⇒ ungulate(x) ungulate(x) ∧ white(x) ∧ blackstripes(x) ⇒ zebra(x).

fhair(x)(1 − fmammal(x)) = 0 fmammal(x)fhoofs(x)(1 − fungulate(x)) = 0 fungulate(x)fwhite(x)fblackstripes(x)(1 − fzebra(x)) = 0.

f (x))

, f (x))

perceptual space

tasks

cognitive laws

: φi(x, f (x)) = 0,

penalty functions

f (x))

perceptual space

?

SLIDE 25

The Marriage of Parsimony Principle and Constraints

fhair(x)(1 − fmammal(x)) = 0 fmammal(x)fhoofs(x)(1 − fungulate(x)) = 0 fungulate(x)fwhite(x)fblackstripes(x)(1 − fzebra(x)) = 0.

penalty functions

f (x))

perceptual space

keep these loss functions as small as possible

fhair( ( ) ( )( − fmammal( − (

(x)fhoofs( )

)( − ( )( − fungulate( ( )( − (

(x)fwhite( ( ( ) ( )( (x)fblackstripes(

− )( − fzebra(

fe

∥ f ∥P

Constraints turn out to be loss functions Parsimony Principle

SLIDE 26

fe

?

Kernel Machines

…

Primal space Dual Space

How to represent the tasks?

SLIDE 27

ILP 2018 where a ⌃ C∞(

under proper boundary conditions ...

Semi-norm in Sobolev Spaces

SLIDE 28

ILP 2018

admissible w.r.t the collection of constraints strictly (hard) partially (soft) check of a “new” constraint

Parsimony Principle

inference in the environment!

SLIDE 29

ILP 2018 check of a new constraint

Facing the intractability coming from formal logic formal

C | = ⇧,

perator al

Inference

SLIDE 30

ILP 2018

Representer Theorem single constraint

Lf ⋆ + p µ∇f ˜ ψ = 0.

f ⋆ = g ∗ ω ˜

ψ,

ω ˜

ψ(x) = − 1

µp(x)∇f ˜ ψ(x, f ⋆(x)).

ˆ f ⋆(ξ) = ˆ g(ξ) · ˆ ω ˜

ψ(ξ).

˜ ψ(x, f(x)) = 0

constraint reaction Gnecco et al (2015)

SLIDE 31

ILP 2018

D(φ1, . . . , φm) D(f1, . . . , fm) = 0.

⌦x Xi ⇧ X : φi(x, f(x)) = 0, i I Nm

m

⌅

L(f) =⌘ f ⌘2

P,γ + m

⇥

i=1

⇤

X

λi(x) · φi(x, f(x))dx.

Lf(x) +

m

⇥

i=1

λi(x) · ◆fφi(x, f(x)) = 0,

Lagrangian approach Euler-Lagrange equations Green function reaction of the constraint Fredholm eq. (II kind)

“merging of two ideas ...”

support constraints

Representation of the solution

hard constraints

SLIDE 32

hard constraints soft constraints

⌦x Xi ⇧ X : φi(x, f(x)) = 0, i I Nm

m

⌅

Lagrange Multipliers and Probability Density

SLIDE 33

ILP 2018

Parsimony and architectural constraints

minimize

1 2

P

i2O

P

j2Ho w2 ij + P` =1

P

j2H j|xj|

subject to xi − P

j2pa(i) wijxj

= 0,

i ∈ H ∪ O,  = 1, . . . , `, 1 − xiyi ≤ 0 i ∈ O,  = 1, . . . , ` L(w, x, ↵, ) = 1 2 X

i2O

X

j2Ho

w2

ij + `

X

=1

X

m

✓ m|xm| [m ∈ H] + ↵m ✓ xm − ✓ X

r2pa(m)

wmrxr ◆◆ [m ∈ H ∪ O] + X

i2O

i

1 − xiyi
+

◆ , P

SLIDE 34

ILP 2018

Gradient descent/ascent

saddle points of the Lagrangian

gκi = xκi − σ ✓ X

j∈pa(i)

wijxκj ◆ = 0

saddle points of the Lagrangian Lagrangian multipliers, straw and support neurons! A more biologically plausibile solution than Backpropagation learning (gradient descent) focus of attention (gradient ascent)

wij ← wij − ηw∂wijL xκi ← xκi − ηx∂xκiL λκi ← λκi − ηλ∂λκiL λκi ← λκi + ηλ∂λκiL

Network growing and constraint selection …

SLIDE 35

ILP 2018

LYRICS

SLIDE 36

ILP 2018

Semi-supervised Learning

# Given predicate stating whether two patterns are "close" Predicate("Close", ("Points","Points"), function=f_close) # The constraint implementing manifold regularization. Constraint("forall p:forall q: Close(p,q)->(A(p)<->A(q))") # Definition of the domain of the data points. Domain(label="Points", data=X) # Approximating the predicate A via a NN. Predicate("A", ("Points"), function=NN_A) # Fit the supervisions PointwiseConstraint(A, y_s, X_s)

given predicate

SLIDE 37

ILP 2018

Semi-supervised Learning (con’t)

(a) (b) (c) (d)

effect of close

SLIDE 38

ILP 2018

“Knowledge Base”

a1(x) ∧ a2(x) ⇒ a3(x) a3(x) ⇒ a4(x) a1(x) ∨ a2(x) ∨ a3(x) ( ) ( ) ( )

a1(

∧ a2( ( )

a3( ( )

⇒ a4( x) ∨

?

What can I deduce? How can data help deduction?

C | = ⇧,

perator al

Bridging Perception and Logic

SLIDE 39

ILP 2018

0.5 1 1.5 2 2.5 3 0.5 1 1.5 2 x1 Sample Class 1 0.5 1 1.5 2 2.5 3 0.5 1 1.5 2 x1 x2 Sample Class 2 x

Checking (logic) constraints

SLIDE 40

ILP 2018

0.5 1 1.5 2 2.5 3 0.5 1 1.5 2 x1 Sample Class 3 0.5 1 1.5 2 2.5 3 0.5 1 1.5 2 x1 x2 Sample Class 4 Label (1)

Checking (logic) constraints

SLIDE 41

ILP 2018

0.5 1 1.5 2 2.5 3 0.5 1 1.5 2 x1 x2 0.2 0.4 0.6 0.8 1 f1(x) Sample Label (1) Label (0) 0.5 1 1.5 2 2.5 3 0.5 1 1.5 2 x1 x2 0.2 0.4 0.6 0.8 1 f1(x) Sample Label (1) Label (0)

points only points and “logic rules”

SLIDE 42

ILP 2018

points only points and “logic rules”

0.5 1 1.5 2 2.5 3 0.5 1 1.5 2 x1 x2 0.2 0.4 0.6 0.8 1 f2(x) Sample Label (1) Label (0) 0.5 1 1.5 2 2.5 3 0.5 1 1.5 2 x1 x2 0.2 0.4 0.6 0.8 1 f2(x) Sample Label (1) Label (0)

SLIDE 43

ILP 2018

points only points and “logic rules”

0.5 1 1.5 2 2.5 3 0.5 1 1.5 2 x1 x2 0.2 0.4 0.6 0.8 1 f3(x) Sample Label (1) Label (0) 0.5 1 1.5 2 2.5 3 0.5 1 1.5 2 x1 x2 0.2 0.4 0.6 0.8 1 f3(x) Sample Label (1) Label (0)

SLIDE 44

ILP 2018

points only points and “logic rules”

0.5 1 1.5 2 2.5 3 0.5 1 1.5 2 x1 x2 0.2 0.4 0.6 0.8 1 f4(x) Sample Label (1) Label (0) 0.5 1 1.5 2 2.5 3 0.5 1 1.5 2 x1 x2 0.2 0.4 0.6 0.8 1 f4(x) Sample Label (1) Label (0)

SLIDE 45

ILP 2018

FOL clause Category a1(x) ∧ a2(x) ⇒ a3(x) KB a3(x) ⇒ a4(x) KB a1(x) ∨ a2(x) ∨ a3(x) KB a1(x) ∧ a2(x) ⇒ a4(x) LD a1(x) ∧ a3(x) ⇒ a2(x) ENV a3(x) ∧ a2(x) ⇒ a1(x) ENV a2(x) ∧ a3(x) ⇒ a4(x) LD a3(x) ⇒ a1(x) ∨ a2(x) ∨ a4(x) LD a1(x) ∧ a4(x) ENV a2(x) ∨ a3(x) ENV a1(x) ∨ a2(x) ⇒ a3(x) ENV a1(x) ∧ a2(x) ⇒ ¬a4(x) ENV a1(x) ∧ ¬a2(x) ⇒ a3(x) ENV a2(x) ∧ ¬a3(x) ⇒ a1(x) ENV Average Truth Value 98.26% (1.778) 98.11% (2.11) 96.2% (3.34) 96.48% (3.76) 91.32% (5.67) 91.7% (4.57) 96.58% (4.13) 00008) 99.7% (0.54) 45.26% (5.2) 78.26% (6.13) 68.28% (5.86) 3.51% (3.76) 27.74% (18.96) 5.71% (5.76)

True False

Checking Constraints

Search reduced to manifolds instead of the Boolean hypercube!

SLIDE 46

ILP 2018

a1(x) ∧ a2(x) ⇒ a3(x) a3(x) ⇒ a4(x) a1(x) ∨ a2(x) ∨ a3(x) ( ) ( ) ( )

a1(

∧ a2( ( )

a3( ( ) ∧ ⇒ a1(x) ∧ a3(x) ⇒ a2(x) a3(x) ∧ a2(x) ⇒ a1(x) ( ) ( ) ( )

Formally false ? but true in this environment!

Checking Constraints in the Environment

SLIDE 47

ILP 2018

Patterns, labels, and individuals

Giuseppe 178, 70, 45

label pattern What about learning and inference with individuals?

X x (X, x)

SLIDE 48

ILP 2018

Inference in formal logic

Domain(label="People") Individual(label="Marco", "People") Individual(label="Giuseppe", "People") Individual(label="Michelangelo", "People") Individual(label="Francesco", "People") Individual(label="Franco", "People") Individual(label="Andrea", "People") Predicate(label="fatherOf", ("People", "People")) Predicate(label="grandFatherOf", ("People", "People")) Predicate(label="eq", ("People", "People"), function=eq) Constraint("fatherOf(Marco, Giuseppe)") Constraint("fatherOf(Giuseppe, Michelangelo)") Constraint("fatherOf(Giuseppe, Francesco)") Constraint("fatherOf(Franco, Andrea)") Constraint("forall x: not fatherOf(x,x)") Constraint("forall x: not grandFatherOf(x,x)")

nly labels are involved!

SLIDE 49

ILP 2018

Inference in formal logic

Constraint("forall x: forall y: fatherOf(x,y) -> not fatherOf(y,x)") Constraint("forall x: forall y: grandFatherOf(x,y)

> not grandFatherOf(y,x)")

Constraint("forall x: forall y: fatherOf(x,y) -> not grandFatherOf(x,y)") Constraint("forall x: forall y: grandFatherOf(x,y) -> not fatherOf(x,y)") Constraint("forall x: forall y: forall z: fatherOf(x,z) and fatherOf(z,y) -> grandFatherOf(x,y)") Constraint("forall x: forall y: forall z: (fatherOf(x,y) and not eq(x,z)) -> not fatherOf(z,y)")

SLIDE 50

ILP 2018

Inference in formal logic

true: grandFatherOf("Marco", "Michelangelo"),

Constraint("forall x: forall y: forall z: grandFatherOf(x,z) and fatherOf(y,z) -> fatherOf(x,y)")

¬ , grandFatherOf("Marco", "Francesco"),

SLIDE 51

ILP 2018

How does it work?

(Marco, Giuseppe)

. . .

(Marco, Francesco)

father

grandfather

grounded pair

wgf(Mar, Fra) wf(Mar, Fra) wf(Mar, Giu) wgf(Mar, Giu)

wf(Mar, Giu) = 1 wf(Giu, Mic) = 1 wf(Giu, Fra) = 1

wf(Fra, And) = 1

SLIDE 52

ILP 2018

How does it work?

wf(Mar, Giu) = 1 wf(Giu, Mic) = 1 wf(Giu, Fra) = 1

wf(Fra, And) = 1

grandfather definition …

Constraint("forall x: forall y: forall z: fatherOf(x,z) and fatherOf(z,y) -> grandFatherOf(x,y)") Constraint("forall x: forall y: forall z: (fatherOf(x,y) and not eq(x,z)) -> not fatherOf(z,y)")

Łukasiewicz logic

T(x, y) = max{0, x + y − 1}

⇒ min{1, 1 − x + y}

X

X,Y,Z

min{1 − max{wf(X, Z) + wf(Z, Y ) − 1, 0} + wgf(X, Y ), 1}

SLIDE 53

ILP 2018

Full inference on individuals

wf(X, Y ), wgf(X, Y )

from formal logic

ωf(x, y), ωgf(x, y)

from neural nets

(X, x)

(agex, weightx, heightx, agey, weighty, heighty) consistency constraints Complexity issues: the inference in the environment avoids massive exploration of the Boolean hypercube

SLIDE 54

ILP 2018

?

k

Poly Check

SLIDE 55

ILP 2018

Learning and inference in the environment

left, below

inside/contains

Learning and inference in the world of rectangles

(p1, p2) (q1, q2)

SLIDE 56

ILP 2018

The “world of rectangles”

∀x, y in S : below(x, y) ⇒ SB(x, y)

∀x, y in S : left(x, y) ⇒ SL(x, y)

∀x, y in S : inside(x, y) ⇒ SI(x, y)

∀x, y left(x, y) ⇔ right(y, x) ∀x, y below(x, y) ⇔ above(y, x)

∀x, y inside(x, y) ⇔ contains(y, x) ∀x, y left(x, y) ⇔ ¬left(y, x)

∀x, y below(x, y) ⇔ ¬below(y, x)

∀x, y inside(x, y) ⇔ ¬inside(y, x)

∀x, y left(x, y) ⇔ ¬inside(x, y) ∀x, y below(x, y) ⇔ ¬inside(x, y)

x ∼ ((p1, p2), (q1, q2))

supervision consistency of the

pposite

asymmetry consistency topologic consistency

SLIDE 57

ILP 2018

Inference in the “world of rectangles”

∀x, y, z : inside(x, y) ∧ right(y, z) ⇒ right(x, z)

∀x, y left(x,y) ⇒ above(x, y) ∀x left(x,x)

0.99 0.55 0.02

50 rectangles, 15 supervisions, 4-20-6 neural net

SLIDE 58

ILP 2018

Generating the next char

∀x IsZero(x) ⇒ zero(x) ∀x IsOne(x) ⇒ one(x) ∀x IsTwo(x) ⇒ two(x) ∀x IsZero(x) ⇒ one(next(x)) ∧ two(previous(x)) ∀x IsOne(x) ⇒ two(next(x)) ∧ zero(previous(x)) ∀x IsTwo(x) ⇒ zero(next(x)) ∧ one(previous(x)) ∀x next(previous(x)) = x ∀x previous(next(x)) = x

SLIDE 59

ILP 2018

Generating the next char (con’t)

Domain("Images", data=X) Predicate("zero",("Images"),function=Slice(NN, 0)) Predicate("one",("Images"),function=Slice(NN, 1)) Predicate("two",("Images"),function=Slice(NN, 2)) PointwiseConstraint(NN, y, X) Predicate("eq",("Images", "Images"), function=eq) Function("next",("Images"), function=NN_next) Function("previous", ("Images"), function=NN_prev) Constraint("forall x: zero(x) -> one(next(x))") Constraint("forall x: one(x) -> two(next(x))") Constraint("forall x: two(x) -> zero(next(x))") Constraint("forall x: zero(x) -> two(previous(x))") Constraint("forall x: one(x) -> zero(previous(x))") Constraint("forall x: two(x) -> one(previous(x))") Constraint("forall x: eq(previous(next(x)),x)") Constraint("forall x: eq(next(previous(x)),x)")

SLIDE 60

ILP 2018

Generating the next char … (con’t)

s next

d previous

Notice that this is NOT based on GAN!

SLIDE 61

ILP 2018

Reconstruction of overwritten chars

Recognize the foreground and background numbers I was told that the foreground char is less or equal to the background char MNIST

SLIDE 62

A framework for computational laws of nature
Probability distributions and Lagrange multipliers,

biological plausibility and focus of attention

Bridging symbols and sub-symbols (logic

representations & learning )

Inference in the environment, full inference

(searching in manifolds instead of the Boolean hypercube)

Time and developmental issues (Piaget foundation
f Developmental Psychology)

Conclusions

SLIDE 63

ILP 2018

OPEN ISSUES

Learning loss functions by generators
Learning of constraints
Interactive environments
Stage-based processing

SLIDE 64

LYRICS

Learning Yourself Reasoning and Inference with COnstraints

a development environment on top of tensorflow

https://github.com/GiuseppeMarra/lyrics

SLIDE 65

?

Acknowledgments

ILP 2018

Stefano Melacci, Marco Maggini, Claudio Saccà, Francesco Giannini, Giuseppe Marra - UNISI Giorgio Gnecco - IMT - Lucca Marcello Sanguineti - Genova Michelangelo Diligenti - UNISI and Google - Zurich Luciano Serafini - FBK, Trento Artur Gracez - City University, London Michael Spranger - Sony Luis Lamb, Tran Son - Institute of Informatics - UFRGS Andrea Passerini - Univ. of Trento

Tutorial and international schools

tutorials: IJCNN 2018, IJCAI 2018 International Schools: ACDL 2018 (Siena) DeepLearn 2018 (Genova)

Publications

Diligenti et al. Semantic-based regularization, AIJ 2017 Gnecco et al. Foundation

f support constraint machines,

Neural Computation 2015

SLIDE 66

Machine Learning

Marco Gori

A CONSTRAINT-BASED APPROACH

SLIDE 67

Machine Learning 

A Constraint-Based Approach

Marco Gori Department of Information Engineering and Mathematics, University

f Siena, Italy

A focused approach that covers the deep ideas of machine learning through a variety of specific techniques KEY FEATURES

It is an introductory book for all readers who love in-depth explanations of

fundamental concepts.

It is intended to stimulate questions and help a gradual conquering of basic

methods, more than offering “recipes for cooking.”

It proposes the adoption of the notion of constraint as a truly unified

treatment of nowadays most common machine learning approaches, while combining the strength of logic formalisms dominating in the AI community.

It contains a lot of exercises along with the answers, according to a slight

modification of Donald Knuth’s difficulty ranking.

It comes with a companion Web site to assist more on practical issues.

QUOTES A fairly comprehensive and original book on machine learning, including deep learning, written from a constraint-based perspective where Marco Gori shares his passion for the topic with his reader. The book comes also with a set of useful problems, exercises, solutions, as well as a companion web site. Pierre Baldi, University of California Irvine This very interesting book brings a fresh look at machine learning and deep learning from the broad point of view in which learning corresponds to satisfying constraints, encompassing the perceptual as well as the symbolic, soft as well as hard constraints. Yoshua Bengio, Université de Montréal A real tour-de-force across the landscape of a field -- machine learning -- which is developing very rapidly and is transforming a large swath of today's science and engineering of intelligence. Tomaso Poggio, MIT ISBN: 978-0-08-100659-7 PUB DATE: November 2017 LIST PRICE: £59.99/€70.95/$99.95 FORMAT: Paperback PAGES: c. 580 AUDIENCE  Upper level undergraduate and graduate students taking a machine learning course in computer science departments and professionals involved in relevant areas of artificial intelligence

COMPUTING Please contact your Elsevier Sales or Customer Service Representative

SLIDE 68

ILP 2018

From parsimonious inference to induction

learning and the active role inference

f ?(x) x

ψ(x, f ?(x)) inductive learning of new

constraints by MMI clustering maximize the sensibility a cyclic process: learning from and of constraints!