Computational Optimization Mathematical Programming Fundamentals - - PowerPoint PPT Presentation

computational optimization
SMART_READER_LITE
LIVE PREVIEW

Computational Optimization Mathematical Programming Fundamentals - - PowerPoint PPT Presentation

Computational Optimization Mathematical Programming Fundamentals 1/25 (revised) If you dont know where you are going, you probably wont get there. -from some book I read in eight grade If you do get there, you wont know it. -Dr.


slide-1
SLIDE 1

Computational Optimization

Mathematical Programming Fundamentals 1/25 (revised)

slide-2
SLIDE 2

If you don’t know where you are going, you probably won’t get there.

  • from some book I read in eight grade

Mathematical Programming Theory tells us – How to formulate a model. Strategies for solving the model. How to know when we have found an

  • ptimal solutions.

How hard it is to solve the model. Let’s start with the basics…………………

If you do get there, you won’t know it.

  • Dr. Bennett’s amendment
slide-3
SLIDE 3

Line Segment

Let x∈Rn and y∈Rn, the points on the line segment joining x and y are { z | z = λx+(1- λ)y, 0≤ λ ≤ 1 }.

y x

slide-4
SLIDE 4

Convex Sets

A set S is convex if the line segment joining any two points in the set is also in the set, i.e., for any x,y∈S, λx+(1- λ)y ∈S for all 0≤ λ ≤ 1 }.

convex convex not convex not convex not convex

slide-5
SLIDE 5

Favorite Convex Sets

Circle with center c and radius r Linear Equalities = plane Linear Inequalities or Polyhedrons

}

{ | x x c r − ≤

{ |

}

mxn m n

Matrix A R b R x R x Ax b ∈ ∈ ∈ =

{ |

}

mxn m n

Matrix A R b R x R x Ax b ∈ ∈ ∈ ≤

slide-6
SLIDE 6

Convex Sets

Is the intersection of two convex sets convex? Yes Is the union of two convex sets convex? NO

slide-7
SLIDE 7

Convex Functions

A function f is (strictly) convex on a convex set S, if and only if for any x,y∈S, f(λx+(1- λ)y)(<) ≤ λ f(x)+ (1- λ)f(y) for all 0≤ λ ≤ 1.

x y f(y) f(x)

λx+(1- λ)y f(λx+(1- λ)y)

slide-8
SLIDE 8

Concave Functions

A function f is (strictly) concave on a convex set S, if and only if for any –f is (strictly) convex on S.

f

  • f
slide-9
SLIDE 9

(Strictly)Convex, Concave, or none of the above?

None of the above Concave Convex Concave Strictly convex

slide-10
SLIDE 10

Favorite Convex Functions

Linear functions Certain Quadratic functions depends

  • n choice of Q (the Hessian matrix)

1

( ) '

n n i i i

f x w x w x where x R

=

= = ∈

1 2 1 2

( , ) 2 f x x x x = + ( ) ' ' f x x Qx w x c = + +

2 2 1 2 1 2

( , ) 2 f x x x x = +

slide-11
SLIDE 11

Convexity of function affects

  • ptimization algorithm
slide-12
SLIDE 12

Convexity of constraints affects optimization algorithm

min f(x) subject to x∈S

S convex S not convex direction of Steepest descent

slide-13
SLIDE 13

Convex Program

min f(x) subject to x∈S where f and S are convex Make optimization nice Many practical problems are convex problem Use convex program as subproblem for nonconvex programs

slide-14
SLIDE 14

Theorem : Global Solution of convex program

If x* is a local minimizer of a convex programming problem, x* is also a global minimizer. Further more if the

  • bjective is strictly convex then x* is the

unique global minimizer. Proof: contradiction

x* y f(y)<f(x*)

slide-15
SLIDE 15

Proof by contradiction

Suppose x* is a local but not global minimizer, i.e. there exist y, s.t. f(y) <f(x*). Then for all 0<ε<1, f(εx*+(1- ε)y)≤ ε f(x*)+(1- ε)f(y) < ε f(x*)+(1- ε)f(x*)=f(x*). Contradiction, x* is not a local min. You try for uniqueness in strict case.

slide-16
SLIDE 16

Problems with nonconvex

  • bjective

a x* b

f strictly convex, problem has unique global minimum

Min f(x) subject to x ∈ [a,b]

x*

f not convex, problem has two local minima

a x’ b

slide-17
SLIDE 17

Problems with nonconvex set

a x* d

Min f(x) subject to x ∈ [a,b] or [c d]

b c x’

slide-18
SLIDE 18

Multivariate Calculus

For x ∈Rn, f(x)=f(x1, x2 , x3 , x4 ,…, xn) The gradient of f: The Hessian of f:

1 2

( ) ( ) ( ) ( ) , ,...,

n

f x f x f x f x x x x ′ ⎛ ⎞ ∂ ∂ ∂ ∇ = ⎜ ⎟ ∂ ∂ ∂ ⎝ ⎠

2 2 2 1 1 1 2 1 2 2 2 2 1 2 2 2 2 2 1 2

( ) ( ) ( ) ... ( ) ( ) ... ( ) ( ) ( ) ( ) ...

n n n n n

f x f x f x x x x x x x f x f x f x x x x x f x f x f x x x x x x x ⎡ ⎤ ∂ ∂ ∂ ⎢ ⎥ ∂ ∂ ∂ ∂ ∂ ∂ ⎢ ⎥ ⎢ ⎥ ∂ ∂ ⎢ ⎥ ∇ = ∂ ∂ ∂ ∂ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ∂ ∂ ∂ ⎢ ⎥ ∂ ∂ ∂ ∂ ∂ ∂ ⎢ ⎥ ⎣ ⎦

slide-19
SLIDE 19

For example

1 1 1

4 3 2 1 2 1 2 3 1 2 3 2 1 3 2 2 2

( ) 3 4 2 3 4 ( ) 1 2 4 2 9 4 ( ) 4 3 6

x x x

f x x e x x x x e x f x x x e f x x = + + + ⎡ ⎤ + + ∇ = ⎢ ⎥ + ⎢ ⎥ ⎣ ⎦ ⎡ ⎤ + ∇ = ⎢ ⎥ ⎣ ⎦

2

[ 0 ,1] 7 ( ) 1 2 1 1 4 ( ) 4 3 6 x f x f x ′ = ⎡ ⎤ ∇ = ⎢ ⎥ ⎣ ⎦ ⎡ ⎤ ∇ = ⎢ ⎥ ⎣ ⎦

slide-20
SLIDE 20

Quadratic Functions

Form Gradient

1 1 1

1 ( ) ' 2 1 2

n n n ij i j j j i j j

f x x Q x b x Q x x b x

= = =

′ = − = −

∑ ∑ ∑

2

( ) ( ) f x Qx b f x Q ∇ = − ∇ =

1

( ) 1 1 2 2 assuming symmetric

kk k ik i kj j k i k j k k n kj j k j

f x Q x Q x Q x b x Q x b Q

≠ ≠ =

∂ = + + − ∂ = −

∑ ∑ ∑

n nxn n

x R Q R b R ∈ ∈ ∈

slide-21
SLIDE 21

Taylor Series Expansion about x* - 1D Case

Let x=x*+p Equivalently

2 2 3 3 n n

1 1 f(x)= f(x*+p)=f(x*)+pf (x*)+ p f (x*)+ p f (x*) 2 3! 1 + + p f (x*) + n! ′ … …

2 2 3 3 n n

1 1 f(x)=f(x*)+(x-x*)f (x*)+ (x-x*) f (x*)+ ( *) f (x*) 2 3! 1 + + ( *) f (x*) + n! x x x x ′ − − … …

slide-22
SLIDE 22

Taylor Series Example

Let f(x) = exp(-x), compute Taylor Series Expansion about x*=0

2 2 3 3 n n 2 3 * * * n * 2 3 n

1 1 f(x)=f(x*)+(x-x*)f (x*)+ (x-x*) f (x*)+ ( *) f (x*) 2 3! 1 + + ( *) f (x*) + n! 1 + +(-1) 2 3! ! 1 + +(-1) 2 3! !

n x x x x n

x x x x x x x xe e e e n x x x x n

− − − −

′ − − = − + − + = − + − + … … … … … …

slide-23
SLIDE 23

First Order Taylor Series Approximation

Let x=x*+p Says that a linear approximation of a function works well locally

f(x)=f(x*+p)=f(x*)+p f(x*)+ p ( *, ) lim ( *, )

p

x p where x p α α

′∇ = f(x)

f(x) f(x*+p)= ( *) ( *) f x p f x ′ ≈ + ∇

x*

f(x) ( *) ( *)' ( *) f x x x f x ≈ + − ∇

slide-24
SLIDE 24

Second OrderTaylor Series Approximation

Let x=x*+p Says that a quadratic approximation of a function works even better locally

2 2

1 f(x)=f(x*+p)=f(x*)+p f(x*)+ f(x*)p+ p ( *, ) 2 lim ( *, )

p

p x p where x p α α

′ ′ ∇ ∇ = x* f(x)

2

f(x) ( *) ( *)' ( *) 1 ( *)' ( *)( *) 2 f x x x f x x x f x x x ≈ + − ∇ + − ∇ −

slide-25
SLIDE 25

Theorem 2.1 –Taylor’s Theorem version 2

Suppose f is cont diff, If f is twice cont. diff, ( ) ( ) ( )' for some [0,1]. f x p f x f x tp p t + = + ∇ + ∈

1 2 2

( ) ( ) ( )' ' ( )' for some [0,1]. f x p f x f x p p f x tp p t + = + ∇ + ∇ + ∈

Also called Mean Value Theorem

slide-26
SLIDE 26

Taylor Series Approximation Exercise

Consider the function and x*=[-2,3] Compute gradient and Hessian. What is First Order TSA about X* What is second order TSA about X* Evaluate both TSA at y=[-1.9,3.2] and compare with f(y)

2

3 2 2 2 1 2 1 1 2 1 2

( , ) 5 7 2 f x x x x x x x x = + + +

slide-27
SLIDE 27

Exercise

2

3 2 2 2 1 2 1 1 2 1 2 2 2 1 2 2

( , ) 5 7 2 fu n ctio n ( ) ( *) [ , ] ' g rad ien t ( ) ( *) H essian F irst o rd er T S A : ( ) ( *) ( *) ( *) seco n d o rd er T S A : ( ) ( *) ( *) ( *) ( *) ( f x x x x x x x x f x f x f x f x g x f x x x f x h x f x x x f x x x f = + + + ⎡ ⎤ ∇ = ∇ = ⎢ ⎥ ⎣ ⎦ ⎡ ⎤ ⎡ ⎤ ∇ = ∇ = ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ ′ = + − ∇ = ′ = + − ∇ ′ + − ∇ *)( *) | ( ) ( ) | | ( ) ( ) | x x x f y g y f y h y − − = − =

slide-28
SLIDE 28

Exercise

2

3 2 2 2 1 2 1 1 2 1 2 2 2 1 1 2 2 2 1 1 2 2 1 2 1 2 2 2 1 2 1

( , ) 5 7 2 function ( *) 56 3x 10 7 ( ) ( *) [15, 52] gradient 5 14 4 6 10 10 14 18 22 ( ) ( *) Hessian 10 14 14 4 22

  • 24

f x x x x x x x x f x x x x f x f x x x x x x x x x f x f x x x x = + + + = − ⎡ ⎤ + + ′ ∇ = ∇ = − ⎢ ⎥ + + ⎢ ⎥ ⎣ ⎦ + + ⎡ ⎤ ⎡ ⎤ ∇ = ∇ = ⎢ ⎥ ⎢ ⎥ + + ⎣ ⎦ ⎣ ⎦

slide-29
SLIDE 29

Exercise

1 2 2

F ir s t o r d e r T S A : ( ) ( * ) ( * ) ( * ) s e c o n d o r d e r T S A : ( ) ( * ) ( * ) ( * ) ( * ) ( * ) ( * ) | ( ) ( ) | | 6 4 .8 1 1 ( 6 4 .9 ) | .0 8 9 | ( ) ( ) | | 6 4 .8 1 1 ( 6 4 .5 ) | .0 3 9 g x f x x x f x h x f x x x f x x x f x x x f y g y f y h y ′ = + − ∇ ′ = + − ∇ ′ ′ + − ∇ − − = − − − = − = − − − =

slide-30
SLIDE 30

General Optimization algorithm

Specify some initial guess x0 For k = 0, 1, ……

If xk is optimal then stop Determine descent direction pk Determine improved estimate of the

solution: xk+1=xk+λkpk

Last step is one-dimensional search problem called line search

slide-31
SLIDE 31

Descent Directions

If the directional derivative is negative then linesearch will lead to decrease in the function

[8,2] [0,-1] d

( ) f x d ′ ∇ <

( ) f x −∇

slide-32
SLIDE 32

Descent directions create decrease

Proof

Let ' ( ) 0, then 0 suchthat ( ) ( ) d f x f x d f x for λ λ λ λ ∇ < ∃ > + < ≤

( ) ( ) ( ) ( , ) ( ) ( ) ( ) ( , ) ( ) ( ) 0 for sufficiently small since ( ) 0and ( , ) 0. f x d f x d f x d x d f x d f x d f x d x d f x d f x d f x x d λ λ λ α λ λ α λ λ λ λ α λ ′ + = + ∇ + ⇓ + − ′ = ∇ + ⇓ + − < ′∇ < →

slide-33
SLIDE 33

Negative Gradient

An important fact to know is that the negative gradient always points downhill Proof

Let ( ), then 0 suchthat ( ) ( ) d f x f x d f x for λ λ λ λ = −∇ ∃ > + < ≤

( ) ( ) ( ) ( , ) ( ) ( ) ( ) ( , ) ( ) ( ) 0 for sufficiently small since ( ) 0and ( , ) 0. f x d f x d f x d x d f x d f x d f x d x d f x d f x d f x x d λ λ λ α λ λ α λ λ λ λ ′ + = + ∇ + ⇓ + − ′ = ∇ + ⇓ + − < α λ ′∇ < →

slide-34
SLIDE 34
slide-35
SLIDE 35

Notes on negative gradient

If gradient nonzero, then negative gradient defines a descent direction

2

' ( ) ( )' ( ) ( ) ( ) d f x f x f x by substitutionof d f x if f x ∇ = −∇ ∇ = − ∇ < ∇ ≠

slide-36
SLIDE 36

Directional Derivative

Always exists when function is convex

( ) ( ) ( , ) lim ( ) f x d f x f x d f x d

λ

λ λ

+ − ′ = ′ = ∇

slide-37
SLIDE 37

Assignment

Read chapter 3 in NW