( ) ( ) s i t y t d t where t = (x,y) denotes spatial - - PowerPoint PPT Presentation
( ) ( ) s i t y t d t where t = (x,y) denotes spatial - - PowerPoint PPT Presentation
If the parameters denoted pure translation, then one way to classify y is to determine for which i , the following quantity is maximized: ( ) ( ) s i t y t d t where t = (x,y) denotes spatial coordinates. This is
Consider the following compressive
acquisition model:
The MLC for this case is now:
n m R R x R y N x y
n m n m
,
, , ), , ( ~ ,
2
2 2 } ,..., 2 , 1 {
2 exp 2 1 ) | ( ) | ( max arg
i i k
s y s y s y p p j
P k
But the compressive measurement y could be
acquired from an image which was acquired in a different pose than any of the images in the database.
The GMLC is now given as:
) , | ( argmax ~ 2 ) ~ ( exp 2 1 ) ~ , | ( ) ~ , | ( max arg
2 2 } ,..., 2 , 1 {
θ s y θ θ s y θ s y θ s y
i θ i i i i i k k
p p p j
P k
This has an interesting name – the smashed filter (derived from the name “matched filter”), taking into account the compressive nature of the measurements.
These are all essentially nearest neighbour
classifiers with Euclidean distance.
What is special about these classifiers? Let us assume that the sensing matrix Φ and
hence (matrix ΦU for any orthonormal U)
- bey the restricted isometry property.
Then for k-sparse signals s1 and s2 and RIC
δ2k, we have:
2 2 1 2 2 2 1 2 2 1 2
) 1 ( ) ( ) 1 ( s s s s s s
k k
Image source: Davenport et al, “The smashed filter for compressive classification and target recognition”
Image source: Davenport et al, “The smashed filter for compressive classification and target recognition” Image size: 128 x 128 Compressive measurements taken by a Rice single pixel camera. Though the sensing matrix of the camera does not obey RIP since it contains values that are 0 or 1, it can be converted into a matrix with entries that are either -1 or +1. This is by taking two measurements of the same scene, where the second measurement is taken by flipping the 0 and 1 values in the first sensing matrix.
The procedure is summarized below:
} )) ( ( { e convergenc until Repeat . matrix by a pick Randomly ΦΨ Φ Φ Φ Φ n m
Pick the step-size adaptively so that you actually descend on the mutual coherence.
2 2
) ( ) ( ) ( ) ( max ) (
j j i i j i
ΦΨ ΦΨ ΦΨ ΦΨ ΦΨ
The main problem is how to find a derivative
- f the “max” function which is non-
differentiable!
Use the softmax function which is
differentiable:
n i i n i i
x x
1 1
} max{ ) exp( log 1 lim
This method does not directly target but
instead considers the Gram matrix DTD where
The aim is to design Φ in such a way that the
Gram matrix resembles the identity matrix as much as possible, in other words we want:
. normalized
- unit
columns all with ΦΨ D I ΦΨ Φ Ψ
T T
T T T T T T T T T T T T
V V V V I ΦVΛV Φ V ΨΨ ΨΨ ΦΨΨ Φ ΨΨ ΦΨ Φ Ψ (SVD) ΛV ΦV Φ V V V V ΦV Φ V V
T T T T T T
w.r.t. minimize
2 F T T
ΦV
Note: need not be
- rthonormal and hence T
need not be identity.
w.r.t. minimize
2 F T T
ΦV
T m)
| ... | | ( ) ,..., , diag( Consider
2 1 n 2 1
) ,..., , ( where
, 2 , 2 1 , 1 2 2 n i n i i i F j i t j j t i i F T
Rank one matrix Ej We want a rank one matrix that approximates Ej as closely as possible in the Frobenius sense. The solution lies in SVD!
SVD gives us the best possible rank-r
approximation to any matrix (it may or may not be a natural image matrix).
In other words, the solution to the following
- ptimization problem:
is given using the SVD of A as follows:
) , min( ) ˆ rank( where ˆ min
2 ˆ
n m r,r
F
A A A
A T t i i r i ii
USV A v u S A
where , ˆ
1
Note: We are using the singular vectors corresponding to the r largest singular values. This property of the SVD is called the EckartYoung Theorem.
12
) ,..., , ( where
, 2 , 2 1 , 1 2 2 n i n i i i F j i t j j t i i F T
Rank one matrix Ej We want a rank one matrix that approximates Ej as closely as possible in the Frobenius sense. The solution lies in SVD.
t j t k k k kk T j
u S u u S USU E
1 11
Assuming that S11 is the largest singular value
Initialize Φ to a random matrix. By SVD, we have T = VV T We have Γ = ΦV. For each j = 1 to m:
- Compute Ej
- Find the largest singular value and
corresponding singular vector of Ej
- Use these to update Γ (via τj).
- Compute the optimal Φ = ΓVT.
Ajit Rajwade
Leading to smaller values of column-column
dot products
Better reconstruction errors For more details refer to figure 1 and table 1
- f “Learning to Sense Sparse Signals:
Simultaneous Sensing Matrix and Sparsifying Dictionary Optimization” by Duarte- Carvajalino and Sapiro.
Ajit Rajwade
It is the task of reconstructing a 2D image
(object) from its 1D projections, or a 3D image (object) from its 2D projections.
What is a projection (also called tomographic
projection)?
It is defined as the Radon transform of the
image in a particular direction (see next slide).
Imagine a line was drawn through the 2D image in a certain direction α, and you integrated the intensity values along that line. Now you repeat this for lines parallel to the
- riginal one but at different offsets.
Each such summation produces a bin of the tomographic projection. The collection of bins form a 1D array which is called the tomographic projection or the Radon transform of the object in the direction α. https://en.wikipedia.org/wiki/Rado n_transform
The image is a simplification of a set of real biological tissues: example, an
- rgan/tumor surrounded by a background consisting of soft, uniform tissue. The
set of tissues is bombarded with an X-Ray beam. The tumor has higher rate of absorption as compared to the surrounding tissue.
The degree of absorption of X-Rays at each point is measured by an X-Ray absorption detector. This detector produces a 1D signal whose amplitude/intensity is directly proportional to the extent of absorption. Any point in the signal = sum
- f the absorptivity values
across the path of a single ray in the X-Ray beam that spatially maps onto that point.
Xray-beam detector
Xray-beam detector
Sum-total of the two back-projections Given the 1D signal (called a projection signal), we try to reconstruct the original 2D image by smearing backwards along the direction of
- projection. This is called as
back-projection. The 1D signal that was measured is duplicated along the columns of the image to be estimated (see the directions marked in yellow).
Given projections in K different directions, we
can hope to reconstruct the original image by performing back-projection along all these directions, and adding up the results.
Back-projection refers to smearing the 1D
projection back across the 2D image, i.e. duplicating the 1D signal across the image in a direction perpendicular to the direction of projection.
The shape of the object will be approximated
better and better as K increases.
Even with many (32) back-projections, there is a blur artifact in the
- reconstruction. This is called as a “halo
effect”.
A 3D object is illuminated with a large cone-shaped X-
Ray beam. This will produce a projection which is a 2D- image.
Changing the direction of the X-ray beam will produce
another image. This set of images when back-projected will yield the 3D volume/object.
However in conventional Computed Tomography (CT),
each slice of the volume is measured at a time. A slice is a 2D entity obtained by cutting the 3D volume transversely through a plane parallel to the XY plane.
This allows for the employment of a smaller number of
detectors at a time, for the same resolution of the measurement.
I0 X-Ray Image X Y Z
) , ( ) , ( log ) , ( exp g dL y x f I I dL y x f I I
L L
I0 = intensity of the X-ray beam from the source I = intensity of the X-ray beam as measured by the detector, given by Beer’s law
Tomography is useful in medical imaging,
because different tissues (eg: tissues in the bone, lungs, etc.) attenuate the X-rays to different extents.
This allows for good medical diagnosis.
Given a direction ϴ, the source and detector pair move along that direction in fixed steps (i.e. variation in ρ). The distance between source and detector is constant. At each step, the source sends out an X-Ray beam onto the subject, and the projection value is recorded on the detector and stored in a computer. This process is repeated for several values of ϴ. In the end, we record a single 2D slice. Now, the subject is moved in a direction perpendicular to the plane of the source-detector pair, and another 2D slice is recorded. This is called 1st generation CT. Source of image: Book by Gonzalez, 3rd edition
To estimate the full 3D structure of an object
from its projections.
The projections are directly measured, the 3D
structure is estimated.
Applications: medical imaging, industrial
applications such as fault detection in machines, observation of plant roots, remote sensing (observation of underground objects
- r phenomena).
The direction of projection is denoted L, and dL is an infinitesimally small element along
- L. L is parameterized as follows (the “normal representation of the line”):
sin cos y x
ρ θ
The complete set of projections for several
different values of the parameters ρ and ϴ gives:
This is called the Radon Transform of f. Its
discrete version is:
dxdy y x y x f g f R ) sin cos ( ) , ( ) , ( ) (
1 1
) sin cos ( ) , ( ) , ( ) (
M x N y
y x y x f g f R
One single projection vector is obtained with a fixed value
- f ϴ, but varying ρ.
Dirac delta function Kronecker delta function
else , if , ) , (
2 2 2
r y x A y x f
2 2
2 ) , ( ) , ( ) ( ) , ( ) sin cos ( ) , ( ) , ( ) (
2 2 2 2 2 2 2 2
r A Ady dy y f dy y f dxdy x y x f dxdy y x y x f g f R
r r r r
r If | | ) , ( , | | g r If
Sinogram (a radon transform plotted as an image in a (ρ,ϴ) grid. https://en.wikipedia.org/wiki/Radon_transform
dxdy y x y x f g f R ) sin cos ( ) , ( ) , ( ) ( ) , sin cos ( ) , ( ) , ( ˆ
k k k k
y x g g y x f
k
1
) , ( ˆ ) , ( ˆ ) , ( ˆ ) , sin cos ( ) , ( ˆ y x f y x f d y x f d y x g y x f
K k
k
Radon transform:
- btained by
sampling several different angles Fix the angle ϴk and for all x and y, compute the value
- f ρ. Copy g(ρ, ϴk) to
hat(f)ϴk(x,y), which is the image obtained when you back-project along angle ϴk. The back-projection operator is NOT the same as the inverse of the Radon transform! So this does not yield back the true signal f(x,y), but the signal f(x,y) blurred with the kernel (x2+y2)-0.5. More on this a few slides down, when we do filtered back-projection.
The blur is a painful consequence of (1) discretization of the angle ϴ, and (2) the inherent blurring with the kernel (x2+y2)-0.5. These images are reconstructed at 0.5 degree changes in ϴ. How do we get rid of this blur? Wait for a few slides!
The Radon transform is given as: Its 1D Fourier transform w.r.t. ρ (keeping ϴ
fixed to some value) is given by:
dxdy y x y x f g f R ) sin cos ( ) , ( ) , ( ) (
d j g G 2 exp ) , ( ) , (
G(μ, ϴ) is the Fourier transform of the projection of f(x,y) along some direction ϴ.
dxdyd j y x y x f 2 exp ) sin cos ( ) , (
dxdy
y x j y x f dxdy d j y x y x f
) sin cos ( 2 exp ) , ( 2 exp ) sin cos ( ) , (
Its 1D Fourier transform w.r.t. ρ (keeping ϴ
fixed) is given by:
sin , cos define we where ) ( 2 exp ) , ( ) sin cos ( 2 exp ) , ( ) , (
v u dxdy yv xu j y x f dxdy y x j y x f G
) sin , cos ( ) , ( ) , (
sin , cos
F v u F G
v u
The RHS of this equation is a slice of the 2D Fourier transform of f(x,y), i.e. F(u,v), along the angle ϴ in the frequency plane, and passing through the origin This equation above is called the Projection Slice Theorem or the Fourier Slice Theorem. It states that the Fourier transform of a projection of the 2D object along some direction ϴ (i.e. G(μ, ϴ)) is equal to a slice of the 2D Fourier transform of the object along the same direction ϴ (in the frequency plane), passing through the origin.
The Projection Slice Theorem or the Fourier Slice Theorem states that the following two are equivalent: (1) Project a 2D object along a certain direction d. Take its 1D Fourier Transform called as F1. (2) Compute the 2D Fourier transform of the same object. Take a slice of this Fourier transform along a direction parallel to d (but in the frequency plane). Call this slice as F2. Now F1 = F2. Source of image: https://en.wikipedia.org/wiki/Projection- slice_theorem