Machine Learning for Signal Processing
Lecture 1: Introduction Representing sound and images
Class 1. 1 Sep 2015 Instructor: Bhiksha Raj
11-755/18-797 1
Machine Learning for Signal Processing Lecture 1: Introduction - - PowerPoint PPT Presentation
Machine Learning for Signal Processing Lecture 1: Introduction Representing sound and images Class 1. 1 Sep 2015 Instructor: Bhiksha Raj 11-755/18-797 1 What is a signal A mechanism for conveying information Semaphores, gestures,
11-755/18-797 1
– Semaphores, gestures, traffic lights..
– from a source to a destination – about a real world phenomenon
11-755/18-797 2
11-755/18-797 3
– Or sets of numbers (for color images)
– 0 is minimum / black, 1 is maximum / white – Position / order is important
11-755/18-797 4
Pixel = 0.5
– MRI: “k-space” 3D Fourier transform
– EEG: Many channels of brain electrical activity – ECG: Cardiac activity – OCT, Ultrasound, Echo cardiogram: Echo-based imaging – Others..
classification..
5
MRI EEG ECG Optical Coherence Tomography
11-755/18-797
11-755/18-797 6
11-755/18-797 7
11-755/18-797 8
11-755/18-797 9
Signal Capture Feature Extraction Channel Modeling/ Regression sensor
– Learning patterns in data
analysis
– Learning to classify between different kinds of data
– Learning to predict data
11-755/18-797 10
11-755/18-797 11
Signal Capture Feature Extraction Channel Modeling/ Regression sensor
11-755/18-797 12
Signal Capture Feature Extraction Channel Modeling/ Regression sensor
11-755/18-797 13
Signal Capture Feature Extraction Channel Modeling/ Regression sensor
11-755/18-797 14
Signal Capture Feature Extraction Channel Modeling/ Regression sensor
11-755/18-797 15
Signal Capture Feature Extraction Channel Modeling/ Regression sensor
11-755/18-797 16
11-755/18-797 17
sensing
11-755/18-797 18
– Markov models and Hidden Markov models – Linear and non-linear dynamical systems
– Binary classification. Meta-classifiers – Neural networks
– Privacy in signal processing – Extreme value theory – Dependence and significance
11-755/18-797 19
– Fourier transforms, linear systems, basic statistical signal processing
– Definitions, vectors, matrices, operations, properties
– Basics: what is an random variable, probability distributions, functions of a random variable
– Learning, modelling and classification techniques
11-755/18-797 20
11-755/18-797 21
11-755/18-797 22
– Mini projects – Will be assigned during course – Minimum 4 – You will not catch up if you slack on any homework
– Attendance counts..
– Will be assigned early in course – Dec 3: Poster presentation for all projects, with demos (if possible)
11-755/18-797 23
– Room 6705 Hillman Building – bhiksha@cs.cmu.edu – 412 268 9826
– Zhiding Yu
– Bing Liu
– TBD
11-755/18-797 24
Hillman Windows My office Forbes
11-755/18-797 25
11-755/18-797 26
11-755/18-797 27
11-755/18-797 28
moving through the air
– Essentially by producing puff after puff of air – Any sound producing mechanism actually produces pressure waves
– Highs push it in, lows suck it out – We sense these motions of our eardrum as “sound”
11-755/18-797 29
Pressure highs Spaces between arcs show pressure lows
11-755/18-797 30
– On the microphone
– Many ways to do this
11-755/18-797 31
computer have anything to do with the recorded sound really?
– Recreate the sense of sound
signal
produce a pressure wave
– That we sense as sound
11-755/18-797 32
computer have anything to do with the recorded sound really?
– Recreate the sense of sound
signal
produce a pressure wave
– That we sense as sound
11-755/18-797 33
sinusoids with frequency
many sinusoids of different frequencies
– Frequency is a physically motivated unit – Each hair cell in our inner ear is tuned to specific frequency
components
– We can hear frequencies up to 16000Hz
be heard by children and some young adults
11-755/18-797 34
10 20 30 40 50 60 70 80 90 100
0.5 1
Pressure A sinusoid
– We need a sample rate twice as high as the highest frequency we want to represent (Nyquist freq)
– Because we hear up to 20kHz
11-755/18-797 35
Time in secs.
11-755/18-797 36
11-755/18-797 37
Time Frequency 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.5 1 1.5 2 x 10
4Time Frequency 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 2000 4000 6000 8000 10000 Time Frequency 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1000 2000 3000 4000 5000
44.1kHz SR, is ok 22kHz SR, aliasing! 11kHz SR, double aliasing!
at 44kHz at 22kHz at 11kHz at 5kHz at 4kHz at 3kHz
– And then some – Cannot control the rate of variation of pressure waves in nature
– Cut off all frequencies above sampling.frequency/2 – E.g., to sample at 44.1Khz, filter the signal to eliminate all frequencies above 22050 Hz
11-755/18-797 38
Antialiasing Filter Sampling Analog signal Digital signal
11-755/18-797 39
– The pressure wave can take any value (within limits) – The diaphragm can also move continuously – The electrical signal from the diaphragm has continuous variations
– Numbers can only be stored to finite resolution – E.g. a 16-bit number can store only 65536 values, while a 4-bit number can store only 16 values – To store the sound wave on the computer, the continuous variation must be “mapped” on to the discrete set of numbers we can store
11-755/18-797 40
Signal Value Bit sequence Mapped to S > 2.5v 1 1 * const S <=2.5v
11-755/18-797 41
Original Signal Quantized approximation
Signal Value Bit sequence Mapped to S >= 3.75v 11 3 * const 3.75v > S >= 2.5v 10 2 * const 2.5v > S >= 1.25v 01 1 * const 1.25v > S >= 0v
11-755/18-797 42
Original Signal Quantized approximation
11-755/18-797 43
11-755/18-797 44
11-755/18-797 45
frequency
11-755/18-797 46
11-755/18-797 47
11-755/18-797 48
11-755/18-797 49
Basic Neuroscience: Anatomy and Physiology Arthur C. Guyton, M.D. 1987 W.B.Saunders Co.
Retina
11-755/18-797 50
http://www.brad.ac.uk/acad/lifesci/optometry/resources/modules/stage1/pvp1/Retina.html
– Fast – Sensitive – Grey scale – predominate in the periphery
– Slow – Not so sensitive – Fovea / Macula – COLOR!
11-755/18-797 51
Basic Neuroscience: Anatomy and Physiology Arthur C. Guyton, M.D. 1987 W.B.Saunders Co.
– The region immediately surrounding the fovea is the macula
52 11-755/18-797
11-755/18-797 53
(From Foundations of Vision, by Brian Wandell, Sinauer Assoc.)
11-755/18-797 54
Wavelength in nm Normalized reponse
11-755/18-797 55
11-755/18-797 56
11-755/18-797 57
11-755/18-797 58
11-755/18-797 59
11-755/18-797 60
11-755/18-797 61
– Sufficient to trigger each of the three cone types in a manner that produces the sensation of the desired color
– Some new-world monkeys are tetrachromatic
– By appropriate combinations of these colors, the cones can be excited to produce a very large set of colours
– How many colours? …
11-755/18-797 62
Wright and John Guild
– Subjects adjusted x,y,and z on the right of a circular screen to match a colour on the left
sensors
– X + Y + Z is 1.0
– The outer curve represents monochromatic light
– The lower line is the line of purples
– The newer charts are less popular
11-755/18-797 63 International council on illumination, 1931
– Colours outside this area cannot be matched by additively combining only 3 colours
would have a differently restricted area
coordinate of one of the three “primary” colours used in images
fraction of our visual acuity
– Also affected by the quantization of levels
11-755/18-797 64
– Each number represents the intensity of the image at a specific location in the image – Implicitly, R = G = B at all locations
– The matrices represent different things in different representations – RGB Colorspace: Matrices represent intensity of Red, Green and Blue – CMYK Colorspace: Cyan, Magenta, Yellow – YIQ Colorspace.. – HSV Colorspace..
11-755/18-797 65
11-755/18-797 66
R = G = B. Only a single number need be stored per pixel
11-755/18-797 67
11-755/18-797 68
11-755/18-797 69
11-755/18-797 70
11-755/18-797 71 Blue
– Adding equal parts of red, green and blue creates white
– Clue – paint colouring is subtractive..
– The base is white – Masking it with equal parts of C, M and Y creates Black – Masking it with C and Y creates Green
– Masking it with M and Y creates Red
– Masking it with M and C creates Blue
– Designed specifically for printing
11-755/18-797 72
– Each paint masks out some colours – Mixing paint subtracts combinations of colors – Paintings represent subtractive colour masks
– How do you think he did it?
11-755/18-797 73
11-755/18-797 74
11-755/18-797 75
11-755/18-797 76
– j may be time, position, etc.. – Usually continuously valued
– ; Q is the space of all j – K( j) is a measurement kernel – Ideally a delta (which takes non-zero value only at the desired j)
– But in reality not
11-755/18-797 77
Q
11-755/18-797 78