Matching the Analysis Scheme to the Signal Fritz Menzer - - PowerPoint PPT Presentation

matching the analysis scheme to the signal
SMART_READER_LITE
LIVE PREVIEW

Matching the Analysis Scheme to the Signal Fritz Menzer - - PowerPoint PPT Presentation

Time-Frequency Analysis for Audio Workshop Matching the Analysis Scheme to the Signal Fritz Menzer (fritz.menzer@epfl.ch) Communication Systems, 5 th year Ecole Polytechnique F ed erale de Lausanne 15th April, 2004 Overview 1


slide-1
SLIDE 1

Time-Frequency Analysis for Audio Workshop

Matching the Analysis Scheme to the Signal

Fritz Menzer (fritz.menzer@epfl.ch)

Communication Systems, 5th year Ecole Polytechnique F´ ed´ erale de Lausanne 15th April, 2004

slide-2
SLIDE 2

Overview

1 Introduction 3 2 Perfect Reconstruction - who cares? 4 2.1 Definition of perfect reconstruction . . . . . . . . . . . . . . . . . . 4 2.2 Do we need perfect reconstruction? . . . . . . . . . . . . . . . . . . 5 3 Harmonic Band Wavelet Transform 7 3.1 Coefficient modeling . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2 Advantages / Drawbacks . . . . . . . . . . . . . . . . . . . . . . . . 11 4 From HBWT to inharmonic sound modeling 12 4.1 Taking filters from different PR filterbanks . . . . . . . . . . . . . . 13 4.2 Why aliasing is not a problem . . . . . . . . . . . . . . . . . . . . . 14 4.3 Method Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.4 Sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5 Time-Frequency Analysis and Granular Synthesis 19 5.1 Time-domain effects . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.2 Scale of all grains in a 1024-band full-tree wavelet decomposition . . 26 A References 27

2

slide-3
SLIDE 3

1 Introduction

  • If you know what you’re looking at, you can examine it more

precisely.

3

slide-4
SLIDE 4

2 Perfect Reconstruction - who cares?

2.1 Definition of perfect reconstruction

  • Definition: Perfect Reconstruction (PR) method: method

providing direct and inverse transforms T and T −1 such that for any signal s, T −1(T (s)) = s

  • FFT based methods, Cosine Modulated Filterbanks and Wavelet

transforms are usually PR methods.

  • Simple operations like filtering or distortion do not necessarily

allow PR (i.e. it may be impossible to find T −1). Example: Quantisation obviously does not allow to reconstuct the

  • riginal signal perfectly.

4

slide-5
SLIDE 5

2.2 Do we need perfect reconstruction?

5 10 15 20 −2 −1.5 −1 −0.5 0.5 1 1.5 5 10 15 20 −2 −1.5 −1 −0.5 0.5 1 1.5 samples 5 10 15 20 50 100 150 5 10 15 20 20 40 60 80 frequency [kHz]

Noise Noise, down- and upsampled by 4

5

slide-6
SLIDE 6

Do we need perfect reconstruction?

  • Not needed for:

– Modifying a signal – Handling noise – If the nature of the signal is known

  • Why use PR methods for compression?

– Generality (ideally any signal can be treated) – Localising the source of errors!

6

slide-7
SLIDE 7

3 Harmonic Band Wavelet Transform (Polotti and Evangelista, 2000)

2

2

2

2

2

2

2

2

2

2

2

2

✲ ✲ ❄ ✲ ❄ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ❄ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ❄ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲

x(n) ... ... ... P φ(k) ψ(k) φ(k) ψ(k) ... P φ(k) ψ(k) φ(k) ψ(k) ... P φ(k) ψ(k) φ(k) ψ(k) ... DC Comp. g1(k) gP−1(k) Sinusoidal Part Sinusoidal Part g0(k)

7

slide-8
SLIDE 8

frequency [Hz] time [sec] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

slide-9
SLIDE 9

time [sec] frequency [Hz] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

slide-10
SLIDE 10

3.1 Coefficient modeling

  • Wavelet Transform
  • Model the scale residual sinusoidally
  • Model the wavelet coefficients using LPC

0.5 1 1.5 2 2.5 3 3.5 1 2 3 4 5 6 7

  • n=1

|

1,0( )

|

2,0( )

|

3,0( )|

|

4,0( )|

|

4,0( )

n=2 n=3 N=4 scale residual ω ω ω ω ω Φ | Ψ Ψ Ψ Ψ | |

10

slide-11
SLIDE 11

3.2 Advantages / Drawbacks

+ Meaningful adaptation of frequency and time resolution = ⇒ Visually better resolution + Reasonable model for the coefficients − Works only for monophonic, harmonic sounds − No model for the transients

11

slide-12
SLIDE 12

4 From HBWT to inharmonic sound modeling

time [sec] frequency [Hz] 1 2 3 4 5 6 7 200 400 600 800 1000 1200 1400 1600 1800

slide-13
SLIDE 13

4.1 Taking filters from different PR filterbanks

ω 1st partial 2nd partial 3rd partial . . . ω 1st partial 2nd partial 3rd partial . . .

slide-14
SLIDE 14

4.2 Why aliasing is not a problem

If a sinusoid of the form sin

 ˆ

kπ P t + ϕ

 

is the input to a P-channel cosine modulated filterbank, only two bands will output nonzero coefficients: |Hk(ej

ˆ kπ P )| = 0 ⇔ k ∈ {ˆ

k − 1, ˆ k}

ω partial’s frequency

= ⇒ there is no aliasing of the sinusoidal part, but only of the part that we model as noise!

14

slide-15
SLIDE 15
slide-16
SLIDE 16
slide-17
SLIDE 17

4.3 Method Overview

Analysis

analyse signal → find N partials → determine filterbank ↓ calculate 2N sets of filterbank coefficients + residual ↓ calculate wavelet transform (WT) of filterbank coefficients ↓ model the WT coefficients sinusoidally and with LPC

Synthesis

reconstruct WT coefficients ↓ perform inverse wavelet transform → get filterbank coefficients ↓ inverse filterbank ↓ add residual (or not)

17

slide-18
SLIDE 18

4.4 Sounds

  • Original Gong
  • Reconstructed from the Filterbank Coefficients
  • Synthesized from model parameters
  • 1 octave pitch-shifted Gong
  • Time-stretched Gong
  • Sinusoidal-only Gong
  • First wavelet scale only
  • Harmonic Gong

18

slide-19
SLIDE 19

5 Time-Frequency Analysis and Granular Synthesis

  • Any Time-Frequency Transform implements a sort of Granular

Synthesis.

  • Each coefficient corresponds to a grain
  • Grains are played at precise instants (instead of randomly)
  • To produce a grain, set all coefficients to zero, except one that

will be set to one. Then perform the inverse transform.

19

slide-20
SLIDE 20

Windowed FFT (STFT) grain

2 4 6 8 10 12 −2 −1.5 −1 −0.5 0.5 1 1.5 2 x 10−3 time [msec]

play

20

slide-21
SLIDE 21

Cosine Modulated Filterbank grain

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 −0.2 −0.15 −0.1 −0.05 0.05 0.1 0.15 time [msec]

play

21

slide-22
SLIDE 22

Full-tree wavelet “grain”

10 20 30 40 50 60 −0.2 −0.15 −0.1 −0.05 0.05 0.1 0.15 time [msec]

play

22

slide-23
SLIDE 23

HBWT grain (noise part)

2 4 6 8 10 12 14 16 −0.2 −0.15 −0.1 −0.05 0.05 0.1 0.15 time [msec]

play

23

slide-24
SLIDE 24

HBWT grain (sinusoidal part)

20 40 60 80 100 120 140 160 −0.05 −0.04 −0.03 −0.02 −0.01 0.01 0.02 0.03 0.04 0.05 time [msec]

play

24

slide-25
SLIDE 25

5.1 Time-domain effects

1 2 3 4 5 6 7 −0.2 −0.15 −0.1 −0.05 0.05 0.1 0.15 time [msec] 1 2 3 4 5 6 7 −0.2 −0.15 −0.1 −0.05 0.05 0.1 0.15 time [msec]

Channel 8:

  • ne grain

played continuously Channel 9:

  • ne grain

played continuously

25

slide-26
SLIDE 26

5.2 Scale of all grains in a 1024-band full-tree wavelet decomposition

time [sec] frequency [Hz] 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 x 104

play

26

slide-27
SLIDE 27

A References

  • Article on Harmonic Band Wavelet Transform by Polotti and

Evangelista http://lcavwww.epfl.ch/publications/publications/2000/PolottiE00b.pdf

  • DAFx 2002 paper on adaptation to inharmonic sounds

http://lcavwww.epfl.ch/publications/publications/2002/PolottiE02.pdf

  • Some material (presentation slides, Matlab functions and pure data
  • bjects)

http://www.xsmusic.ch/

27