Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary...

64
Page 1 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité" June 19, 2015 Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals Roland Badeau Télécom ParisTech / CNRS LTCI Currently visiting the CIRMMT McGill University, Montreal

Transcript of Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary...

Page 1: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 1 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Probabilistic modeling

of non-stationary signals

in time-frequency domain

Application to music signals

Roland BadeauTélécom ParisTech / CNRS LTCI

Currently visiting the CIRMMTMcGill University, Montreal

Page 2: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 2 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Part I

Introduction

Page 3: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 3 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Non-negative Matrix Factorization (NMF)

Musical score

Page 4: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 3 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Non-negative Matrix Factorization (NMF)

Musical score

Spectrogram V

Page 5: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 3 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Non-negative Matrix Factorization (NMF)

Musical score

0 2 4 6 8 100

2040

Time (s)

0 2 4 6 8 100

2040

0 2 4 6 8 100

2040

Temporal activations H

0

2000

4000

6000

8000

10000

0 0.5 1

Fre

quen

cy (

Hz)

0

2000

4000

6000

8000

10000

0 0.5 10

2000

4000

6000

8000

10000

0 0.5 1

Spectral templates W Spectrogram V

Page 6: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 4 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Non-negative Matrix Factorization

Factorization of a matrix V ∈ RF×T+ as a product V ≈ W H

Rank reduction: W ∈ RF×S+ and H ∈ R

S×T+ where S < min(F ,T )

Usual applications:

Image analysis, data mining, spectroscopy, finance, etc.Audio signal processing:

– Multi-pitch estimation, onset detection

– Automatic music transcription

– Musical instrument recognition

– Source separation

– Audio inpainting

Page 7: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 5 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

NMF-based automatic transcription

Algorithm

Time-frequency NonnegativeInput MIDI

signal representationTranscription

filedecomposition

Estimation of

MIDI pitch

Detection of the attacks

and ends of notes

W

H

Demo

Original signal (Liszt):

Transcribed signal:

N. Bertin, R. Badeau, and E. Vincent. "Enforcing Harmonicity and Smoothness in Bayesian Non-negative Matrix Factorization Applied to Polyphonic Music Transcription". IEEE Trans. on Audio,Speech, and Lang. Proc., 18(3): 538–549, Mar 2010.

Page 8: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 6 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Score-based informed source separation

Algorithm

Round Midnight (Thelonious Monk):

R. Hennequin, B. David, and R. Badeau. "Score informed audio source separation using a para-metric model of non-negative spectrogram". In ICASSP, May 2011.

Page 9: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 7 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Watermarking-informed source separation

Algorithm

Coder

1) Coding 2) Decoding

M Mixtures

S Sourcesignals

Decoder

Sideinfo

M MixturessignalsS Source

Sideinfo

Mix Tape (Jim’s Big Ego) :

A. Liutkus, R. Badeau, and G. Richard. "Informed source separation using multichannel NMF".In LVA/ICA, Sep 2010.

Page 10: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 8 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

NMF probabilistic models

Mixture models with (hidden) latent variables

+ can exploit a priori knowledge

+ can use well-known statistical inference techniques

Probabilistic models of time-frequency distributions:Magnitude-only models (phase is ignored)

– Additive Gaussian noise [Schmidt, 2008],

– Probabilistic Latent Component Analysis [Smaragdis, 2006],

– Mixture of Poisson components [Virtanen, 2008],

Phase-aware models (theoretical ground for Wiener filtering)

– Mixture of Gaussian components [Févotte, 2009],

– Mixture of alpha-stable components [Liutkus & Badeau, 2015]

[1] A. Liutkus, R. Badeau, "Generalized Wiener filtering with fractional power spectrograms," inICASSP, Apr 2015, pp. 266–270.

Page 11: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 9 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Gaussian model [Févotte, 2009]

xs(f , t)

X s

wfs

hst

σ2xs(f , t) = wfs hst

V s

all time-frequency

bins are independent∼ N 0,

Page 12: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 9 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Gaussian model [Févotte, 2009]

y(f , t) =S−1∑

s=0

xs(f , t)

Y W

H

vft =S−1∑

s=0

σ2xs(f , t)

V = WH

xs(f , t)

X s

wfs

hst

σ2xs(f , t) = wfs hst

V s

⇓hst

wfs

all time-frequency

bins are independent∼ N

∼ N

0,

0,

Page 13: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 9 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Gaussian model [Févotte, 2009]

y(f , t) =S−1∑

s=0

xs(f , t)

Y W

H

vft =S−1∑

s=0

σ2xs(f , t)

V = WH

xs(f , t)

X s

wfs

hst

σ2xs(f , t) = wfs hst

V s

∼ N

⇓hst

wfs

all time-frequency

bins are independent

max L(Y )

m

min DIS(V = |Y |2|V )∼ N

0,

0,

Page 14: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 10 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Review of Itakura-Saito NMF (IS-NMF)

Estimation of W and H:

The maximum likelihood estimate is obtained by minimizing the IS

divergence between the spectrogram V = |Y |2 and V

Methods: multiplicative update rules or SAGE algorithm

Advantages of IS-NMF:

The MMSE estimation of xs(f , t) leads to Wiener filtering

The existence of phases is taken into account

Drawbacks of IS-NMF:

xs(f , t) for all s, f , t are assumed uncorrelated

The values of phases in the STFT matrix Y are ignored

Page 15: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 11 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Questions

Can we design time-frequency (TF) transforms such that the

assumption of uncorrelated TF bins is best satisfied?

For which class of stochastic processes can this assumption be

satisfied? (TF bins of sinusoidal and impulse signals will always

be correlated anyway)

For stochastic processes whose TF correlations cannot be

withdrawn, is it possible to extend the IS-NMF model in order to

best take these correlations into account?

What kind of improvement can we expect from modeling these

correlations in applications such as source separation and audio

inpainting?

Page 16: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 12 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Part II

Designing appropriate TF transforms

Page 17: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 13 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Preservation of whiteness (PW)

Filter

· · ·

· · ·

· · ·

...

Frequency

Time

· · ·

· · ·

bank

...

White noise

Figure: TF transform of a (proper complex) white noise

Page 18: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 13 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Preservation of whiteness (PW)

Filter

· · ·

· · ·

· · ·

...

Frequency

Time

· · ·

· · ·

bank

...

White noise

2D white noise

Figure: TF transform of a (proper complex) white noise

Page 19: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 14 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Preservation of whiteness (PW)

· · ·

· · ·

· · ·

Frequency

Time

· · ·

· · · STFT ...

Real white noise

2D proper complex

white noise

· · ·

· · ·0

Figure: Complex TF transform of a real white noise

Page 20: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 15 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Perfect reconstruction (PR)

Filter

· · ·

· · ·

· · ·

...

Frequency

Time

· · ·

· · ·

bank

Analysis Synthesis

...Signal InputInput signal

TF transform

Figure: TF transform of a time series

Page 21: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 15 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Perfect reconstruction (PR)

Filter

· · ·

· · ·

· · ·

· · ·...

Frequency

Time

· · ·

· · ·

bank

Analysis

Filter

bank

Synthesis

......

Input Inputsignal signal

TF transform

Figure: Perfect reconstruction filter bank

Page 22: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 16 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Solution of (PW) + (PR)

E(z)...

z−1

↓ L

R(z)...

z−1

+

+

↑ L

↓ L

↓ L

↑ L

↑ L

· · ·

· · ·

· · ·

...

Frequency

Time

· · ·

......

TF transform

Figure: Critically sampled paraunitary filter banks: R(z) = E(z)

Page 23: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 17 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Examples of solutions

Real TF transform of real signals: MDCT filter banks

Xf ,t =∑

n∈Z

wn xFt−n cos

F

(f +

1

2

)(n +

F + 1

2

))

Complex TF transform of complex signals: PR critically decimated

GDFT filter banks with matched analysis and synthesis filters:

Xf ,t =∑

n∈Z

wn xFt−n exp

(+

i2π

F(f + φ)(n + τ)

)

Complex TF transform of real signals: same GDFT filter banks,

with F even and φ = 12

Page 24: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 18 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

TF transform of uncorrelated time samples

Time

Fre

qu

en

cy

Time× ×

Time

TFtransform:

Window:

Uncorrelated time samples

Non-overlapping time frames

Non-adjacent columns

are uncorrelated

series:

Figure: TF transform of uncorrelated time samples

Page 25: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 19 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

TF transform of uncorrelated time samples

Time

Fre

qu

en

cy

Time× ×

Time

TFtransform:

Window:

Uncorrelated time samples

Preservation of whiteness

Adjacent columns

are approx. uncorrelated

Slowly varying powerseries:

Figure: TF transform of uncorrelated time samples with slowly varying power

Page 26: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 20 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

TF transform of a WSS process

Fre

qu

en

cy

Time

Fre

qu

en

cy

×

×

Spectral TFtransform:

Window:

WSS process

Non-overlapping sub-bands

Non-adjacent rows

are approx. uncorrelated

representation:

with high stop-band rejection

Figure: TF transform of a WSS process with high stop-band rejection

Page 27: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 21 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

TF transform of a WSS process

Fre

qu

en

cy

Time

Fre

qu

en

cy

×

×

Spectral TFtransform:

Window:

WSS process

Preservation of whiteness

Adjacent rows

are approx. uncorrelated

representation:

Smooth PSD

Figure: TF transform of a WSS process with smooth PSD

Page 28: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 22 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

TF transform of a nonstationary process

Non-stationary process

Preservation of whiteness

All TF bins

are approx. uncorrelated

Smooth power TF density

High stop-band rejection

Time

Fre

qu

en

cy

× ×

×

×

Figure: TF transform of nonstationary signal with smooth TF density

Page 29: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 23 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Take-home message

Advantages of whiteness-preserving TF transforms:

The assumption of uncorrelated TF bins holds approximatelyfor a wide range of nonstationary signals with smooth TF density.

No need to care for the consistency of the TF transform, since

it is bijective (no redundancy in the TF domain).

Preliminary results, in a source separation application involving

NMF modeling and Wiener filtering, showed no performance losswhen using an MDCT instead of an STFT with 75% overlap

Drawbacks:

Designing paraunitary STFT filter banks is constrained: solutions

involve non-overlapping rectangular windows or recursive filters.

The assumption of uncorrelated TF bins does not hold for sinusoids

and impulses: such correlations still need to be properly modeled.

Page 30: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 24 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Part III

Modeling correlations in the TF domain

Page 31: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 25 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Linear convolutive mixtures modeling

Page 32: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 26 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Convolution in TF domain

Purpose: implement y(n) = (g ∗ x)(n) in TF domain

Standard approach: column-wise multiplication of the STFT

x(f , t) by the frequency response cg(f ) of filter g(n)

f

tcg(f ) x(f , t)

y(f , t)

× × ×

Advantage: y(f , t) are uncorrelated if x(f , t) are uncorrelated

Drawbacks: Approximation, holds if g(n) is much shorter than

time frames (unrealistic). Approach restricted to the STFT.

Page 33: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 27 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Convolution in TF domain

Purpose: implement y(n) = (g ∗ x)(n) in TF domain

Problem: find transformation TTF in Figure 1 such that the output

is y(n) when the input is x(n)

↓ F

↓ F

h0

hF-1

......

↑ F

↑ F

h0

hF-1

......TTF

Fig. 1: Applying a TF transformation to a TD signal

Page 34: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 28 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Convolution in TF domain

Solution: TTF is represented in the larger frame in Figure 2, where the

input is x(f , t), the output is y(f , t), and TTD is the convolution by g(n)

↓ F

↓ F

h0

hF-1

......

↑ F

↑ F

h0

hF-1

...... TTD

Fig. 2: Applying a TD transformation to TF data

Page 35: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 29 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Convolution in TF domainThis solution can be implemented as a 2D filter:

f

t

τ

ϕ

cg(f , ϕ, τ)x(f , t)

∗ y(f , t)

ARMA parametrisation: if g(n) is a causal and stable recursive

(ARMA) filter then cg(f , ϕ, τ) can be parametrised as

∀ϕ, τ, f , ag(f − ϕ, τ) ∗τ

cg(f , ϕ, τ) = bg(f , ϕ, τ)

[2] R. Badeau and M.D. Plumbley, "Probabilistic Time-Frequency Source-Filter Decomposition ofNon-Stationary Signals," in EUSIPCO, Sep 2013.

Page 36: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 30 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Time-domain mixing model

x0g0,0

g1,0

xS−1g0,S−1

g1,S−1

+y0,0

y0,S−1

+y1,S−1

y0

y1

n0

n1

y1,0

MixtureSourcefilters images mixingLinear Source

signals signalsNoisy

......

...

xs(n) gms(n) yms(n) ym(n)nm(n)

Example of a stereophonic setting (M = 2)

Page 37: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 31 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Multichannel HR-NMF model

The TF transforms of the source signals xs(f , t) follow a regular

IS-NMF model: xs(f , t) ∼ N (0,∑

k wsfkhs

kt)

ARMA filtering is implemented via a state-space representation:

zs(f , t) = xs(f , t)−Qa∑τ=1

as(f , τ)zs(f , t − τ)

yms(f , t) =Pb∑

ϕ=−Pb

Qb∑τ=0

bms(f , ϕ, τ) zs(f − ϕ, t − τ)

Output: ym(f , t) = nm(f , t) +S−1∑s=0

yms(f , t) with nm(f , t) ∼ N (0, σ2n)

[3] R. Badeau and M.D. Plumbley, "Multichannel high resolution NMF for modelling convolutivemixtures of non-stationary signals in the time-frequency domain" in IEEE Trans. Audio, Speech,Lang. Proc., vol. 22, no. 11, Nov 2014, pp. 1670–1680.

Page 38: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 32 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Dependency graph

t t t

f f f

xs(f , t) zs(f , t) yms(f , t)

TF innovation TF state TF observation

Dependency graph in the TF domain

Page 39: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 33 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Particular cases

The HRNMF model encompasses:

Multichanel NMF [Ozerov & Févotte, 2010] (if Qa = Qb = Pb = 0)

ARMA processes (if K = 1 and hskt is flat)

t

hskt

Mixtures of damped sinusoids (if K = 1 and hskt is an impulse)

t

hskt

Page 40: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 34 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Estimation of HR-NMF

Various approaches (initially developed for the mono M = 1 case)

EM algorithm with Kalman filtering [Badeau, 2011]: slow

convergence, high computational complexity

Multiplicative updates [Badeau & Ozerov, 2013]: fast

convergence but numerical stability issues

Variational EM algorithm [Badeau & Drémeau, 2013] : low

computational complexity

[4] R. Badeau, "Gaussian modeling of mixtures of non-stationary signals in the time-frequencydomain (HR-NMF)," in WASPAA, Oct 2011, pp. 253–256.[5] R. Badeau and A. Ozerov, "Multiplicative updates for modeling mixtures of non-stationary sig-nals in the time-frequency domain," in EUSIPCO, Sep 2013.[6] R. Badeau and A. Drémeau, "Variational Bayesian EM algorithm for modeling mixtures of non-stationary signals in the time-frequency domain (HR-NMF)," in ICASSP, May 2013.

Page 41: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 35 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Variational EM algorithm

Goal: estimate the parameter θ of a probabilistic model involving

observations y and latent variables z

Idea: p(z|y ; θ) is approximated by a distribution q

Decomposition of log-likelihood L(θ) = ln(p(y ; θ)):

L(θ) = DKL(q||p(z|y ; θ)) + L(q; θ), where

DKL(q||p(z|y ; θ)) =⟨

ln(

q(z)p(z|y ;θ)

)⟩

q(KL divergence)

L(q; θ) =⟨

ln(

p(y,z;θ)q(z)

)⟩

q(variational free energy)

Since DKL ≥ 0, L(q; θ) is a lower bound of L(θ)

Method: maximize L(q; θ): at each iteration i ,

E-step (update q): q⋆ = argmaxq∈F

L(q; θi−1)

M-step (update θ): θi = argmaxθ

L(q⋆; θ)

Page 42: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 36 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Variational EM for multichannel HR-NMF

Parameters: θ = {as(f , τ), bms(f , ϕ, τ), σ2n,w

sfk , h

skt)}

Mean field approximation: q(z) =∏

s,f ,t

qsft(zs(f , t))

Complexity: 4MSFT (1 + 2Pb)(1 + max(Qb,Qa)) (linear w.r.t. all

model dimensions)

Parallel implementation

Application to real audio data:

Always converges to a relevant solution when S = 1

Needs proper initialization or semi-supervised learning when S > 1

[3] R. Badeau and M.D. Plumbley, "Multichannel high resolution NMF for modelling convolutivemixtures of non-stationary signals in the time-frequency domain" in IEEE Trans. Audio, Speech,Lang. Proc., vol. 22, no. 11, Nov 2014, pp. 1670–1680.

Page 43: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 37 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Part IV

Application to piano sounds

Page 44: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 38 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Application to piano tones

0 0.2 0.4 0.6 0.8 1 1.20

500

1000

1500

2000

2500

3000

3500

4000

Time (s)

Spectrogram of the original mixture signal

Fre

quen

cy (

Hz)

Spectrogram of the input piano sound (C4 + C3)

(F = C, S = 2, M = 1, Fs = 8600kHz)

Page 45: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 39 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Source separation

0 0.2 0.4 0.6 0.8 1 1.2-40

-20

0

20

40

Time (s)

2nd

harm

onic

(54

0 H

z)

(a) First component (C4)

0 0.2 0.4 0.6 0.8 1 1.2-60

-40

-20

0

20

40

Time (s)

4th

harm

onic

(54

0 H

z)

(b) Second component (C3)

Separation of two sinusoidal components

(real parts of STFT subband signals)

IS-NMF:

HR-NMF:

IS-NMF:

HR-NMF:

Page 46: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 40 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Audio inpainting (mono)

0 0.2 0.4 0.6 0.8 1 1.20

500

1000

1500

2000

2500

3000

3500

4000

Time (s)

Spectrogram of the original mixture signal

Fre

quen

cy (

Hz)

C4+C3:

C4 alone:

IS-NMF:

HR-NMF:

Spectrogram of the input piano sound (C4 + C3)

Page 47: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 40 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Audio inpainting (mono)

0 0.2 0.4 0.6 0.8 1 1.20

500

1000

1500

2000

2500

3000

3500

4000

Time (s)

Masked spectrogram

Fre

quen

cy (

Hz)

C4+C3:

C4 alone:

IS-NMF:

HR-NMF:

Masked spectrogram of the input piano sound

Page 48: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 40 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Audio inpainting (mono)

0 0.2 0.4 0.6 0.8 1 1.20

500

1000

1500

2000

2500

3000

3500

4000

Time (s)

Restored spectrogram

Fre

quen

cy (

Hz)

C4+C3:

C4 alone:

IS-NMF:

HR-NMF:

Recovery of the full C4 piano tone

Page 49: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 41 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Audio inpainting (stereo)

Time (s)

Fre

quen

cy (

Hz)

Left channel (m = 0)

0 0.2 0.4 0.6 0.8 1 1.20

1000

2000

3000

4000

5000

Time (s)

Fre

quen

cy (

Hz)

Right channel (m = 1)

0 0.2 0.4 0.6 0.8 1 1.20

1000

2000

3000

4000

5000

Input stereo piano MDCT ym(f , t) (F = R, S = 1, M = 2, Fs = 11kHz)

Page 50: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 42 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Audio inpainting (stereo)

Time (s)

Fre

quen

cy (

Hz)

Left channel (m = 0)

0 0.2 0.4 0.6 0.8 1 1.20

1000

2000

3000

4000

5000

Time (s)

Fre

quen

cy (

Hz)

Right channel (m = 1)

0 0.2 0.4 0.6 0.8 1 1.20

1000

2000

3000

4000

5000

Stereo image yms(f , t) estimated with Qa = Qb = Pb = 0

Page 51: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 43 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Audio inpainting (stereo)

Time (s)

Fre

quen

cy (

Hz)

Left channel (m = 0)

0 0.2 0.4 0.6 0.8 1 1.20

1000

2000

3000

4000

5000

Time (s)

Fre

quen

cy (

Hz)

Right channel (m = 1)

0 0.2 0.4 0.6 0.8 1 1.20

1000

2000

3000

4000

5000

Stereo image yms(f , t) estimated with Qa = 2, Qb = 3, Pb = 1

Page 52: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 44 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Overview of the HR-NMF model

Able to accurately represent multichannel, underdetermined

mixtures of sound sources in presence of reverberation

Achieved via an accurate TF implementation of ARMA filtering

Compatible with any filter bank (either real or complex)

Accounts for phases and correlations over time and frequency

Able to separate overlapping sinusoids within the same frequency

band (high spectral resolution)

Able to restore missing observations (synthesis capability)

Page 53: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 45 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Part V

Source separation benchmark

Page 54: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 46 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Source separation benchmark

Benchmark of several NMF-based methods involving phase recovery:

NMF-Wiener: Wiener filtering with NMF models of spectrograms

Phase reconstruction based on spectrogram consistency:

NMF-GL: NMF models with GL algorithm [Griffin & Lim, 1984]

NMF-LR: NMF models with LR algorithm [Leroux, 2008]

Complex NMF (CNMF) estimation of the STFTs of the sources:

CNMF: without any phase constraint [Kameoka, 2009]

CNMF-LR: with consistency phase constraints [Leroux, 2009]

HR-NMF (with a reduced frequency resolution in order to

compensate for the extra ARMA parameters)

[7] P. Magron, R. Badeau, B. David, "Phase recovery in NMF for audio source separation: aninsightful benchmark," in ICASSP, Apr 2015, pp. 81–85.

Page 55: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 47 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Source separation benchmark

Datasets:

Synthetic mixtures of two harmonic signals with additive white noise

Piano notes mixtures from the MAPS database [Emiya, 2010]

MIDI audio excerpt (bass and guitar)

Blind vs. Oracle approaches:

Blind: model parameters are estimated from the mixtures

Oracle: model parameters are learned from the isolated sources

Evaluation criteria: BSS EVAL Toolbox [Vincent, 2006]

SDR: Source to Distortion Ratio

SIR: Source to Interference Ratio

SAR: Source to Artifact Ratio

Page 56: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 48 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Synthetic mixtures of two harmonic signals

Page 57: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 49 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Piano notes mixtures

Page 58: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 50 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

MIDI audio excerpt

Page 59: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 51 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Oracle separation of a MIDI audio excerpt

Mix NMF-Wiener HRNMF

Bass

Guitar

Keyboard

Page 60: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 52 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Conclusions of the benchmark

Spectrogram consistency may not be relevant for audio quality

Oracle results show the potential of the HR-NMF model in source

separation applications

Blind results show the difficulty of estimating this model without a

proper initialization

Solutions could involve:

Semi-supervised learning,

A priori information (harmonicity, smoothness, sparsity...),

New estimation methods (MCMC, belief propagation, high

resolution methods,...)

Page 61: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 53 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Part VI

Conclusion

Page 62: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 54 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Conclusions

Take-home message:

Possibility of designing TF transforms that better fit the assumption

of uncorrelated TF bins

Importance of modeling phases and correlations in the TF domain

Outlooks of the HRNMF model:

Introduce high temporal resolution (to model sharp transients)

2D linear prediction of TF state zs(f , t)) (to model vibrato, chirps)

Correlations between components (to model sympathetic vibration)

Non-stationary filters (to model attack-decay-sustain-release)

Applications:

Source coding, source separation, audio inpainting...

Page 63: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 55 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Thank you!

Page 64: Probabilistic modeling of non-stationary signals in … · Probabilistic modeling of non-stationary signals in time-frequency domain Application to music signals ... Round Midnight

Page 56 / 56 Roland Badeau Journée "Temps-Fréquence et Non-Stationnarité"

June 19, 2015

Bibliography

A. Liutkus and R. Badeau, “Generalized wiener filtering with fractional powerspectrograms,” in ICASSP, Apr. 2015, pp. 266–270.

R. Badeau and M. D. Plumbley, “Probabilistic time-frequency source-filter decomposition ofnon-stationary signals,” in EUSIPCO, Sep. 2013.

——, “Multichannel high resolution NMF for modelling convolutive mixtures ofnon-stationary signals in the time-frequency domain,” IEEE Trans. Audio, Speech, Lang.Process., vol. 22, no. 11, Nov. 2014, 1670–1680.

R. Badeau, “Gaussian modeling of mixtures of non-stationary signals in the time-frequencydomain (HR-NMF),” in WASPAA, Oct. 2011, pp. 253–256.

R. Badeau and A. Ozerov, “Multiplicative updates for modeling mixtures of non-stationarysignals in the time-frequency domain,” in EUSIPCO, Sep. 2013.

R. Badeau and A. Drémeau, “Variational Bayesian EM algorithm for modeling mixtures ofnon-stationary signals in the time-frequency domain (HR-NMF),” in ICASSP, May 2013, pp.6171–6175.

P. Magron, R. Badeau, and B. David, “Phase recovery in NMF for audio source separation:an insightful benchmark,” in ICASSP, Apr. 2015, pp. 81–85.