Download - HMY 636: Αναγνώριση Συστημάτων · ECE 636: System identification ... • Oliver Nelles, Nonlinear system identification: From classical approaches to neural networks

HMY 636: Αναγνώριση άΣυστημάτων

ECE 636: System identificationInstructor:

Georgios MitsisOffice: GP401Office hours: Anytime (please contact me first)Tel: 22892239Tel: 22892239Email: [email protected]

Lecture 1: IntroductionLecture 1: Introduction

BibliographyIdentification of linear systems• Class textbook Torsten Soderstrom and Petre Stoica System identification Prentice Hall• Class textbook: Torsten Soderstrom and Petre Stoica, System identification, Prentice Hall

– free download!http://user.it.uu.se/~ts/personal02.html#Ordering_of_the_book_System_Identification

• Lennart Ljung, System identification: Theory for the user, Prentice HallP t J B k ll d Ri h d A D i Ti S i Th d M th d S i• Peter J. Brockwell and Richard A. Davis Time Series: Theory and Methods, Springer Series in Statistics

Identification of nonlinear systems• Vasilis Marmarelis, Nonlinear Dynamic Modeling of Physiological Systems, Wiley‐IEEE

D id W t i k d R b t K Id tifi ti f N li Ph i l i l S t• David Westwick and Robert Kearney, Identification of Nonlinear Physiological Systems, IEEE Series in Biomedical Engineering

• Oliver Nelles, Nonlinear system identification: From classical approaches to neural networks and fuzzy models, Springer.

Signals and SystemsSignals and Systems• Alan V. Oppenheim and Alan S. Willsky, Signals and Systems, Prentice Hall Series in

Signal Processing• Julius S. Bendat and Alan G. Piersol, Random Data: Analysis and Measurement

Procedures, Wiley Series in Probability and StatisticsProcedures, Wiley Series in Probability and StatisticsAssessment and grading• Midterm exam 30%• Final exam 40%• Homeworks (including Matlab exercises) 30%• Homeworks (including Matlab exercises) 30% Websitehttp://www.eng.ucy.ac.cy/gmitsis/ece636/

• Class outline– IntroductionIntroduction

– Overview of random variables and random signals

– Random signals and linear systems

I t i l N t i id tifi ti th d h– Input signals, Nonparametric identification methods, coherence

– Linear regression and parameter estimation

– Linear system models

– Prediction error methods

– Instrumental variable methods

– Recursive identificationRecursive identification

– Identification of closed‐loop systems

– Model order selection and validation

P ti l i– Practical issues

– Nonlinear system identification

Introduction

• System: Any entity which transforms an input signal into an output signal

Introduction

output signal

• Mathematical representation: The mapping S which transforms the input signal x(t) into the output signal y(t)

y(t)=S[x(t)]• This mapping (S) may assume pp g y

different forms such as:– (Linear/nonlinear) Differential equations– Impulse responsep p– Nonlinear mappings (e.g., Volterra series)

• When S, x are known: systems analysis/simulation• When S is unknown and x,y are known (measured): systems When S is unknown and x,y are known (measured): systems

identification

Systems ‐ examplesSystems examples

• Electrical circuits

• Image transformation galgorithms (e.g. edge detection)

• Telecommunication systemssystems

Systems ‐ examplesSystems examples

• Physiological/biologicaly g / gsystems– Blood pressure regulationsystem

• Control systemsControl systems– Regulation of signal values within desired range(i d t i l(industrial, medical applications)

• …

• System identification typically

includes the following steps– If we have control of the I/O

experiment, selection of a suitablep ,

protocol, input signal etc.

– Choice of the candidate models type

(model set selection)(model set selection)

– Select a member of this set and determine the values of its parameters (model estimation)parameters (model estimation)

– Assess the quality of the model (model validation)

b l• Moreover, it combines several scientific disciplines: signals and systems theory, y yprobability/statistics, parameter estimation etc.

Systems and models• System: the “true” phenomenon which we are trying to

approximate with a model (important to keep in mind!)

Systems and models

pp ( p p )

• Typically we consider that the output data contain some kind of noise (θόρυβος):

y=z+ε• The system output may depend on variables that we cannot

control and/or even measure (unobservable variables)control and/or even measure (unobservable variables)

υ: Disturbance (Διαταραχή)

ευ

S yx+

ε

z

υ

Basic system categoriesBasic system categories• Static / Dynamic (Στατικά / Δυναμικά)• Causal / Noncausal (Αιτιατά / Μη Αιτιατά)B d d i t b d d t t (BIBO) t bl / t bl (Ε θή ΦΕΦΕ / Α θή)• Bounded input bounded output (BIBO) stable / unstable (Ευσταθή ΦΕΦΕ / Ασταθή)

• Time invariant / Time varying (Χρονικά αμετάβλητα / Χρονικά μεταβαλλόμενα)• Linear / Nonlinear (Γραμμικά / Μη γραμμικά)

• Linear time invariant (LTI) systems (Γραμμικά Χρονικά Αμετάβλητα Συστήματα)

( ) ( )x t y t→

0 0

1 1 2 2

( ) ( )( ) ( )( ) ( ), ( ) ( )

x t y tx t t y t tx t y t x t y t

→− → −

→ →

1 1 2 2 1 1 2 2( ) ( ) ( ) ( )a x t a x t a y t a y t+ → +

Basic model typesyp

• «Black box» / Nonparametric / Data‐driven

We do not assume any a priori structure for the system, i.e. we view it as a “black‐box”

Linear systems : Time domain ‐ Impulse response (Κρουστική απόκριση), Frequency domain – frequency response (απόκριση συχνοτήτων)

Nonlinear systems: Volterra‐Wiener series

• «Grey box» / Parametric

We presume some a priori structure (possibly from empirical knowledge)We presume some a priori structure (possibly from empirical knowledge) e.g. differential equations stemming from laws of physics, block diagrams etc.

Typically, black‐box models are more difficult to estimate and interpret butTypically, black box models are more difficult to estimate and interpret but are also more flexible

Black box modelsh(n): Impulse response (κρουστική απόκριση)M: System memory (Μνήμη συστήματος)0

( ) ( ) ( )n

M

my n h m x n m

=

= − ⇒∑

Q ⎧ ⎫⎪ ⎪

( ) ( ) ( )j j jY e H e X eω ω ω=

kn: Volterra kernels (πυρήνες Volterra) Q: System order (Τάξη συστήματος)

1

1 1

1 10 0 0

( ) ... ( ,..., ) ( )... ( )n

Q M M

n n nn m m

y n k m m x n m x n m− −

= = =

⎧ ⎫⎪ ⎪= − −⎨ ⎬⎪ ⎪⎩ ⎭

∑ ∑ ∑

system memory

k 1(m

)k 2

(m1,m

2)k

system memory

k

121212m

m1*m2* m1* m2*

Impulse response and system properties

• Causal systems:

• BIBO Stable systems:

( ) 0, 0h n n= <

| ( ) |h n∞

< ∞∑• BIBO Stable systems:

• Causal & Stable

| ( ) |n

h n=−∞

< ∞∑

0| ( ) |

nh n

∞

=

< ∞∑

Grey box / Parametric models

• Linear systems: Linear Differential (CT)/ Difference

equationsSu y

2

2 1 1 02

2 1 1 0

( ) ( ) ( )( ) ( )

( 2) ( 1) ( ) ( 1) ( )

d y t dy t du ta a y t b b u tdt dt dt

a y n a y n y n b u n b u n

+ + = +

− + − + = − +

• Nonlinear systems: Nonlinear differential/difference equations2

1 2 11 1 0( ) ( 1) ( 2) ( 1) ( 1) ( )y n a y n a y n a y n b u n b u n+ − + − + − = − +

Block models, e.g. L‐N cascade

( ) ( )( )dz t du tt b+

Lyu z1 1

21 2

( ) ( )( )

( ) ( ) ( )

a z t bdt dt

y t c z t c z t

+ =

= +

Προσαρμογή καμπύλης (Curve fitting)Several of the basic principles of system identification can be qualitatively

understood with the simple problem of curve fitting, i.e.:

g(x)

‐ Given a sequence of N observed values for an independent variable (regressor) x{x1,x2,…,xN} and the corresponding observed values of the dependent (output) { 1 2 N} p g p ( p )variable {t1,t2,…,tN}, where tk=g(xk)+εk, and εk is noise, find an estimate (εκτίμηση)

of the true function g(x)‐ In this case assume that g(x)=sin(2πx)ˆ ( )Ng x

‐ Assume also that we use a model that describes the data of the form:

Προσαρμογή επιφάνειας (Surface fitting)

This could be also done for a higher dimensional problem, e.g. if we have two independent variables (regressors) we have a problem of surface fittingindependent variables (regressors) we have a problem of surface fitting

Curve fitting- We want to estimate based on the observations{x1, t1,x2, t2,…,xN,tN}

- Step 1: Data observations √‐ Step 2: Choice of model set

ˆ ( )Ng x

In this case we are looking for polynomial models of the type:

Note: In general we could have

(παραμετροποίηση – parametrization), όπου:f : Basis functionsfk: Basis functionswhere we would have to select:

‐ The type of the basis functions‐ The number of the basis functions nThe number of the basis functions n

- Step 3: Choice criterion of fit. A common choice is the least‐squares criterion

‐ Step 4: Calculate model coefficients, i.e., wj or αk‐ Step 5: Model validation

Curve fitting- Least squares criterion/ cost function

ˆ arg min ( )1

NN

E= ww w2

1

1( ) ( , )N

k kk

E t g x=

= −Ν∑w w

or alternative versions of this (e.g. weighted least squares)‐ As we will see (linear regression), theunknown parameter values whichunknown parameter values which minimize this criterion can be analytically determined

Curve fitting

• How do we select the order of the polynomial fitting function?• How do we select the order of the polynomial fitting function?

Curve fitting

What happens when we use a different data set (testing data) for model validation?

Curve fitting

Curve fitting

More data observations gives (much) better results!

Κανονικοποίηση (regularization)

• Penalty on coefficients with large magnitudes(shrinkage)• Equivalent to maximum a posteriori estimation (more to follow)• For a squared norm: ridge regression• For M=9:• For M=9:

Κανονικοποίηση (regularization)

Nonlinear model of action potential encoding

μ1x(n) u1 p

ŷμ1

x(n) u1 pŷ

μ2

μ3

u2

u3

f(u1 ,u2 ,u3)

_++ ŷ(n)

v(n)0

1

v

ŷ

μ2

μ3

u2

u3

f(u1 ,u2 ,u3)

_++ ŷ(n)

v(n)0

1

v

ŷ

μ3

φ(u1 ,u2 ,u3 ,p)

μ3

φ(u1 ,u2 ,u3 ,p)

P i i l D i M d d lPrincipal Dynamic Mode modelWiener/Bose model, minimum set of linear filters (dynamic modes)Eigenvalue analysis of Volterra kernel matrix (modes=eigenvectors)Th i ifi t i l ⎤⎡ 1Three significant eigenvalues

Static nonlinearity f(u1,u2,u3): Firing probabilityThreshold function φ(u1,u2,u3,p) ⎥

⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡

=2

T

kk

kR

1

1

21

21

0k

⎦⎣ 212

25

Mechanoreceptor dynamic modes and firing b bilit f tiprobability functions

• Encoding of various displacement parametersparameters– μ1: HP‐ velocity

– μ2 : BP/HP‐ position (displacement magnitude), selective velocity

1 1 1

g ) y

– μ3 : LP‐ cumulative position

• Directional selectivity in encoding

0.6

0.8

1

of fi

ring

0.6

0.8

1

0.6

0.8

1Type A neuronType B neuron

0.2

0.4

Pro

babi

lity

o

0.2

0.4

0.2

0.4

-1 -0.5 0 0.5 10

u1

-0.4 -0.2 0 0.2 0.4 0.60

u2

-0.5 0 0.50

u3

26

Mechanoreceptor model predictions – ROC curvesp p

Type Α Τype ΒNeuron 1 Neuron 2 Neuron 1 Neuron 2

Linear 0.989 0.953 0.919 0.956μ1, μ2 0.988 0.936 0.938 0.958

μ1, μ2, μ3 0.989 0.969 0.960 0.96527

Cerebral blood flow regulation

Autoregulation: homeostatic regulation of own blood flow Brain: extremely effective autoregulationBrain: extremely effective autoregulation

2% of body mass15% of total cardiac output, 20% O2 consumption

28

Nonlinear model of cerebral h d ihemodynamics

ΑBP CO2

Nonlinear, two‐input model of cerebral hemodynamics ΑBP

… … (1)1Lb(1)

jb(1)0b … … (2)

ILb(2)jb(2)

0b

CO2of cerebral hemodynamicsInputs: ABP, PETCO2variations

Dynamic pressureautoregulation

DynamicCO2 reactivity

a at o s

Output: CBFV variations

Simultaneous assessment of fK…f1dynamic pressure

autoregulation, CO2reactivity

+

CBFV

reactivity

Includes MABP‐CO2interactions

29

Cerebral hemodynamics under resting conditions

Experimental data14 subjectsResting conditions, 45 minsTraining data: 6 min (360 points)Validation data: 1 minValidation data: 1 min

Systemorder

NMSE [%]MABP CO2 ΜΑBP & CO2

3030

orderLinear (Q=1) 32.8±13.2 71.6±12.1 24.8±11.1

Nonlinear (Q=3) 20.0±9.2 51.5±10.4 14.5±6.9

Cerebral hemodynamics under resting conditions –Linear componentsLinear components

MABP: HP characteristicSlow MABP changes regulatedSlow MABP changes regulated more effectively

PETCO2: LP characteristicSlow CO2 changes have more 2 geffect on CBF

31

Cerebral hemodynamics under resting conditions –Nonlinear components

Second order kernels

Nonlinear components

Second‐order kernelsMost power in VLF, LF

Relative contribution of NL terms more significant

for PETCO2

MABP PETCO2

Nonlinear to linear terms power ratio

0.31±0.13 1.18±0.45

32

Lecture 2: Random variables and random signals

Deterministic and stochastic/random signalsDeterministic and stochastic/random signals• Deterministic variables/signals: Values are exactly known

• Stochastic/Random variables/signals: Values follow some probabilistic distribution

• In system identification, we typically assume that the input is a deterministic signal and the noise and disturbance signals are stochastic

lsignals

ευ

S yx+

ε

z

υ

Random variables: basics• For a random variable , the probability distribution function (συνάρτηση

κατανομής πιθανότητας) is defined as:

• If the random variable is continuous the probability density function (συνάρτησηIf the random variable is continuous the probability density function (συνάρτηση πυκνότητας πιθανότητας) ‐ pdf is defined as:

and:

Pr ob{ } ( )( ) 0

x X x dx p x dxp x∞

≤ < + =≥

∫ ( ) 1, lim ( ) 0xp x dx p x∞

→∞−∞= =∫

35

Random variables: basics• Expected value (Αναμενόμενη τιμή) :

• For any function of the r.v. Χ:• For discrete rv’s:For discrete r.v. s:

• Mean value: • Moment (Ροπή) of order k:

έ ή• Mean square value(Μέση τετραγωνική τιμή ) – Moment of order 2:

• Central moment of order k:• Variance ‐ Central moment of order 2:Variance Central moment of order 2:

• Τυπική απόκλιση (standard deviation):• The joint probability distribution function between two random variables

i d fi dis defined as:

• Independent (ανεξάρτητες) random variables:

36

• Uncorrelated (Ασυσχέτιστες ) random variables X, Υ: • Two independent random variables are also uncorrelated but the inverse is not

necessarily true

The normal (Gaussian) distributionThe normal (Gaussian) distribution

– Many random variables in practice approximate the normal distribution

– Central limit theorem (Κεντρικό οριακό θεώρημα)

If Χ1,Χ2,…,ΧΝ are random variables with almost any pdf (as long as for some k>2), with {(μ σ ) (μ σ ) (μ σ ) } then their sum Χ=ΣαΧ follows a normal distribution with

kΜ < ∞with {(μ1,σ1), (μ2,σ2), … (μΝ,σΝ),} then their sum Χ=ΣαiΧi follows a normal distribution with

Very powerful2 2 2

1 1,

n n

i i i it t

μ α μ σ α σ= =

= =∑ ∑

Random (stochastic) signals• Stochastic/random signal: A signal whose value at each time point results

from a corresponding random variable . The signal consists of a set of these random variables (for ). Generally such a set of random variables is termed stochastic/random process A stochastic signal may be viewed as a particulartermed stochastic/random process. A stochastic signal may be viewed as a particular sequence of samples that results from the corresponding random process, in other words it is a realization of this random process.

• In order to fully describe a random process, we need both the individual probability distribution functions for each r.v. and the joint probability distributions between the underlying r.v.’s at different times ( ).

• The random process, which is the set of these random variables is essentially a probabilistic model which describes the random signal. In the general case, theprobabilistic model which describes the random signal. In the general case, the distributions that describe the process depend on the time lag n

38

Stochastic signals: basics• Random signals are sequences of random variables. The autocorrelation function

(συνάρτηση αυτοσυσχέτισης ) of a random signal (or random process )is defined as:

while the autocovariance function (συνάρτηση αυτοσυνδιακύμανσης) is defined as:while the autocovariance function (συνάρτηση αυτοσυνδιακύμανσης) is defined as:

• The cross‐correlation function (αλληλοσυσχέτιση) between two randomrocesses/signals is defined as:

ό ί ά• Similarly we define the Παρόμοια ορίζουμε τη cross‐covariance function (συνάρτηση διασυνδιακύμανσης):

• A stationary (στάσιμη) random signal exhibits statistical properties that are independentA stationary (στάσιμη) random signal exhibits statistical properties that are independent of the time lag n, i.e.:

M th t l ti f ti f t ti d i l d d l• Moreover, the autocorrelation function of a stationary random signal depends only on the difference between n and m, i.e.:

• A random signal/process is defined as strictly stationary when all its moments are

39

g p y yindependent of n. If only the moments up to second order are independent of time (as defined above), we have a weakly/wide‐sense stationary random process (ασθενώς στάσιμη ή στάσιμη υπό ευρεία έννοια). Usually Weakly stationary → Strictly stationary

• In practice statistical quantities are estimated from

Stochastic signals: basicsIn practice, statistical quantities are estimated from

finite samples of the random signal

• Εnsemble of signals: A set of realizations of the random

process that corresponds to the signalprocess that corresponds to the signal

• If the statistical properties of each realization in the

ensemble are the same, the signal is termed ergodic

(εργοδικό)(εργοδικό)

• For every n0 the values {x1[n0],x2[n0],…,xK[n0]} are samples of the random variable x[n0].

• How do we estimate statistical quantities in practice? We usually have finite (or o do e est ate stat st ca qua t t es p act ce e usua y a e te (oquite often only one) realization of the random process, each one with a finite length. If our signal is stationary and ergodic we can estimate these quantities from this single realization

• The estimation of the mean signal value is given by the sample mean:

• The estimation of the signal variance is given by the sample variance:Matlab: mean, var, std

40

Stochastic signals: basics• The autocorrelation is estimated by (Matlab: xcorr)

• The above sums converge to the true values for L‐>∞

• Properties of correlation/covariance functions:

41