rozpracované přípravy na přednášky

Transkript

Pokro£ilé statistické metody
p°edná²ky pro doktorandy
Obsah
1 Úvodní hodina
2
2 Model a simulace
5
2.1
Regresní model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
2.2
Kategorický model
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
2.3
Logistický model
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.4
Stavový model
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.5
Poznámka . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
3 Odhad
10
3.1
Bayes·v vzorec . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
3.2
Exponenciální model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
3.3
Regresní model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
3.4
Kategorický model
14
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 Predikce
16
4.1
Regresní model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
4.2
Kategorický model
17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 ízení
18
5.1
Regresní model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
5.2
Kategorický model
20
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6 Adaptace
20
7 Model sm¥si a jeho odhad
24
7.1
Model
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
7.2
Odhad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
8 Klasikace klasická a s modelem sm¥si
1
29
9 Stavový model a ltrace
31
9.1
Známé parametry modelu
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.2
S neznalostí parametr· modelu
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
34
10 Testy hypotéz
35
11 Aplikace
38
11.1 Prediction of trac ow intensity (regression model) . . . . . . . . . . . . . . . .
38
11.2 Estimation of queue length (discrete model) . . . . . . . . . . . . . . . . . . . . .
40
11.3 Classication of road elements safety (logistic model) . . . . . . . . . . . . . . . .
40
11.4 Testing of safety of road elements in urban trac network . . . . . . . . . . . . .
43
11.5 Estimation of error in a guess about length of time a interval
44
12 Psaní v Lyxu
. . . . . . . . . . .
46
1 Úvodní hodina
Web: http://www.fd.cvut.cz/personal/nagyivan/ a Doktorské studium
2
Dohody
1. Je mi jedno, jestli n¥kdo chodí na p°edná²ky nebo studuje doma. Nakonec to musí um¥t!
A to jak trochu teorii tak i pouºití v konkrétním p°ípad¥.
2. B¥hem p°edná²ek si m·ºete vymý²let podle libosti - co chcete, jak to chcete, víc teorie
nebo pouºití, procházet programy ...
3. Známku dostanete za zpracovanou aplikaci - bu¤ vlastní (téma disertace) nebo vymy²lenou. Nejlépe s reálnými daty nebo alespo¬ se simulovanými. Práci musíte odevzdat na
papí°e v profesionální úrovni (to není buzerace ale p°íprava na disertaci). Práce by m¥la
demonstrovat n¥kterou z úloh, kterými se budeme zabývat.
Doporu£uji LYX: http://www.lyx.org/
3
Programy
UxyCo..
x=model
y=úloha
model
úloha
1 spojitý
1 simulace
2 diskrétní
2 odhad
3 logistický
3 predikce
4 stavový
4 °ízení
Co=typ úlohy
5 sm¥sový
Seznam program·
progEL.pdf - popis úloh
ScIntro.sce - inicializace programu ve Scilabu
U11sim.sce - simulace s regresním modelem
U11simN.sce - simulace s MIMO regresním modelem
U12est.sce - odhad regresního modelu ze statistik
U12estB.sce - odhad regresního modelu LS
U12estN.sce - odhad MIMO regresního modelu
U13pre.sce - predikce s regresním modelem
U14reg.sce - °ízení s regresním modelem
U21sim.sce - simulace s kategorickým modelem
U22est.sce - odhad s kategorickým modelem
U23pre.sce - predikce s kategorickým modelem
U24reg.sce - °ízení s kategorickým modelem
U31sim.sce - simulace s logistickým modelem
U41sim.sce - simulace se stavovým modelem
U42est.sce - odhad stavu
U51simM.sce - simulace s modelem sm¥si
U52estM.sce - odhad modelu sm¥si
4
2 Model a simulace
Modelem nazýváme podmín¥nou hp modelované veli£iny v závislosti na vysv¥tlujících veli£inách
a parametrech
f (yt |ψt , Θ)
kde
yt
je modelovaná veli£ina,
ψt
(2.1)
je regresní vektor, který obsahuje veli£iny, na kterých
Θ
(bývají tam i zpoºd¥né veli£iny) a
yt
závisí
jsou parametry modelu.
Model m·ºe být bu¤ statický nebo dynamický (ψt obsahuje zpoºd¥né hodnoty výstupu).
Statický model
- p°edstavuje pravd¥podobnostní (hustota realizací) kope£ek se st°edem ve
st°ední hodnot¥ a ²í°kou danou kovarian£ní maticí - význam prvk·. Hodí se pro systémy, které
se nevyvíjejí v £ase.
Dynamický model - centra kope£k· jsou závislá na zpoºd¥ných datech; kope£ky se v datovém
prostoru pohybují.
Modelem také nazýváme v²echno to, co hustotu
(2.1) denuje. M·ºe to být rovnice nebo tabulka.
2.1 Regresní model
Je denován rovnicí
yt = θ0 ψt + et
kde
··· .
Vícerozm¥rný model
y1;t
y2;t
=
a11
a21
a12
a22
y1;t−1
y2;t−1
+
b1
b2
ut +
k1
k2
+
e1;t
e2;t

=
a11
a21
a12
a22
b1
b2
k1
k2
→

−
−1
0
0
−1
a11
a21
kde vektor s daty se nazývá roz²í°ený
Statický model

y1;t−1
 y2;t−1 

 + e1;t
 ut 
e2;t
1
y1;t
y2;t


a12 b1 k1 
 y1;t−1
a22 b2 k2 
 y2;t−1
 ut
1
h
0
regresní vektor Ψ = yt ,

 
 

y1;t
k1
e1;t
 y2;t  =  k2  +  e2;t 
y3;t
k3
e3;t
5


 
 = e1;t

e2;t


0
ψt
i0
.
=
→

−1
− 0
0
0
−1
0
0
0
−1

 y
k1  1;t
y2;t
k2  
 y3;t
k3
1
| {z



e1;t

 =  e2;t 

e3;t
}
Ψt
P°íklad
Model spot°eby automobilu
Na testovacím aut¥ m¥°íme spot°ebu, rychlost, moment motoru a p°ír·stky nadmo°ské vý²ky.
Chceme sestavit model spot°eby v závislosti na zbylých veli£inách.
Podle odhadu bude tento systém dynamický - okamºitá spot°eba bude záviset na minulé a
dal²ích veli£inách. Model tedy bude:
yt
- spot°eba,
vt
- rychlost,
mt
- moment,
nt
- p°ír·stek nadmo°ské vý²ky.
yt = ay1t−1 + a2 vt + a3 mt + a4 nt + k + et
Simulace s jednorozm¥rným regresním modelem je p°edvedena v následujícím programu
//
Simulation
of
scalar
regression
model
of
order
n
//
clc ,
//
clear
[ u , t , n ]= f i l e ( ) ;
//
find
c h d i r ( dirname ( n ( 1 ) ) ) ;
//
set
//
clear ,
close
generation
of
regression
//
model
working
−(1:n ) ) , u ( t −(0:n ) )
//
number
of
data
setup
o r d =2;
//
model
//
reg .
coef .
a
//
reg .
coef .
b
t h k = −1;
//
constant
cv = . 1 ;
//
noise
tha =[.6
t h b =[1
//
− .2 . 1 − .5 . 2 ] ;
. 5 − .3 − .1 . 1 − . 3 ] ;
model
ut (1)=1;
for
if
input
order
variance
generation
j =1;
i =2: nd
rand ( 1 , 1 , ' u ' ) > . 8 5 ,
u t =[ u t
directory
directory
vector
d e f f ( ' p s=g e n p s i ( t , n , y , u ) ' , ' p s =[ y ( t
nd = 1 0 0 ;
all
working
j=r a n d ( 1 , 1 , ' n ' ) ;
j ];
end
y t=z e r o s ( 1 , nd ) ;
6
end
1] ' , ' c ')
generated
t h =[ t h a ( 1 : o r d )
//
time
for
thk ] ;
//
regression
coefficients
/ / ============================================
t =( o r d + 1 ) : nd
//
time
p s=g e n p s i ( t , o r d , y t , u t ) ;
//
regression
y t ( t )= p s ∗ th '+ c v ∗ r a n d ( 1 , 1 , ' n ' ) ;
//
model
end
//
loop
thb ( 1 : ( ord +1))
//
end
of
time
loop
vector
simulation
l o o p ====================================
Results
s =1: nd ;
p l o t ( s , yt ( s ) , s , ut ( s ) )
l e g e n d ( ' output ' , ' input ' ) ;
t i t l e ( ' Simulation
with
regression
s e t ( g c a ( ) , ' data_bounds ' , [ 1
model ' )
nd min ( [ y t , u t ] )
−.1
max ( [ y t , u t ] ) + . 1 ] )
2.2 Kategorický model
f (yt |ψt , Θ) = Θyt |ψt
coº je dynamická tabulka.
P°íklad
Klasikace nehod na lehké a t¥ºké.
V ur£ité oblasti zaznamenáváme nehody
provázejí. Konkrétn¥ to jsou: teplota
x1;t
yt
(0=lehké, 1=t¥ºké) a n¥které veli£iny, které je do-
(1=pod nulou, 2=nad nulou) a sv¥tlo
x2;t
(1=dobré,
2=²ero, 3=tma).
Jsou jen diskrétní veli£iny. Model bude
[x1 , x2 ]
1,1
1,2
1,3
2,1
2,2
2,3
Simulace s jednoduchým modelem
//
Simulation
of
y=0
θ0|11
θ0|12
θ0|13
θ0|21
θ0|22
θ0|23
y=1
θ1|11
θ1|12
θ1|13
θ1|21
θ1|22
θ1|23
y ∈ {1, 2} , u ∈ {1, 2}
categorical
model
je v následujícím programu
f (y( t ) | u( t ) , y( t
−1))
//
clc ,
//
clear
[ u , t , n ]= f i l e ( ) ;
clear ,
close
//
find
c h d i r ( dirname ( n ( 1 ) ) ) ;
//
set
getd ( ) ;
7
all
working
working
directory
directory
with
y , u =1 ,2
nd = 2 0 0 ;
th = [ . 9
.1
.8
.2
.3
.7
.1
.9];
u t= f i x ( 2 ∗ r a n d ( 1 , nd , ' u ' ) ) + 1 ;
//
number
//
model
//
gen .
of
data
parameter
of
random
input
z t=o n e s ( 1 , nd ) ;
for
t =2: nd
//
time
l o o p ====================================
j=p s i 2 r o w ( [ u t ( t ) , z t ( t
−1)]
,[2 ,2]);
//
row
z t ( t )=sum ( r a n d ( 1 , 1 , ' u ' ) > cumsum ( t h ( j , : ) ) ) + 1 ;
end
//
//
end
of
time
in
parameter
//
l o o p ====================================
Results
s =1: nd ;
p l o t ( s , z t ( s ) , ' . ' , s , ut ( s ) , ' . : ' )
l e g e n d ( ' output ' , ' input ' ) ;
title
" Simulation
with
nd
.9
2.1])
categorical
model "
2.3 Logistický model
Ozna£íme
Pro
p0 = P (yt = 0|ψt )
a
p1 = P (yt = 1|ψt )
(platí
p0 = 1 − p1 ).
yt ∈ {0, 1}
ln
Pro
yt ∈ {0, 1, · · · , n}
podobn¥
p1
= θ 0 ψt + et
p0
pi = P (yt = i|ψt )
p1
= θ10 ψt + et
p0
p2
= θ20 ψt + et
ln
p0
···
pn
= θn0 ψt + et
ln
p0
ln
Kde
pi = P (yt = i|ψt , θ) , ∀i
Invertovaný model
Pro
θi0 ψt = zi;t
table
generation
je
pi =
exp {zi;t }
P
, i = 1, 2, · · · , n
1 + j exp {zj;t }
p0 =
1+
1
exp
{zj;t }
j
P
8
of
y
P°íklad
Klasikace nehod na lehké a t¥ºké.
yt
V ur£ité oblasti zaznamenáváme nehody
provázejí. Konkrétn¥ to jsou: teplota
x1;t
(0=lehké, 1=t¥ºké) a n¥které veli£iny, které je do-
(ve
o
C)
a sv¥tlo
x2;t
(1=dobré, 2=²ero, 3=tma).
Pro model bude
výstup
yt ,
regresní vektor
xt = [1, x1;t , x2;t ] = [1,
teplota, sv¥tlo]t
2.4 Stavový model
Pro nem¥°itelnou veli£inu (stav)
xt
je
xt
= M xt−1 + N ut + F + wt
yt
= Axt + But + G + vt
P°íklad
Filtrace veli£iny m¥°ené se ²umem.
M¥°íme za²um¥nou veli£inu
y t = x t + vt
a chceme z ní vytáhnout £istý signál
xt
=
xt−1 + wt
yt
=
x t + vt
Stavová rovnice je náhodná procházka -
xt
xt .
se m·ºe m¥nit libovoln¥ (o dynamice jejího vývoje
nic nevíme) ale jeho zm¥ny jsou malé - nemohou být v¥t²í, neº amplituda ²umu
wt
která je dána
jeho rozptylem.
Výstupní rovnice °íká, ºe v m¥°eném signálu je £istý signál + ²um a velikost toho ²umu je dána
vt .
rozptylem
→
rozptyly
wt
a
vt
(které musíme zadat) ur£ují, co jsou zm¥ny signálu
xt
a co je dáno ²umem
se kterým signál m¥°íme.
Jednoduchý program simulace se stavovým modelem je v následujícím programu
//
Simulation
with
s t a t e −s p a c e
model
//
x ( t +1) = Mx( t ) + Nu ( t ) +w( t )
//
y( t )
= Ay ( t )
//
clc ,
//
clear
[ u , t , n ]= f i l e ( ) ;
clear ,
//
find
c h d i r ( dirname ( n ( 1 ) ) ) ;
//
set
nd = 2 0 0 ;
//
number
M= [ . 8
//
.2
N= [ . 2 ;
.1
close
.6];
1];
9
all
working
working
of
directory
directory [ [
steps
model
parameters
A=[0
1];
rw = [ . 1
.02
.01
x =[5;
for
state
−3];
t =2: nd
//
time
noice
//
initial
//
input
y=A∗ x ;
//
output
x=M∗ x+N∗ u+rw ∗ r a n d ( 2 , 1 , ' n ' ) ;
//
state
u t ( t )=u ;
//
stor
state
y t ( t )=y ;
//
the
x t ( : , t )=x ;
//
variables
//
covariance
l o o p ====================================
u=r a n d ( 1 , 1 , ' n ' ) ;
end
//
//
.1];
end
of
time
l o o p ====================================
Results
set ( scf (1) , ' position ' ,[200
100
1000
800])
s u b p l o t ( 2 1 1 ) , p l o t ( xt ' )
title
" Simulatrd
state "
subplot ( 2 1 2 ) , p l o t ( [ yt
title
" Simulatrd
ut ] )
output
and
input "
2.5 Poznámka
Diskrétní veli£iny je vhodné kódovat bez mezer, nap°. 1, 2, 3, ...
Spojité veli£iny je moºno pouºít tak, jak jsou, nebo je ²kálovat, tj. ode£íst pr·m¥r a d¥lit sm¥rodatnou odchylkou
x̃ =
x − x̄
.
s
kálování má ten význam, ºe prostor, na kterém se veli£iny vyskytují je p°ibliºn¥ znám (nap°.
p°i po£áte£ním rozmís´ování komponent modelu sm¥si).
3 Odhad
Vytvo°ení modelu se skládá ze dvou £ástí:
1. Návrh struktury modelu (s neznámými parametry).
2. Odhad neznámých parametr· z dat.
Odhad sám má také n¥kolik £ástí:
1. Konstrukce apriorních statistik.
2. Pr·b¥ºný odhad z m¥°ených dat.
3. Validace modelu (v¥t²inou podle chyby predikce).
10
Poznámka
1. Konstrukci apriorních dat (tzv inicializaci odhadu) nejlépe provedeme z apriorních dat, tj.
dat, m¥°ených je²t¥ p°ed za£átkem odhadování. Takových dat je v¥t²inou k dispozici velké
mnoºství, protoºe daný systém jiº existoval a n¥jak fungoval. Pokud taková data nejsou,
nebo chceme do inicializace zavést novou, expertní znalost, m·ºeme pouºít tzv metodu
ktivních regresních vektor·. P°i ní si expert p°edstaví systém v n¥mº tato vlastnost
dominuje a ur£í jaká data tuto skute£nost nejlépe vyjad°ují. Z t¥chto dat se potom sestaví
1
n¥kolik nezávislých regresích vektoru, které se pouºijí jako apriorní data .
2. D·leºité je si také uv¥domit, ºe apriorní znalost m·ºe být vyjád°ena s r·znou silou.
Ta odpovídá tomu, z kolika regresních vektor· (skute£ných nebi ktivních) byla získána.
Nejlépe je to vid¥t na korun¥: rub 1, líc 1 nebo rub 100, líc 100.
3. V¥t²inou je apriorní informace slabá a hlavní podíl mají data. P°i ²patných datech nebo
jejich malém mnoºství to m·ºe být i naopak - dominuje apriorní (expertní) znalost a data
je je potvrzují nebo korigují.
3.1 Bayes·v vzorec
Pr·b¥ºný odhad z dat se °ídí Bayesovým vzorcem
f (Θ|d (t)) ∝ f (yt |ψt , Θ) f (Θ|d (t − 1))
kde fungují p°irozené podmínky °ízení.
Tento vzorec prakticky funguje na statistikách pro konkrétní rozd¥lení modelu a k n¥mu konjugovanou hp pro parametry.
3.2 Exponenciální model
Postup p°i konstrukci algoritmu odhadování ukáºeme na modelu s exponenciálním rozd¥lením
Model
f (yt |a) = a exp {−ayt }
Sou£in
t
Y
(
t
f (yτ |a) = a exp −a
τ =1
t
X
)
yτ
τ =1
Aposteriorní
f (a|y (t)) ∝ aκt exp {−aSt }
P°epo£et statistik
κt = κt−1 + 1
St = St−1 + yt
1 Regresních
vektor· musí být tolik, aby proces odhadu byl regulární.
11
kde
κ0 , S0
jsou apriorní statistiky.
Bodový odhad
κt
St
ât =
dostaneme nap°. jako maximum likelihood.
3.3 Regresní model
Podle Bayese
Model
f (yt |ψt , Θ) = √
0
1
1
−1
exp − [−1, θ0 ] Ψt Ψt
θ
2r
2πr
Aposteriorní
f (Θ|d (t)) ∝
1
√
r
κt
1
−1
exp − [−1, θ0 ] Vt
θ
2r
P°epo£et statistik
0
Vt
= Vt−1 + Ψt Ψt
κt
= κt−1 + 1
Bodové odhady
0
θ̂t =
Vψ−1 Vyψ
a
r̂t =
Vy − Vyψ Vψ−1 Vyψ
κt
Poznámka
P°i výpo£tu bodových odhad· se po£ítá inverze matice Vψ která na za£átku odhadu nemusí být
regulární. V tomto p°ípad¥ se doporu£uje inicializovat matici V0 ne jako matici nul, ale jako
−8
diagonální matici s velmi malou diagonálo (10 ).
Úlohu demonstruje následující program
//
Estimation
of
scalar
regression
model
of
order
n
//
clc ,
//
clear
[ u , t , n ]= f i l e ( ) ;
clear ,
close
//
find
c h d i r ( dirname ( n ( 1 ) ) ) ;
//
set
load
da t aT 1 1 . d a t
all
working
working
directory
directory [ [
−(1:n ) ) , u ( t −(0:n ) )
Sim
y t=Sim . Cy . y t ;
u t=Sim . Cy . u t ;
o r d=Sim . Cy . o r d ;
12
1] ' , ' c ')
t h=Sim . Cy . t h ;
nd=l e n g t h ( y t ) ;
n P s i =2∗ o r d +3;
V=1e −8∗ e y e ( n P s i , n P s i ) ;
for
t =( o r d + 1 ) : nd
Ps =[ y t ( t )
g e n p s i ( t , ord , yt , ut ) ] ' ;
V=V+Ps ∗ Ps ' ;
end
thE=i n v (V ( 2 : $ , 2 : $ ) ) ∗ V ( 2 : $ , 1 ) ;
disp ( ' Simulated
parmeters ' )
d i s p ( th )
d i s p ( ' Estimated
parmeters ' )
d i s p ( thE ' )
if
l e n g t h ( t h)== l e n g t h ( thE )
b a r ( [ th '
thE ] )
t i t l e ( ' Simulated
and
estimated
parameters ' )
end
Nejmen²í £tverce
y1
=
a1 y0 + bu1 + k + e1
y2
=
a1 y1 + bu2 + k + e2
···
yt
=
a1 yt−1 + but + k + et
→
Y = Xθ + E
kde


y1
 y2 

Y =
 ··· ,
yt

y0
 y1
X=

yt−1
Potom
θ̂t = (X 0 X)
−1
u1
u2
···
ut
X 0Y
Ŷ = X θ̂t
r̂t = var Y − Ŷ
kde
Y − Ŷ
je chyba predikce.
Postup je ilustrován v p°íklad¥
13
 
1

1 
=
 
1
0 
ψ1
0
ψ2 

··· 
0
ψt
//
Estimation
of
scalar
regression
model
of
order
2
by LS
//
clc ,
//
clear
[ u , t , n ]= f i l e ( ) ;
//
find
c h d i r ( dirname ( n ( 1 ) ) ) ;
//
set
//
clear ,
close
working
how
the
regression
//
specifically
load
da t aT 1 1 . d a t
ps =
vector
[ y( t )
Sim
is
organized
y ( t − 1) u ( t )
//
loas
//
output
//
input
//
order
//
simulated
//
data
//
actual
//
once
s 2 =1: nd − 2;
//
twice
Y=y t ( s ) ' ;
//
output
ut ( s1 ) ;
thE=i n v (X' ∗ X) ∗ X' ∗ Y ;
//
time
delayed
delayed
ut ( s2 ) ;
//
geression
//
least
parmeters ' )
parmeters ' )
l e n g t h ( t h)== l e n g t h ( thE )
b a r ( [ th '
thE ] )
and
estimated
parameters ' )
end
Koruna
yt ∈ {0, 1} .
Model
t
f (yt |p) = py1t p1−y
,
2
kde
p1 = p, p2 = 1 − p.
Aposteriorní
V
V
f (p|d (t)) ∝ p1 1;t p2 2;t
14
time
vectors
squares
d i s p ( thE ' )
if
time
ones (1 , length ( s ) ) ] ' ;
d i s p ( th )
d i s p ( ' Estimated
1]
parameters
Results
disp ( ' Simulated
1] ' , ' c ')
simulation
length
s =3: nd ;
ut ( s ) ;
the
simulation
s 1 =2: nd − 1;
yt ( s2 ) ;
in
u ( t − 1) u ( t − 2)
X=[ y t ( s 1 ) ;
directory
directory [ [
−(1:n ) ) , u ( t −(0:n ) )
//
all
working
kde
Vt = [V1;t , V2;t ]
je statistika.
P°epo£et statistiky
V1;t = V1;t−1 + 1
V2;t = V2;t−1 + 1
Odhad
yt = 1 (rub)
pro
pro
yt = 2 (líc)
0
p̂t =
[V1;t , V2;t ]
V1;t + V2;t
(normalizace)
Obecn¥
V obecném p°ípad¥ dynamického modelu ve tvaru tabulky je p°epo£et statistiky
Vt = Vt−1 + ∆yt |ψt
kde
Vt
Odhad
je statistika ve stejné form¥ jako parametr,
θ̂t
∆y|ψ
je matice nul s jedni£kou na pozici
y|ψ .
je statistika normalizovaná tak, aby sou£ty prvk· v °ádcích byly jedna.
Postup je demonstrován v následujícím p°íklad¥
//
Estimation
of
categorical
model
f (y( t ) | u( t ) , y( t
−1))
with
y , u =1 ,2
//
clc ,
//
clear
[ u , t , n ]= f i l e ( ) ;
clear ,
close
//
find
c h d i r ( dirname ( n ( 1 ) ) ) ;
//
set
all
working
working
directory
directory
getd ( ) ;
load
da t aT 1 2 . d a t
Sim
//
load
//
extraction
nd=l e n g t h ( z t ) ;
//
number
V=z e r o s ( 4 , 2 ) ;
//
statistics
z t=Sim . Cz . z t ;
simulatio
of
vars
u t=Sim . Cz . u t ;
t h=Sim . Cz . t h ;
for
t =2: nd
j=p s i 2 r o w ( [ u t ( t ) , z t ( t
−1)]
,[2 ,2]);
V( j , z t ( t ))=V( j , z t ( t ) ) + 1 ;
of
//
time
//
row
//
generation
data
loop
in
parameter
of
table
y
end
thE=f n o r m (V , 2 ) ;
//
//
point
estimates
( normalization
of
rows
Results
bar ( [ th ( : , 1 )
thE ( : , 1 ) ] )
and
estimated
parameters
legend ( ' simulated ' , ' estimated ' ) ;
15
( first
column ) ' )
o f V)
4 Predikce
Základní úlohy, °e²ené s odhadnutým modelem jsou predikce a °ízení. Predikce výstupu je odhad
budoucího výstupu se znalostí dat do sou£asnosti. Protoºe budoucí výstup je neznámý, je popsán
prediktivní hp
f (yt+k |d (t))
kde
k ≥ 1.
Pro
k=1
mluvíme o jednokrokové p°edpov¥di, kterou realizuje sám model se známými para-
metry. Pokud parametry neznáme, pak úloha predikce zahrnuje i odhad
ˆ
f (yt+1 |d (t)) =
Pro
k>1
(nap°.
k = 3)
f (yt+1 |ψt+1 , Θ) f (Θ|d (t)) dΘ.
(4.1)
jde o vícekrokovou predikci. Pro známe parametry je
ˆ ˆ
f (yt+3 |d (t)) =
f (yt+3 |ψt+3 ) f (yt+2 |ψt+2 ) f (yt+1 |ψt+1 ) dyt+2 dyt+1 .
Pokud jsou parametry neznámé, bude
ˆ ˆ ˆ
f (yt+3 |d (t)) =
f (yt+3 |ψt+3 , Θ) f (yt+2 |ψt+2 , Θ) f (yt+1 |ψt+1 , Θ) dyt+2 dyt+1 f (Θ|d (t)) dΘ.
{z
}
f (yt+3 |d(t) t°íkrokový prediktor)
|
(4.2)
a srv.
(4.1)
a
(4.2).
Prakticky v¥t²inou predikujeme s bodovými odhady.
4.1 Regresní model
Pro známé parametry, nebo jejich bodové odhady a bodovou predikci pouºijeme postupné dosazování rovnice regresního modelu, nap°.
yt = ayt−1 + but + k + et ,
ŷt
=
ayt−1 + but + k
ŷt+1
=
aŷt + but+1 + k
ŷt+2
=
aŷt+1 + but+2 + k
kde
et = 0
···
kde
ˆ ozna£uje
bodovou predikci.
Úlohu ilustruje následující program
//
Prediction
with
scalar
regression
model
of
order
n
//
clc ,
//
clear
[ u , t , n ]= f i l e ( ) ;
clear ,
close
//
find
c h d i r ( dirname ( n ( 1 ) ) ) ;
//
set
16
all
working
working
directory
directory
−(1:n ) ) , u ( t −(0:n ) )
load
//
da t aT 1 1 . d a t
extraction
of
Sim
//
variables
from
load
structure
//
output
//
input
//
reg .
c v=Sim . Cy . c v ;
//
noise
covariance
//
model
order
prediction
steps
np =5;
// # o f
1] ' , ' c ')
simulace
Sim
coefficients
(1= p r e d i c t i o n
from
the
model )
yp=z e r o s ( 1 , nd ) ;
for
t =( o r d + 1 ) : ( nd−np+1)
tome
l o o p ======================
//
old
for
//
loop
p s=g e n p s i ( t+j , o r d , y i , u t ) ;
//
regression
y i ( t+j )= p s ∗ th ' ;
//
auxiliary
end
//
yp ( t+np −1)= y i ( t+np − 1 ) ;
//
end
//
//
−1));
j = 0 : ( np − 1)
y i=y t ( 1 : ( t
//
end
of
time
data
of
( at
within
final
time
t)
prediction
vector
prediction
the
interval
prediction
for
t+np
l o o p ===================================
Results
s =( o r d + 1 ) : ( nd−np ) ;
p l o t ( s , y t ( s ) , s , yp ( s ) )
l e g e n d ( ' output ' , ' p r e d i c t i o n ' ) ;
title
" Prediction
with
regression
model "
Jednokroková predikce se známými parametry (nebo jejich bodovými odhady) je dána modelem, nap°.
f (yt |yt−1 )
yt−1 = 1
yt−1 = 2
yt−1 = 3
yt = 1
yt = 2
yt = 3
0.3
0.5
0.2
0.1
0.2
0.7
0.2
0.2
0.6
yt
f (yt |1)
tj. ŷt = 2.
f (yt |yt−1 = 1)
bude jako hp dána prvním °ádkem, tj.
odhad bude to
yt ,
které má nejv¥t²í pravd¥podobnost -
1
2
3
0.3
0.5
0.2
a bodový
Vícekroková prediktivní hp se dá jednodu²e spo£ítat jen pro £tvercový model (který jsme
práv¥ uvedli). Ozna£íme-li matici v tabulce modelu jako
k
krokovou predikci bude
T k.
T,
pak model prediktivní tabulky pro
A op¥t, pro konkrétní hodnotu v podmínce vybereme p°íslu²ný
°ádek a v n¥m maximální prvek. Jemu odpovídá bodová
17
k -kroková
predikce.
5 ízení
Optimální °ízení po£ítá °ídící veli£inu tak, aby se minimalizovalo zadané kriterium
J=
X
yt2 + ωu2t
nebo
J=
X
2
(yt − st ) + ω (ut − ut−1 )
V prvém p°ípad¥ °ídíme na nulu a penalizujeme celé
ut .
2
Pokud ustálené
nebo
y∞
···
není nula, z·stává
regula£ní odchylka. V druhém p°ípad¥ °ídíme na ºádanou veli£inu a penalizujeme p°ír·stky
ut .
Tady regula£ní odchylka nez·stává.
Pro °ízení p°edpokládáme model se známými parametry nebo s dosazenými bodovými odhady
z externí identikace.
P°edpis pro °ídící veli£iny na intervalu °ízení dostaneme postupnou minimalizací st°ední hodnoty
kriteria odzadu, proti sm¥ru £asu. Kdyº dojdeme na za£átek intervalu máme k dispozici data pro
vy£íslení prvního °ízení. To aplikujeme a zm¥°íme první výstup soustavy. Tím máme k dispozici
data pro vy£íslení dal²ího °ízení a tak postupujeme ve sm¥ru £asu a °ídíme.
Pro výpo£et °ídícího zákona (odzadu) i jeho aplikaci (ve sm¥ru £asu) jsou odvozeny algoritmy.
5.1 Regresní model
Nejd°íve vyjád°íme regresní model ve stavovém tvaru. Potom po£ítáme °ídící zákon na celém
intervalu °ízení a nakonec °ízení pr·b¥ºn¥ aplikujeme. Situaci demonstruje následující program
/ / ADAPTIVE CONTROL WITH DYNAMIC REG. MODEL WITH CONSTANT
//
multivariate
input
//
control
setpoint
with
//
penalization
//
receding
//
of
and
input
horizon
in
output
increments
control
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
clc ,
//
clear
[ u , t , n ]= f i l e ( ) ;
//
find
c h d i r ( dirname ( n ( 1 ) ) ) ;
//
set
getd ( ) ;
//
load
rand ( ' seed ' , 2 ) ;
//
set
//
clear ,
Setting
close
the
all
working
working
functions
seed
//
length
of
prior
nd = 2 0 0 ;
//
length
of
control
Setting
the
for
from
random
task
n i =150;
//
directory
directory
estimation
simulation
Sim . Cy . o r d =2;
//
model
order
− . 2 ; − .05 − . 3 ] ;
Sim . Cy . t h ( 2 ) . a = [ . 1 − . 1 ; . 0 2 − . 1 ] ;
Sim . Cy . t h b 0 =[1 − . 4 ; 1 . 3 − . 4 ] ;
//
a2
//
b0
Sim . Cy . t h ( 1 ) . b = [ . 1
0;0
.2];
//
b1
Sim . Cy . t h ( 2 ) . b = [ . 1
0;0
.1];
//
b2
Sim . Cy . t h ( 1 ) . a = [ . 3
//
18
a1
current
dir
generator
Sim . Cy . t h k = [ 1 ;
−1];
Sim . Cy . s d = . 5 ;
//
k
//
noise
( const )
stdev
ny= s i z e ( Sim . Cy . t h ( 1 ) . a , 1 ) ;
//
dimension
nu= s i z e ( Sim . Cy . t h ( 1 ) . b , 2 ) / ( 1 + o r d ) ;
//
and
of
y
u
t h=Sim . Cy . t h b 0 ;
for
i =1: o r d
t h =[ t h
//
−>
parameters
Sim . Cy . t h ( i ) . a
vector
th
Sim . Cy . t h ( i ) . b ] ;
end
t h =[ t h
Sim . Cy . t h k ] ;
k=Sim . Cy . t h k ;
s d=Sim . Cy . s d ;
//
Setting
the
control
Con . Cy . nh =3;
//
length
Con . Cy . om=1;
//
penalty :
of
Con . Cy . l a = . 0 1 ;
//
penalty :
control
interval
y ( t )^2
( u ( t )− u ( t
− 1))^2
nh=Con . Cy . nh ;
// c o n v e r s i o n
to
state
model
[ XX, XX, XX, nx , ny , nu ]= r e g 2 s t ( th , o r d ) ;
//
penalization
matrices
Om=z e r o s ( nx , nx ) ;
Om( 1 : ny , 1 : ny)=Con . Cy . om∗ e y e ( ny , ny ) ;
//
output
//
input
nn=ny+nu ;
for
i =1: nu
Om( ( ny+ i ) , ( ny+ i ))= Con . Cy . l a ;
//
( only
increment
for
o r d >1
Om( ( ny+ i +nn ) , ( ny+ i +nn ))=Con . Cy . l a ;
Om( ( ny+ i ) , ( ny+ i +nn))= −Con . Cy . l a ;
Om( ( ny+ i +nn ) , ( ny+ i ))= −Con . Cy . l a ;
end
//
setpoint
generation
s t p=g e n s t p ( ny , nd , [ 3 ; 5 ] , [ . 9 5 ; . 9 2 ] , [
//
TIME LOOP f o r
adaptive
−5;10]);
c o n t r o l =========================
S= l i s t ( ) ;
y=z e r o s ( ny , nd ) ;
y(: ,1:2)=[10
10; −10
−10];
u=z e r o s ( nu , nd ) ;
R= . 0 0 0 0 1 ∗ e y e ( nx , nx ) ;
for
//
regularization
t =( o r d + 2 ) : ( nd−nh )
/ / STATE CONSTRUCTION
[M, N, A, nx , ny , nu ]= r e g 2 s t ( th , o r d ) ;
/ / GENERATION OF CONTROL LAW ON INTERVAL
for
i =nh : − 1 : 1
M( 1 : ny , $)=− s t p ( : , t+ i )+k ;
19
for
inversion
!!)
T=R+Om;
A=N' ∗ T∗N ;
B=N' ∗ T∗M;
C=M' ∗ T∗M;
S ( t )= i n v (A) ∗ B ;
R=C−S ( t ) ' ∗ A∗ S ( t ) ;
end
/ / GENERATION OF OPTIMAL CONTROL ( o n l y
//
construction
u ( : , t )=−S ( t ) ∗ x ;
//
input
y ( : , t )= t h
end
//
one
x=genph ( o r d , t , y , u ) ;
//
of
∗[u (:
time
, t );
step )
of
state
evaluation
x ]+ s d ∗ r a n d ( ny , 1 , ' norm ' ) ;
//
new
output
l o o p ========================================
Results
100
600
500])
s = 1 : ( nd−nh ) ;
//
first
output
subplot (211)
p l o t ( s , y ( 1 , s ) , s , u ( 1 , s ) , ' − − ' , s , s t p ( 1 , s ) , 'm. ' , ' m a r k e r s i z e ' , 2 )
l e g e n d ( ' output ' , ' input ' , ' s e t p o i n t ' , 4 ) ;
title
//
" Adaptive
second
control
with
regression
model "
output
subplot (212)
plot ( s , y (2 , s ) , s , u (2 , s ) , ' − − ' , s , stp (2 , s ) , ' c . ' , ' markersize ' , 2 )
// s e t ( g c f ( ) , ' p o s i t i o n ' , [ 5 0 0
50
600
500])
Postup je moºno nalézt v textu k STS.
6 Adaptace
Jak u predikce, tak i p°i °ízení jsme p°edpokládali model se známými parametry. V adaptivní
verzi, tedy pro neznámé parametry, tyto úlohy nemají v optimální podob¥ spo£itatelné °e²ení.
Proto se °e²í suboptimáln¥, nej£ast¥ji tak, ºe se odhad provádí paraleln¥ a v úloze se vyuºívají
bodové odhady parametr· tak, jak by to byly známé parametry.
Postup je následující
1. Vezmeme existující bodové odhady parametr· a s nimi provedeme úlohu na celém budoucím intervalu:
(a) predikujeme na poºadovaný po£et krok·,
(b) provedeme celou syntézu °ízení na ur£eném intervalu.
2. P°i predikci si zapamatujeme predikovanou hodnotu, p°i °ízení aplikujeme jedinou ak£ní
veli£inu pro sou£asný £asový okamºik.
20
3. Zm¥°íme nová data.
4. P°epo£teme statistiky a ud¥láme nové bodové odhady parametr·.
Postup p°i predikci je obsaºen v programu
//
Prediction
with
scalar
regression
model
of
order
n
//
clc ,
//
clear
[ u , t , n ]= f i l e ( ) ;
clear ,
close
//
find
c h d i r ( dirname ( n ( 1 ) ) ) ;
//
set
//
da t aT 1 1 . d a t
Data
prom
working
directory
directory
−(1:n ) ) , u ( t −(0:n ) )
load
all
working
1] ' , ' c ')
Sim
simulation
c v=Sim . Cy . c v ;
//
Setting
np =5;
the
task
// # o f
prediction
steps
(1= p r e d i c t i o n
from
the
model )
ordE =2;
yp=z e r o s ( 1 , nd ) ;
V=1e −5∗ e y e ( 7 , 7 ) ;
ka =0;
//
Time
for
l o o p ===============================================
t =(ordE + 1 ) : ( nd−np+1)
/ / ESTIMATION
p s=g e n p s i ( t , ordE , y t , u t ) ;
//
regression
Ps =[ y t ( t )
//
extended
//
update
ps ] ' ;
V=V+Ps ∗ Ps ' ;
Vp=V ( 2 : $ , 2 : $ ) ;
Vyp=V ( 2 : $ , 1 ) ;
thE=i n v ( Vp ) ∗ Vyp ;
//
//
//
vector
reg .
vec .
of V
partitionoing
point
of V
estimates
Prediction
y i=y t ( 1 : ( t
for
−1));
j = 0 : ( np − 1)
//
old
//
loop
data
//
auxiliary
//
final
of
( at
time
t)
prediction
p s=g e n p s i ( t+j , ordE , y i , u t ) ;
y i ( t+j )= p s ∗ thE ;
prediction
end
yp ( t+np −1)= y i ( t+np − 1 ) ;
end
//
end
of
time
prediction
at
t
l o o p ====================================
21
//
Results
s =(ordE + 1 ) : ( nd−np ) ;
p l o t ( s , y t ( s ) , s , yp ( s ) )
l e g e n d ( ' output ' , ' p r e d i c t i o n ' ) ;
Postup p°i °ízení ilustruje následující program (stejný, jako pro °ízení se známými parametry,
ale dopln¥ný adaptivitou)
/ / ADAPTIVE CONTROL WITH DYNAMIC REG. MODEL WITH CONSTANT
//
multivariate
input
//
control
setpoint
with
//
penalization
//
receding
clc ,
clear ,
of
and
input
horizon
in
output
increments
control
//
clear
[ u , t , n ]= f i l e ( ) ;
//
find
c h d i r ( dirname ( n ( 1 ) ) ) ;
//
set
getd ( ) ;
//
load
rand ( ' seed ' , 2 ) ;
//
set
//
Setting
close
the
all
working
working
functions
seed
//
length
of
prior
nd = 2 0 0 ;
//
length
of
control
I_typU =2;
//
type
I _ a d a p t =1;
//
adaptive
Setting
the
for
of
input
1= n o i s e ,
control
2= s i n ,
1=y e s ,
//
model
0=no ,
order
− . 2 ; − .05 − . 3 ] ;
Sim . Cy . t h ( 2 ) . a = [ . 1 − . 1 ; . 0 2 − . 1 ] ;
Sim . Cy . t h b 0 =[1 − . 4 ; 1 . 3 − . 4 ] ;
//
a2
//
b0
Sim . Cy . t h ( 1 ) . b = [ . 1
0;0
.2];
//
b1
Sim . Cy . t h ( 2 ) . b = [ . 1
0;0
.1];
//
b2
//
k
//
noise
Sim . Cy . t h ( 1 ) . a = [ . 3
−1];
Sim . Cy . s d = . 5 ;
//
a1
( const )
stdev
ny= s i z e ( Sim . Cy . t h ( 1 ) . a , 1 ) ;
//
dimension
nu= s i z e ( Sim . Cy . t h ( 1 ) . b , 2 ) / ( 1 + o r d ) ;
//
and
u
t h=Sim . Cy . t h b 0 ;
for
i =1: o r d
t h =[ t h
//
Sim . Cy . t h ( i ) . a
parameters
−>
Sim . Cy . t h k ] ;
k=Sim . Cy . t h k ;
s d=Sim . Cy . s d ;
//
Setting
the
vector
Sim . Cy . t h ( i ) . b ] ;
end
t h =[ t h
current
control
22
th
of
y
dir
generator
estimation
simulation
Sim . Cy . o r d =2;
Sim . Cy . t h k = [ 1 ;
from
random
task
n i =150;
//
directory
directory
3=jumps
Con . Cy . nh =3;
//
length
Con . Cy . om=1;
//
penalty :
of
Con . Cy . l a = . 0 1 ;
//
penalty :
control
interval
y ( t )^2
( u ( t )− u ( t
− 1))^2
nh=Con . Cy . nh ;
// c o n v e r s i o n
to
state
model
[ XX, XX, XX, nx , ny , nu ]= r e g 2 s t ( th , o r d ) ;
//
penalization
matrices
Om=z e r o s ( nx , nx ) ;
Om( 1 : ny , 1 : ny)=Con . Cy . om∗ e y e ( ny , ny ) ;
//
output
//
input
nn=ny+nu ;
for
i =1: nu
Om( ( ny+ i ) , ( ny+ i ))= Con . Cy . l a ;
//
( only
increment
for
o r d >1
Om( ( ny+ i +nn ) , ( ny+ i +nn ))=Con . Cy . l a ;
Om( ( ny+ i ) , ( ny+ i +nn))= −Con . Cy . l a ;
Om( ( ny+ i +nn ) , ( ny+ i ))= −Con . Cy . l a ;
end
//
setpoint
generation
s t p=g e n s t p ( ny , nd , [ 3 ; 5 ] , [ . 9 5 ; . 9 2 ] , [
//
Prior
estimation
with
selected
−5;10]);
input
signal
y i=z e r o s ( ny , n i ) ;
select
I_typU
case
1,
u i=r a n d ( nu , n i , ' norm ' ) ;
case
2,
u i =[ s i n ( 2 0 ∗ ( 1 : n i ) / n i ) ; c o s ( 2 0 ∗ ( 1 : n i ) / n i ) ] ;
case
3,
u i=s i g n ( [ s i n ( 2 0 ∗ ( 1 : n i ) / n i ) ; c o s ( 2 0 ∗ ( 1 : n i ) / n i ) ] ) ;
end
V=z e r o s ( nx+ny+nu , nx+ny+nu ) ;
for
i =( o r d + 1 ) : n i
p s=g e n p s ( o r d , i , y i , u i ) ;
y i ( : , i )= t h ∗ p s + s d ∗ r a n d ( ny , 1 , ' norm ' ) ;
Ps =[ y i ( : , i ) ;
ps ] ;
V=V+Ps ∗ Ps ' ;
//
prior
information
matrix
end
t h e t a=v2thN (V/ n i , ny ) ;
//
TIME LOOP f o r
//
adaptive
prior
parameters
c o n t r o l =========================
S= l i s t ( ) ;
y=z e r o s ( ny , nd ) ;
y(: ,1:2)=[10
10; −10
−10];
u=z e r o s ( nu , nd ) ;
R= . 0 0 0 0 1 ∗ e y e ( nx , nx ) ;
for
//
regularization
t =( o r d + 2 ) : ( nd−nh )
/ / ESTIMATION
if
I _ a d a p t==1
p s=g e n p s ( o r d , t − 1 , y , u ) ;
Ps =[ y ( : , t
−1);
ps ] ;
V=V+Ps ∗ Ps ' ;
23
for
inversion
!!)
t h e t a=v2thN (V/ ( t+n i ) , ny ) ;
elseif
//
current
//
known
estimates
I _ a d a p t==0
t h e t a=th ' ;
parameters
else
//
prior
pt . e s t i m a t e s
are
used
end
[M, N, A, nx , ny , nu ]= r e g 2 s t ( t h e t a ' , o r d ) ;
/ / GENERATION OF CONTROL LAW ON INTERVAL
for
i =nh : − 1 : 1
M( 1 : ny , $)=− s t p ( : , t+ i )+k ;
T=R+Om;
A=N' ∗ T∗N ;
B=N' ∗ T∗M;
C=M' ∗ T∗M;
S ( t )= i n v (A) ∗ B ;
R=C−S ( t ) ' ∗ A∗ S ( t ) ;
end
/ / GENERATION OF OPTIMAL CONTROL ( o n l y
//
construction
u ( : , t )=−S ( t ) ∗ x ;
//
input
y ( : , t )= t h
end
//
one
x=genph ( o r d , t , y , u ) ;
//
of
∗[u (:
time
, t );
step )
of
state
evaluation
x ]+ s d ∗ r a n d ( ny , 1 , ' norm ' ) ;
//
new
output
l o o p ========================================
Results
100
600
500])
s = 1 : ( nd−nh ) ;
//
first
output
subplot (211)
p l o t ( s , y ( 1 , s ) , s , u ( 1 , s ) , ' − − ' , s , s t p ( 1 , s ) , 'm. ' , ' m a r k e r s i z e ' , 2 )
title
//
" Adaptive
second
control
with
regression
model "
output
subplot (212)
plot ( s , y (2 , s ) , s , u (2 , s ) , ' − − ' , s , stp (2 , s ) , ' c . ' , ' markersize ' , 2 )
// s e t ( g c f ( ) , ' p o s i t i o n ' , [ 5 0 0
50
600
500])
7 Model sm¥si a jeho odhad
7.1 Model
Model sm¥si tvo°í mnoºina komponent (regresních nebo jiných model·) a model ukazovátka jeho výstupem je diskrétní veli£ina indikující v kaºdém £ase aktivní komponentu.
24
P°íklad
Budeme uvaºovat sm¥s s dv¥ma statickými komponentami a statickým modelem ukazovátka (tj.
hodnota ukazovátka nezávisí na minulé hodnot¥) a dvourozm¥rným výstupem.
Komponenta 1 -
f1 (yt |k)
Komponenta 2 -
y1;t
y2;t
y1;t
y2;t
=
8
1
0
5
e1;t
e2;t
e1;t
e2;t
+
f2 (yt |k)
Model ukazovátka
=
+
f (ct |α) = αct
ct
αct
1
2
0.2
0.8
Co to znamená (nap°. v simulaci)
Jedna statická komponenta simuluje kope£ek. Dv¥ budou tedy simulovat dva kope£ky se st°edy
[8, 1]
0
a
0
[0, 5] .
Jejich ²í°ky jsou stejné; ob¥ mají jednotkovou kovarian£ní matici. Mnoºství bod·
v kope£cích je dáno pravd¥podobností, se kterou model ukazovátka generuje 1 nebo 2. Tady je
to 0.2 a 0.8.
Následuje program a obrázek.
//
Simulation
of
a
simple
mixture
//
clc ,
//
clear
[ u , t , n ]= f i l e ( ) ;
clear ,
close
//
find
c h d i r ( dirname ( n ( 1 ) ) ) ;
//
set
nd = 5 0 0 ;
t h =[8
1;0
al =[.2
for
.8];
t =1: nd
//
time
working
of
directory
directory
//
number
//
pars
of
components
steps
//
pars
of
pointer
//
pointer
//
output
c t ( t )= c ;
//
storing
y t ( : , t )=y ;
//
end
of
time
title
save
p l o t ( yt ( 1 , : ) , yt ( 2 , : ) , ' . ' )
" Simulation
s i m p l e . dat
ct
of
generation
variables
l o o p ===================================
Results
scf (1);
columns )
l o o p ================================
y=t h ( : , c )+ r a n d ( 2 , 1 , ' n ' ) ;
//
( in
model
c=sum ( r a n d ( 1 , 1 , ' u ' ) > cumsum ( a l ) ) + 1 ;
end
//
5] ';
all
working
with
a
simple
mixture "
yt
25
7.2 Odhad
U modelu sm¥si odhadujeme
1. parametry v²ech komponent
(θ, r,
2. parametry modelu ukazovátka
resp. β),
(α),
3. hodnoty ukazovátka.
Bez odhadu aktivní komponenty (hodnoty ukazovátka) by ne²ly odhadovat parametry. Je pot°eba, aby data p°icházející z daného pracovního módu systému (komponenty systému) ²la
k odpovídající komponent¥ modelu a ne ke v²em komponentám - to by se komponenty jen
p°etahovaly sem tam.
Proto je výsledný algoritmus odhadu následující.
1. Zm¥°íme nová data.
2. Odhadneme ukazovátko
(a) ur£íme blízkost dat ke komponentám
(b) spo£teme váhový vektor
w
jako sou£in blízkosti a pravd¥podobnosti komponent.
3. P°epo£teme statistiky v²ech parametr· s váºenými daty.
4. Ur£íme bodové odhady parametr·.
Komentá°
1. Jasné.
26
2. Blízkost se spo£te tak, ºe se do modelu komponenty dosadí existující bodové odhady parametr· (tj. odhady z minulého kroku) a nov¥ zm¥°ená data. Tuto hodnotu pak nazýváme
blízkost dat k dané komponent¥.
Pravd¥podobnosti komponent jsou dány prvky parametru
α.
Tato pravd¥podobnost je
úm¥rná po£tu p°ípad·, kdy byla v minulosti daná komponenta aktivní.
Sou£in blízkosti a pravd¥podobnosti dá váhy komponent - pravd¥podobnosti aktivit jednotlivých komponent.
Pro výpo£et vah
w
jsou zapot°ebí bodové odhady parametr· z minulého kroku (nebo
apriorní).
3. P°epo£et statistik se provádí stejn¥ jako p°i odhadu samostatného modelu, jen data, která
do p°epo£tu vstupují se násobí vahou komponenty.
4. Bodové odhady komponent i modelu ukazovátka se po£ítají stejn¥, jako u samostatných
model·. Bodové odhady jsou p°ipraveny pro výpo£ty v dal²ím kroku.
Pro ná² p°íklad bude:
Komponenty jsou Gaussovky s regresním vektorem
Ψt = [y1;t , y2;t , 1]
a tedy kaºdá z nich bude mít statistiku ve tvaru

Vi;t
V11
=  V21
V31
V12
V22
V32

0
V13
Vy [2x2] Vyψ [2x1]
V23  =
Vyψ [1x2]
Vy
V33
κi;t = [κ1 , κ2 ]
i = 1, 2
a
[i x j]
ozna£uje dimenzi sub-matice.
P°epo£et statistik je
0
Vi;t = Vi;t−1 + wi Ψt Ψt
κi;t = κi;t−1 + wi
i = 1, 2.
Bodový odhad
0
θ̂i = Vψ−1 Vyψ
r̂i =
Vy − θ̂Vyψ
κi
i = 1, 2.
Ukazovátko
ct
je diskrétní náhodný proces s dv¥ma hodnotami
ct ∈ {1, 2} .
Model ukazovátka je kategorický statický model pro binární veli£inu
ct
f (ct |α)
1
2
α1
α2
Statistika má stejný tvar jako model
ct
νt
1
2
ν1
ν2
27
P°epo£et statistiky
νi;t = νi;t−1 + wi
i = 1, 2.
Bodový odhad
α̂t =
[ν1;t , ν2;t ]
P
νi;t
Odhad ukazovátka realizujeme tak, ºe dosadíme do Gaussovek (po£ítáme pod logaritmem) a
násobíme odhadem
α.
Postup odhadu ilustruje následující program
//
Estimation
of
a
simple
mixture
//
clc ,
//
clear
[ u , t , n ]= f i l e ( ) ;
clear ,
close ,
close
//
find
all
c h d i r ( dirname ( n ( 1 ) ) ) ;
//
set
//
load
//
length
nu=r a n d ( 2 , 1 , ' u ' ) ;
//
p o i n t e r −model
statistics
a l=nu /sum ( nu ) ;
//
parameter
working
working
directory
directory
getd ( ) ;
load
s i m p l e . dat
ct
yt
nd=max ( s i z e ( y t ) ) ;
V= l i s t ( ) ;
thE= l i s t ( ) ;
of
( from
S im p le Mi x Si m )
data
cvE= l i s t ( ) ;
thE ( 1 ) = [ 6 ; 2 ] ;
//
in
simulation
was
8 ,1
thE ( 2 ) = [ 1 ; 3 ] ;
//
in
simulation
was
0 ,5
for
data
of
initial
parameters
j =1:2
V( j ) = [ thE ( j ) ; 1 ] ∗ [ thE ( j ) ; 1 ] ' ;
cvE ( j ) = . 1 ∗ e y e ( 2 , 2 ) ;
//
initial
//
initial
//
data
inf .
noise
matrix
covariance
end
ka=r a n d ( 1 , 2 , ' u ' ) ;
for
t =1: nd
//
//
time
computation
for
of
counter
l o o p ===================================
weights
j =1:2
[ xxx , mL( j ) ] = GaussN ( y t ( : , t ) , thE ( j ) , cvE ( j ) ) ;
//
end
mp=mL−max (mL ) ;
//
normalization
m=e x p (mp ) ;
//
exponent
//
component
//
store
w=m. ∗ a l ;
w=w/sum (w ) ;
wt ( : , t )=w ;
//
recomputation
of
statistics
Ps =[ y t ( : , t ) ; 1 ] ;
for
j =1:2
V( j )=V( j )+w( j ) ∗ Ps ∗ Ps ' ;
ka ( j )= ka ( j )+w( j ) ;
28
weights
proximity
//
point
estimates
thE ( j )=( i n v (V( j ) ( 3 , 3 ) ) ∗ V( j ) ( 3 , 1 : 2 ) ) ' ;
if
t >50
//
in
the
beginning ,
cvE ( j )=(V( j ) ( 1 : 2 , 1 : 2 )
cvE
is
fixed
− thE ( j ) ∗V( j
) ( 3 , 1 : 2 ) ) / ka ( j ) ;
end
end
nu=nu+w ;
//
a l=nu /sum ( nu ) ;
//
t h 1 ( : , t )=thE ( 1 ) ;
//
t h 2 ( : , t )=thE ( 2 ) ;
//
and
stor
statistics
parameter
for
plot
end
// R e s u l t s
scf (1);
p l o t ( wt ' )
100
800
600])
s u b p l o t ( 2 1 1 ) , p l o t ( th1 ' )
s u b p l o t ( 2 1 2 ) , p l o t ( th2 ' )
save
estim . dat
8 Klasikace klasická a s modelem sm¥si
K-means
29
DB-scan
Hierarchická
Logistická
Sm¥sová
U£ení
Odhad modelu sm¥si komponent se provádí ve dvou krocích
(i)
klasikace,
(ii)
odhad. Klasi-
kace je tedy sou£ástí odhadu. M·ºe probíhat standardním zp·sobem: pro fázi u£ení pouºijeme
u£ící mnoºinu dat, která pracuje s u£itelem (tedy známe správné aktuální t°ídy, tj. hodnoty
ukazovátka). P°i tom provádíme ob¥ fáze odhadu - klasikaci i odhadování.
Testování
Po vy£erpání u£ící mnoºiny dat (bez znalosti aktuálních t°íd) aktuální t°ídy odhadujeme. Tady
provádíme jen klasikaci (konstrukci váhového vektoru
bodových odhad·). Pro odhad aktuální t°ídy v £ase
t
wt
bez následného p°epo£tu statistik a
vyuºijeme váhový vektor
wt
- odhadnutá
t°ída je dána indexem u maximálního prvku váhového vektoru.
Do-u£ování
Pokud se i nadále, t°eba jen ob£as, dozvíme, která t°ída byla skute£n¥ aktuální,
m·ºeme klasika£ní algoritmus p°iu£it - pro známou aktuální t°ídu provedeme i p°epo£et
statistik a nová bodové odhady parametr·.
Postup ilustruje následující program
//
Testing
of
//
connected
//
data
for
a
simple
with
mixture
simpleMixSim . s c e
testing
are
also
from
and
simpleMixEst . s c e
simpleMixSim
//
clc ,
//
clear
[ u , t , n ]= f i l e ( ) ;
clear ,
//
find
c h d i r ( dirname ( n ( 1 ) ) ) ;
//
set
//
load
results
//
load
data
getd ( ) ;
close ,
estim . dat
load
simple2 . dat
t =1: nd
//
//
yt
time
computation
for
all
working
working
directory
directory
mode ( 0 )
load
for
close
of
ct
for
of
learning
testing
l o o p ===================================
weights
j =1:2
[ xxx , mL( j ) ] = GaussN ( y t ( : , t ) , thE ( j ) , cvE ( j ) ) ;
//
end
mp=mL−max (mL ) ;
//
normalization
m=e x p (mp ) ;
//
exponent
//
component
//
store
w=m. ∗ a l ;
w=w/sum (w ) ;
wt ( : , t )=w ;
end
30
weights
proximity
// R e s u l t s
[ xxx , c e ]=max ( wt , ' r ' ) ;
c e=c e ' ;
wrong=sum ( c t~=c e )
//
point
estimates
//
number
of
wrong
of
pointer
dlassifications
s =1:100;
100
800
600])
p l o t ( s , c t ( s ) , ' or ' , ' m a r k e r s i z e ' , 1 4 )
pl ot ( s , ce ( s ) , ' . b ' , ' markersize ' , 1 0 )
title
' Testing
of
simple
max ( s )
mixture
min ( c t ) − . 2 max ( c t ) + . 2 ] )
estimation '
9 Stavový model a ltrace
Stavový model má tvar
f (xt |xt−1 , ut ) → xt = M xt−1 + N ut + F + wt
f (yt |xt , ut ) → yt
= Axt + But + G + vt
Odhad stavu
dt ={yt ,ut }
f (xt−1 |d (t − 1)) |{z}
→ f (xt |d (t))
predikce
z}|{
→
|{z}
f (xt+1 |d (t))
f iltrace
ˆ
f (xt |d (t − 1)) =
f (xt |xt−1 , ut ) f (xt−1 |d (t − 1)) dxt−1
predikce x
f (xt |d (t)) ∝ f (yt |xt , ut ) f (xt |d (t − 1)) ltrace x
ˆ
f (yt |ut , d (t − 1)) = f (yt |xt , ut ) f (xt |d (t − 1)) dxt predikce
Pro normální rozd¥lení dostaneme
Kalman·v ltr
Zadání ve Scilabu
31
y
, tj. rekurzi pro statistiky.
[xt,Rx,yp]=Kalman(xt,yt,ut,M,N,F,A,B,G,Rw,Rv,Rx)
Rx
kovariance odhadu stavu (po£áte£ní hodnota 10
Rv
kovariance ²umu modelu pro výstup
Rw
kovariance ²umu modelu pro stav
Na po£áte£ních hodnotách
Rv
a
Rw
3
∗ eye(nx , nx )
vt = yt − Axt − But − G
wt = xt+1 − M xt − N ut − F
velmi záleºí a není v·bec jednoduché je správn¥ odhadnout.
9.1 Známé parametry modelu
P°íklad - ltrace ²umu
M¥°íme za²um¥ný signál
yt
a chceme získat £istý signál
xt .
- m¥°ení je £istý signál + ²um
y t = x t + vt
- pro vývoj £istého signálu p°ipustíme jen zm¥ny
wt ,
co je víc, p°ipisujeme ²umu
xt+1 = xt + wt
Kovariance se musí volit tak ²ikovn¥, aby dovolily vývoj stavu, ale separovaly od n¥j ²um.
Program
//
//
Filtering
of
noise
by Kalman
filter
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
clc ,
//
clear
[ u , t , n ]= f i l e ( ) ;
clear ,
close
//
find
c h d i r ( dirname ( n ( 1 ) ) ) ;
//
set
mode ( 0 )
//
write
all
working
working
values
directory
directory
if
no
getd ( )
//
Simulation
i =0;
t =0:.1:2∗% pi
for
i = i +1;
x ( 1 , i )=5 ∗ c o s ( t ) ;
//
x ( 2 , i )=2 ∗ s i n ( t ) ;
//
pure
signal
( ellipse )
end
x =[1
//
rotated
//
length
y t=x + . 5 ∗ r a n d ( 2 , nd , ' n ' ) ;
//
noisy
//
.4;
0
1]∗x ;
nd= s i z e ( x , 2 ) ;
ellipse
of
the
signal
signal
Filtration
xt ( : , 1 ) = z e r o s ( 2 , 1 ) ;
//
initial
Rw= . 0 8 ∗ e y e ( 2 , 2 ) ;
//
state
state
model
32
estimate
noise
;
Rv=1∗ e y e ( 2 , 2 ) ;
//
output
Rx=1000 ∗ e y e ( 2 , 2 ) ;
//
noise
A=e y e ( 2 , 2 ) ;
//
matrices
C=e y e ( 2 , 2 ) ;
//
of
F=[0;0];
//
model
for
model
of
noise
stste
estimate
state
t =1: nd
//
Kalman
filtration
[ x t ( : , t + 1 ) , yp ( : , t ) , Rx ] = . . .
K F i l t ( x t ( : , t ) , y t ( : , t ) , A, C , F , Rw, Rv , Rx ) ;
end
//
Results
scf (1);
plot (x (1 ,:) , x (2 ,:) , ' g . ')
p l o t ( yt ( 1 , : ) , yt ( 2 , : ) , ' r . ' )
p l o t ( xt ( 1 , : ) , xt ( 2 , : ) , ' b ' )
a Kalman·v ltr
function
[ x t , yp , Rx , e y ]= K F i l t ( x t , y t , A, C , F , Rw, Rv , Rx )
//
[ x t , yp , Rx , e y ]= K F i l t ( x t , y t , A, C , F , Rw, Rv , Rx )
//
Kalman
filetr
for
the
in
the
following
form
yp = C∗ x t
//
//
model
x t = A∗ x t + F
//
...
first
compute
//
t
//
xt
state
//
yp
predicted
− 1| t −1
output
//
prediction
ey
state
//
data
yt
and
then
t | t −1
filtration ,
−−>
i .e.
t |t
output
covariance
error
covariance
matrix
sample
/ / A, C , F model
parameters
/ / Rw
state
covariance
/ / Rv
output
//
indicator
ie
−−>
estimate
/ / Ry
/ / Rx
prediction
covariance
i e =0 => o n l y
state
prediction
is
computed
//
//
Prediction
x t=A∗ x t+F ;
//
time
updt
of
state
Rx=Rw+A∗ Rx ∗A ' ;
//
time
updt
of
state
yp=C∗ x t ;
//
data
prediction
Ry=Rv+C∗ Rx ∗C ' ;
//
output
//
covariance
Filtration
prediction
covariance
Ryy=Ry+1e −8∗ e y e ( Ry ) ;
Rx=Rx−Rx ∗C' ∗ i n v ( Ryy ) ∗ C∗ Rx ;
//
e y=y t −yp ;
KG=Rx ∗C' ∗ i n v ( Rv ) ;
x t=x t+KG∗ e y ;
33
data
updt
//
prediction
//
Kalman
//
data
of
state
error
gain
updt
of
state
covariance
endfunction
9.2 S neznalostí parametr· modelu
Roz²í°ení stavu
Jestliºe n¥který parametr modelu není znám, povaºujeme ho za neznámou veli£inu a p°idáme
ho k ostatním neznámým veli£inám, ke stavu. Nap°. máme model pro dvourozm¥rný stav
0
[x1;t , x2;t ]
a skalární výstup
kde
a
xt =
yt
xt
=
a
0
0 1−a
yt
=
[0.2, 0.8] xt
xt−1 +
neznáme. Za°adíme jej do neznámých veli£in
xt .
z1;t
= x1;t
z2;t
= x2;t
z3;t
= a
1
1
ut + wt
Denujeme nový stav
zt
Model pak bude mít tvar (bez ²umu, který je v tuto chvíli nezajímavý)

z3;t−1
zt =  0
0
0
1 − z3;t−1
0


 
z1;t−1
1
1
1   z2;t−1  +  1  ut
0
1
z3;t−1
yt = [0.2, 0.8, 0] zt−1
Vidíme, ºe výsledný model je nelineární - stavy jsou v sou£inu. Proto musíme linearizovat.
Model má tvar
zt
= g (zt−1 ) + N ut
yt
= Azt
Linearizace
Linearizaci provedeme tak, ºe nelineární £ásti modelu nahradíme prvními dv¥ma £leny Taylorova
2
rozvoje. Rozvoj d¥láme v bod¥ posledního odhadu
Pro rozvoj pot°ebujeme hodnotu
g
v posledním odhadu
ẑt−1


ẑ3;t−1 ẑ1;t−1
g (ẑt−1 ) =  (1 − ẑ3;t−1 ) ẑ2;t−1 
ẑ3;t−1
2 Musíme
si uv¥domit, ºe model je vztah mezi prom¥nnými
g (x) v ur£itém pevném
.
g (x) = x̂ + g 0 (x − x̂) . Za
x̂.
xt−1
a
xt .
Taylor·v rozvoj d¥láme pro nelineární
funkci
bod¥
Hodnotu funkce pak vyjad°ujeme v okolí tohoto bodu, který vyjád°íme
jako
pevný bod volíme vºdy poslední odhad stavu
stále jde jen o pravou stranu modelu! ádné
xt
x̂t−1
do hry p°i rozvoji nevstupuje.
34
stavu
xt−1 .
V²imn¥me si, ºe
a derivaci (Hessovu matici) funkce
g
ve stejném bod¥

ẑ3;t−1
g 0 (ẑt−1 ) =  0
0
ẑt−1
0
1 − ẑ3;t−1
0

ẑ1;t−1
−ẑ2;t−1 
1
Linearizovaný model stavu bude
zt = g (ẑt−1 ) + g 0 (ẑt−1 ) (zt−1 − ẑt−1 ) + N ut
a po úprav¥
zt = g 0 (ẑt−1 ) zt−1 + N ut + g (ẑt−1 ) − g 0 (ẑt−1 ) ẑt−1
{z
}
|
| {z }
M̃
kde
F̃

a
ẑ3;t−1
M̃ =  0
0
 

0
1 − ẑ3;t−1
0
ẑ3;t−1 ẑ1;t−1
ẑ3;t−1
F̃ =  (1 − ẑ3;t−1 ) ẑ2;t−1  −  0
ẑ3;t−1
0

ẑ1;t−1
−ẑ2;t−1 
1
0
1 − ẑ3;t−1
0


ẑ1;t−1
ẑ1;t−1
−ẑ2;t−1   ẑ2;t−1 
1
ẑ3;t−1
10 Testy hypotéz
Bayesovské testování
Testujeme hypotézy o tom, který model nejlépe popisuje daný datový vzorek. Máme n hypotéz
Hi , i = 1, 2, · · · , n a kaºdá z nich specikuje sv·j vlastní model jako kandidáta na nejlep²í popis
dat. Modely mohou být
1. r·zné i co do struktury,
2. strukturou stejné, ale li²ící se
(a) hodnotami parametr·,
(b) mnoºinami p°ípustných parametr·.
Toto obecné pojetí bylo navrºeno Dr. Peterkou a má následující podobu:
n hypotéz Hi a datový vzorek d (t) . Po£ítáme pravd¥podobnosti f (Hi |d (t)) pro i =
1, 2, · · · , n. Jako vít¥znou hypotézu vezmeme tu, která má tuto pravd¥podobnost maximální.
Máme
Pravd¥podobnosti po£ítáme takto
f (Hi |d (t)) ∝ f (d (t) |Hi ) f (Hi )
| {z }
prior
První hp na pravé stran¥ je hp datového vzorku (za p°edpokladu
modelem podle
Hi ,
tedy
ˆ
f (d (t) |Hi ) =
f (d (t) , Θ|Hi ) dΘ =
35
Hi ),
kterou chceme popsat
ˆ
fi (d (t) |Θi ) f (Θi ) dΘi
| {z }
=
prior
kde
fi
ozna£uje model podle hypotézy
který specikuje
i-tý
Hi
Θi , f (Θi ) je apriorní
hp pro Θi ,
f (Θi ) = δ Θ − Θ̂ nebo intervalem,
se svými parametry
model, a to bu¤ hodnotou
Θ̂,
pak
ve kterém parametr leºí. V prvém p°ípad¥ jednodu²e hodnotu parametru dosadíme, ve druhém
musíme provést nazna£enou integraci.
Poznámka
Specikací modelu a jeho parametr· se hypotéza vy£erpala a uº ji v podmínce nemusíme psát.
První hp vpravo je likelihood
Lt (Θi )
pro parametr
Lt (Θi ) =
t
Y
Θi
který se spo£te jako sou£in model·
fi (dt |ψt , Θi )
τ =1
Hledané pravd¥podobnosti tedy jsou
ˆ
f (Hi |d (t)) ∝
Lt (Θi ) f (Θi ) dΘi f (Hi )
∝ Lt Θ̂i f (Hi )
Pro alternativní model
nd−n
an (1 − a)
pro
pro pevné
a ∈ (0, 1)
obecn¥
Θi
je podíl,
n
(10.2)
je po£et jedni£ek a
po£et dvojek, je postup tetování ilustrován v následujícím programu
//
Testing
of
a
proportion
model
//
clc ,
//
clear
[ u , t , n ]= f i l e ( ) ;
clear ,
close ,
close
//
find
c h d i r ( dirname ( n ( 1 ) ) ) ;
//
set
nd = 2 0 0 ;
//
number
a =.5;
//
true
a0 = . 3 ;
//
tested
data
d t =( r a n d ( 1 , nd)>a ) + 1 ;
//
na=sum ( d t ==1);
/ / No
B0=b e t a ( na − 1 , nd−na − 1 ) ;
//
working
of
directory
directory
data
proportion
of
beta
B i=d i s t f u n _ b e t a c d f ( a0 , na − 1 , nd−na − 1 ) ;
all
working
−
data
proportion
1
function
//
inc .
beta
fc
b1=B0 ∗ B i
//
for <
int_0 : a0 ( L i k )
b3=B0 ∗ (1 − B i ) ;
//
for >
int_a0 : 1 ( Lik )
P( 1 ) = b1 / ( 3 ∗ a 0 ) ;
//
prob .
for <
P( 2 ) = a 0^na ∗ (1 − a 0 ) ^ ( nd−na ) / 3 ;
//
prob .
for =
P( 3 ) = b3 / ( 3 ∗ ( 1 − a 0 ) ) ;
//
prob .
for >
36
(10.1)
nd − n
je
Ph=P/sum (P ) ;
//
normalizing
/ / RESULTS
d i s p ( Ph ' , ' P r o b a b i l i t i e s
of
hypotheses ' )
Test hypotéz pomocí sm¥sí
Podíváme-li se na vztahy
(10.1)
a 10.2 zjistíme, ºe pro modely dané pevnými parametry, se
nápadn¥ podobají klasikaci v odhadu sm¥si. Na rozdíl od odhadu sm¥si tady ale z·stávají
parametry konstantní, neodhadují se. Tedy co se pr·b¥ºn¥ odhaduje jsou hypotézy (jako ukazovátko) a parametr ukazovátka
hypotéz, tedy
f (Hi |d (τ )) ,
pro
α, coº je vektor stacionárních pravd¥podobností jednotlivých
τ = 1, 2, · · · , t. Finální α je tedy hp f (Hi |d (t)) podle které
testujeme. Algoritmus je nejlépe patrný z programu
//
//
Hypotheses
with
three
static
normal
models
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
clc ,
//
clear
[ u , t , n ]= f i l e ( ) ;
clear ,
close ,
close
//
find
c h d i r ( dirname ( n ( 1 ) ) ) ;
//
set
all
working
working
directory
directory
nd = 1 0 0 ;
//
Simulation
for
t =1: nd
y ( : , t ) = [ 3 ; 5 ] + 1 ∗ randn ( 2 , 1 ) ;
end
//
Inicialization
H=f n o r m ( [ 1
1
1]);
t h= l i s t ( ) ;
th ( 1 ) = [ 2 ; 3 ] ;
th ( 2 ) = [ 3 ; 4 ] ;
th ( 3 ) = [ 3 ; 1 ] ;
sd = . 5 ;
//
Proximities
l p=o n e s ( 1 , 3 ) ;
for
t =1: nd
for
i =1:3
l p ( i )= l p ( i )+GaussN ( y ( : , t ) , t h ( i ) , s d ) ;
//
probabilities
in
logarithm
end
end
l p=l p −max ( l p ) ;
//
p=e x p ( l p ) ;
//
p r e −n o r m a l i z a t i o n
taking
exponent
pH=p . ∗ H ;
//
proxim
∗
fH=pH/sum ( pH ) ;
//
final
f (H)
normalization
37
//
hypotheses
//
Results
d i s p ( fH , " P r o b a b i l i t i e s
of
hypotheses ")
11 Aplikace
Uvedeme n¥které jiº hotové aplikace. Sem by se postupn¥ m¥ly p°idávat dal²í.
11.1 Prediction of trac ow intensity (regression model)
At a specic point of an urban trac network we measure intensities
and green proportions
Gt
It
on magnetic detectors
on a light signalization. Our goal is to construct a model describing
the future values of intensity in dependence on the other measured variables.
Solution
From what has been said it follows, we are looking for the conditional pdf (model)
f (It+n |I1:t , G1:t+n−1 , P )
where
n
is a number of steps of prediction and
P
denotes prior information (if any exists).
Notice, in the model we suppose, we will know the control strategy
prediction interval.
G1:t+n−1
for the whole
If not, we would need to know a model according to which the control
works.
As a model representation, we stick to the regression model (for better lucidity with the indexes
shifted by one ahead)
0
It+1 = ψt θ + et
with carefully dened regression vector
tion
µ=0
ψt
(11.1)
and normally distributed noise
and constant (let us suppose known) variance
et
with zero expecta-
σ2 .
Now, for the regression vector. If we decide that the nature of the situation is static, it means
that older values of variables than the present do not aect the next ones, we can choose the
regression vector as
0
ψt = [It , Gt , 1]
where 1 stands for the absolute term of the model.
Thus, the model at time
t
(11.1) has the form
It+1 = θ1 It + θ2 Gt + θ3 + et
which corresponds to the pdf model form
38
f (It+1 |ψt , θ)
Now as for the prediction. For
n = 1 we get one step prediction of no exceptional importance, as
n = 2 - two steps prediction.
it is performed just by the model itself. So, we will touch the case
In pdfs, we get
ˆ ˆ
f (It+2 |I1:t , G1:t+1 , P ) =
f (It+2 , It+1 , θ|I1:t , G1:t+1 , P ) dIt+1 dθ =
ˆ ˆ
f (It+2 |It+1 , I1:t , G1:t+1 , P, θ) f (It+1 |I1:t , G1:t+1 , P, θ) f (θ|I1:t , G1:t+1 , P ) dIt+1 dθ =
=
ˆ ˆ
=
f (It+2 |ψt+1 , θ) f (It+1 |ψt , θ) dIt+1 f (θ|I1:t , G1:t+1 , P ) dθ
where the rst two pdfs from the left are models at times
the description of the estimated parameter
θ.
t+1
and
t,
the last one represents
This one has to be developed in the process of
estimation.
The way described is optimal in the Bayesian sense, but computationally it is very dicult. The
integrals are rarely solvable analytically and we have to look for some numerical solutions. As
robust and reliable, Monte Carlo solutions are strongly recommended.
If we decide to separate the estimation and prediction in the sense that at each time
compute the point estimates of
θ
t
we
and substitute it into the model, we obtain very simple way
of the prediction. It consists in substituting into the model with the time going ahead. If the
real measured values are missing, we use predictions from the last steps.
Let us indicate the computation for a single time instant t. For a prediction
It+1
Iˆt+1
of the intensity
it holds
Iˆt+1 = θ1 It + θ2 Gt + θ3
which is the conditional expectation of the intensity
It+1
(see (
??)).
It+2 we would like to use the same formula but shifted
It+1 occurs in the regression vector. However, its value is
actually at time t). So, we replace it by the prediction, gained in
Now, for the prediction of the intensity
in time by one. Here, the intensity
not known to us (as we are
the last prediction step, i.e.
Iˆt+2 = θ1 Iˆt+1 + θ2 Gt+1 + θ3
If we substitute for
Iˆt+1
we obtain
Result
The point two step prediction is
Iˆt+2 = θ2 It + θ2 Gt+1 + θ1 θ2 Gt + θ3 (θ1 + 1)
39
11.2 Estimation of queue length (discrete model)
Let us monitor a queue in a single arm of a controlled crossroads and model its length in
dependence on its last value and on the green length of the signal light. Let us distinguish only
discrete values of the mentioned variables. Thus, we set the queue:
long) and the green:
Gt = 1, 2
Qt = 1, 2, 3 (no queue, short,
(short and long green).
Solution
??) has the structure
In the frame of this setup, the model according to (
[Gt , Qt−1 ]
[1, 1]
[1, 2]
[1, 3]
[2, 1]
[2, 2]
[2, 3]
Qt = 1
Θ1|11
Θ1|12
Θ1|13
Θ1|21
Θ1|22
Θ1|23
Qt = 2
Θ2|11
Θ2|12
Θ2|13
Θ2|21
Θ2|22
Θ2|23
Qt = 3
Θ3|11
Θ3|12
Θ3|13
Θ3|21
Θ3|22
Θ3|23
The statistics has the same structure, too. Its update according to (
??) is
Vi|jk;t = Vi|jk;t−1 + 1
for
Qt = i
and
Gt = j, Qt−1 = k .
All the other items of the statistics stay unchanged.
Result
The estimation of the parameter
Θ
lies in a plain normalization of the statistics
sum of its items along each row would be one. I.e. for the row
[j, k]
V,
so that the
we have
V·|jk
Θ·|jk = P3
i=1 Vi|jk
11.3 Classication of road elements safety (logistic model)
By road elements we mainly consider parts of crossroads or trac circles in an urban road net.
However, an attempt to include rural zones, too, has been made and will be presented, here.
The result of accident is accepted as a modeled variable
y.
means an accident accompanied with injury or death and
without injury.
accident.
1, 2, 3, 4
It takes two possible values:
y=1
y=0
with the meaning of accident
We are going to investigate the inuence of other variables on the type of
Namely they are: daytime (x1
= 1, 2, 3
= 1, 2 -
- clear, mist, rain, snow), speed (x3
- day, dusk/dawn, night), visibility (x2
normal, high), cause (x4
way of driving, overtaking, other), type of accident (x5
animal).
In this way we obtain a purely discrete model
40
= 1, 2, 3, 4
= 1, 2, 3, 4
=
- speed,
- danger, crash, x barrier,
logit (pk ) = xk θ + ek
(11.2)
pk = f (yk |xk , θ)
where
• yk
is their
• xk
is the regression vector, corresponding to
k th
measurement of the output (result of an accident)
yk
- it has the form
xk = [1, x1 , x2 , x3 , x4 , x5 ]
where the initial 1 stands for the model constant
• θ
is the vector of regression coecients
θ = [θ0 , θ1 , · · · , θ5 ].
A special feature of this task is a severe lack of data. Each data item corresponds to an accident
which costs money or even lives. Any articial augmentation of the data is not possible.
This fact implies that
•
the structure of the model must be chosen carefully (choice of variables into the regression
vector)
•
the estimation must use all available means bringing information (prior expert information)
•
a thorough validation of the estimated model must be performed (check of prediction
error).
Solution
A sample of 63 data items, i.e. couples
[yk , xk , k = 1, 2 · · · 63] was at disposal.
It was assembled
by the police of R on the highway II/114 in the north of the Czech Republic. The measurements
were collected over the period 2000 2006.
The structure of the regression model was optimized using the model
combinations of variables at disposal. The result is
quality
variables for
coecient
regression vector
0.65857
1
2
3
0.65857
1
2
3
5
0.65857
1
2
3
4
0.65857
1
2
3
4
0.71576
[ ]
41
5
(11.2)
for all possible
where the numbers coincide with the indexes of the variables in
x.
The symbol
[]
denotes
regression vector reduced only to a constant. The table shows, that even the best model structure
is not much better than the plain constant.
A standard logistic estimation with the best regression vector conguration, using the collected
data sample gives the parameters
θ = [0.37, 0.80, −0.52, 0.63]
So, in the estimated model, the higher is the value of the rst and third variable and the lower
id that of the second variable, the higher is the
logit
of the probability that
y =1
(accident
without injury) and vice verse. This follows directly from the denition of the model in
The result in graph is in the Figure
(11.2).
??.
Figure is missing
From the picture, it can be seen that only one zero has been caught. And no wonder. In the
data sample there is a big dominance of ones (fortunately, as
y = 1
means accident without
injury). Hoverer, we would like the prediction would be more precise. The only thing how to
do it is to combine the information brought by data with that provided by trac experts.
First of all, let us inspect how the data sample looks like. In the following table there are all
dierent regression vectors and counts of zeros and ones they result to the following table
No.
x1
x2
x3
y=0
y=1
1
1
1
1
6
23
2
1
1
2
1
1
3
1
2
1
0
1
4
1
2
2
0
1
5
1
3
1
0
1
6
1
3
2
2
1
7
1
4
1
2
0
8
1
4
2
0
3
9
2
1
1
0
2
10
3
1
1
0
11
11
3
3
1
2
0
12
3
4
1
0
3
13
3
4
2
0
3
The table shows that in reality (which means full discrete estimation), the regression vectors
No. 6, 7 and 11 point to
y = 0 and even the probabilities are not expressive.
The approximation
that regression does results in only one zero predicted.
A way, how to trim the probabilities, is to add some prior regression vectors and corresponding
output values, which support our belief. Hoverer, due to the approximation, not only the desired
predictions change but also some other that could become wrong. A way, how to proceed in
appalling the ctitious regression vectors to correct what is needed and spoil as few as possible
with the rest is still under investigation.
42
Result
As a possible result, we can generate predictions for all possible regression vectors and to predict
their dangerousness.
11.4 Testing of safety of road elements in urban trac network
This is a variant of the previous task. We shall demonstrate it on trac circles, however, it can
be applied for other trac elements like crossroads, overpasses, underpasses, grade crossings etc.
The main notion here is semi-accident.
It is a situation when the passing vehicle get into a
dangerous situation (more or less collision position). It can be e.g. a deviation from the regular
way of pass, a necessity of sudden change of direction, unexpected breaking and so on. From
this viewpoint, the modeled variable
basic values of the output:
y=0
y
will be the way of the pass of a vehicle. There are two
means a normal pass,
y=1
means a pass with semi-accident.
Possibly, various kinds of semi-accidents can be distinguished, bringing more values of output.
The variables belonging to regression vector are again those selected factors of the circle that
inuence the output.
They are e.g.
approach in which it enters the circle,
x1
x3
the height of the central circle,
x2
the angle of the
the range of vision at the enter of the monitored arm
etc.
The task is to determine, if the trac circle is safe, it means, if the semi-accidents occur only
in some small admissible ratio
p0
of the whole amount of vehicle passes.
If not, we should
determine, which factors have the major inuence on the high ratio of semi-accidents.
The model of this situation is a discrete one, describing the vehicle pass
f (yk |xk , p) =
Y
t ,xt ])
pδ([y,x];[y
y,x
(11.3)
y,x∈{y,x}∗
where
p
is a model parameter,
otherwise and
∗
{y, x}
δ
is Kronecker function (δ
(u, u) = 1
[y, x].
for
u = v
and is zero
stands for all possible values of the couple
Solution
The solution follows Bayesian testing of hypothesis about the proportion
the model
(11.3).
in the Section
??.
(he probability
p
p of semi-accidents from
A general solution to the Bayesian testing of hypotheses is briey described
Here, slightly extended version covering the conditional form of the model
depends on the regression vector
We have a data sample with
N
x)
will be considered.
items - records about passes of vehicles in the monitored trac
circle together with the actual values of the factors (variables from the regression vector).
Hereafter, we will accept some border proportion of semi-accidents
p0
and distinguish two pos-
sible models
1.
H1 :
safe situation - the model parameter
p < p0
2.
H2 :
unsafe situation - the model has has the parameters
43
,
S1 = (0, p0 )
and
p > p0 ., S2 = (p0 , 1).
Through the Bayesian testing we learn, which hypothesis has bigger probability and thus it can
be accepted as valid.
The integral
(??)
see the Section
ˆ
(i)
IN
=
??, has the form
N
Y
Y
t ,xt ]) νy,x;0 −1
pδ([y,x];[y
py,x
dp =
y,x
Si k=1 y,x∈{y,x}∗
ˆ
Y
=
νy,x;N −1
py,x
dp
(11.4)
Si y,x∈{y,x}∗
with
νy,x;N = νy,x;0 +
The integral
(11.4)
PN
k=1
δ ([y, x] ; [yt , xt ]).
denes the generalized beta function
B (νN )
with
νN
being a tenzor (multi-
indexed variable). Its denition through gamma function is provided in the Section
??.
Result
The probabilities of hypotheses
where
B(·,·)
H1
and
H2
are see
(??)
f (H1 |d (N ))
∝ B(0,p0 ) (νy,x;N ) f (H1 )
f (H2 |d (N ))
∝ B(p0 ,1) (νy,x;N ) f (H2 )
is incomplete beta function - see Section
??.
11.5 Estimation of error in a guess about length of time a interval
This problem is connected with testimony of witnesses about some accident. The real length of
certain events is very important with respect to the guilt or innocence of accident participants.
At the same time, errors in testimonies of dierent persons can be large.
A set of testimonies from various witnesses is at disposal as a data sample. The items of this
sample consist in the guessed length (d - continuous) and values of several monitored features of
the witness, namely: real length of the estimated interval (x1 ), sex (x2 ), age (x3 ) and possibly
other, all discrete variables. For the data sample we know the real lengths of intervals, so we
can determine the errors (y ).
The task is to describe evolution of the error when the amount of witness features wary typically grows. E.g. I have a testimony from a man 43 years old. How will my belief into this
testimony change when I learn that the man is university graduated? Thus, we want to know
all dependencies between the error and the features as well as between the features themselves.
44
Solution
The model of what is demanded is the joint pdf
f (y, x)
where
x = [x1 , x2 · · · xn ], n
is the number of monitored features.
This joint pdf can be constructed in the following way
f (y, x) = f (y|x) f (x)
where
f (y|x) is the conditional pdf describing the inuence of the features to the error and f (x)
is joint pdf determining the relations between features. This joint pdf on features is fully in our
hands as we determine them.
Construction of
f (y|x)
The linear regression is used. The model is
y = xθ + e = θ0 + θ1 x:1 + · · · + θn xn + e
where
e
(11.5)
is zero mean random variable with constant and unknown variance
and parameters
θ
f (y|x) is constructed as a transformation
N (0, r), the pdf f (y|x) = N (xθ, r).
The desired pdf
if the error is
Construction of
r.
This variance
r
are estimated.
of
e
to
y
according to
(11.5).
Thus,
f (x)
It is an estimation of discrete model, resulting in counting the numbers for which the individual
states occurred in the data sample.
Dividing by the sample length gives the probabilities of
the states, which is the joint pdf. Let us demonstrate the procedure in the example with two
features
x1
and
x2 ,
both with values 1, 2.
Example:
The discrete joint pdf in
x1 , x2
can be written in the form of table
f (x1 , x2 )
x1
x2
f (1, 1)
f (2, 1)
f (1, 2)
f (2, 2)
1
2
1
2
1
1
2
2
where
PN
f (i, j) =
where
δ (u, v)
k=1
δ ([i, j] , [x1 , x2 ]k )
N
is Kronecker delta which is one for
of the data sample and
N
u=v
and zero otherwise,
[x1 , x2 ]k
is
k th
item
is number of items in the data sample.
Having constructed both the partial pdfs, we can write the joint pdf
f (y|x)
element product. We obtain (still within the example) the following table
45
as their element by
f (x1 , x2 )
x1
x2
where
f (1, 1) LR (1, 1)
f (2, 1) LR (2, 1)
f (1, 2) LR (1, 2)
f (2, 2) LR (2, 2)
1
2
1
2
1
1
2
2
LR (i, j) represents the pdf f (y|x) obtained from linear
regression and corresponding with
the regression vector
θ̂
and
r̂.
[i, j] .
Mostly, it is a normal pdf
N xθ̂, r̂
with the estimated parameters
It is mixed pdf, which means it consists of continuous functions (Gauss pdfs) which
are indexed by all possible congurations of values of the regression vector.
Now, we have at disposal just a subvector
x̃
of the regression vector
x.
What we want is the
partial description of the error
f (y|x̃)
It can be obtained through marginalization of the joint pdf and by the nal conditioning in the
following way
f (y, x) → f (y, x̃) → f (y|x̃)
where the rst step is done by summation over those variables in the regression vector which
are not included in
f (y, x̃) /f (x̃)
with
x̃ and the second step employs denition of the conditional
f (x̃) derived again by summation from f (x) as its marginal.
pdf
f (y|x̃) =
Result
The interval in which the error can lie can be construct as a conditional condence interval using
th pdf
f (x̃).
12 Psaní v Lyxu
Klávesové zkratky
Tools/Preferences/Editing/Shortcuts - lze vyhledávat podle funkce nebo klávesy a m¥nit.
Program les/Lyx/Resources/bind - schované kl. zkratky
AppData/Roaming/Lyx/bind - vlastni kl. zkratky (user.bind) p°etahnout, podívat, m¥nit. Pozor
M=Alt.
Ovládání editoru
Tools/Preferences
Ovládání stránky
Document/Settings
Psaní dokumentu
46

rozpracované přípravy na přednášky

Transkript

Podobné dokumenty

Dotazník pro nabídku fotovoltaické elektrárny

Úvod - Ivan Nagy

UV ZAŘÍZENÍ obecné informace

Úvod k technologii

Skripta

Návratnost investic do sociálních sítí a recenzí podle

masurare

Pareti termo

itinerář výpravy , trasy přechodu , mapy, navigace, jízdní řády

OpenPLi - software pro DVB zarízení