ICSC 2006 - Evropský polytechnický institut, sro

Transkript

ICSC 2006 - Evropský polytechnický institut, sro
EUROPEAN POLYTECHNICAL INSTITUTE KUNOVICE
PROCEEDINGS
FOURTH INTERNATIONAL CONFERENCE ON SOFT
COMPUTING APPLIED IN COMPUTER AND
ECONOMIC ENVIRONMENT
ICSC 2006
January 27, Kunovice, Czech Republic
Edited by:
Prof. Ing. Imrich Rukovanský, CSc, and Doc. Ing. Pavel Ošmera, CSc
Prepared for print by:
Bc. Andrea Šimonová, DiS., Bc. Pavel Kubala, DiS. and Ing. Petr Matušík
Printed by:
© European Polytechnical Institute Kunovice, 2006
ISBN : 80–7314–084-5
FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING
APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT
ICSC 2006
Organized by
THE EUROPEAN POLYTECHNICAL INSTITUTE, KUNOVICE
THE CZECH REPUBLIC
Conference Chairman
Ing. Oldřich Kratochvíl, Dr.h.c., rector
Conference Co-Chairmen
Prof. Ing. Imrich Rukovanský, CSc.
Assoc. Prof. Ing.Pavel Ošmera, CSc.
INTERNATIONAL PROGRAMME COMMITEE
O. Kratochvíl – Chairman (CZ)
M. Baraňski (Poland)
J. Baštinec (Czech Republic)
J. Diblík (Czech Republic)
P. Dostál (Czech Republic)
U. K. Chakraborthy (USA)
B. Kulcsár (Hungary)
V. Mikula (Czech Republic)
P. Ošmera (Czech Republic)
J. Petrucha (Czech Republic)
I. Rukovanský (Czech Republic)
G. Vértesy (Hungary)
W. Zamojski (Poland)
J. Zapletal (Czech Republic)
T. Walkowiak (Poland)
ORGANIZING COMMITEE
I. Rukovanský (Chairman)
P. Ošmera
A. Šimonová
P. Kubala
J. Kavka
P. Matušík
M. Balus
I. Polášková
Session 1: ICSC
Chairman: Doc. RNDr. Josef Zapletal, CSc.
J. Šáchová
T. Chmela
J. Míšek
Š. Mikuláš
R. Jurča
M. Zálešák
OBSAH
A MESSAGE FROM THE GENERAL CHAIRMAN OF THE CONFERENCE ...................................................7
POZNÁMKA K ROZHODOVÁNÍ ZA RIZIKA A NEJISTOTY
Zapletal Josef...............................................................................................................................................................9
THE USE OF FUZZY LOGIC FOR SUPPORT OF DIRECT MAILING
Dostál Petr .................................................................................................................................................................21
THE COLLATION OF VARIOUS METHODS FOR THE SOLUTION OF TRANSPORTATION PROBLEMS
Abdurrzzag Tamtam ..................................................................................................................................................27
SOLUTION OF STRUCTURAL INTERBRANCH SYSTEM OF A DYNAMIC MODEL
Baštinec Jaromír, Diblík Josef ....................................................................................................................................35
PROBABILITY THEORY AND STATISTICS IN THE COMBINED FORM OF STUDY OF THE
BACHELOR STUDENT PROGRAMMES AT FEEC BUT
Novák Michal ............................................................................................................................................................43
APLIKACE FUZZY SYSTÉMŮ PRO PODPORU ROZHODOVÁNÍ A ŘÍZENÍ
Mikula Vladimír, Petrucha Jindřich ............................................................................................................................45
EXAMPLES OF USING CONCEPTS OF PROBABILITY THEORY IN MANAGEMENT DECISION
MAKING
Novák Michal, Fajmon Břetislav................................................................................................................................51
OPTIMIZATION OF MATERIAL CHARACTERIZATION BY ADAPTIVE TESTING
Vértesy Gábor, Tomáš Ivan, Mészáros István .............................................................................................................57
VYUŽITÍ KOMPLETNÍHO GENETICKÉHO ALGORITMU PRO ŘEŠENÍ OPTIMALIZACE VÝROBNÍHO
PROCESU Z HLEDISKA MAXIMALIZACE ZISKU
Kostiha Jiří ................................................................................................................................................................65
REVITALIZING COMPANY INFORMATION SYSTEMS AND COMPETITIVE ADVANTAGES
Lacko Branislav.........................................................................................................................................................73
MODEL LEARNING AND INFERENCE THROUGH ANFIS
Amalka Al Khatib......................................................................................................................................................81
GRAMMATICAL EVOLUTION WITH BACKWARD PROCESSING
Ošmera Pavel, Popelka Ondřej, Rukovanský Imrich ...................................................................................................89
OBJECT RECOGNITION BY MEANS OF NEW AL
Šťastný Jiří, Minařík Martin .......................................................................................................................................99
APLIKÁCIA TEÓRIE GRAFOV V INTELIGENTNOM DOPRAVNOM SYSTÉME
Klieštik Tomáš.........................................................................................................................................................105
THE VORTEX-FRACTAL THEORY OF THE UNIVERSE STRUCTURES
Ošmera Pavel...........................................................................................................................................................109
VORTEX-FRACTAL PHYSICS
Ošmera Pavel...........................................................................................................................................................123
VÝZNAM MONITOROVÁNÍ POČÍTAČOVÝCH SÍTÍ
Rukovanský Imrich..................................................................................................................................................131
ANALÝZA DAT S VYUŽITÍM NEURONOVÝCH SÍTÍ A KONTINGENČNÍCH TABULEK
Petrucha Jindřich......................................................................................................................................................137
DETECTION OF INITIAL DATA GENERATING BOUNDED SOLUTIONS OF LINEAR DISCRETE
EQUATIONS
Baštinec Jaromír, Diblík Josef ..................................................................................................................................143
ON SOME PROPERTIES OF FRACTIONAL CALCULUS
Krupková Vlasta, Šmarda Zdeněk ............................................................................................................................157
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
5
ON THE STABILITY OF LINEAR INTEGRODIFFERENTIAL EQUATIONS
Šmarda Zdeněk ........................................................................................................................................................163
EXISTENCE OF POSITIVE SOLUTIONS FOR RETARDED FUNCTIONAL DIFFERENTIAL EQUATIONS
WITH UNBOUNDED DELAY AND FINITE MEMORY
Diblík Josef, Svoboda Zdeněk ..................................................................................................................................169
APPLICATION OF NON SIMPLEX METHOD FOR LINEAR PROGRAMMING
Tomšová Marie........................................................................................................................................................173
AUTHOR INDEX ..................................................................................................................................................179
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
6
A MESSAGE FROM THE GENERAL CHAIRMAN OF THE CONFERENCE
Dear guests and participants at this conference
You are getting this anthology from the 4th scientific conference ICSC 2006 –
International Conference on Soft Computing Applied in Computer and Economic
Environment.
Ing. Oldřich Kratochvíl, Dr.h.c.
Prof. Ing. Imrich Rukovanský, CSc.
It is my pleasure to give thanks for the preparation of the conference to Prof. Ing.
Imrich Rukovansky, CSc.
Fuzzy logic and neuron networks have become an important part of work at our
University during the last four years. The conference participants gave their
papers on their scientific work and results gained during the last year. Some
academics (Doc. Pavel Ošmera CSc, Ing. Dostál), their papers are a part of this
anthology, will send their results to well known magazines abroad highlighting
that their research results were published in this anthology for the first time.
I am pleased that the academics from Czech, Slovak, Hungary, Poland and Russia
universities took part in the conference.
This conference was not only an important scientific but also a social event.
Allow me to wish lots of success to all the participants of the conference. I kindly ask them to be with us at the 5th
conference ICSC 2007.
Kunovice, January 27, 2006
Dipl. Ing. Oldřich Kratochvíl, Dr.h.c.
rector
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
7
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
8
POZNÁMKA K ROZHODOVÁNÍ ZA RIZIKA A NEJISTOTY
Josef Zapletal
Evropský polytechnický institut, s.r.o.
Abstrakt: V článku se popisují metody rozhodovací analýzy, které mají poněkud odlišné rysy od
rozhodovacích metod operační analýzy, které představují především optimalizační nástroje vhodné pro
řešení jednodušších, dobře strukturovaných rozhodovacích problémů. Naopak, charakteristickým rysem
rozhodovací analýzy je to, že snaží skloubit exaktní postupy a modelové nástroje se znalostmi a
zkušenostmi řešitelů těchto problémů. Heuristické metody významně ovlivňují postupy a výsledné řešení
problémů. Uvedeme základní pojmy , metody a nástroje rozhodovací analýzy, resp. rozhodování za rizika a
nejistoty. Mezi ně bude patřit pojem subjektivní pravděpodobnost, funkce utility za rizika a některé grafické
nástroje podpory řešení rozhodovacích problémů za rizika a nejistoty.
Klíčová slova: Rozhodovací analýza, operační analýza, deterministické metody, stochastické metody,
subjektivní pravděpodobnost, poměr sázek, funkce utility za rizika, sklon rozhodovatele k riziku, averze
k riziku, sklon k riziku, konkávní, lineární a konvexní funkce utility, jistotní ekvivalent.
1 Úvod
Tento příspěvek má být jakýmsi metodickým návodem pro studenty EPI, kteří se zabývají ve svých projektech
problematikou rozhodování a to zejména rozhodování řízení nedeterministických procesů. Výchozím materiálem se
mně stala skripta Jiřího Fotra a Jiřího Dědiny Manažérské rozhodování a dále práce [2], [4], [5], [7], [13]. Příklad
v podkapitole Metoda relativních velikostí je přebrán z [6].
2. Subjektivní pravděpodobnosti.
2.1 Objektivní a subjektivní pravděpodobnost
Důležitou součástí přípravy rozhodování je vyjasnit si možné budoucí situace. Zejména očekávané ekonomicko
politické vlivy, které mají vliv na důsledky uvažovaných variant rozhodování. Některé z těchto skutečností mohou být
nepříznivé(nadúroda ve velkých geografických oblastech, nadvýroba určitého zboží ve velkých ekonomicky silných
státech atd.). Naopak některé mohou být příznivé (tržní konjunktura, ústup určité konkurence z trhu v důsledku
zmodernizování výroby, získání nových odbytišť pro zaběhnutou výrobu aj.).
Je proto nutné, nějakým způsobem pro další vyhodnocování stanovit míru nebezpečí u nepříznivých vlivů a míru
nadějnosti u příznivých vlivů. Takovou mírou bývá zpravidla pravděpodobnost. Rizikové situace a jejich umístění
v konkrétním čase však nejsou obvykle zcela a pravidelně opakovatelné. Nejsme většinou v situaci, kdy má manažér
rozhodnout o nákupu jistého počtu náhradních dílů do rezervy pro daný drahý stroj a přitom ví, kolik a s jakou
pravděpodobností se tyto díly pokazily a kdy existuje jakýsi objektivní systém pravděpodobností, z něhož lze vycházet.
Při rozhodování obecných možných situací nemá manažér k dispozici minulé statistické údaje a pro pravděpodobnost
ohodnocení rizikových situací uplatnit pouze tzv. subjektivní pravděpodobnosti. Tyto jsou založeny na předpokladu, že
každý subjekt, kterým ovšem není pouhý jedinec, má určitý předpoklad vývoje a víru v tento vývoj. Subjektivní
pravděpodobnost je pak vyjádřením míry „osobního přesvědčení“ subjektu (celého týmu prognostiků a manažérů) na
nastoupení určitého jevu případně události.
2.2. Číselné vyjádření subjektivní pravděpodobnosti
Subjektivní pravděpodobnost můžeme vyjádřit buď číselně, nebo slovně. Číselné vyjádření může mít dvě formy:
První forma – pomocí čísel od 0 do 1 případně od nula procent do 100%. Hodnota pravděpodobnosti nula vyjadřuje, že
daná situace nebo jev určitě nenastane, hodnota pravděpodobnosti 1, resp. 100% indikuje, že daná situace nebo jev
nastanou s jistotou.
Druhá forma – číselného vyjádření subjektivní pravděpodobnosti je vyjádření buď ve formě poměru udávajícího počet
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
9
realizací daného jevu z celkového počtu možných případů (např. porucha určitého zařízení nastane v průměru jedenkrát
za 100 dní), nebo pomocí tzv. poměru sázek. V tomto případě vyjadřuje manažér svoji víru ve výskyt daného jevu
(kterým může být např. úspěch nově vyvinutého výrobku na trhu) např. výrokem typu: „Vsadil bych 3:1, že výrobek
bude na trhu úspěšný. Pravděpodobnost tržního úspěchu výrobku je pak
3
= 0, 75.
3 +1
Číselné stanovení subjektivních pravděpodobností obvykle probíhá ve spolupráci se specialisty z dané věcné oblasti. Při
stanovování úspěchu vývoje určitého výrobku to může být vedoucí vývojového týmu, podobně při určování výše
prodeje zase marketingový odborník. Pro ilustraci postupů stanovení subjektivních pravděpodobností uvedeme dvě
metody a to metodu relativních velikostí a metodu kvantilů.
Metoda relativních velikostí
Tato metoda je vhodná pro určování subjektivních pravděpodobností jevů, kterých je pouze omezený počet. V této
metodě se určuje nejprve ten jev (situace), kterou považuje odborník za nejpravděpodobnější. Tato pravděpodobnost
pak slouží jako základ pro stanovení pravděpodobností dalších jevů (situací). Pro názornost uvedeme následující
ilustrativní příklad.
Předpokládejme, že podnik kupuje nové výrobní zařízení, přičemž s nákupem je třeba objednat určitý počet kusů
významných a většinou velmi drahé náhradní součásti, která se náhodně poškozuje. Jako podklad pro tuto objednávku
je třeba určit pravděpodobnost jednotlivých hodnot počtu poruch dané součásti (to jsou v našem případě jevy, resp.
Situace, které mohou nastat) během provozu doby životnosti kupovaného výrobního zařízení. Pokud je toto zařízení již
několikanásobně v provozu a vedou–li se statistiky poruchovosti spadá tato úloha do operační analýzy, konkrétně do
teorie zásob. Většinou ale nejsou k dispozici dostatečně rozsáhlá statistická vyhodnocení a pro stanovení subjektivních
pravděpodobností jednotlivých počtů poruch se využije informací získaných diskusí analytika s odborníkem. Z takové
diskuse vyplynulo, že
•
maximální předpokládaný počet poruch je pět (počet poruch se může tedy pohybovat od nuly k pěti);
•
nejpravděpodobnější počet poruch jsou dvě;
•
pravděpodobnost jedné resp. tří poruch je stejně velká a je přibližně dvakrát menší než pravděpodobnost dvou
poruch;
•
pravděpodobnost žádné, resp. pěti poruch je stejně velká a je zhruba desetkrát menší než pravděpodobnost dvou
poruch;
•
pravděpodobnost čtyř poruch je přibližně pětkrát menší než pravděpodobnost dvou poruch.
Jestliže nyní pravděpodobnost vzniku dvou poruch (tj. počtu poruch s největší pravděpodobností) označíme P a
pravděpodobnosti nastoupení i poruch jako pi , pak z výše uvedeného plyne
p2 = P
p1 = p 3 =
po = p 5 =
p4 =
P
2
P
10
P
5
Druhá rovnice shora vyjadřuje tvrzení, že pravděpodobnost vzniku jedné poruchy je stejně velká jako pravděpodobnost
nastoupení tří poruch a obě jsou přibližně dvakrát menší než pravděpodobnost nastoupení dvou poruch, kterou jsme
označili P.
Protože celý pravděpodobnostní prostor je tvořen pěti hodnotami, musí platit
po + p1 + p2 + p3 + p4 + p5 = 1 .
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
10
Jestliže nyní do této rovnice dosadíme za po =
P
10
Řešením této rovnice je hodnota
Dostaneme
P
,
10
+
P
P
p1 =
P
+ P+
2
,
p2 = P ,
2
P
+
2
+
5
P
atd., dostaneme
= 1
10
P = 0 , 42 , pomocí níž již určíme pravděpodobnosti jednotlivých počtů poruch.
P
po =
0, 42
=
10
p1 =
P
B 0, 04
10
0, 42
=
= 0, 21
2
2
p2 = P = 0, 42
p3 =
p4 =
p5 =
P
2
P
0, 42
=
2
0, 42
=
5
P
= 0, 21
B 0, 08
5
=
0, 42
10
B 0, 04
10
Stanovené subjektivní pravděpodobnost tvoří rozdělení pravděpodobnosti počtu poruch. Toto rozdělení můžeme zapsat
buď ve tvaru tabulky (viz první řádek tabulky 1), kde každé hodnotě počtu poruch odpovídá určitá pravděpodobnost,
nebo graficky, pomocí histogramu, kde na ose x zobrazíme jednotlivé počty poruch a na ose y jim odpovídající
pravděpodobnosti (viz obr.1).
Tab.1 Rozdělení pravděpodobností
Počet poruch
2
3
0,42
0,21
Pravděpodobnosti
Pravděpodobnost
Kumulativní
Pravděpodobnost
0
0,04
1
0,21
0,04
0,25
0,67
0,88
4
0,08
5
0,04
0,96
1
0,45
0,4
0,35
0,3
0,25
0,2
0,15
0,1
0,05
0
0
1
2
3
4
5
Obr.1 Rozdělení pravděpodobnosti počtu poruch
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
11
Poznámka
Rozdělení pravděpodobnost počtu poruch můžeme též vyjádřit pomocí tzv. kumulativních pravděpodobností (viz třetí
řádek tab. 1). Tyto kumulativní pravděpodobnosti vyjadřují pravděpodobnost, že počet poruch bude menší nebo roven
určitému počtu poruch. Například pravděpodobnost jevu, že počet poruch bude menší nebo roven dvěma (tj. nastanou
buď dvě, jedna nebo žádná porucha) je 0,67. Tuto pravděpodobnost získáme součtem (kumulací) pravděpodobností
nastoupení žádné, jedné nebo dvou poruch. Platí 0,67 = 0,04 + 0,21 + 0,42. Kumulativní pravděpodobnost nula až pěti
poruch je jedna. Je možné tvrdit s jistotou, že během daného období provozu zařízení počet poruch nepřevýší pět.
Kumulativní pravděpodobnosti uvedené v tab. 1 ve svém souhrnu definují distribuční funkci náhodné veličiny udávající
počty poruch.
Metoda kvantilů
Tato metoda je vhodná pro stanovení subjektivního rozdělení pravděpodobnosti v případě, že počet možných jevů
(situací), které mohou nastat je vysoký, případně nekonečný. Tento charakter má většina faktorů rizika, např. nákupní a
prodejní ceny určitých produktů a surovin, výše poptávky, devizových kursů aj.
Podstata metody kvantilů vyplyne názorně z příkladu stanovení rozdělení pravděpodobnosti budoucí poptávky po
určitém výrobku nově uváděném na trh.
Předpokládejme, že z diskuse analytika s marketingovým odborníkem vyplynulo, že roční výše poptávky se může
pohybovat od pěti tisíc kusů do deseti tisíc (pesimistický odhad pět tisíc, optimistický odhad deset tisíc určují hranici
intervalu, ve kterém se může poptávka pohybovat),
Dále se může postupovat dvěma způsoby. V prvním marketingový odborník určuje velikosti poptávky, které odpovídají
podle jeho názoru určitým pevným hodnotám pravděpodobnosti, např. 0,25, 0,5, a 0,75. Při druhém způsobu určuje
marketingový odborník hodnoty pravděpodobností, které podle jeho soudu odpovídají určitým zvoleným hodnotám
poptávky, např. šesti tisícům kusů, sedmi tisícům kusů, osmi tisícům kusů a devíti tisícům kusů.
Jestli např. v prvním případě vedla diskuse analytika s marketingovým odborníkem k závěru, že pravděpodobnosti 0,25
odpovídá velikost poptávky sedm tisíc kusů, pravděpodobnostem 0,5 a 0,75 velikost poptávky osm tisíc kusů a osm
tisíc pět set kusů, pak tyto dvojice čísel spolu s dvojicemi 0; 5 000 a 1; 10 000 představují subjektivní rozdělení
pravděpodobnosti poptávky (viz obr.2).Jednotlivé dvojice je třeba chápat tak, že např. pravděpodobnost, že roční
poptávka po daném produktu nepřekročí výši sedmi tisíc kusů, je 0,25 (je to tedy pravděpodobnost, že poptávka bude
menší nebo nejvýše rovna sedmi tisícům kusů). Pravděpodobnost, že poptávka nepřekročí osm tisíc kusů je 0,5 ,
pravděpodobnost nepřekročení poptávky velikosti osm tisíc pět set je 0,75 a konečně pravděpodobnost velikosti 1
odpovídající deseti tisícům znamená, že marketingový odborník považuje za zcela jisté , že roční poptávka po daném
produktu nepřekročí hodnotu deseti tisíc kusů.
Pravděpodobnost
Poptávka ( ks/rok )
1
10 000
0,75
8 500
0,5
8 000
0,25
7 000
0
5 000
Obr.2 Určení poptávky pro dané hodnoty pravděpodobností
Subjektivní rozdělení pravděpodobnosti poptávky po daném produktu je nyní možné opět zobrazit graficky, kdy na xové ose zobrazíme hodnoty poptávky a na ose y-ové odpovídající pravděpodobnosti. Tím dostaneme graf distribuční
funkce.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
12
Na obr.3 uvedená distribuční funkce poptávky odpovídá kumulativním pravděpodobnostem počtu poptávek
v předchozím příkladu I u poptávky (a jiných faktorů rizika s velkým počtem jejich možných hodnot) je však možné
určit něco, co odpovídá pravděpodobnostem (nekumulativním. Jde o tzv. hustotu pravděpodobnosti, jejíž grafické
zobrazení pro případ naší poptávky uvádíme na obr. 4. Na x-ovou osu opět nanášíme hodnoty poptávky, ale na y-ové
ose nejsou odpovídající pravděpodobnosti, ale již zmíněná hustota pravděpodobnosti. Graf hustoty pravděpodobnosti
lze interpretovat takto: Pravděpodobnosti určitých hodnot poptávky jsou dány velikostí odpovídajících ploch pod
křivkou hustoty pravděpodobnosti na obr. 4. Celá plocha pod touto křivkou je normována a je rovna jedné (vyjadřuje to
jistotu, že roční poptávka po daném produktu nebude nižší než pět tisíc kusů a současně nepřesáhne hodnotu deseti tisíc
kusů). Z obr. 2 resp. 3 vidíme, že pravděpodobnost toho, že poptávka nepřekročí sedm tisíc kusů, je 0,25. Stejná
pravděpodobnost (tj. že se poptávka bude pohybovat mezi pěti tisíci kusy a sedmi tisíci kusy) je vyjádřena plochou
obrazce pod křivkou na obr. 4 od počátku s hodnotou pět tisíc do kolmice v bodě sedm tisíc.
1
0,75
0,5
0,25
0
5000
7000
8000
8500
10000
Obr.3 Distribuční funkce poptávky
Stejně tak např. pravděpodobnost poptávky v intervalu od sedmi tisíc kusů do osmi tisíc pěti set kusů je dána plochou
vyšrafovaného obrazce pod křivkou na obr.4 ohraničeného kolmicemi v hodnotě poptávky sedm tisíc kusů a osm tisíc
pět set kusů. Tuto pravděpodobnost můžeme též stanovit z odpovídajících hodnot na obr. 3, resp. 4 jakožto rozdíl
pravděpodobnosti, že poptávka bude menší nebo rovna osmi tisícům pěti stům kusů a pravděpodobnosti, že poptávka
nepřekročí sedm tisíc kusů, tj. 0,75 – 0,25 = 0,5. Distribuční funkce (viz obr. 3) a hustota pravděpodobnosti (viz obr. 4)
jsou tedy ve vzájemném jednoznačném vztahu a při znalosti jedné křivky lze určit druhou křivku a naopak.
5000
6000
7000
8000
8500
10000
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
13
2.3. Slovní vyjádření subjektivních pravděpodobností
Přednost číselného vyjádření subjektivních pravděpodobností je jeho jednoznačnost. Pokud chceme těchto
pravděpodobností využít při tvorbě a řešení modelů podporujících manažérské rozhodování, nelze užít jiného než
číselného vyjádření subjektivních pravděpodobností.
Určitou nevýhodou je, že se manažeři mnohdy kvantitativnímu vyjádření vyhýbají a raději pracují se slovními popisy
subjektivních pravděpodobností, které jsou všeobecně srozumitelné a přijatelné. Mezi číselnými hodnotami a slovními
popisy subjektivních pravděpodobností existuje určitý vztah, který můžeme vyjádřit např. pomocí tab. 2 (Tepper – Kápl
1991).
Slovní vyjádření subjektivních pravděpodobností má však též značné nevýhody. Nelze jej využít pro tvorbu
matematických modelů podporujících přípravu manažérského rozhodnutí. Kromě toho praktické zkušenosti ukazují, že
jednoznačný vztah mezi číselným a slovním vyjádřením subjektivních pravděpodobností, uvedených v tabulce 2, není
určitou závaznou normou a že různí lidé chápou uplatněné slovní popisy odlišně a přikládají jim nestejný obsahový
význam) blíže viz Moore, 1983). Nejednoznačnost slovních vyjádření subjektivních pravděpodobností je proto jejich
značným nedostatkem, který může ztížit komunikaci při týmovém řešení. Vzhledem k těmto skutečnostem může slovní
vyjádření subjektivních pravděpodobností sloužit jako určitý první stupeň, po němž následuje uplatnění některé metody
číselného stanovení těchto pravděpodobností.
Tab. 2 Číselné a slovní vyjádření subjektivních pravděpodobností
Vyjádření subjektivní pravděpodobnosti
Číselné
0
0,1
0,2 - 0,3
0,4
0.6
0,7 – 0,8
0,9
1
Slovní
Zcela vyloučeno
Krajně nepravděpodobné
Dosti nepravděpodobné
Nepravděpodobné
Pravděpodobné
Dosti pravděpodobné
Nanejvýš pravděpodobné
Zcela jistě
3 Funkce utility
3.1 Postoj rozhodovatele k riziku
Při rozhodování za rizika a nejistoty, a to zvláště ve fázi hodnocení variant a výběru varianty určené k realizaci , hraje
významnou roli postoj rozhodovatele k riziku. Rozhodovatel (manažér, podnikatel) může mít buď averzi
k riziku,případně sklon k riziku nebo neutrální postoj k riziku.
Rozhodovatel s averzí k riziku se snaží vyhnout volbě značně rizikových variant a vyhledává málo rizikové varianty,
které se značnou jistotou zaručují dosažení výsledků, které jsou pro něj přijatelné. Rozhodovatel se sklonem k riziku
naopak vyhledává značně rizikové varianty (které mají naději na dosažení zvláště dobrých výsledků, ale jsou spojeny i
vyšším nebezpečím špatných výsledků, resp. ztrát) a preferují je před variantami méně rizikovými. U rozhodovatele
s neutrálním postojem k riziku jsou averze a sklon k riziku ve vzájemné rovnováze.
Postoj rozhodovatele k riziku patří k jednomu ze základních pojmů teorie rozhodování za rizika a nejistoty. Jeho
definice je založena na chování rozhodovatele v situaci, kdy má možnost volby mezi dvěma variantami, z nichž jedna je
riziková a druhá neriziková. Předpokládejme např. že riziková varianta vede s pravděpodobností p1 k výsledku x1 a
s pravděpodobností 1 - p1 k výsledku x2 . Neriziková varianta nechť s jistotou zaručuje dosažení výsledku, který je roven
očekávání (střední) hodnotě výsledku první varianty, tj. zaručuje dosažení výsledku x1. p1 + x2.( 1 - p1 ) .
Podle definice má rozhodovatel averzi k riziku právě tehdy, dává-li v každé situaci výše uvedeného typu přednost druhé
(tj. nerizikové) variantě před první (rizikovou) variantou. Jestliže rozhodovatel preferuje vždy první, rizikem zatíženou
variantu před druhou nerizikovou variantou, má sklon k riziku. Pro rozhodovatele s neutrálním postojem k riziku jsou
obě výše uvedené varianty indiferentní (tj. hodnotí je stejně vysoko).
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
14
Předpokládejme, že rozhodovatel např. vybírá ze dvou variant. První vede s pravděpodobností 0,5 k zisku ve výši 10
milionů Kč a se stejnou pravděpodobností k nulovému zisku. Druhá varianta zaručuje s jistotou dosažení zisku ve výši 5
milionů Kč (což se právě rovná očekávanému zisku při aplikaci první varianty neboť 0,5 . 10 mil. Kč + 0,5 . 0 = 5 mil,
Kč). V tomto případě rozhodovatel s averzí k riziku volí druhou variantu, kdy s jistotou dosáhne zisku 5 mil. Kč (tzn.,
že se snaží vyhnout situaci, která by v případě volby první varianty mohla nastat a vést k nulovému zisku).
Rozhodovatel se sklonem k riziku volí první variantu, u které oceňuje značnou naději (50% pravděpodobnost)
dosáhnout zisku ve výši 10 mil. Kč (to je o pět mil. Kč více, než zaručuje druhá varianta). Pro rozhodovatele
s neutrálním vztahem k riziku jsou obě varianty stejně výhodné.
Postoj rozhodovatele k riziku ovlivňuje více faktorů. K nejvýznamnějším patří jeho osobní založení, minulé zkušenosti
(tj. úspěšnost nebo neúspěšnost předchozích rozhodnutí), dále okolí, ve kterém volba rizikových variant probíhá.
3.2 Konstrukce funkce utility
Funkce utility za rizika (existují terminologické nejednotnosti. Kromě funkce utility za rizika se používají i termíny
funkce užitku za rizika, resp. užitková funkce za rizika), slouží jako nástroj pomocí kterého lze kvantitativně vyjádřit
postoj rozhodovatele k riziku. Lze dokázat (Keeney – Raiffa,1976), že pro rozhodovatele s averzí k riziku je funkce
utility konkávní, rozhodovatel se sklonem k riziku má funkci utility konvexní. Funkce utility rozhodovatele
s neutrálním postojem k riziku je lineárního tvaru. (Grafické znázornění funkce utility v závislosti na postoji
rozhodovatele k riziku pro kriteria výnosového a nákladového typu je uvedeno na obr. 5 a 6.
Utilita
1
1
2
3
0
Kritérium
Obr. 5 Rostoucí funkce utility kritéria výnosového typu
Vysvětlivky:
1 . . . Konkávní funkce utility rozhodovatele s averzí k riziku.
2 . . . Lineární funkce utility rozhodovatele s neutrálním postojem k riziku.
3 . . . Konvexní funkce utility rozhodovatele se sklonem k riziku.
Pro náležité pochopení funkce utility je třeba zdůraznit, že tato funkce nevyjadřuje celkový postoj rozhodovatele
k riziku, tj. vzhledem k celému souboru kritérií hodnocení variant, nýbrž postoj k riziku z hlediska daného kritéria
hodnocení. Vzhledem k tomu se tato funkce označuje též jako dílčí, resp. jednorozměrná funkce utility. Tvar této
funkce pro daného rozhodovatele může být (a z pravidla je) pro jednotlivá kritéria zčásti odlišný (např. pro některá
kritéria jsou odpovídající funkce utility konkávní, tzn. že vyjadřují averzi rozhodovatele k riziku, vzhledem k jiným
kritériím má rozhodovatel neutrální postoj k riziku a odpovídající funkce utility jsou lineární)
Dříve než si ukážeme postup konstrukce funkce utility pro zvolené kritérium hodnocení, musíme se ještě seznámit
s jedním základním pojmem, na kterém je tato konstrukce založena. Tímto pojmem je tzv. jistotní ekvivalent.
Jistotním ekvivalentem (pro dané kritérium hodnocení) varianty, která vede k důsledkům (vzhledem k tomuto kritériu)
velikosti x1, x2, . . . , xn s pravděpodobnostmi p1, p2, . . . , pn rozumíme takovou hodnotu důsledku , jehož utilita je rovna
právě střední (očekávané) utilitě varianty vzhledem k tomuto kritériu. (Rozhodovatel se tedy cení důsledek rovný
jistému ekvivalentu, resp. variantu, která vede s jistotou k důsledku rovnému jistotnímu ekvivalentu, resp. variantu,
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
15
která vede s jistotou k důsledku rovnému jistotnímu ekvivalentu stejně vysoko, jako výše uvedenou variantu zatíženou
rizikem).
Platí tedy
(1)
kde
. . . jistotní ekvivalent ,
. . . utilita jistotného ekvivalentu ,
. . . utilita důsledku x i .
Utilita
1
1
2
3
0
Kritérium
Obr 6 Klesající funkce utility kritéria nákladového typu
Vysvětlivky:
1 . . . Konkávní funkce utility rozhodovatele s averzí k riziku.
2 . . . Lineární funkce utility rozhodovatele s neutrálním postojem k riziku.
3 . . . Konvexní funkce utility rozhodovatele se sklonem k riziku.
Jestliže se vrátíme k předchozímu příkladu dvou variant (první byla riziková a vedla s pravděpodobností 0,5 k zisku 10
mil. Kč a se stejnou pravděpodobností k nulovému zisku a druhá zaručovala s jistotou zisk ve výši 5 mil. Kč.) pomocí
kterého jsme demonstrovali postoj rozhodovatele k riziku pak pro první rizikovou variantu platí x1 = 10 mil., x2 = 0 a p1
= p2 = 0,5. Pokud nyní rozhodovatel cení stejně vysoko tuto rizikovou variantu jako variantu, která s jistotou zaručuje
zisk např. ve výši 3 mil Kč (tj.utilita jistého zisku ve výši 3 mil. Kč je rovna očekávané utilitě rizikové varianty, neboli
podle vztahu (1) platí u (3) = 0,5.u (10) + o,5 . u (0), je jistotní ekvivalent této rizikové varianty roven právě 3 mil. Kč.
Jistotní ekvivalent můžeme také interpretovat poněkud jinak. Budeme-li uvažovanou rizikovou variantu považovat za
loterii s výhrami 10 a 0 mil.Kč (dosahovanými se stejnou pravděpodobností 0,5), pak je jistotní ekvivalent roven
minimální částce, za kterou je subjekt ochoten tuto loterii prodat. V našem případě by tedy tato prodejní cena činila 3
mil. Kč.
Na základě vztahu jistotního ekvivalentu dané rizikové varianty a jejího očekávaného důsledku je rovněž možné
vymezit postoj rozhodovatele k riziku. Jestliže v předchozím příkladě byl pro daného rozhodovatele jistotní ekvivalent
rizikové varianty (jejíž očekávaný zisk je 0,5 . 10 + 0,5 . 0 = 5 mil. Kč) menší než tento očekávaný zisk (platí 3 <5), má
rozhodovatel vzhledem ke kritériu tvořenému ziskem averzi k riziku. Je-li jistotní ekvivalent dané rizikové varianty
vyšší než její očekávaný zisk , tj. vyšší než 5 mil. Kč, má rozhodovatel vzhledem k zisku klon k riziku. Jistotní
ekvivalent dané rizikové varianty rozhodovatele s neutrálním postojem k riziku je v našem případě roven právě 5 mil.
Kč. Rozdíl mezi očekávaným důsledkem rizikové varianty a jejím jistotním ekvivalentem se někdy označuje jako
riziková prémie (podrobněji viz Fotr – Píšek, 1986)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
16
3.3 Příklad určení funkce utility
Konstrukci funkce utility si ukážeme na příkladě jejího stanovení pro kritérium hodnocení, tvořeného ziskem.
Předpokládejme, že při řešení určitého rozhodovacího problému investiční povahy bylo formulováno několik variant,
které jsou v odlišné míře rizikové a jejichž možný zisk se pohybuje od 0,25 mil. Kč do 9,8 mil. Kč. Naším úkolem je
pomoci stanovit rozhodovateli, který je odpovědný za volbu varianty určené k realizaci, funkci utility zisku.
Prvním krokem při stanovení funkce utility pro dané kritérium je vymezit její definiční obor. Jako krajní body
definičního oboru můžeme zvolit buď nejmenší a největší hodnotu kritéria v daném souboru variant (v našem případě
0,25 mil.Kč a 9,8 mil. Kč) nebo veličiny, které vznikly jejich vhodným zaokrouhlením. My se rozhodneme pro
zaokrouhlení a funkci utility zisku budeme stanovovat pro interval, jehož dolní mez je 0 mil. Kč a horní mez 10 mil. Kč.
Pro stanovení definičního oboru funkce utility můžeme stanovit hodnoty utility v krajních bodech tohoto oboru.
Využijeme zde toho, že funkce utility za rizika (stejně jako funkce utility za jistoty) nevyjadřuje absolutní ocenění
(výhodnost pro rozhodovatele) možných hodnot kritéria, ale ocenění relativní, a proto je volba hodnot utility v krajních
bodech definičního oboru arbitrární. (arbitrárnost volby hodnot funkce utility v jejích krajních bodech vyplývá z toho,
že funkce utility je jednoznačná až na pozitivní lineární transformaci, tj. původně stanovená funkce utility u (x) může
být nahrazena libovolnou jinou funkcí v (x), pro kterou platí vztah v (x) = a.u (x) + b, přičemž a > 0). Zvykem však je
volit pro kritéria výnosového typu utilitu dolní meze definičního oboru rovnou nule a utilitu horní meze tohoto oboru
rovnou jedné (někdy stu), přičemž pro kritéria nákladového typu je to právě naopak. V našem případě zvolíme tedy
utilitu nulového zisku jako nula a utilitu zisku ve výši 10 mil. Kč jako jedna (platí tedy u (0) = 0, u (10) = 1) , takže
známe již dva body hledané utility zisku.
Dále stanovíme několik dalších bodů hledané funkce utility a to pomocí jistotních ekvivalentů. Určování jistotních
ekvivalentů probíhá v dialogu analytika (konzultanta)s rozhodovatele, kdy analytik klade postupně rozhodovateli
dotazy, přičemž první dotazy může rozhodovatel zodpovědět snadno, avšak náročnost dalších dotazů (tak jak se dialog
blíží k určení jistotního ekvivalentu ) se zvyšuje.
Dialog analytika s rozhodovatele by mohl mít v našem případě asi následující průběh. Analytik položí rozhodovateli
nejdříve dotaz. „Preferujete více variantu, která Vám s jistotou zaručí zisk ve výši 1 mil. Kč, nebo rizikovou variantu,
která s pravděpodobností 0,5 vede k zisku 10 mil. Kč a se stejnou pravděpodobností k nulovému zisku“? Rozhodovatel
(pokud jeho averze k riziku není zvlášť výrazná) asi odpoví, že preferuje více více danou rizikovou variantu. V tomto
případě další dotaz analytika na rozhodovatele zní např. „Preferujete více variantu s jistým ziskem ve výši 8 mil. Kč
nebo danou rizikovou variantu“? Rozhodovatel, (pokud nemá nemá vysoce výrazný sklon k riziku ) patrně odpoví, že si
více cení nerizikové varianty s jistým ziskem velikosti 8 mil. Kč.
Charakter dalších dotazů analytika na rozhodovatele je obdobný. Analytik v druhém kroku zjišťuje, zda rozhodovatel
preferuje více variantu s jistým ziskem 1,5 mil. Kč (resp. dále 6 mil. Kč.) než danou rizikovou variantu. Pokud náš
rozhodovatel představuje převládající typ rozhodovatele s určitou averzí k riziku, bude patrně ještě preferovat danou
rizikovou variantu před variantou s jistým ziskem 1,5 mil. Kč. A zcela jistě si bude více cenit varianty s jistým ziskem 6
mil. Kč než dané rizikové varianty.
Dalšími obdobnými dotazy analytika na rozhodovatele dospějeme v dalších krocích např. k situaci, kdy si rozhodovatel
o něco málo více cení dané rizikové varianty než varianty s jistým ziskem 2,5 mil. Kč a stejně tak považuje za o něco
málo lepší variantu s jistým ziskem 3,5 mil. Kč než danou rizikovou variantu.Jestliže analytik dále zjistí, že
rozhodovatel cení stejně vysoko variantu s jistým ziskem 3 mil. Kč jako danou rizikovou variantu, určili jsme jistotní
ekvivalent rizikové varianty, která s pravděpodobností 0,5 poskytuje zisk 10 mil. Kč a se stejnou pravděpodobností
nulový zisk. Tento jistotní ekvivalent činí 3 mil. Kč.
Tento nepřímý způsob stanovení jistotního ekvivalentu dané rizikové varianty v dialoga analytika s rozhodovatelem je
pro rozhodovatele méně náročný než přímý způsob, tj. položení dotazu: „ Jakou výši jistého zisku si ceníte stejně
vysoko jako jako rizikovou variantu vedoucí se stejnou pravděpodobností 0,5 k zisku 10 mil. Kč a k nulovému zisku?“
Kromě vyšší náročnosti vede tento přímý způsob i k méně spolehlivým výsledkům. (Podrobněji k problematice
k určování jistotních ekvivalentů viz Fotr-Píšek, 1986.)
Stanovený jistotní ekvivalent nám umožňuje určit třetí bod (kromě dvou výše charakterizovaných krajních bodů) funkce
utility zisku. Vzhledem ke vztahu (1) musí totiž platit
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
17
u (3) = 0,5 . u (0) + 0,5 . u (10)
a po dosazení za
(2)
u (0) = 0 a za u (10) =1 do vztahu (2) dostaneme
u (3) = 0,5 .0 + 0,5 . 10 = 0,5
(3)
Tím jsme určili utilitu jistotního ekvivalentu dané rizikové varianty, která je 0,5 a máme tedy další (již třetí) bod
hledané funkce utility zisku, který má souřadnice 3 a 0,5.
Další dva body funkce utility určíme stejným způsobem, a to pomocí jistotních ekvivalentů dalších rizikových variant,
vytvořených s využitím znalosti jistotního ekvivalentu velikosti 3 mil. Kč. Stanovíme (opět v dialogu analytika
s rozhodovatelem) jistotní ekvivalenty dvou rizikových variant, z nichž první vede s pravděpodobností 0,5 k nulovému
zisku a se stejnou pravděpodobností k zisku 3 mil. Kč a druhá s pravděpodobností 0,5 k zisku ve výši 3 mil. Kč a 10
mil. Kč.
Činí-li jistotní ekvivalent první rizikové varianty 1,3 mil. Kč a jistotní ekvivalent druhé rizikové varianty 5,5 mil. Kč,
musí opět vzhledem k (1) platit:
u (1,3) = 0,5 . u (0) + 0,5 . u (3
(4)
u (5,5) = 0,5 . u (3) + 0,5 . u (10)
(5)
Protože však již víme, že u (3) = 0,5 , dostaneme po dosazení za u (0) = 0, u (3) = 0,5 a u (10) = 1 do vztahů (4) a
(5)
u (1,3) = 0,5 .0 + 0,5 . 0,5 = 0,25
(6)
u (5,5) = 0,5 . 0,5 + 0,5 . 1 =0,75
(7)
Nyní již známe souřadnice dalších dvou bodů funkce utility zisku, které jsou (1,3; 0,25) a (5,5; 0,75), a celkem tedy je
známých již pět bodů této funkce. Získané body funkce utility zobrazíme v grafu (na ose x vyneseme velikosti zisku a
na ose y jim odpovídající utility – viz. obr. 7). Proložíme-li tyto body vhodnou křivkou, získáme aproximaci grafického
zobrazení funkce utility zisku pro daného rozhodovatele (a daný rozhodovací problém). Z obr. 7 vidíme, že funkce
utility zisku daného rozhodovatele je konkávní. Tento rozhodovatel má tedy pro dané kritérium averzi k riziku.
Pro praktické uplatnění funkce utility při hodnocení a výběru rizikových variant může být vhodné stanovit funkční tvar
funkce utility a určit jeho parametry.
V případě neutrálního postoje rozhodovatele k riziku, kdy funkce utility je lineární, má její vzorec tvar
u (x ) =
0
x- x
*
0
x - x
0
,
*
kde u je nejhorší a x nejlepší hodnota daného kritéria (tj. krajní body definičního oboru). Její kvantifikace
nevyžaduje odhad parametrů.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
18
Utilita
1
0,75
0,5
0,25
0
1,3
3
5,5
10
Zisk
Obr. 7 Funkce utility zisku
V případě averze, resp. sklonu rozhodovatele k riziku, lze odpovídající konkávní, resp. konvexní funkci utility často
zobrazit exponenciální funkcí tvaru
u (x )= e
a.x+ b
(9)
K určení parametrů a a b této funkce užijeme všech funkčních hodnot funkce utility ve stanovených bodech , tvořených
krajními body jejího definičního oboru a jistotními ekvivalenty. Určení koeficientů se provádí aproximační metodou
nejmenších čtverců přes logaritmování dané rovnice (9).
Empirické výzkumy chování rozhodovatelů za rizika a jejich funkce utility ukazují, že značně převládá averze k riziku.
Současně se však ukazuje, že postoj subjektu k riziku je často odlišný v závislosti na tom, zda jde o zisky nebo ztráty.
Zatímco v oblasti zisku je převládající averze k riziku, pak v oblasti ztrát převládá sklon k riziku (to však platí spíš o
menších ztrátách, neboť v případě značných až katastrofických ztrát, vedoucích k ruinování subjektu opět výrazně
převládá averze k riziku).Postoj rozhodovatele s averzí k riziku v oblasti zisku a sklonem k riziku v oblasti ztrát lze pak
zobrazit funkcí utility s inflexním bodem, který odděluje její konkávní část pro kladné hodnoty ziskového kritéria od
části konvexní pro záporné hodnoty tohoto kritéria(viz obr. 8).
Utilita
1
Averze k riziku
Inflexní bod
Sklon k riziku
Oblast ztráty
0
Oblast zisku
Kritérium
Obr. 8 Funkce utility s inflexním bodem
K funkci utility a k její konstrukci je třeba ještě poznamenat, že tato funkce vyjadřuje vždy subjektivní postoj
rozhodovatele k riziku vzhledem k danému kritériu. Žádná objektivní funkce utility (pro dané kritérium) neexistuje.
Funkce utility různých rozhodovatelů se mohou lišit (a také se obvykle liší, jak ukazují výsledky empirických studií).
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
19
Anii funkce utility téhož rozhodovatele pro dané kritérium, zjišťovaná v různých obdobích, nemusí být stejná.
Empirické výzkumy též ukazují, že konstrukce funkce utility je obecně obtížná záležitost (z hlediska informací, které je
třeba od subjektu pro její kvantifikaci získat). Rozhodovatel s ní proto v mnoha případech nejsou ochotni pracovat.
Seznam literatury
[1]
BAŠTA, A. Plánové rozhodovací procesy a jejich systém. Praha : Academia, 1977.
[2]
ČERNÝ, J.; GLŰCKAUFOVÁ, D. Vícekriteriální vyhodnocování v praxi. Praha : SNTL, 1982.
[3]
EDEN, C.; JONES, S.; SIMS, D. Messing About in Probléme. Oxford : Pergamon Press, 1983.
[4]
FOTR, J. Příprava a hodnocení podnikatelských projektů. Praha : VŠE, 1993.
[5]
FOTR, J. Manažérská rozhodovací analýza. Praha : VŠE, 1992.
[6]
FOTR, J.; DĚDINA, J. Manažérské rozhodování. Praha : VŠE.
[7]
FOTR, J.; PÍŠEK, M. Exaktní metody ekonomického rozhodování. Praha : Academia, 1986.
[8]
IVANCEVICH, J. M.; DONESLY, J. H.; GOBBON, J. L. Management. Principles and Functions. Homewood
: R. D. Irvin, 1989.
[9]
MOORE, P. G. The Business of Risk. Cambridge : University Press, 1983.
[10]
NOVÁK, M. Examples of using concepts of probability theory in managementdecision makinng. Mezinárodní
konference. Kunovice : EPI, s.r.o., 2006.
[11]
NOVÁK, M. Probability theorz in combined form of study at FEEC BUT. Mezinárodní konference. Kunovice :
EPI, s.r.o., 2006.
[12]
PÍŠEK, M.; VOBOŘIL, J. Vybrané metody dlouhodobého prognózování a jejich využití. Praha : Ekonomický
ústav ČSAV, 1981.
[13]
STCHLE, W. H. Management. München : Verlag Franz Valen, 1989.
[14]
VLČEK, R. Hodnotový management. Praha : Management Press, 1992.
[15]
VLČEK, R. Příručka hodnotové analýzy. Praha : SNTL, 1983.
[16]
WATSON, S. R.; BUDGE, J. R. Decision Synthesi. Cambridge : Cambridge University Press, 1987.
[17]
ZAPLETAL, J. Operační analýza. Kunovice : Skriptorium VOŠ, 1995.
[18]
ZÁRUBA, P. aj. Základy podnikového managementu. Praha : Aleko, 1991.
Adresa:
Doc. RNDr. Josef Zapletal, CSc.
Evropský polytechnický institut, s.r.o.
Osvobození 699
686 04 Kunovice
Česká republika
Tel./fax.: +420 572 549 018/ +420 572 548 788
e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
20
THE USE OF FUZZY LOGIC FOR SUPPORT OF DIRECT MAILING
Petr Dostál
Brno University of Technology
Abstrakt: The article deals with the use of fuzzy logic for the support of direct mailing. The brief
description of fuzzy logic, the process of calculation, the scheme of models, rule blocks, attributes and their
membership functions are mentioned. The use of fuzzy logic is an advantage especially for support of
direct mailing where evaluation is very complicated.
Klíčová slova: fuzzy logic, direct mailing, model, rule block, membership function, attributes
1. INTRODUCTION
The use of fuzzy logic is an advantage especially for the support of direct mailing where evaluation is very complicated.
The advantage is that linguistic variables are used. Fuzzy logic measures the certainty or uncertainty of membership of
an element of the set. Analogously man makes decisions during mental and physical activities. The solution of a certain
case is found in the principle of rules that were defined by fuzzy logics for similar cases. Fuzzy logics belong among
methods that are used in the area of direct mailing.
2. The fuzzy processing
The calculation of fuzzy logics consists of three steps: fuzzification, fuzzy inference and defuzzification.
•
The fuzzification means that the real variables are transferred to linguistic variables. The definition of linguistic
variable goes out from basic linguistic variables, for example, at the variable risk there are set up the following
attributes: none, very low, low, medium, high, very high. Usually there are used from three to seven attributes of
variable. The attributes are defined by the so called membership function, such as Λ, π, Z, S and some others. The
membership function is set up for input and output variables.
•
The fuzzy inference defines the behavior of the system by means of rules of type <When>, <Then> on the
linguistic level. The conditional clauses evaluate the state of input variables by the rules. The conditional clauses
are in the form
<When> Inputa <And> Inputb ….. Inputx <Or> Inputy …….. <Then> Output1,
it means, when (the state occurs) Inputa and Inputb, ….., Inputx or Inputy, …… , then (the situation is) Output1.
The fuzzy logic represents the expert systems. Each combination of attributes of variables, incoming into the system
and occurring in condition <When>, <Then>, presents one rule. Every condition behind <When> has a corresponding
result behind <Then>. It is necessary to determine every rule and its degree of supports (the weight of rule in the
system). The rules are created by the expert himself.
•
The defuzzification transfers the results of fuzzy inference on to the output variables, that describes the results
verbally (for example, whether the risk exists or not).
The system with fuzzy logics can work as an automatic system with entering of input data. The input data can be
represented by many variables.
3. DIRECT MAILING
This case presents the use of fuzzy logic for direct mailing, whether the client is visited personally, sent a letter or not to
speak to him. See the model on fig. 1.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
21
Fig. 1 Project chart
The input variables and their attributes are Loan (fig. 2) (none, small, medium, high), Salary (fig. 3) (low, medium,
high), Age (fig. 4) (young, medium, old, very old), Children (no, a few, many), State (single, married, divorced) and
Place (big city, city, village).
Fig. 2. The attributes and membership functions of variable Loan
Fig. 3. The attributes and membership functions of variable Salary
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
22
Fig. 4. The attributes and membership functions of variable Age
The rule blocks with attributes are Finance (excellent, good, bad), Personality (unsuitable, suitable, good, excellent).
The fig. 5 shows the attributes and membership functions of the Finance.
Fig. 5. The attributes and membership functions of variable Finance
The output variable Mailing with the attributes evaluates whether the client will be visited or a letter will be sent to him
or he will not be spoken to him. See fig. 6.
Fig. 6 The attributes and membership functions of variable Mailing
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
23
Fig. 7 shows one from two rule blocks Finance with their rules and degree of support that set up the relation between
input and output variables.
Fig. 7 Rule block
When the model is made, it is necessary to tune it (to set up the inputs on known values, evaluate the results and to
change the rules or weights, if necessary). If the system is tuned, it is possible to use it in practice.
Fig.8 The attributes and membership functions of output variable Mailing
The set up of the rule block distinguish single cases. For example, the result of decision making is the no contact with
client in case when the person has a low salary and a lot of loans, he lives in a village, he is of old age, he is single and
without children. Fig. 8 shows this result, where the mailing is evaluated not to contact the client.
The effort is to bring the profit in the future from the investment into the marketing. The evaluation, whether the
marketing project is profitable or loss-making, is possible to evaluate after a certain time.
4. CONCLUSION
The mentioned case is only the fraction of possible variants of the use of fuzzy logic in various areas of decision
making. The theory of fuzzy logic contributes to the quality of decision making. The decision making process is an
important activity of firms. It is possible to say, that the successful decision making make the firm successful.
It is necessary to emphasize, that these methods support the decision making and that the responsibility of optimal
variant or variants are on those, who make the decision.
Fuzzy logic as well artificial neural networks and genetic algorithms belongs to relative strong methods as a tool of
artificial intelligence for the support of decision making.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
24
LITERATURE
[1]
ALIEV, A.; ALIEV, R. Soft Computing and Its Applications, World Scientific Pub. Ltd, UK2002, 444 p.,
ISBN 981-02-4700-1.
[2]
ALTROCK, C. Fuzzy Logic & Neurofuzzy – Applications in Business & Finance. USA : Prentice Hall, 1996,
375p., ISBN 0-13-591512-0.
[3]
DOSTÁL, P. Moderní metody ekonomických analýz – Finanční kybernetika. Zlín : UTB, 2002, 110p., ISBN
80-7318-075-8.
[4]
DOSTÁL, P. Soft Computing and Stock Market. Brno : VUT, 2003, p.258-262, ISBN 80-214-2411-7.
[5]
DOSTÁL P.; ŽÁK, L. Fuzzy Logic and Financial Time Series. Kunovice : EPI, s.r.o., 2004, International
Conference on Soft Computing, s.93-97., ISBN 80-7314-025-X.
[6]
DOSTÁL, P.; RAIS, K. Operační a systémová analýza II. Brno : VUT – FP, 2005, Skripta, 160s., ISBN 80214-2803-1.
[7]
DOSTÁL, P.; MACHŮ, E. The Use of Fuzzy Logic in Pedagogy of Gifted Students. Brno : 2005, Business and
Economic Development in Central and Eastern Europe, Konference, s.18, 5s., ISBN-214-3012-5.
[8]
DOSTÁL, P. Vybrané metody rozhodování v podnikové sféře. Brno : VUT-FP, 2005, Habilitační práce, 22 s.
188 s. ISBN 80-214-3083-4, ISSN 1213-418X.
[9]
DOSTÁL, P. Využití fuzzy logiky v risk managementu. In Progressive Methods and Tools of Management
and Economics of Companies. Brno : 2005, 5s., ISBN 80-214-3099-0.
[10]
DOSTÁL, P.; RAIS, K.; SOJKA, Z. Pokročilé metody manažerského rozhodování. Grada, 2005,168s, ISBN
80-247-1338-1.
[11]
FANTA, J. Psychologie, algoritmy a umělá inteligence na kapitálových trzích. Praha : Grada, 2001, 168p.,
ISBN 80-247-0024-7.
[12]
KAZABOV, K.; KOZMA, R. Neuro-Fuzzy – Techniques for Intelligent Information Systems Physica-Verlag.
Germany, 1998, 427p., ISBN 3-7908-1187-4.
[13]
KLIR, G.J.; YUAN, B. Fuzzy Sets and Fuzzy Logic, Theory and Applications. New Jersy : Prentice Hall, USA,
1995, 279p., ISBN 0-13-101171-5.
[14]
KOLEKTIV FuzzyTech – Users Manual, Inform. GmbH, Germany, 2002, 258p.
[15]
NOVÁK, V. Fuzzy množiny a jejich aplikace. Praha : SNTL, 1990, 297p., ISBN 80-03-00325-3.
[16]
RAIS, K.; SMEJKAL V. Řízení rizik. Praha : Grada, 2003, 270p., ISBN 80-247-0198-7.
[17]
RIBEIRO, R.; YAGER, R. Soft Computing in Financial Engineering. A Springer Verlag Copany, 1999, 590p.,
ISBN 3-7908-1173-4.
Address:
Ing. Petr Dostál, CSc.
Brno University of Technology
Kolejní 4
612 00 Brno, Czech Republic
Tel. +420 541 143 714, Fax. +420 541 142 692
e-mail: [email protected], [email protected]
http://www.iqnet.cz/dostal
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
25
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
26
THE COLLATION OF VARIOUS METHODS FOR THE SOLUTION OF TRANSPORTATION
PROBLEMS
Abdurrzzag Tamtam
Brno University of Technology
Abstrakt: The paper solves the problem of effectiveness of various methods with respect to their sizes. We
study and compare the North West Corn method,the index method, the Vogel method. It is obvious that the
first allocation required values of goods by customers will be most expansive when using North West Corn
method which is based on geographical principle without any economical conditions. We show that the
index method gives a better result than North West Corn method but the Vogel method brings a better
solution with respect to sizes. We modified Vogel method and we show by example that for greater
matrixes among suppliers and customers, the modified method the gives optimum solution.
Klíčová slova: Supplier, customer, cell, free cell – water, occupied cell – stone, North West Corn method,
index method, Vogel method, modified Vogel method.
1) Introduction
A special class of linear programming problems are the so called distributive problems. Such a distributive problem can
be formulated in full generality as follows:
Minimise the objective function
m
n
z = ∑
∑ ci j x i j
i=1 j=1
under the conditions
n
m
∑ xi j = a i
∑ k i j xi j = b j
i =1
j=1
and non-negative of xi j
where
i = 1, 2, . . . , m , j = 1, 2, . . . , n
These problems can be solved also with the Simplex method. But such a manner of solving is usually very lengthy and
laborious. Special properties of distributive problems make it possible to use special methods such as transportation
problem.
There we suppose m suppliers Si ( i = 1, 2, . . . , m) with inventories si of units of identical commodity and n
customers Kj ( j = 1, 2, . . . , n) with requirements of kj units of the same commodity.
We suppose that:
m
∑ si =
i=1
n
∑ kj
j=1
and we say that a balanced transport problem is given. Further the expanses for transporting a unit from the supplier Si
to the customer Kj are known. We denote them cij.. The number of transported units of commodity from i-th supplier to
the j-th customer will be denoted by xij. The problem is to find m.n – dimensional vector
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
27
[ x11, x12, . . . , x1n, x21, x22, . . . , x2n, . . . , xm1, xm2, . . . , xmn ] Satisfying the following restrictions:
x11 + x12 + x13 + x1n
= s1
= s2
xm1 + xm2+ xm3+ xmn = sm
= k1
+ x31 + . . . . . . + xm1
= k2
+ x32 + .. . . . . . . .
+ xm2
+ x2n
+ x3n . . . . . . . . . ... xmn = kn
x21 + x22 + x23 + x2n
x11
+ x21
+ x22
x12
x1n
This minimises the objective function
z = c11x11 + c12x12 + ... + c1nx1n + c21x21 + c22x22 + ... + c2nx2n + ... + cm1xm1 + cm2xm2+ ... + cmnxmn
Coefficients standing at variables are equal to one or zero. The number of zero coefficients is n . m . (m + n - 2), the
number of unity coefficients is only (2. m . n). The system of equations is dependent. We see that after summing the
first m equations we obtain the same result as after summing the last n equations. Therefore the solution has at most (m
+ n – 1) non-zero components.
If the number of non-zero components is just (m + n - 1), then the solution is called non-degenerated, in case of smaller
number of solutions we say that the solution is degenerated. That is the reason for which we do not solve the transport
problem by Simplex method and we use simpler methods.
2 Solving of Transportation Problem
Hereafter we study the optimising process on a concrete example. The comparison of the methods for the solving of
transportation problem will be done at one example which contains three suppliers and four customers. The following
example with a small matrix of 3x4 cells shows the advantages of index method, Vogel method and modified Vogel
method against the North West Corn method, and also shows that the Vogel method exceeds the index method but this
small example does not show any difference between Vogel method and our modified method.
Example: Let us suppose that we have three suppliers S1, S2 and S3 with inventories 310, 200 and 190 units of
commodity and four customers K1, K2, K3, K4with demands for 250, 100, 150 and 200 of units. The transport costs cij
from the i-th supplier to the j-th customer are given by table 1.
Table 1
Suppliers
Customers
K2
K3
K1
20
14
Inventories
K4
11
12
S1
310
6
15
18
15
S2
200
17
12
19
23
S3
Demands
190
250
100
150
200
700
North West Corner Method
The name arose from the geographical point of view. NW method is a geographical method of occupation of fields
(cells) in the table, which has nothing common with economical point of view. We are beginning from the cell P11
determined by the row D1 and the column S1 (From North and West on a map) the value c11 has no sense for the
construction. This cell will be occupied by maximum possible part of required commodity from S1, it is 250. The
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
28
supplier S1 still has 310 - 250 = 60 units. We continue with occupations of cells in the east course till the inventories of
the first supplier are spent. So the cell P12 determined by double S1, K2 obtains 60 units. Inventories of the first supplier
are equal to zero, but the second customer is not satisfied. We pass on to the second supplier S2 who still has 40
remaining units, hence the cell P21 will be occupied by 40 units. At the supplier S2 there are still 160 units of
commodity. It is possible to satisfy the whole requirement of the customer K3, and we put it into the cell P23. The last 10
units from S2 will be given to the forth customer in the cell P24. The customer K4 claims 200 units of commodity and he
has only 10 units. The rest 190 units will be transported from the third supplier. Six cells are occupied from the total
number of 12 cells. In this example (m+n-1) is just equal to 6, and hence the problem is not degenerated.
The value of the objective function is z = 250.20+60.14+14+40.15+150.18+10.15+190.13=13660 financial units
We obtained the first solution. It is not the best solution of transportation so far.
This fact can be proved by calculus using table 2 containing the first solution obtained by NW method. The occupied
cells are called stones and the empty cells are called waters.
Table 2
Customers
K2
K3
Suppliers
K1
S1
250
20
11
14
310
15
18
40
S2
17
150
12
15
10
19
S3
250
12
60
6
Demands
Inventories
K4
100
150
200
23
190
190
200
700
Index Method
It is obvious that the economical point of view has its important position. This method works with costs cij and begins
with occupation of the cell with the smallest costs and continues over greater and greater costs of cells to the maximum
cost. Simultaneously the sum of values in stones in every row is identical with the initial inventory of the relevant
supplier. Similarly the sums of values in stones are the same as the requirements of relevant customers. We explain this
method in our example.
We begin with the cell P21, which has the smallest cost c21 = 6. We occupy it with the maximum requirement which can
be delivered from the supplier S2, it is 200 units of commodity. This does not satisfy the demand of K2. For the next
smallest cost is c13, we occupy the cell P13 with the maximum amount again, this one is given by the demand of S3. The
next smallest cost equal to 12 occurs in two cells namely in P14 and in P32. We see that they are in different rows and
also in different columns. Therefore it is not necessary to choose the order of occupation of these cells. We put into the
first cell 160 units, there is not more at S1. The amount needed by for the customer K2 is transported from K3 into the
cell P32. The cells with even higher costs c12=14, c22 =c24=15, are not occupied because the inventories of the second
supplier S2 are exhausted. We come to the cost c31= 17. The customer K1 asks for the total amount 250 units and he has
200 units distributed in the stone P21 thus he receives in the cell P31 the amount of 50 units of commodity. For analogous
reasons (the inventories of the supplier are exhausted or the demands of the customer are satisfied) we omit the costs 18,
19 and 20. We finish the allocation in the cell P34 with 40 units. The general visualisation can be found in the table 3.
The value of the objective function for the solution obtained by the index method rewritten in the table 3 is equal to:
z = 200.6 + 150.11 + (100 + 160).12 + 50.17 + 40.23 = 7740
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
29
Table 3
Suppliers
Customers
K2
K3
K1
14
20
11
12
150
S1
15
6
S2
200
S3
50
100
Demands
250
100
Inventories
K4
160
18
310
15
200
17
12
19
23
150
40
190
200
700
We see that the index method brings a better result than the North West corner method. The index method
demonstrative applies the economical specifications and it requires speculation during all of the allocation process. On
the other hand the North – West corner method can be called mechanical we use a given algorithm without extra
cogitation. However nor even the solution obtained by index method need be optimal. Especially for larger problems we
come to results which have very far from the optimal solution. Better results can be received using a method of
approach called the Vogel method (VAM method). We illustrate this one on the same example and such we will be able
to compare all these methods. Using the Vogel method we even receive the optimal solution for our example.
Vogel Approach Method (VAM Method)
In the course of the index method we occupy a cell with the smallest cost the earliest possible which is lying in concrete
row and concrete column. It may be that after the exhaustion of inventories in this row or by refilling the demands of a
customer who is in that concrete column. We must occupy the cell with a very high cost. For this reason costs for all the
solution are growing abnormal. VAM method uses a process which excludes these situations, unfortunately not
absolutely.
Table 4
Suppliers
K1
20
Customers
K2
K3
14
11
Inventories Difference
K4
12
S1
6
15
18
12
19
250
11
100
2
150
7
200
9
190
5
23
S3
Demands
Differences
1
15
S2
17
310
200
3
700
The basic pattern of VAM method is to prevent from such a system of occupation of cells. It is to prevent from a
situation that to the end of dispatching of inventories to customers the differences among costs are growing
inappropriately. This unwelcome situation approves oneself such that in corresponding rows or columns are great
differences between the minimal cost and the nearest higher one.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
30
Table 5
Suppliers
K1
20
Customers
K2
K3
14
11
Inventories Difference
K4
12
S1
6
S2
15
18
12
19
250
11,3
100
2,2
150
7,8
200
9,x
190
5,5
23
S3
Demands
Differences
1,1
15
200
17
310
200
3,11
700
We keep away from such situations namely as that at the beginning of allocation of inventories to customers we prefer
those rows and columns where there is the maximum difference between the smallest value of cost and the nearest
greater one. After finding such an array (row or column) we implement the maximum possible inventory into the cell
with minimum cost. We apply this method on our example again. We calculate differences in all the rows and columns.
The results are in the last row and last column of table 4. The greatest difference is in the first column. We find the cell
with minimum cost in that column. It is the cell P21. We occupy it by the maximal possible amount i.e. by 200. The
inventories of the supplier S2 are exhausted. This fact will be denoted by a lying cross at differences in the second row
where characteristics of the second supplier are described. We do an analogous conclusion when the demands of the
customer are satisfied. We write the lying cross into a cell for differences which is lying in the column of the
corresponding customer. After every operation by which either the supplier is empty or the customer is satisfied we
omit this row or column. Hence it is necessary to re-count all the remaining differences. The situation as it is after the
first assignment into the cell P21 is given in the table 5. (Simultaneously the difference is re-counted. The highest
difference is in the fourth column now. The bottom cost is in the cell P14 . This cell will be occupied by the demand of
customer K4 i.e. by 200 units. The amount of the fourth customer is satisfied. We put it as that we write the lying cross
at differences in the fourth column. After the next recounting of differences the maximal difference (8) is in the third –
the capacities of the second supplier S2 are run out, therefore the costs c2j , j = 1,2,3,4 are not used for the
computational procedure of new differences. The lowermost cost is in the cell P13 .We occupy this cell by the
remaining maximum amount from the first supplier S1, i.e. 110 units. Now both the suppliers S1 and S2 are exhausted
and simultaneously the costs from the first two rows are not considered for the next calculation of the differences. As
the column differences, we overwrite the according lying costs. We complete desired amounts of commodity of
customers in a row K2 which receives 100 units of commodity, K1 which receives 50 units of commodity, and K3
which receives 100 units of the same commodity.
The whole process of dispatching of commodity and the sequences of differences for adequate rows and columns are
given in table 6.
The value of the objective function obtained by the Vogel method described in the table 6 is equal to
z = 200.6 + 110.11 + (200 + 100).12 + 50.17 + 40.19 = 7620 .
Simultaneously it is the minimum value. Every other solution produces a larger value of the objective function by most
and hence larger costs for transportation. In the end we can say that the Vogel method brings usually the best results
from all those last given methods.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
31
Table 6
Suppliers
K1
20
S1
6
S2
200
S3
50
17
Customers
K2
K3
14
11
110
15
18
12
100
19
Inventories Differences
K4
12
200
250
100
150
11,3,3, 2,2,2,1 7,8,8,19,
Differences
17,17,x
2,x
19,19,x
1,1,3,x
200
9,x
190
5,5,5,5
15
23
40
Demands
310
200
3,11,x
700
Very often the Vogel method gives directly the optimal solution. It is usually for not extensive problems, for example in
our case. For more complicated problems the solution which is received by Vogel method is near to optimum.
Modified Vogel Method
This method can be used when the matrix of cells for suppliers and customers is of the type m . n where m >= 3 and n
>=3. In the account when the numbers m, n are near to three there we receive the same results as for not modified Vogel
method. Now to the modified Vogel method.
Table 7
Suppliers
K1
20
S1
Customers
K2
K3
14
11
110
15
6
S2
200
S3
50
17
18
12
19
100
Inventories Differences
K4
12
200
310
3,9,9,x
200
9,x
190
7,7,7,7
15
23
40
Demands
250
100
150
200
14,3,3, 3,2,2,12, 8,8,8,19, 11,11,x
Differences
17,17,x
x
19,19,x
700
We take the three smallest cij in every row and column. We order these three numbers with respect to greatness. Let us
suppose that we are working in the io row and the smallest numbers are c i j ≤ c i j ≤ c i j .
o 1
Now we do differences d 2 = c
io j 3
- ci
j
o 2
and d1 = c
io j 2
- ci
o
j1
o 2
o 3
hence we define modified difference (max difference) as
the sum of differences d and d , We obtain modified difference as d = d + d . We apply this modified method to
our example. The final table is the table 7. We see that the modified method gives the same dislocation as the VAM.
We can say that the modified method did not bring anything new and better. But it is also necessary to state, that it is
not worse. The modified differences can be more favourable for examples with bigger matrices. We show it on the
following example:
1
2
M
1
2
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
32
Example. Let us suppose that we have seven suppliers S1, S2, . . . , S7 with inventories 100, 200, 300, 400, 500, 600
and 700 units of commodity and four customers K1, K2, K3, K4 with demands for 250, 100, 150 and 200 of units. The
transport costs cij from the i-th supplier to the j-th customer are given in table 8 :
Table 8
Suppliers
K1
9
S1
10
S2
150
S3
300
Customers
K2
K3
11
6
9
8
9
S4
5
S6
1
6
7
340
300
2,2,2,2,2,2,2,
x
400
4,4,4,x
500
2,2,x
600
8,x
700
4,4,4,4,6,x
9
600
9
5,5,5,5,5,x
9
500
9
200
7
400
2
3,3,3,3,3,3,x
5
3
3
100
7
4
2
3
S7
3
50
6
Max
differences
7
90
8
S5
K4
Inventories
3
360
Demands
960
740
650
450
4,6,3,3,3 4,4,4,5,x 2,2,3,3,3 4,4,4,4,4
Max
,3
,3,x
,2,x
differences
Vogel method accomplishes the starting allocation that the cost is equal to 11580 financial units. Our modified method
brings the starting allocation in the cost 9890 financial units (see table 8). It is obvious that the modified Vogel method
presented here brings more better results than the classic Vogel method. But it is necessary to admit that at an other
choice of column or row with the same w-differences we can come to other and worst result.
REFERENCES
[1]
ACKOFF, R.; SASSIENI, M. Fundamentals of operations research. N.Y. : Wiley, 1968.
[2]
BELLMAN, R.; DREYFUS, S. Dynamic programming. Princeton : PUP, 1962.
[3]
DUDIRKIN, J. Opereach research. Praha : SNTL, 1994.
[4]
CHURCHMAN, C. W.; ACKOFF, R.; ARNOFF, E. Introduction to operation research. N.Y. : Wiley, 1975.
[5]
SAATY, T. Mathematical Methods of Operations Research. N.Y. N.Y. : Mac grave, 1959.
[6]
ZAPLETAL, J.; ZÁSTĚRA, B. Vybrané kapitoly z operačního výzkumu. Zlín : VUT-FT, 1983.
[7]
ZAPLETAL, J. Operační analýza. Skriptorium VOŠ, Kunovice : EPI, s.r.o., 1995.
[8]
ZAPLETAL, J.; VACULÍK, J. Podpůrné metody rozhodovacích procesů. Brno : MU, 1998.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
33
Address:
Abdurrzzag Tamtam
Faculty of Electrical engineering and Communication
Brno University of Technology
Purkynova 118,
612 00 Brno, Czech Republic
e-mail: [email protected],
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
34
SOLUTION OF STRUCTURAL INTERBRANCH SYSTEM OF A DYNAMIC MODEL
Jaromír Baštinec, Josef Diblík
Brno University of Technology
Abstrakt: The whole range of problems by mutual deliveries among various manufacturing branches of
industry and a lot of sizes at on market place is given by crude productions of individual branches. In this
paper, it will be shown that this dynamic problem can be solved with special mathematical methods.
Klíčová slova: Branches of industry, crude production, technical coefficient,dynamic model.
Let us suppose that the economical system is divided into n manufacturing branches. We denote xi the whole amount
produced by i − th branch of industry. Further we denote X ij the amount of production of i − th branch supplied to the
branch. At the end we denote yi the amount of products of the i − th branch for final usage (ii. market,
export). The whole relations among producers and their customers can be given as follows:
j − th
x1 = X 11 + X 12 + . . . + X 1n + y1
x2 = X 21 + X 22 + . . . + X 2 n + y2
. . . . . . .
(1)
xn = X n1 + X n 2 + . . . + X nn + y n
The common cognitions allow us to do the following assumption: The supply X ij of the i − th branch to the j − th one
is direct proportional to the crude production of the j − th branch x j . Then we have:
X i j = ai j x j
(2)
where the coefficient of the direct proportion is called the technical coefficient.
If we know the supplies among all branches of industry and the amount for the final usage from the previous seasons
then we can calculate the technical coefficient such as we calculate all xi i = 1, 2, . . . , n from the system (1) and
hence the technical coefficient can be obtained from the equation
ai j =
Xi j
xj
(3)
Calculating X ij from (3) for i = 1, 2, . . . , n , and j = 1, 2, . . . , n and putting them into (1) the following system of
linear equations is received:
x1 = a11 x1 + a12 x2 + . . . + a1n xn + y1
x2 = a21 x1 + a22 x2 + . . . + a2 n xn + y2
. . . . . . . . . . . .
(4)
xn = an1 x1 + an 2 x2 + . . . + ann xn + y n
The system (4) can be rewritten into the matrix form:
X = A. X + Y .
(5)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
35
Vector X is called vector of crude production of branches, vector Y is called vector of final usage and the matrix A is
called the matrix of technical coefficients. The system of equations can be rewritten into the form
(6)
( E − A) . X = Y
where E is a unit matrix of the type n
it is n rows and n columns.
n
The matrix E − A can be supposed as an operator of transformation which for the input vector X (vector of crude
production of branches) Y assigns the vector of output (vector of final usage). This result is not eminent for economy.
More interesting and more important for economy is to search out the operator which for given Y assigns the vector X.
This operator can be received by multiplication of the equation (6) by the inverse matrix ( E − A )
X = ( E − A ) .Y
−1
−1
and we receive
(7)
The matrix ( E − A ) is denoted by B and it is called the matrix of the full material burden. Matrix equations are called
fundamental form of open static model of inter branches relations. The elements of the matrix
−1
B = bi j 
(8)
are called coefficients of the full material costs. The coefficient bi j indicates inverse consumption of the production of
the i − th branch which is necessary for the delivery of the production of the j − th branch for the final usage. Among
ai j and bi j holds the following relation:
ai j ≤ bi j for i, j = 1, 2, . . . , n .
For the expression of economical interpretation of the coefficients bi j it is useful to restore the model (7) as the system
of the equations again:
xi = bi1 y1 + bi 2 y 2 + . . . + bin yn
i = 1, 2, . . ., n
(9)
For the new vector of usage Y ' = y1 , y2 , . . ., yn the system ( 9 ) is of the form:
'
'
'
xi' = bi1 y1' + bi 2 y 2' + ... + bin yn'
i = 1, 2,..., n
(10)
Let us denote
∆xi = x&i − xi , ∆yi = y& i − yi , i = 1,2.L, n.
and we do a subtraction of the left and right sides of suitable equations of the systems ( 9 ) and (10). We receive the
system
∆xi = bi1∆y1 + bi 2 ∆y2 + L + bin ∆yn , i = 1,2,L, n.
(11)
The magnitude ∆xi tells the difference of the change of the whole crude production of the i − th branch if the change of
the components of the final usage is ∆yi , ∆y2 ,L, ∆yn . It is possible to prove that the coefficients bi j are non negative
numbers and hence ∆xi ≥ 0. We put ∆yi = 1 and ∆yi = 0 for all k ≠ j . Then from (11)
∆xi = bij ⋅ 1, i = 1, 2,L, n.
(12)
The coefficient bi j of the full material costs sets the value for which must be the production increased in the
i − th branch that the j − th branch increases supply for final usage upon the unit.
The coefficient bi j of the full material costs includes in itself at first the value of the direct supply from the i − th branch
into the j − th branch which is necessary to the production of a unit in the j − th branch, secondly the values of
supplies of the i − th branch which contracts the j − th branch mediate by instrumentality of the others branches.
We supposed at the model (6) that we know the vector of the crude production X and we gain the vector of final usage
Y . Contrary at the model (7) we know the vector of final usage Y and with the aid of it we look for the vector of the
crude production X . As the third type of problems we solve the following situation. We have given for some branches
the crude production and for the other the amount of the vector of final usage. At this treatment of the model the
required vector contains k elements formed by crude production and n − k elements formed by final trade outlets from
the system ( n > k ). The branches of the model can be transformed so that we obtain two families:
The first family contains branches for which the capacities of crude production are given.
The second family of branches for which we know the amount of the full final usage.
We compose our model such that we calculate from the first family the full final usage and from the second one the
crude production for complementary branches. The first family will be denoted by the index 1, the second by the index
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
36
2. We look for such matrix R for which the following equality relation holds.
X   Y 
R.  1  =  1 
 Y2   X 2 
It is obvious that we must construct four sub-matrixes with the following property:
 R11



R
 21
where R11 is the matrix of the type
k
k
n−k
n−k
(13)
R1 2 
 X   Y 
. 1  =  1 
  Y2   X 2 
R2 2 
, R12 is of the type
n−k
k
(14)
, R21 is of the type
k
n−k
and R2 2 is of the type
.
From (14) there follows:
R11 X 1 + R12Y2 = Y1
(15)
R21 X 1 + R2 2Y2 = X 2
We infer the matrixes R11 , R12 , R21 , R22 as follows: We divide the matrix A of technical coefficients with the agreement of
the allocation of branches of industry into two families:
 A11



A
 21
A1 2 
 X
Y
X
 .  1  +  1  = 1
  X 2   Y2  X 2
A2 2 
(16)
We extend the equation (16) into two equations with respect to the rules for multiplication of matrixes and we have:
A11 X 1 + A12 X 2 + Y1 = X 1
(17)
A21 X 1 + A2 2 X 2 + Y2 = X 2
After the rearrangement of the equations (17) we obtain
(
)
Y1 = E − A11 X 1 − A12 X 2
(
(18)
)
Y2 = − A21 X 1 + E − A2 2 X 2
When we put X 2 from the second equation of (18) we get
(
X 2 = E − A2 2
)
−1
(
)
Y2 + E − A2 2 A21 X 1
and hence
(
)
(
Y1 = E − A11 X 1 − A12  E − A2 2
)
−1
(
Y2 + E − A2 2
)
−1
A21 X 1 

(19)
After comparing with the first equation from (15) we get
(
)
(
R11 = E − A11 − A1 2 E − A2 2
(
R12 = − A12 E − A2 2
)
)
−1
A21
(20)
−1
Similarly as (20) we get from the second equation of (15)
(
)
= (E − A )
R21 = E − A2 2
R2 2
−1
A21
(21)
−1
22
While the interpretation of the elements of matrixes ( E − A ) and ( E − A ) in models (6) and (7) is simple, the
interpretation of the elements in the matrixes Ri j ; i , j = 1, 2 is more complicated task. For the understanding of the
−1
content of the elements of sub-matrixes Ri j ; i , j = 1, 2 we study fractional products.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
37
Dynamic systems
Production in a certain interval depends on the accumulation in the previous interval. The statistic models do not
express this dependence.
Accumulation is a component of final product. In the balance of the relations between the branches, we denoted the
final products of the individual branches as follows: y1, y2, ... , yn, - This final product basically consists of two parts:
yi(1) ... the consumed part, and
yi(2) ... the accumulated part, i.e. yi = yi(1) + yi(2).
For distinguishing the individual periods of time, we shall denote gross production of the i-th branch in year t by the
symbol xi(t) and the final product in year t by the symbol yi(t).
Accumulated product of i-th branch becomes a part of the means of other branches. We denote the part of accumulated
product of t i-th branch in year t which is invested into the j-th branch as Δyij(t). Then
n
yi( 2) = ∑ ∆yij (t ).
j =1
If the accumulated product itself consists only of floating means that are consumed in the following year (t+1), the
following relation obviously holds between the increment of production in the j-th branch [xj(t+1)-xj(t)] and the
investment into the products of the i-the branch Δyij(t):
∆yij (t ) = aij [ x j (t + 1) − x j (t )] = X ij (t + 1) − X ij (t ) .
However, part of the accumulated product is of the form of basic funds that are not consumed within a single year.
Suppose the consumption of investments Δyij(t) is divided into Tij years. This means that only Tij -part of the investment
Δyij(t) is consumed within the following year. The reality is hus better characterised by the relation
∆yij (t )
= aij ( x j (t + 1) − x j (t )) ,
Tij
which after multiplication gives
∆yij (t ) = aijTij ( x j (t + 1) − x j (t )).
The relation between the increment in the year (t+1) and the accumulation in the preceding year is therefore given by a
system of technical coefficients aij and by the system of average periods of usability of Tij, that are also of technical
nature. Therefore we substitute them with the so-called "investment coefficient", denoted by cij:
cij = aijTij
The system of investment coefficients, and similarly the system of technical coefficients, forms square matrix C.
Using the investment coefficients, the relation between the increment of production in the j-th branch and and the extent
of investments into the production of the j-th branch may be written in the form
∆yij (t ) = cij ( x j (t + 1) − x j (t ))
The whole of the accumulated production is thus equal to
n
n
j =1
j =1
yi( 2 ) (t ) = ∑ ∆yij (t ) = ∑ cij ( x j (t + 1) − x j (t )).
This equation connects the accumulation of the i-th branch with the increment of the production in the individual
branches. Similar equations may be obtained for all the branches
c11 ( x1 (t + 1) − x1 (t )) + c12 ( x2 (t + 1) − x2 (t )) + L + c1n ( xn (t + 1) − xn (t )) = y1( 2) (t ),
c21 ( x1 (t + 1) − x1 (t )) + c22 ( x2 (t + 1) − x2 (t )) + L + c2 n ( xn (t + 1) − xn (t )) = y2( 2) (t ),
O
cn1 ( x1 (t + 1) − x1 (t )) + cn 2 ( x2 (t + 1) − x2 (t )) + L + cnn ( xn (t + 1) − xn (t )) = yn( 2) (t ).
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
38
From this system of equations, we may directly determine how much we need to accumulate from the production of the
individual branches in the given year to reach the planned increment of the production in the individual branches. When
we write the system of equations in the matrix form
C∆X = Y ( 2) ,
and after adaptation
∆X = C −1Y ( 2)
we may determine the increment of the individual branches in the following year, when the level and structure of
accumulation is given.
Now we take into account the period t = 1, 2, ..., T and adopt the following notation:
n
xi (t ) = ∑ ( X ij (t ) + zij (t )) + yi (t ), i = 1,2,L, n,
(22)
j =1
where Xij(t) is the supply of the i-th branch to the j-th branch for consumption in the t-th period of time, zij(t) is the
supply of the i-th branch to the j-th branch for investment during the t-th period of time, yi(t) is the final product of the
i-th branch in the t-th period of time. Suppose, similarly as with the static model, that the
X ij (t ) = aij x j (t ),
where aij is the technical coefficient that does not change in time.
Further, suppose that the supply of the i-th branch to the j-th branch for investment during the period of time t is
proportional to the increment of the production of the j-th branch in one period of time:
zij (t ) = cij ( x j (t + 1) − x j (t )) = cij ∆x j (t ),
where Δxj(t) is the increment of the total production of the j-th branch within one period, cij is the investment coefficient
, again independent of time.
By substituting into the balance equation (22), we obtain, in the matrix form,
CX (t + 1) + ( A − C − E ) X (t ) = −Y (t ),
We obtained a system of difference equations
CX (t + 1) = ( E − A + C ) X (t ) + Y (t ),
X (t + 1) = C −1 ( E − A + C ) X (t ) + C −1Y (t ).
From this system, we may determine how the inter-branch relationships should look in order to obtain the required
growth.
−1
Denote C ( E − A + C ) := M , C
−1
:= N . We receive
X (t + 1) = MX (t ) + NY (t ).
(23)
We obtained a system of difference equation defined for all t. If |M| is not equal to 0, then a unique solution of the
system exists.
By subsequent substitution, we transform the system (23) into a single difference equation of the n-th order, which may
be solved e.g. with the aid of eigenvalues of the characteristic equation.
If M has a small order, then the solution was described in [17].
Moreover, we can use the next Theorem (proof see [3], p. 124).
Theorem:
The unique solution of the initial value problem
X (n + 1) = M (n ) X (n ) + G (n ),
X (n0 ) = X 0
is given by
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
39
n−1
 n−1

 n−1

X (n, n0 , X 0 ) =  ∏ M (i ) X 0 + ∑  ∏ M (i )G (r ).
r = n0  i = r +1

 i=n0

If M is a constant matrix, then the solution is given by
n−1
X (n, n0 , X 0 ) = M n −n0 X 0 + ∑ M n−r −1G (r ).
r =n0
Example:
We consider the system
x(n + 1) = 2 x (n) + y (n) + n,
y (n + 1) = 2 y (n) + 1.
x(0 ) = 1, y (0) = 0.
Solution:
 2 1
n
1
 , G (n) =   , X (0) =   .
In this case we have M = 
0
1


1
 0
Then
 2n
M = 
0
n
n 2n−1 
.
2n 
Hence
 2n
X (n) = 
0
n2 n −1  1  n−1  2 n−r −1
  + ∑ 
2 n  0  r =0  0
(n − r − 1)2 n−r −2  r 
2 n −r −1
 1  =
 
 2n  n−1  r 2 n−r −1 + (n − r − 1)2 n−r −2 
 = (*)
=   + ∑ 
n − r −1

0
2

  r =0 
n +1
n −1
a (1 − an ) − na (1 − a )
r
We use the formula ∑ ra =
.
(1 − a )2
r =1
n +2
 1   1 n 
 1 n−1  1  r n − 1 n−1  1  r 
1  



1 −    + (n ) 
    n
  +
∑
 2n  n  4 ∑
2
2
2
4
2

2  =
n 2 




1
0
r
=
r
=
  + 2     
(*) =   + 2 
=

r
 0
n
1 n−1  1 
1


0






 
1−  
∑


2
2

r =0 


2


 n  1  n  n n  1   
 −    + −     2 n + n 2n −1 − 3 n 
n
2 
 4  2   2 2  2   
4 
.
=   + 2 n 
=
n

n
1



1
0
 

  1−  

1−  

 
 2

2


„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
40
So we have
3
x(n ) = 2 n + n 2n −1 − n,
4
n
1
y (n ) = 1 −   .
 2
Acknowledgement
This research has been supported by the Czech Ministry of Education in the frame of MSM002160503
Research Intention MIKROSYN New Trends in Microelectronic Systems and Nanotechnologies.
REFERENCES
[1]
ACKOFF, R. L. Progress in Operation Research. New York : John Wiley & Sons, Inc. 1961.
[2]
CHURCHMAN, Ch. W.; ACKOFF; R. L.; ARNOFF, L. Introduction to Operations Research. New York :
John Wiley & Sons, Inc. 1957.
[3]
ELAYDI, S. N. An introduction to dufference equations, Second Edition, Springer, 1999.
[4]
HABR, J.; VEPŘEK, J. Systémová analýza a syntéza. Praha : SNTL, 1972 .
[5]
BECK, J.; LAGOVÁ, M.; ZELINKA, J. Lineární modely v ekonomii. Praha : SNTL, 1982.
[6]
KLAPKA, J.; DVOŘÁK, J.; POPELKA, P. Metody operačního výzkumu. Brno : VUTIUM, 2001.
[7]
PRÁGEROVÁ, A. Diferenční rovnice. Praha : SNTL, 1971.
[8]
RAIS, K. Vybrané kapitoly z operační analýzy. Brno : PGS, 1985.
[9]
ROCCAFERRERA, G. M. F. Operation Research Models for Business and Indusry. Chicago, New York :
S.W. publishing company, 1964.
[10]
TER-MANUELIANC, A. Modelování problémů řízení. Praha : Institut řízení, 1977.
[11]
VACULÍK, J.; ZAPLETAL, J. Podpůrné metody rozhodovacích procesů. Brno : Masarykova univerzita 1998.
[12]
WALTER, J. a kol. Operační výzkum. Praha : SNTL, 1973.
[13]
WALTER, J. Stochastické modely v ekonomii. Praha : SNTL, 1970.
[14]
ZAPLETAL, J.; ZÁSTĚRA, B. Vybrané kapitoly z operačního výzkumu. Zlín : VUTFT, 1983.
[15]
ZAPLETAL, J. Operační analýza. Kunovice : Skriptorium VOŠ, 1995.
[16]
ZAPLETAL, J. Structural Interbranch System of Static Model. International conference of EPI Kunovice,
2005, 303 – 307. ISBN 80-7314-052-7.
[17]
BAŠTINEC, J. Structural Interbranch System of Dynamic Model. International conference of EPI Kunovice,
2005, 317 – 322. ISBN 80-7314-052-7.
Address:
Doc. RNDr. Jaromír Baštinec, CSc.
Department of Mathematics
Faculty of Electrical Engineering and Communication
Brno University of Technology
Technická 8,
616 00 Brno,
[email protected]
Address:
Prof. RNDr. Josef Diblík, DrSc.
Department of Mathematics
Faculty of Electrical Engineering and Communication
Brno University of Technology
Technická 8,
616 00 Brno,
[email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
41
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
42
PROBABILITY THEORY AND STATISTICS IN THE COMBINED FORM OF STUDY OF THE
BACHELOR STUDENT PROGRAMMES AT FEEC BUT
Michal Novák
Brno University of Technology
Abstrakt: At FEEC BUT the teaching of bachelor student programmes in the combined form of study
began in the academic year 2004/2005. This contribution discusses how the basics of probability theory
and statistics are included in the programmes. Some general information on the course as well as first
experience from it are also given.
Klíčová slova: distance learning, combined form of study, probability, statistics, teaching mathematics
Introductory information – context and prerequisites
Teaching in the combined form of study began at FEEC BUT in the academic year 2004/2005. The courses as well as
their outlines and requirements are the same as in the attended form of study (with the exception of one subject offered
in the attended form but not in the combined one). Mathematics in the bachelor student programmes is therefore taught
in four subjects: Mathematical seminar, Mathematics 1, Mathematics 2 and Mathematics 3.
Mathematical seminar is meant to revise secondary school knowledge of mathematics necessary for further studies. In
the combined form of study it is a long weekend course at the beginning of the term; the students can either attend it or
submit exercises only.
Mathematics 1 (in the first term) includes basics of linear algebra, basics of differential and integral calculus of
functions of one variable and basics of differential calculus of more variables. Mathematics 2 (in the second term)
includes solving differential equations and basics of theory of complex functions and integral transformations. Both
courses consist of five tutorials.
The course on probability and statistics
Basic concepts of probability and statistics are taught during the third term in Mathematics 3. Its outline, however,
includes basic numerical methods as well. There are four tutorials (3 lessons each) out of total number of six in the third
term which include Mathematics 3. The outline of the first Mathematics 3 course was as follows:
•
Tutorial 1
• introduction, revision, classical and geometrical probabilities, discrete and continuous random variable, issues
of expected value and dispersion
•
Tutorial 2
• some basic distributions of probability (binomial, Poisson, exponential, uniform, normal), idea of statistical
testing, basic statistical tests (sign test, z-test & mean expected value test), queuing theory (time permitting)
•
Tutorial 3
• numerical methods (not to be discussed here)
•
Tutorial 4
• final tutorial, revision, sample tests
The tutorials introduce students to the basic concepts of probability theory and statistics. Students learn that a relatively
small number of formulas and theorems have profound effects in a number of situations. The choice of tasks emphasises
the way of decoding the respective word problems and finding the way of applying the mathematical means to solve the
problems. The part on statistical testing stresses the general concept of testing and its applicability in various contexts.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
43
Students are required to submit exercises for each tutorial. These are assessed by a maximum of 30 points in total. The
written only exam at the end of term is assessed by a maximum of 70 points. In order to pass the subject students are
required to acquire the minimum of 50 points. Students have access to sample tests – these can be downloaded from
teacher’s homepage. The final tutorial deals with the exam as well.
Students can study the subject matter from [2], which respects the needs of distance form of study and was targeted at
this subject. The text, which is also used by students in the attended form of study, is available as a PDF file from a
number of links including the faculty website and teacher’s homepage. Special office hours in convenient time are set
for the combined students only – consulting subject matter by telephone is widely used.
Another contribution in the proceedings of this conference, [3], gives examples of tasks solved throughout the course.
Applications of these problems as well as the necessary mathematical knowledge are included there as well.
Probability and statistics in the master study programmes
Since teaching in the combined form of study started only as late as 2004/2005, there are no students in the master study
programmes yet. The master study programmes in the attended form of study, however, offer a course Probability,
statistics, operations research, which deepens the knowledge of probability theory and statistics. The topics dealt with
in the course include: basic statistical tests – t-test, F-test; confidence intervals; linear regression; post-hoc tests;
goodness of fit test; nonparametric tests; mathematical methods in economics - linear programming, the transport
problem; dynamic programming, recursive algorithm, inventory models.
Conclusion
Knowledge of probability theory and statistics is necessary for dealing with a great number of situations which can
occur in various contexts. The combined form of study as a means designed to provide access to this knowledge to
students who are already employed can help to improve position and status not only of such students but also of their
employers.
References
[1]
BAŠTINEC, J. Výuka matematiky na FEKT VUT Brno (v bakalářském i magisterském studiu). In 37.
konferencia slovenských matematikov. Žilina : Slovenská matematická společnost, 2005.
[2]
FAJMON, B.; RŮŽIČKOVÁ, I. Matematika 3. Brno : UMAT FEKT VUT, 2003, available from
https://www.feec.vutbr.cz/et/skripta/umat/Matematika_3_S.pdf.
[3]
NOVÁK, M. Examples of using concepts of probability theory in management decision making. This
proceedings.
Address:
Mgr. Michal Novák, Ph.D.
Ústav matematiky,
FEKT VUT v Brně
Technická 8
616 00 Brno
tel.: +420-541143135
e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
44
APLIKACE FUZZY SYSTÉMŮ PRO PODPORU ROZHODOVÁNÍ A ŘÍZENÍ
Vladimír Mikula, Jindřich Petrucha
Evropský polytechnický institut, s.r.o.
Abstrakt: Tento příspěvek pojednává o použití fuzzy logiky pro podporu rozhodovacího procesu. Základní
poznatky o fuzzylogice byly publikovány v početných literárních pramenech- viz přehled literatury na konci
tohoto článku. Stručný přehled teorie fuzzy systémů může čtenář nalézt v tutoriálním článku publikovaném
autorem v Proceedings of the International Conference of EPI, s.r.o. Kunovice, leden 2005 [1] Popsáno je
uspořádání fuzzy systému s vysvětlením účelu a funkce individuálních bloků systému a stručný popis využití
pro rozhodovací a řídicí procesy.
Klíčová slova: fuzzy sets, fuzzy systems, universe of fuzzy sets,degree if membership, fuzzification of input
and out-put variables, fuzzy rules, fuzzy associative memory FAM (bank of rules), fuzzy inferences,
MAXMIN and MAXPROD methods (MAMDANI´s and LARSEN´s method), centroid, defuzzification of
centroid, crisp value of output variable.
Úvod
V reálném světě jsou jevy a aktivity popisovány pokud možno exaktně na základě idealizovaných matematických
modelů (běžné v přírodních a technických vědách). Avšak v některých oblastech, např. v ekonomice, v organizování
chodu systémů společenského charakteru) nejsou vždy k dispozici exaktní matematické modely, nebo jsou obtížně
formulovatelné, anebo, i když by se dal takový model zformulovat, byl by velmi komplikovaný a těžkopádný a možno i
prakticky nepoužitelný. V těchto situacích se čím dál více začínají uplatňovat metody, popisující systémy a jevy na
základě expertních znalostí sledované problematiky. Pro řešení se využívají postupy a metody z oblasti umělé
inteligence, kam patří umělé neuronové sítě (napodobující myšlení) a fuzzy systémy, založené na fuzzy logice (a na tzv.
approximate reasoning,) a využívající tzv.lingvistické proměnné, tedy vyjadřování pomocí vágních jazykových
prostředků, hodnotících velikost parametrů nějaké veličiny, odstupňované v jistém rozsahu pomocí výstižných slov
(nálepek, labelů), např: velmi malý, malý, střední, velký, velmi velký, apod..) I když tyto pojmy nejsou ostře vymezeny,
jsou tzv neostré, nebo-li fuzzy, vyjadřují obvykle v přijatelné a srozumitelné formě příslušnou oblast platnosti nějakého
tvrzení a dají se modelovat pomocí fuzzy množin. Použití „ostré“ (binární, booleovské) logiky, uznávající jenom
pravdivost (vyjádřenou symbolem logické jedničky), nebo nepravdivost (vyjádřenou logickou nulou) jistého tvrzení je
někdy příliš hrubé a nepřijatelné a lépe vyhovuje odstupňování míry pravdivosti hodnotami kontinualně rozloženými
v definovaném intervalu (např. od nuly do jedničky). Proto je ve fuzzy logice zaveden pojem stupně příslušnosti prvku
do dané fuzzy množiny, pro který platí μ∈< 0, 1 >. Využívání vágních pojmů pro popis skutečností pomocí
lingvistických prostředků je odedávna běžné a mnohdy velmi výhodné a je součástí každodenního života. Proces řízení
anebo rozhodování je pak expertně popsán pravidly typu
JESTLI-ŽE < předběžná podmínka > PAK < následek, činnost >, nebo, jak je všeobecně používáno IF <antecedent >
THEN < consequent >. Soubor těchto pravidel pak tvoří tzv. znalostní banku ( fuzzy associative memory,
FAM) a postihuje pomocí patřičně kvantifikovaných lingvistických prostředků uvažovaný řídicí, nebo rozhodovací
proces – je to obdoba programu sekvenčního digitálního počítače, operujícího v převážné většině na bázi ostré,
booleovské logiky. Poznatky o fuzzy logice, která je obecnější než ostrá binární logika (jež je speciálním případem
fuzzy logiky) jsou popsány ve velmi početné literatuře - viz reference na konci tohoto článku.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
45
Uspořádání fuzzy systému pro řízení a pro podporu rozhodování
je uvedeno na Obr. 1. Algoritmus a funkci jednotlivých bloků fuzzy systému lze stručně popsat takto:
ostré
Blok defuzzifikace hodn.
Blok úprav
výstupních
veličin
6
7
Blok fuzzy
inferencí
5
Upravené výst. prom. y1, ...ym
y1 ... ym
Banka pravidel IFTHEN (FAM)
4
Řízení nebo
rozhodující objekt
x1 ...xn
Vstupní proměnné
y1 ...ym
Blok fuzzifikace
1
Blok úprava
vstupních veličin
3
2
Obr. 1. Uspořádání fuzzy systému
1.
2.
3.
4.
5.
Získání hodnot vstupních proměnných x1 až xn (např. pomocí senzorů na řízeném objektu (nebo
z vhodnédatabanky)
Úprava těchto hodnot (normování, úprava na bezrozměrné číselné hodnoty)
Fuzzifikace vstupních a výstupních veličin, tj. jejich rozdělení na dílčí fuzzy podmnožiny v příslušných
univerzech a přidělení názvů (nálepek, labelů) těmto podmnožinám ve smyslu výše uvedených úvah o jazykové
proměnné. Otázkou je stanovení optimálního počtu těchto podmnožin. Čím více jich zvolíme, tím jemnější řízení,
nebo rozhodování dosáhneme,ale prodlouží se tím výpočetní čas. Ukazuje se, že rozumný počet podmnožin je 3 až
9, obvykle se volí lichý počet (symetrické rozložení kolem střední hodnoty. Sousední podmnožiny se musí
částečně překrývat, aby bylo dosaženo plynulé přecházení univerzem. Stupeň překrytí (overlapping) si vyžaduje
hlubší rozbor, ale přibližně lze zvolit koeficient překrytí jako poměr kp = A / B, kde A je šířka intervalu překrytí
dvou sousedních podmnožin na ose x a B je celkový interval na ose x , zabíraný těmito podmnožinami . Často se
volí překrytí tak, aby průsečík obou překrývajících se podmnožin byl na hodnotě μ = (0,3 až 0,5).
Sestavení banky pravidel (FAM- fuzzy associative memory) na základě expertních znalostí řeše-ného
problému. Pravidla (rules) jsou již zmíněného typu IF < antecedent > THEN < consequent >. Rozměr banky
pravidel je roven počtu vstupních proměnných n a počet pravidel P je roven součinu počtu podmnožin pi na které
jsou patřičná univerza vstupních veličin rozdělena, tedy P = p1 . p2 . ... pn . Uvažujme např. systém, který má dvě
vstupní veličiny, tedy n = 2 (označme je x1 a x2) a jednu výstupní veličinu y. Banka pravidel bude tedy
dvourozměrná (obdélníková). Nechť univerzum veličiny x1 je rozděleno do pěti podmnožin, označených
lingvisticky na stupně: velmi nízký (VN), nízký (N), střední (S), vysoký (V) a velmi vysoký (VV), univerzum
veličiny x2 nechť má tři stupně: nízký (N), střední (S ) a vysoký (V ), a nechť výstupní veličina y bude mít také tři
stupně: malá (M) střední (S) a velká (V).. Počet pravidel P = p1 . p2 = 5 . 3 = 15. Nechť expertně stanovená
pravidla, reprezentovaná doporučeným obsazením jednotlivých políček znalostní banky mají rozdělení podle Obr.
2a. Je-li počet vstupních proměnných roven třem, banka pravidel bude třírozměrný hranol, obecně pro n vstupních
proměnných to bude n – rozměrné těleso (Obr.2b). Dále platí zásada, že pro každou výstupní veličinu musíme
vytvořit samostatnou banku pravidel
Fuzzy inference. Fuzzifikované vstupní a výstupní veličiny přivádíme do bloku inferencí, kde na základě pravidel
uložených ve znalostní bance jsou vykonávány operace, zvané fuzzy inference. Výsledkem těchto operací je
získání tzv. centroidů výstupních veličin, což jsou obvykle subnormální fuzzy množiny (nedosahující úrovně µ =
1). Inference jsou typu MAXMIN (Mamdaniho metoda), nebo MAXPROD (Larsenova metoda). Pro objasnění
algoritmu inferencí uvažujme konkrétní fuzzifikované veličiny podle Obr.3.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
46
x1
y
x2
VN
N
S
V
VV
N
V
V
V
V
V
S
M
S
S
S
S
V
M
M
M
M
M
x3
x1
x2
a
b
Obr.2. Příklad banky pravidel při dvou (a) a při třech (b) vstupních proměnných.
U inferencí typu MAXMIN postupujeme takto:
pro definované hodnoty vstupních veličin, např. x1A a současně x2A zjistíme do kterých fuzzy podmnožin tyto veličiny
patří a s jakým stupněm příslušnosti µ (x1A), µ (x2A). Tak např. x1A leží v množině V s hodnotou µ = 0,7 a také
v sousední množině S, kde dosahuje hodnotu µ = 0,35. Veličina x2A leží přitom v množině N, kde dosahuje hodnoty µ
= 0,45 a také v množině S s hodnotou µ = 0,15. Na základě banky pravidel (použijeme FAM banku uvedenou výše)pro
tuto situaci zapíšeme následující pravidla:
R1 : IF x1 = x1A → V ( 0,7) AND x2 = x2A → N (0,45) THEN y → V (0,45)
OR
R2 : IF x1 = x1A → S (0,35) AND x2 = x2A → S (0,15) THEN y → S (0,15).
OR next rule.
Protože v antecedentu je použita spojka AND, značící průnik, používáme ve smyslu pravidla o průniku fuzzy množin
[1]: µV = min (µV , µN ). Obě pravidla R1 a R2 platí současně, takže je spojíme spojkou OR, nebo-li sjednocení.
Současně platí i ostatní pravidla, ale ta se v uvedené situaci neprojeví..
Výslednou fuzzy množinu výstupní veličiny (výsledný centroid) dostaneme tak, že množinu V výstupní veličiny y
ořežeme ve výšce µ = 0,45 a množinu S ořežeme ve výšce µ= 0,15. Takto získané dílčí centroidy, označme je jako C1 a
C2, spojíme podle pravidla o sjednocení fuzzy množin:
µC = max (µC1 , µC2 ), jak je ukázáno na obr 3a.
µ VN
1
0,7
N
S
V
VV
µ
M
S
V
0,45
MAXMIN
0,35
0,15
0
x1A
x1
0
y
C2
µ
1
N
S
V
µ
M
C1
S
V
1
MAXPROD
0,45
0,45
0,15
0,15
0
x2A
0
C2
y
C1
Obr.3. Rozdělení fuzzifikovaných veličin uvažovaného fuzzy systému a ukázka inferencí typu MAXMIN a
MAXPROD.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
47
Podle metody MAXPROD (Larsenova metoda) postupujeme tak, že při získávání dílčích centroidů neořezáváme
příslušné podmnožiny výstupní veličiny, ale je snížíme vynásobením (product, odtud PROD) hodnotou minima
získaného aplikací jednotlivých pravidel. Takto získané dílčí centroidy pak spojíme operací sjednocení (MAX) a
dostaneme výsledný centroid (Obr. 3.).
1. Defuzzifikace centroidu výstupní veličiny je operace při níž se snažíme vyhodnocením centroidu získat ostrou
(crisp) hodnotu výstupní veličiny y. Je několik metod defuzzifikace, ale nejpoužívanější je metoda nalezení
souřadnice těžiště centroidu (centre of gravity, COG), tedy yCOG podle vztahu
∞
n
∫ µ(y)y dy
Σ µ (yi) yi
-∞
i=1
yCOG =
≈
∞
n
∫ µ (y) dy
Σ µ (yi)
-∞
i =1
Druhá část tohoto vztahu používá místo integrace součty výrazů µ (yi) yi a µ (yi) získaných vzorkováním centroidu
(Obr. 4.).
µ
1
µ ( yi )
těžiště ( COG)
.
0
y1 y2
yCOG
yi
yn
y
Obr. 4. Centroid výstupní veličiny vyjadřený pomocí vzorkování.
Určíme-li všechny hodnoty yCOG pro všechny hodnoty vstupních veličin a znázorníme je v souřadném systému yCOG = f
(x1, x2 …. xn), dostaneme tzv. rozhodovací (u řídicího procesu řídicí) plochu uvažovaného fuzzy systému v (n +1) –
rozměrném prostoru.
2. Blok úprav výstupních hodnot. Výstupní ostré hodnoty yCOG získané na výstupu defuzzifikátoru mohou být
pouhé číselné hodnoty bez rozměru. Pro reálný systém může být nutné dodat k nim rozměr a přizpůsobit je
k žádoucím rozsahům veličin pro které je systém sestaven (např. v reálných řídicích systémech veličiny y mohou být
elktrická napětí, nastavovaná v definovaných rozsazích, nebo může jít o délku časového intervalu po který má
funkce systému probíhat, atd.). K tomu slouží blok úprav výstupních veličin.
Aplikace poznatků
Výše uvedené postupy lze aplikovat v různých oborech. Uveďme zjednodušený příklad z oblasti rozho-dování
managementu při financování určité akce, kterou má realizovat subjekt S, jehož jakostní parametry P1 , P2 ... Pn jsou
uloženy v databázi DB a na základě těchto dat lze expertním způsobem vyhodnotit údaje potřebné pro rozhodnutí, zda
uvažovanou akci v požadované výši nákladů financovat nebo ne.
Ale rozhodování jenom mezi dvěma krajními hodnotami tedy ANO nebo NE může být příliš hrubé. Jemnější a snad
přijatelnější by bylo použití ještě dvou stupňů mezi nimi, tedy např. SPÍŠE ANO a SPÍŠE NE. Použití dalšího stupně
uprostřed by mohlo vést k neurčitému stavu, proto se přikloníme k uvažovaným čtyřem stupňům doporučení zda
uvažovanou akci realizovat, tedy ANO, SPÍŠE ANO, SPÍŠE NE a NE .
Výši nákladů na akci lze také kvantifikovat vhodnými lingvistickými stupni např. VYSOKÉ, STŘEDNÍ a NÍZKÉ.
Kvalitativní parametr realizačního subjektu (uvažujme pro názornost jen jeden (významný) parametr, a to P1, i když
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
48
tuto metodiku lze zavést i pro více parametrů) také rozdělíme expertně na žádoucí počet lingvisticky odstupňovaných
hodnot, např. VELMI MALÝ, MALÝ, STŘEDNÍ, VELKÝ, VELMI VELKÝ. Algoritmus tohoto procesu znázorňuje
Obr. 5.
Databáze umožňující určit
kvalitativní parametry
realizačního subjektu
Expertní stanovení
fuzzy podmnožin
vst. a výst.veličin
a banky pravidel
Fuzzy systém pro podporu
rozhodovacího procesu
podle schématu na Obr. 1
Finální
rozhodnutí
x1
y
x2
VM
N N
S N
V N
M S
SN SA
SN SA
N SN
V
A
A
SA
VV
A
A
A
Banka pravidel IF - THEN
µ VM
1
M S
V
VV
0
µ N
1
µ N
1
x1
S
0
SN SA
A
y
V
0
x2
x1 kvalitativní parametr P1 realiz. subjektu
x2 výška nákladů
y rozhodnutí managementu zda akci
financovat při uvažované výši nákladů
Obr. 5. Algoritmus rozhodovacího procesu, příklad banky pravidel IF- THEN a rozvržení fuzzifikovaných vstupních a
výstupních veličin.
Závěr
Tento článek podává stručný přehled o metodě využití fuzzy logiky pro podporu rozhodovacího procesu.
Navazuje na předchozí tutoriální článek [1], uvedený ve sborníku mezinárodní konference EPI Kunovice, konané
v lednu 2006. Využití fuzzy množin umožňuje získat vhodné výsledky i v případech, kdy není k dispozici exaktní
matematický model systému, nebo je velmi komplikovaný a nevhodný pro řešení v reálném čase, ale kdy existují
expertní znalosti o řešené problematice.Potřebné teoretické základy jsou v uvedených literárních pramenech. V článku
jsou stručně popsány jednotlivé bloky fuzzy systému, jejich funkce, případně nejdůležitější zásady návrhu. Na závěr je
ukázán příklad koncepce fuzzy systému pro podporu rozhodovacího procesu managementu při financování určité akce.
Metodika se dá využít i pro různé další úlohy z oblasti bankovnictví, ekonomiky, průmyslu, atd. Výstup tohoto systému
lze brát spíše jako kvalifikované doporučení při rozhodování. Finální rozhodnutí, samozřejmě, záleží na názoru
rozhodujícího subjektu, jímž je uvedený management.
Literatura:
[1]
MIKULA, V. Exploitation of fuzzy logic in control and decision processes. Proceedings of the International
Conference of Kunovice : EPI, 2005.
[2]
ZADEH, L. A. Fuzzy Sets. Inf. and Control, 8, 1965, pp. 338- 353.
[3]
KOSKO,B. Neural Networks and Fuzzy Systems. Prentice Hall Inc., 1992.
[4]
NOVÁK, V. Fuzzy množiny a jejich aplikace. Praha : Matematický seminář, SNTL, 1992.
[5]
NOVÁK, V. Základy fuzzy modelování. Praha : BEN – Technická literatura, 2000.
[6]
POKORNÝ, M. Umělá inteligence v modelování a řízení. Praha : BEN- Technická literatura, 1996.
[7]
ZADEH, L. A.; LANGARI, R.; YEN, R.; COX, E. Fuzzy Logic Educational Program. Motorola Co. 1992.
[8]
KAUFMANN, A. Initiation Élementaire aux Sous- ensambles Flous à l´ Usage des Débutants. École
Polytechnique Féderal de Lausanne, 1992.
[9]
KONEČNÝ, V.; PEZLAR, R.; REJNUŠ, O. Fuzzy expertní systémy systémy a rozhodování. Brno : Acta
Universitatis Agriculturae et Silviculturae Mendelianae Brunensis, 2001.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
49
Adresa:
Prof. Ing. Vladimír Mikula, CSc.
Evropský polytechnický institut, s.r.o.
Osvobození 699,
686 04 Kunovice
te./fax.: +420 572 549 018, +420 572 548 788
e-mail: [email protected]
Adresa:
Ing. Jindřich Petrucha, Ph.D.
Evropský polytechnický institut, s.r.o.
Osvobození 699,
686 04 Kunovice
te./fax.: +420 572 549 018, +420 572 548 788
e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
50
EXAMPLES OF USING CONCEPTS OF PROBABILITY THEORY IN MANAGEMENT
DECISION MAKING
Michal Novák, Břetislav Fajmon
FEKT VUT v Brně
Abstrakt: Probability theory and statistics play an important part in everyday company and management
life and decision making. In the contribution we show that even very basic concepts can be used in solving
practical problems. The choice of tasks and ways of solving them conform to the curriculum of a course on
probability theory and statistics offered in a combined form of study of bachelor student programmes at
FEEC BUT, which is referred to elsewhere in this proceedings.
Klíčová slova: probability theory, statistics, combined form of study, teaching mathematics
Introductory information
Elsewhere in the proceedings of this conference, [6], there is mentioned the curriculum of a course on probability theory
and statistics offered in the combined form of study at FEEC BUT. This course is a brief and introductory one only – it
includes classical and geometrical probabilities, discrete and continuous random variable, basic terms of statistics, some
very basic distributions of probability and introduction to the issue of statistical testing. Yet even these concepts only
can be used in solving some important practical tasks.
We are going to discuss various problems which can occur in various contexts in a number of variations and we are
going to show how the knowledge of basic concepts only can help in solving them. The variability of the choice is
intentional – we want to show that the basics of probability theory and statistics can be applied in a number of situations
at almost any position in the company without any special deep education, long training or use of specialised software.
The issue of guarantee period
Let us consider the following situation:
The operating life of a product can be described by a certain distribution of probability. We are ready to tolerate a
certain number of legitimate complaints in the guarantee period. What length of the guarantee period shall we set?
Once we know the distribution of probability describing the operating life of our product, the task can be solved in a
simple way. Let us denote the continuous random variable describing the operating life of our product by X , the
expected value of the operating life by EX, the relative number of legitimate complaints we are ready to tolerate by α
and the length of the guarantee period by G. Then in fact we need to find such G that P ( X < G ) = α . For the sake of
simplicity, let us consider the exponential distribution1. Then we have 1 − e
G = − EX ln(1 − α ) .
−
G
EX
= α , which results in
Designing a special offer
Let us imagine the following special offer:
We are going to offer something for free. Every eligible person is entitled to do something (cast a die, turn a lottery1
The choice follows from the fact that the exponential distribution is one of those taught in the course on probability
and statistics referred to in [6]. It is to be noted, however, that the choice of exponential distribution in this context may
be rather special.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
51
wheel, etc) with a relatively small number of possible outcomes. If certain states are reached, the person is given the
advertised thing and entitled to another chance. This keeps repeating as long as the desired states are reached.
We may consider the following specific example:
Beers for free! If you can squeeze in, turn our lottery-wheel! Six numbers only! 5 and 6 mean a beer for free and
another chance! As long as 5 and 6 keep falling, you keep drinking! The only way to stop drinking for free is to pray for
1 to 4!
The nature of this offer can be revealed using the concept of expected value. Let X be a discrete random variable
denoting the number of beers drunk for free by one person and p(x) the probability mass function of X. Let us except the
2
cases of stopping drinking for other reasons than turning out the wrong numbers. We get that p( x ) = x +1 for
3
x = {0,1,2,...} and p( x) = 0 otherwise. The expected number of beers drunk for free by one person in such a special
∞
offer can be computed as EX = ∑ x
x =0
2
3
x +1
and it turns out that EX=0.5, which is definitely less than the offer suggests.
Designing board games
Many board games contain fields known as “function fields”, i.e. fields which require some action or direct the game.
The flow of the game can be controlled or directed by a suitable choice of positions of these fields, or rather the number
of fields between them. It becomes apparent if the
game is played with more than one die and the
Distribution of probability of sum s on tw o dice
length of each player’s move is the sum of numbers
on the dice. The graph shows values of probability
0,200
mass function of a discrete random variable denoting
the possible sums on two six-sided dice:
probability
0,150
Seven is the most likely sum on two dice – it is three
times more probable than three or eleven and six
times more probable than two or twelve.
0,100
0,050
The only mathematical concept used in this example
is the notorious formula of classical probability
| A|
P ( A) =
, where |A| is the number of positive
|Ω|
outcomes of an experiment and |Ω| is the number of
all possible outcomes.
0,000
2
3
4
5
6
7
8
9
10
11
12
possible sum s
The issue of guessing in multiple choice tests
There are many objections to multiple choice tests. It has been often suggested that the results may be influenced by
simple guessing. Let us consider the following conditions:
There is a given number of questions (let us denote it by n), each of which offers the same number of options (let us
denote it by k), out of which r options are always correct. Let us suppose that the respondent does not know anything
about the subject matter of the test yet knows how many options are correct and guesses accordingly. What is the
expected number of correctly answered questions if the question is regarded as answered correctly if all correct options
are marked only?
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
52
number of correct
options (r)
number of options (k)
1
3
0,33
4
0,25
5
0,20
6
0,17
7
0,14
8
0,13
2
0,33
0,17
0,10
0,07
0,05
0,04
0,25
0,10
0,05
0,03
0,02
0,20
0,07
0,17
0,03
0,05
0,01
0,02
0,14
0,04
3
4
5
6
7
0,13
The probability that a question is answered correctly under
these circumstances is the same for every question and
1
. The table shows the respective
equals p =
k 
 
r 
probabilities for given k and r. The expected number of
correctly answered questions is EX = n. p . The example of
100 questions of six options out which three are always
correct (which can be considered a reasonable compromise)
immediately negates the objection on guessing.
Reading specifications
Reading and understanding specifications of products is an important part of everyday company life. Let us e.g.
consider a test with the following specifications:
Point span: 0 – 100; minimum points to pass the test: 50; random variable X, which describes the results of the test, has
normal distribution, with expected value µ = 62 and dispersion σ 2 = 25 .
•
It is obvious that such a test is useless, since if X~No( µ , σ 2 ), we have P (µ − 3σ < X < µ + 3σ ) = 0.9973 , which
•
means that failing the proposed test is almost impossible because in our case P (47 < X < 77) = 0.9973 .
Let us consider another simple example:
Random variable X describing the time before a problem with a machine occurs has exponential distribution of
probability. The problem occurs once in H hours. We denote T the operating time before the problem occurs for the first
time and p probability that the machine works without a problem for longer that T. For H=2000 and p=0.99 we get that
T is as short as 20 hours.
This could make us reject such a machine since 20 hours of non-problematic operating time is indeed not acceptable.
Yet with some background knowledge of probability theory we could easily object to such reasoning since with p
decreasing T rapidly increases. The general formula for counting T is in our case of exponential distribution of
probability T = − H ln p , which follows from P ( X ≥ T ) = p , where X ~ Exp(1) describes the time before the problem
occurs in case that the problem occurs once in H hours2. The following table gives values of T for some values of p and
H.
H
2000
p
0,99 0,98
0,97
0,96
0,95
T
20,1 40,4
60,9
81,6
102,6 123,8 145,1 166,8 188,6 210,7 325,0 446,3
H
3000
p
0,99 0,98
0,97
0,96
0,95
T
30,2 60,6
91,4
122,5 153,9 185,6 217,7 250,1 282,9 316,1 487,6 669,4
H
p
4000
0,99 0,98
0,97
0,96
T
40,2 80,8 121,8 163,3 205,2 247,5 290,3 333,5 377,2 421,4 650,1 892,6
0,95
0,94
0,94
0,94
0,93
0,93
0,93
0,92
0,92
0,92
0,91
0,91
0,91
0,90
0,90
0,90
0,85
0,85
0,85
0,80
0,80
0,80
Comparing results
Comparing data is a necessity in a number of situations. However, wrong conclusions are often drawn from “comparing
the incomparable”. Let us consider the following data sets with the same span of possible results, where
xi ∈ {10,11,...,19,20} :
Set A: xi : 12, 12, 12, 13, 14, 14, 15, 15, 15, 16, 18, 19, 20
2
Naturally, another piece of knowledge would be necessary to support or question the fact that exponential distribution
is used in this respect.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
53
Set B: xi : 10, 10, 11, 11, 11, 13, 16, 17, 18, 19, 19, 20, 20
Value xi = 18 occurs in both data sets. The average value for both data sets is x = 15 , therefore for both data sets we
have that xi = 18 is better than the average. Yet after we employ the standardized z-values, which take into account
both average and dispersion, we get that in set A the respective z-value is z16 = 1,22 while for set B we have
z16 = 0,77 , which means that value xi = 18 is relatively much better in set A even though its distance from the (same)
average is the same in both sets.
The issue of statistical testing
Since z-tests only are dealt with throughout the course in the combined form of study at FEEC BUT, which is referred
to in [6], let us consider only this type of statistical tests. For the sake of simplicity let us deal with the test
µ ≠ constant. The mathematical background of this test is very simple: once we set the hypotheses, we are looking for
α
such xk that P ( X ≥ xk ) = , where X~No( µ , σ 2 ) and α is the given significance level. The xk can be easily
2
obtained with the help of the respective z-value, since the parameters µ and σ 2 (thus alsoσ ) are known.
This simple background can be applied in a number of various situations. Typically:
We know that the operating time of a machine (random variable X) can be described by the normal distribution of
probability as X~No( µ , σ 2 ). We are going to test a new technique designed to increase the operating time. Some
machines have been enhanced with the technique and their operating time measured. Given the significance level
α what conclusion on the quality of the technique can be drawn from the values?
Statistical testing can be easily abused in order to manipulate the recipient into accepting wrong conclusions. This is
especially true for the choice of the significance level and the misinterpretation of µ ≠ constant and µ > constant or
µ < constant tests. Given suitable significance level and disregarding the nature of the test almost any hypothesis may
seem acceptable.
Conclusion
Knowledge of probability theory and statistics is an integral part of responsible decision making in many aspects of
company routine. It is often believed that these theories are too abstract or too complex to be used by non-trained staff
or without specialised software. The above contribution does not challenge this assumption, which is naturally valid in a
great many contexts. It rather complements the idea by suggesting that there exist real-life situations which can be
solved by a surprisingly modest amount of knowledge of probability theory.
References:
[1]
BAŠTINEC, J. Matematika pro bakaláře na FEKT VUT. In Matematika na vysokých školách. Praha : 2003.
[2]
BAŠTINEC, J.; DIBLÍK, J. Výuka matematiky v magisterském studiu na FEKT VUT. In XXIII International
Colloquium on The Acquisition Process Management, Sborník abstraktů a elektronických verzí příspěvků na
CD-ROM. Brno : Univerzita obrany, 2005.
[3]
CASELLA, G.; BERGER, R. L. Statistical Inference, 2nd ed. Duxbury Thompson Learning, 2002.
[4]
FAJMON, B.; RŮŽIČKOVÁ, I. Matematika 3. Brno : UMAT FEKT VUT, 2003, available from
https://www.feec.vutbr.cz/et/skripta/umat/Matematika_3_S.pdf
[5]
FOTR, J.; DĚDINA, J. Manažerské rozhodování. Praha : Vysoká škola ekonomická, Fakulta
podnikohospodářská, 1994.
[6]
NOVÁK, M. Probability theory and statistics in the combined form of study of the bachelor student
programmes at FEEC BUT. This proceedings.
[7]
ZAPLETAL, J. Poznámka k rozhodování za rizika a nejistoty. This proceedings.
[8]
ZAPLETAL, J. Základy počtu pravděpodobnosti a matematické statistiky. Brno : PC-DIR, 1995.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
54
Address:
Mgr. Michal Novák, Ph.D.
Ústav matematiky,
FEKT VUT v Brně
Technická 8
616 00 Brno
tel.: +420-541143135
e-mail: [email protected]
Address:
RNDr. Mgr. Břetislav Fajmon, PhD.
Ústav matematiky,
FEKT VUT v Brně
Technická 8
616 00 Brno
tel.: +420-541143135
e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
55
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
56
OPTIMIZATION OF MATERIAL CHARACTERIZATION BY ADAPTIVE TESTING
Gábor Vértesy1, Ivan Tomáš2, István Mészáros3
1
Research Institute for Technical Physics and Materials Science Hungarian Academy of Sciences,
2
Institute of Physics, Academy of Sciences of the Czech Republic
3
Department of Materials Science and Engineering, Budapest University of Technology and Economics
Abstract: A new procedure, called Adaptive Testing was applied for non-destructive characterization of
cold-rolled austenitic stainless steel samples. The flat samples were magnetized by an attached yoke, and
sensitive, reliable descriptors of their plastic deformation strain were obtained from the proper evaluation,
based on the measurements of series of magnetic minor hysteresis loops, without magnetic saturation of
the samples. The results were compared with the results of conventional, reference measurements.
Significant increase of sensitivity was found if Adaptive Testing was applied.
Keywords: Adaptive testing, nondestructive material evaluation, material parameter optimization
1. Introduction
Magnetic measurements are frequently used for characterization of changes in structure of ferromagnetic materials,
because their magnetization processes are closely related to the microstructure of the materials. This fact also makes
magnetic measurements an evident candidate for non-destructive testing, for detection and characterization of any
modification and/or defects in materials and in products manufactured from such materials [1,2]. Majority of traditional
magnetic investigations of variation of structural material properties simply make use of several parameters of the
saturation-to-saturation major hysteresis loop (coercive force, remanent induction, saturation magnetization,
permeability), see e.g [3,4]. These traditional parameters were very suitably established for general account of magnetic
properties of ferromagnetic samples, but they were never optimized for magnetic reflection of various structural
properties of the measured specimens and for their current alterations.
An alternative, sensitive and experimentally friendly approach to this topic, the Adaptive Testing (AT) method,
suggests a procedure of accumulation of data on the selected physical process, whose parameters are systematically
modified in as broad ranges of values as to get the most complex picture of the behavior. Next analysis of the recorded
data leads to a large family of “degradation curves”, i.e. of potential calibration curves, the most satisfactory of which is
then picked up by a software algorithm as the optimally adapted calibration curve for next tests of unknown samples of
the inspected material altered in the expected way. Based on the magnetic minor loops measurement the method of
Magnetic Adaptive Testing (MAT) was considered recently in [5] and [6]. MAT introduced general magnetic
descriptors to diverse variations of non-magnetic properties of ferromagnetic materials, optimally adapted to the just
investigated property and material. According to this method the sets of minor hysteresis loops are scrutinized, and
sensitive descriptors of the property variation of the material are identified.
In this work an example of application of AT is given. We describe an experimental search for the calibration
curve/curves, best adapted to magnetic examination of a particular steel material, subjected to cold rolling. Next
testingof any unknown sample of the same kind of steel would be expected to indicate the level of the currently applied
plastic strain through magnitude of the chosen magnetic feature of the material.
2. ADAPTIVE TESTING OF COLD ROLLED AUSTENITIC STEEL
Titanium stabilized austenitic stainless steel, 18/8 type, was studied. Stripe-shaped specimens were annealed at 1100 C
for 1 hour. Then they were quenched in water in order to prevent any carbide precipitation, and to achieve
homogeneous austenitic structure as the starting material structure. The as-prepared stainless steel specimens were coldrolled at room temperature to different strains (from 33 to 63% strain in the case of the investigated specimens). For the
reference measurement, a major (saturation-to-saturation) magnetic hysteresis loop was taken from each sample.
A specially designed Permeameter [6] with a magnetizing yoke was applied for measurement of families of minor loops
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
57
differential permeability of the magnetic circuit. The magnetizing coil wound on the ferrite yoke gets a triangular
waveform current with step-wise increasing amplitudes and with a fixed slope magnitude in all the triangles. This
produces time-variation of the effective field, ha(t), in the magnetizing circuit and a signal is induced in the pick-up coil.
As long as ha(t) sweeps linearly with time, the voltage signal U(ha,hb), in the pick-up coil is proportional to the
differential permeability, µ(ha,hb), of the magnetic circuit
µ ( ha , hb ) = const * U ( ha , hb ) = const * ∂B ( ha , hb ) / ∂ha * ∂ha / ∂t
Permeameter works under full control of a PC computer, which sends the steering information to the function generator,
and collects the measured data. An input-output data acquisition card accomplishes the measurement. The computer
registers data-files for each measured family of the minor “permeability” loops, corresponding to each measured
sample.
The experimental raw data are processed by a data-evaluation program, which divides the originally continuous data of
each measured sample into a family of individual permeability half loops. Then the family, either of the top half-loops
or the bottom half-loops or their average is chosen for next processing. The program filters experimental noise and
interpolates the experimental data into a regular square grid of elements, µij ≡ µ(hai ,hbj), of a “µ-matrix” with a preselected field-step. The consecutive series of µ-matrices, each taken for one sample with strain, ε, of the consecutive
series of the more-and-more deformed material, describes the magnetic reflection of the material plastic deformation.
The matrices are processed by a matrix-evaluation program, which normalizes them by a chosen reference matrix, and
arranges all the mutually corresponding elements µij of all the evaluated µ-matrices into a µij(ε) table. Each µij(ε)column of the table numerically represents one µij(ε)-degradation function of the material. The matrix-evaluation
program calculates sensitivity of each degradation function and draws their “sensitivity map” in the plane of the field
coordinates (hai ,hbj)≡(i,j). This map shows the relative sensitivity of each µij(ε)-degradation function with respect to the
plastic deformation strain, ε, of the investigated material. Sensitivity of each degradation function is computed as the
slope of its linear regression and it is expressed by a shade in the sensitivity map figure.
1750
1500
hb [A/m]
1250
1000
750
500
250
0
-250
-750 -500 -250
0
250
500
750 1000 1250
ha [A/m]
Fig. 1
Map of relative sensitivity of the µ-degradation functions, µij(ε)≡µ(hai ,hbj)(ε). (The crossing point of the lines indicate
the most sensitive µij(ε)-degradation function.)
Permeability matrices of all the samples were calculated from the measured data, and the matrices-evaluation process
was applied to compare sensitivity of all the individual degradation functions, µij(ε), each corresponding to a pair of the
field-coordinates (hai ,hbj). The sample having the lowest strain was used for the normalization. Fig. 1 shows the map of
the relative sensitivity of the µij(ε)-degradation functions. The elements, depicted in the sensitivity map as the “whitest”,
correspond to the most sensitive µij(ε)-degradation functions. The most sensitive element, characterized by ha=700 A/m
and hb=1200 A/m corresponds to the top of the “white” area. Its location is shown by the two, crossing perpendicular
lines in Fig. 1. It is also seen from the sensitivity map, that within the “whitest” region the µij(ε)-degradation functions
vary only very slightly, so the neighbouring elements of the chosen (ha700, hb1200) provide practically the same value.
This makes the choice of the proper, sensitive descriptor to be very reliable. The most sensitive µij(ε)-degradation
function is shown in Fig. 2. The Bmax values, which were determined from reference measurement on the major loops,
are also indicated in the figure.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
58
5.5
5.0
ha700,hb1200
Reference
µ degradation functions
4.5
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
30
35
40
45
50
55
60
65
Plastic strain [%]
Fig. 2
The most sensitiveµij(ε)-degradation function and the result of the reference measurement
Integrating the permeability along the field ha, hysteresis loops and hysteresis loop B-matrices can be obtained. The Bmatrices contain the same information as the µ–matrices, however, presentation of the ε-dependences of the
corresponding Bij(ε)-degradation functions is different and sometimes advantageous. After the same procedure of the
matrices-evaluation and the corresponding normalization, all the Bij(ε)-degradation functions for 0≤hbj≤2000 A/m, hbj≤haj≤+hbj, are shown in Fig. 3. It is worth of mentioning, that all the Bij(ε)-degradation functions show the same
shape-type of dependence on plastic deformation.
200
B-degradation functions
150
f
6
100
f
e
5
50
e
0
b
2
b
b
B
B
2
b
2
b
B
A
1
a
3
c
c
c
C
3
c
C
3
c
C
d
4
d
d
D
d
4
D
4
d
D
4
d
D
D
f
F
Ff
6
F
6ff
F
6
F
6
F
6
Ffff
6
F
6
f
e
E
e
E
E
5
e
E
5
e
5
e
E
5
E
5
e
E
F
-50
-100
-150
-200
30
35
40
45
50
55
60
65
Plastic strain [%]
Fig. 3
Bij(ε)-degradation functions, as functions of the plastic strain.
The sensitivity map for the Bij(ε)-degradation functions is presented in Fig. 4. In contrast to Fig. 1, here the white (and
also black) areas indicate those regions where the matrix elements varies magnitudes vary rapidly, jumping from one
element to the neighbouring one. For instance, elements corresponding to the steepest slopes in Fig. 3 are located in the
black and white areas of Fig. 4. These descriptors are very sensitive, but their reliability is questionable, because
moving from one element to the neighbouring one, high jumps of values can happen (even from a large positive value
to a large negative one). Because of this reason, the choice of descriptors from the homogeneously gray areas seems to
be the most reliable one. The optimal choice of the Bij(ε)-degradation functions are shown in Fig. 5.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
59
1800
1600
hb [A/m]
1400
1200
1000
800
600
-600
-400
-200
0
200
400
600
ha [A/m]
Fig. 4
Map of relative sensitivity of the B-degradation functions, Bij(ε) ≡ B(hai ,hbj)(ε).
8
B-degradation functions
7
ha100hB1000
6
5
4
3
2
1
30
35
40
45
50
55
60
65
Plastic strain [%]
Fig. 5
The optimal choice of the Bij(ε)-degradation functions.
A third type of matrices (µ’-matrix) can also be obtained. This is the matrix of the derivative of permeability with
respect to the field, ha, (the first derivative of permeability, ∂µ/∂ha, or the second derivative of magnetic induction,
∂2B/∂ha2). The sensitivity map of the µ’ij(ε)-degradation functions is shown in Fig. 6. The optimal µ’ij(ε)-degradation
function (a compromise between sensitivity and reliability, as explained above) is shown in Fig. 7.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
60
1600
1400
1200
hb [A/m]
1000
800
600
400
200
0
-200
0
200
400
600
800 1000 1200
ha [A/m]
Fig. 6
Map of relative sensitivity of the µ’ij(ε)-degradation functions.
14
ha450,hb950
µ' degradation functions
12
10
8
6
4
2
0
30
35
40
45
50
55
60
65
Plastic strain [%]
Fig. 7
Optimal choice of µ’ij(ε)-degradation functions, as a function of plastic strain.
3. DISCUSSION
As it was already mentioned the originally paramagnetic austenite specimens became more and more ferromagnetic, as
a consequence of the applied cold-rolling. All austenitic stainless steels are paramagnetic in the annealed, fully
austenitic condition, and the only magnetic phase, which can be induced (e.g. by cold-rolling) in the low carbon
austenitic stainless steels, is the bcc α′-martensite, which is highly ferromagnetic. This process can be followed easily
by magnetic measurements.
By applying the above described adaptive testing method, the relatively small difference between the magnetic
characteristics of the investigated sample series can be determined much more sensitively, than by the conventional
methods. By using the µij(ε)-degradation functions derived from the permeability matrices, it is possible to increase the
sensitivity of the determination of the appearance of ferromagnetic α′-martensite by about a factor 3, as compared with
the “classical” major loop approach, if proper descriptors are used. (For comparison see Fig. 2.) The reliability of this
determination is very good, which is illustrated by the sensitivity matrix. Here a wide plateau is seen, where sensitivity
of the µij(ε)-degradation functions is varied only very slightly, if their field coordinates are miss-positioned. In other
words, the sensitivity (and also the reliability) of the measurement is not influenced by the exact choice of the matrix
element.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
61
The sensitivity with respect to the applied strain is increased, if the hysteresis loop B-matrix parameters are considered
instead of permeability matrix. The sensitivity map shows the area, where the most sensitive and/or most reliable
descriptors are found. If the descriptors are very carefully chosen, more than 1:7 ratio can be obtained between the less
and the most ferromagnetic piece of the investigated samples. The most sensitive area in the Bij-sensitivity map is not as
large (there is not such a wide plateau), as in the case of sensitivity of the µij(ε)-degradation functions, but there is a
well defined area, where the most sensitive descriptors are positioned.
Scatter of the parameters is very low, as can be seen in Fig. 3. It is possible to increase the sensitivity: more than 1:100
ratio can be reached, but in this case the exact choice of matrix elements becomes crucial. The shapes of the
dependences of all the Bij-matrix elements vs. plastic deformation are very similar to each other. So, the descriptors,
which were evaluated from the hysteresis loop B-matrices, seem to be especially suitable for very sensitive
characterization of the changes, which are introduced by the cold rolling in the austenitic material.
It is possible to increase the sensitivity even more substantially, and at the same time to avoid the possible mistake of
any miss-positioning by choosing the first derivative of permeability (µ’-matrix) as the source of descriptors. In the case
of µ’-matrices, the largest sensitivity can be obtained if the most reliable area is taken. For the presented series of
samples it is 1:14 (see Fig. 7). On the other side, the scatter of the µ’-matrix elements is the largest, if we take into
account all the elements, but even in this case an area of the elements can be found, from where reliable enough
elements can be taken.
4. CONCLUSIONS
This paper was dedicated to the indirect experimental measurement of variable material properties, in particular to the
question how – from all available features of the chosen physical process (here from the process of magnetization of the
samples by an external field), which was employed for description of the investigated material variation – to determine
that one, which is the best adapted to the particular material under inspection, to the particular variation/degradation of
the material, and to the particular demands declared by the examiner.
The introduced way of the Adaptive Testing of materials suggests a procedure of collection of data on the selected
physical process, whose parameters are systematically modified in as broad ranges of values as to get the most complex
picture of the behavior. Next analysis of the recorded data leads to a large family of degradation functions, the most
satisfactory of which is then defined as the optimally adapted calibration curve for next tests of unknown samples of
the investigated material altered in the expected way.
As shown in the presented experimental example, an optimum selection among the degradation functions has to take
into consideration not only the desired high sensitivity of the calibration curve, but needs to demand also low
experimental error of the curve-constituting measured values and low curvature of the sensitivity surface (referred to as
stability of the calibration curve) around the selected AT-coordinates point.
The presented example obviously demonstrated that AT, focusing on the explored concrete material and the explored
concrete degradation, introduces kind of a trade involving sensitivity, stability, smoothness, shape, and experimental
friendliness of the optimum calibration curves. This focus usually leads to excellent results of AT, which are
substantially more advantageous for description of the explored material variations than the plain use of the traditional
descriptors, which are focused on the employed physical process itself.
ACKNOWLEDGEMENTS
The financial support by the Hungarian Scientific Research Fund (T-035264 and T-062466) and by the Academy of
Science of the Czech Republic (projects No. 1QS100100 and AVOZ 10100520) is appreciated.
REFERENCES
[1]
JOHNSON, M. J.; LO, C. C. H.; ZHU, B.; CAO, H.; JILES, D.C.; Nondestruct. Eval. 20 (2000) 11.
[2]
JILES, D.C. Magnetic methods in nondestructive testing. K. H. J. Buschow et al., Ed., Encyclopedia of
Materials Science and Technology, Oxford : Elsevier Press, 2001. p. 6021.
[3]
JILES, D.C. NDT Int. 21 (1988) 311.
[4]
DEVINE, M. K. Min. Met. Mater. (JOM) (1992) 24.
[5]
TOMÁŠ, I.; Magn. Magn. Mat. 268 (2004) 178.
[6]
TOMÁŠ, I.; PEREVERTOV, O. JSAEM Studies in Applied Electromagnetics and Mechanics. T. Takagi and
M. Ueasaka (Eds.), 9 (2001) 533.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
62
Address:
Dr. Gábor Vértesy, Dr.Sc.
Research Institute for Technical Physics and Materials Science
Hungarian Academy of Sciences,
H-1525 Budapest,
P.O.B. 49, Hungary
phone:+3613922677
fax: +3613922226
e-mail: [email protected]
Address:
RNDr. Ivan Tomáš, CSc.
Institute of Physics, Academy of Sciences of the Czech Republic
Na Slovance 2,
18221 Praha, Czech Republic
phone: +420266052177
fax: +420286890527
e-mail: [email protected]
Address:
Dr. István Mészáros
Department of Materials Science and Engineering,
Budapest University of Technology and Economics,
H-1111 Budapest,
Goldmann ter 3, Hungary
phone: +3614632883
fax: +3614633250
e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
63
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
64
VYUŽITÍ KOMPLETNÍHO GENETICKÉHO ALGORITMU PRO ŘEŠENÍ OPTIMALIZACE
VÝROBNÍHO PROCESU Z HLEDISKA MAXIMALIZACE ZISKU
Jiří Kostiha
VUT Brno
Abstrakt: Článek obsahuje popis kompaktního genetického algoritmu, jeho specifika a odlišnosti od
klasického genetického algoritmu. Součástí je ukázka efektivnosti na standardní testovací funkci a aplikace
algoritmu na konkrétní ekonomickou optimalizační úlohu s úkolem maximalizace zisku výrobního procesu.
Klíčová slova: kompaktní, genetický, algoritmus, evoluční, optimalizace, vícerozměrný, problém, řešení,
maximalizace, zisk, výrobní, proces
Úvodem
Evoluční algoritmy (EA) jsou svým principem založeny na technické interpretaci biologických dějů. Základní
myšlenkou popsanou Darwinem je přežívání nejsilnějších jedinců. Během života se jedinci mezi sebou kříží, mutují a
tím vznikají nový jedinci, kteří následovně vstupují do procesu výběru a dalšího křížení. Nejlépe vybavení jedinci
k životu mají největší šanci ke křížení, zatímco slabší jedinci mají malou šanci. Tím je zaručeno velmi pravděpodobné
pokračování genetických větví silných jedinců, zatímco větve slabších jedinců se ukončují. Na základě principu EA se
dají nalézt jejich společné rysy:
•
Inspirovány přírodními evolučními procesy.
•
Pracují s populací jedinců.
•
Iterační charakter přístupu k řešení. Jeden iterační cyklus je dán jednou populací.
•
Stochastický a heuristický charakter.
•
Poskytují řešení, které nemusí být optimální, ale je vhodné. Řešení je poskytnuto v krátkém čase.
Genetické algoritmy (GA) jsou dosud nejúspěšnější z evolučních algoritmů vůbec. Pro svou efektivitu a velmi dobré
výsledky nacházejí uplatnění v mnoha odvětvích lidské činnosti. Používají se například při vytváření rozvrhů práce pro
stroje v továrnách, v teorii her, v ekonomii managementu, pro řešení optimalizačních problémů multimodálních funkcí,
při řízení robotů, v rozpoznávacích systémech a v úlohách umělého života. Dále genetické algoritmy nacházejí
uplatnění při řešení tzv. NP-úplných problémů, kde téměř všechny ostatní algoritmy selhávají, tj. kde výpočetní čas je
exponenciálně nebo faktoriálně závislý na počtu proměnných.
Za svou dobu existence vzniklo mnoho modifikací EA a stále se vyvíjí nové. Kompaktní genetický algoritmus (CGA) je
speciální případ algoritmu. Řadí se do skupiny GA, avšak je postaven na zcela odlišném základě. Hlavní rozdíly jsou v
technice generování populace a rekombinačních operátorech.
Evoluční algoritmy
Genetické algoritmy
Klasický genetický algoritmus
Kompaktní genetický algoritmus
Obrázek 1: Hierarchie evolučních algoritmů
Srovnání CGA a klasického GA
Principem klasického GA i CGA, stejně jako všech EA je řešení úloh, které se dají popsat tzv. fitness funkcí a hledání
jejich globálního extrému. Podle typu úlohy může jít o minimalizaci nebo maximalizaci. Fitness funkce se skládá z
vlastní funkce a omezujících podmínek. Funkce i podmínky jsou charakteristické pro daný problém a je nutné je
vytvořit podle zadání konkrétního problému.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
65
Základem GA jsou chromozómy, které v matematické interpretaci odpovídají jedincům, někdy také nazývanými
agenty. Jedinci tvoří populaci, která se mění s přibývajícími generacemi. Populace reprezentuje informaci o částech
prohledávaného prostoru, které byly v dosavadním výpočtu vzorkovány. Standardně používané operátory křížení,
mutace a selekce potom určují, jak má být s touto informací naloženo při generování dalších potenciálně slibných
řešení. Tyto operátory tvoří základ klasického GA.
CGA algoritmus používá pravděpodobnostní popis aktuálního stavu výpočtu (model popisující současnou populaci).
Operátory křížení, mutace a selekce jsou zde nahrazeny vzorkováním prohledávaného prostoru na základě daného
pravděpodobnostního modelu. Jeho základem je pravděpodobnostní chromozóm (PCh), který je zapsán jediným nrozměrným reálným vektorem. Nepracuje se s reálnou populací, ale s vektorem, na základě kterého se v době výpočtu
generuje reálný chromozóm (jedinec). Každá pozice vektoru vyjadřuje pravděpodobnost výskytu hodnoty 1 na dané
pozici chromozomu. Počáteční stav je dán hodnotou 0,5 na všech pozicích PCh.
Na obrázku 2 je znázorněn PCh pro jednoduchou čtyřbitovou úlohu o dvou neznámých parametrech reprezentovaných
geny 1 a 2. V pravé části je nejpravděpodobnější binární hodnota generovaných jedinců. Černá čísla představují
ustálené hodnoty - již se nemohou měnit. Šedá neustálené - mohou se ještě v průběhu výpočtu měnit.
Pravděpodobnostní chromozóm
Gen 1
Gen 2
0 0,8 0,6 1 0,3 0,5 0,2 1
Reprezentace PCh chromozómu v binárním tvaru
Gen 1
Gen 2
0 1 1 1 0 1 0 1
Obrázek 2: Pravděpodobnostní chromozóm
Vývoj výpočtu spočívá v modifikaci PCh podle vygenerované populace nějakým vhodným způsobem, aby docházelo k
přibližování hodnot generovaných jedinců ke globálnímu extrému. Postup výpočtu klasického GA a CGA je na obrázku
3.
populace
pravděpodobnostní chromozóm (PCh)
křížení
populace
mutace
modifikace PCh
selekce
a)
b)
Obrázek 3: Srovnání klasického genetického algoritmu a) a kompaktního genetického algoritmu b)
Modifikace PCh se provádí podle určitých pravidel, kterých může být celá řada. Použité pravidlo v rámci zde
uvedených výsledků je následující:
•
vyber nejlepšího a nejhoršího jedince z populace
•
jestliže se bity jedinců liší a u lepšího jedince je 1, pak k hodnotě bitu PCh na dané pozici přičti 0,01 (1%)
•
jestliže se bity jedinců liší a u lepšího jedince je 0, pak k hodnotě bitu PCh na dané pozici odečti 0,01 (1%)
•
jestliže mají bity stejnou hodnotu, nedělej nic
Elitismus
Elitismus je technika, která se používá k zapamatování nejlepšího, nebo několika nejlepších jedinců předcházející
populace a jejich automatické zařazení do následující populace. Tím je stoprocentně zaručeno přežití nejlepších jedinců.
Toto vede v mnoha případech ke zlepšení efektivnosti algoritmu. Při aplikaci v CGA má smysl, jak vyplývá z principu
algoritmu, zařazovat pouze jednoho nejlepšího jedince. V následující generaci se srovnávají všichni nově vytvoření
jedinci tj. celá nová populace a elitní jedinec z předchozí generace. Vybere se nový nejlepší jedinec a je označen za
elitního. Modifikace bitů PCh se provádí podle nejlepšího a nejhoršího jedince, přičemž nejlepší jedinec v tomto
případě odpovídá elitnímu jedinci.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
66
Stabilizační hranice
Stabilizační podmínka, viz. podmínky ukončení výpočtu, může vést k předčasnému ustálení v lokálním extrému nebo
na nějaké hodnotě dané předčasným ustálením některého nebo některých bitů. V takových případech může docházet k
získávání nesprávných hodnot nejen při uvíznutí algoritmu v lokálním extrému, ale také k uvíznutí "někde" na funkci,
zapříčiněné ustáleným bitem, který již nemůže změnit svou hodnotu. Proto jsem algoritmus doplnil o stabilizační
hranice, zobrazené na obrázku 4. Interval b znázorňuje modifikační oblast, ve které se můžou pohybovat hodnoty PCh.
Naopak intervaly a na okrajích jsou zakázány. K stabilizaci potom dochází na těchto hranicích. Je tím stále zaručena
pravděpodobnost generování jedinců s opačnou hodnotou bitu a možnost překlopení a ustálení bitu k opačné hodnotě.
Takto je zamezeno možné předčasné nežádoucí stabilizaci bitů PCh.
a
b
0
a
1
Obrázek 4: Stabilizační hranice
Podmínky ukončení výpočtu
Podmínek ukončení výpočtu může být mnoho. U klasického GA jsou dvě základní. Ukončení při dosáhnutí zadaného
počtu generací a ukončení při dosáhnutí zadané přesnosti výpočtu. Druhá podmínka se týká především testovacích
funkcí, kde známe hodnotu hledaného extrému. U CGA přibývá ještě jedna důležitá podmínka a to ukončení při
stabilizaci hodnot PCh na hodnotách 0 a 1. Tato podmínka je zajímavá tím, že algoritmus jakoby sám pozná kdy došel k
extrému a již nemůže vygenerovat jinou hodnotu. Vypočítaná hodnota nemusí být přímo hodnota globálního extrému,
toto záleží na složitosti problému, ale je to nějaká vhodná hodnota, kterou lze považovat za velmi dobré řešení
zkoumaného problému. Při stabilizaci je žádoucí ukončení algoritmu, protože již nemůže dojít k nalezení lepší hodnoty
a všichni generovaní jedinci jsou totožní a odpovídají stabilizovanému PCh. Při použití stabilizačních hranic toto
neplatí, ale pravděpodobnost, že bude nalezeno lepší řešení je velmi malá a algoritmus je dále neefektivní.
Testovací úloha
Pro testování CGA jsem použil standardní testovací funkci Rastriginovu, zobrazenou na obrázku 6. Výhodou této
funkce je stejný zápis při použití libovolného počtu neznámých parametrů tzn. testování je možné na libovolné
rozměrnosti bez nutnosti zásahu do matematického zápisu této funkce. Další výhodnou vlastností funkce je poloha
globálního extrému v bodě nula na všech osách při všech rozměrnostech.
Parametry:
•
interval -5,12 až +5,12 na každé ose
•
16-ti bitové kódování genů
•
maximální možná přesnost 0,00015625 při použitém kódování a velikosti intervalu
•
5n – 1 lokálních extrémů, n = počet dimenzí
•
výsledné hodnoty potřebného počtu generací k nalezení globálního minima průměrovány aritmeticky z 10 řešení
•
použitý výpočetní prostředek: počítač s procesorem AMD Athlon 1600+
Na obrázku 5 je vidět závislost počtu generací na velikosti populace. Tenká čára nahoře znázorňuje řešení s největším
počtem potřebných generací k nalezení minima a naopak tenká čára dole s nejmenším počtem potřebných generací.
Tlustá čára vyznačuje aritmetický průměr z 10 řešení. Krok zvyšování populace na ose x je 10. Jedno řešení je
dosahováno přibližně během jedné sekundy.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
67
Obrázek 5: Závislost počtu generací potřebných pro nalezení globálního minima Rastriginovi funkce na počtu jedinců v
populaci pro dva neznámé parametry
Vliv elitismu
U testovací úlohy se dvěma a třemi neznámými parametry je patrný pozitivní vliv při použití menších populací, řádově
do 40 jedinců. S rostoucí rozměrností problému, tj.zvyšování počtu neznámých parametrů, se elitismus ukazuje jako
nevhodný pro zefektivnění algoritmu. Průměrné množství generací potřebné pro nalezení globálního řešení je větší.
Toto je způsobeno častou konvergencí k lokálnímu extrému a uvíznutí v něm.
Vliv stabilizační hranice
S použitím stabilizačních hranic u tohoto problému nedochází příliš ke zlepšení průměrného počtu potřebných generací
k nalezení globálního minima. Dochází však k větší spolehlivosti dosáhnutí lepšího řešení a omezuje se časté uvíznutí
algoritmu. Jako vhodná hodnota stabilizační hranice se ukazuje malé číslo v řádu procent.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
68
Obrázek 6: Standardní Rastriginova testovací funkce pro dva neznámé parametry, a) řez rovinou xz v y = 0, b) náhled
na rovinu xy ze shora, c) , d) 3D náhled
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
69
Příklad řešení úlohy maximalizace zisku výrobního procesu pomocí CGA
Zadání
V podniku se vyrábí pět druhů výrobků (A, B, C, D, E) ze tří druhů materiálu (S1, S2, S3). Materiál je k dispozici pro
plánované období v omezeném množství: 1500kg S1, 300kg S2, 450kg S3. Výrobní procesy jednotlivých druhů
výrobků probíhají nezávisle. Jiná omezení nepřichází v úvahu. Spotřeba materiálu na jeden kus vyráběných druhů
výrobku (kg na 1kus) a velkoobchodní ceny jednotlivých výrobků jsou následující:
Výrobek
A
B
C
D
E
Spotřeba (kg na 1kus) materiálu:
S1
0,4
0,3
0,6
0,6
S2
0,05
0,2
0,1
0,1
S3
0,1
0,2
0,2
0,1
0,2
Velkoobchodní cena (Kč na 1kus)
20
120
100
140
40
Tabulka 1: Zadání úlohy
Úkol: Vypočítejte kolik je třeba vyrobit jednotlivých druhů výrobků při daných omezeních, aby v plánovaném období
bylo dosaženo maximálních tržeb za plánované období.
Sestavení ekonomicko matematického modelu
Funkce maximalizace zisku:
zmax ( x1 , x2 , x3 , x4 , x5 ) = 20 x1 + 120 x2 + 100 x3 + 140 x4 + 40 x5
Omezující podmínky:
S1: 0 x1 + 0, 4 x2 + 0,3 x3 + 0, 6 x4 + 0, 6 x5 ≤ 1500
S2: 0, 05 x1 + 0, 2 x2 + 0,1x3 + 0,1x4 + 0 x5 ≤ 300
S3: 0,1x1 + 0, 2 x2 + 0, 2 x3 + 0,1x4 + 0, 2 x5 ≤ 450
Řešení
Použil jsem 16 bitové kódování genů z testovací úlohy, které je svým rozmezím hodnot (0 až 65535) pro jednotlivé
geny více než dostačující. Funkce zmax představuje fitness funkci, do které se přidají omezující podmínky s postihy za
nežádoucí řešení.
Z deseti provedených řešení označil algoritmus hodnoty v tabulce 2 jako optimální a na nich také setrval. Devadesát
procent hodnot bylo označeno během prvních 1000 generací a do 10 sekund na počítači s procesorem AMD Athlon
1600+. Průměrná hodnota spočítaného řešení je 379674 Kč. Optimální řešení odpovídá 380000 Kč.
číslo
řešení
hodnota označeného
maximálního zisku [kč]
1
379900
2
379400
3
379680
4
379560
5
379960
6
380000
7
379900
8
379880
9
378820
10
379640
Tabulka 2: Hodnoty spočítané CGA
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
70
Závěrem
CGA, stejně jako ostatní EA poskytují v poměrně velmi krátkém čase řešení, které je nebo se blíží k optimálnímu.
Použití algoritmu je vhodné tam, kde ostatní algoritmy selhávají nebo nejsou schopné podat řešení v dostatečně krátkém
čase. Přičemž není zcela nutné znát nejlepší řešení a postačuje znalost některého z velmi dobrých řešení. Tomuto
charakteru úloh odpovídají ekonomické problémy, proto je použití algoritmu v těchto úlohách vhodné.
Použité zkratky a pojmy
CGA
Compact Genetic Algorithm (Kompaktní genetický algoritmus), speciální případ GA, který je
založen na PCh
EA
Evolutionary Algorithm (Evoluční algoritmus), matematické algoritmy inspirované přírodou
Elitní jedinec
Nejlepší jedinec za celou dobu výpočtu, popř. části výpočtu
Fitness
Síla jedince, vyjadřuje míru vhodnosti řešení daného jedince
Fitness funkce
Funkce popisující zadanou úlohu, charakteristická pro danou úlohu, hledá se na ní globální
extrém odpovídající optimálnímu řešení úlohy
GA
Genetic Algorithm (Genetický algoritmus), skupina algoritmů spadající pod EA
Gen
Prvek vektoru parametrů, základní stavební jednotka chromozomu, odpovídá zakódované
proměnné
Generace
Krok iteračního cyklu při hledání optimálního řešení pomocí GA
Chromozóm
Genetická informace ve formě řetězce, skládá se z genů, v matematické interpretaci odpovídá
jedinci
Jedinec (Agent)
Nositel genetické informace, jedinci tvoří populaci, v matematické interpretaci odpovídá
chromozómu
Křížení
Konstrukce nových jedinců (potomků) dle původních jedinců vybraných z generace (rodičů)
Mutace
Náhodná změna hodnot bitů v chromozómu
PCh
Probability Chromozome (pravděpodobnostní chromozóm), zvláštnost CGA, na základě PCh se
konstruují jedinci populace, podle daných pravidel je zpětně modifikován
Populace
Množina jedinců v daném kroku iteračního cyklu zvaném generace
Potomek
Jedinec, jenž je výsledkem rekombinace rodiče(ů)
Rekombinace
Proces generování nového jedince, obvykle odpovídá křížení a mutaci
Rodič
Jedinec vstupující do rekombinace, výsledkem je(jsou) nový jedinec(i)
Selekce
Výběr jedinců do nové populace podle daných pravidel
Literatura
[1]
MAŘÍK, V.; ŠTĚPÁNKOVÁ, O.; LAŽANSKÝ, J. a kol. Umělá Inteligence (4)
[2]
PETERKA, I. Genetické algoritmy. Matematicko-fyzikální fakulta Univerzity Karlovy, duben 1999.
[3]
KALÁTOVÁ, E.; DOBIÁŠ, J. Evoluční algoritmy. http://lucifer.fav.zcu.cz/uir/
[4]
HARIK, a kol., 1997.
Adresa:
Ing. Jiří Kostiha
VUT Brno
Kolejní 4
612 00 Brno
tel.: +420 775 303 543
e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
71
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
72
REVITALIZING COMPANY INFORMATION SYSTEMS AND COMPETITIVE ADVANTAGES
Branislav Lacko
VUT v Brně
Abstract: The contribution deals with innovation of computer based information system
Key words: Information system, revitalization, innovation of information system, grow of company
1 Introduction
At present, there is hardly any company having no information system working. Nevertheless, we can still read in our
magazines and brochures articles on the information systems design many of those books reasoning extensively the
necessity of introducing information systems. Current problems, however, do not lie in the absence of information
systems in Czech companies. The problem is in the quality of our existing information systems.
From this perspective, it is suitable to talk about the necessity of the information systems innovation. This paper's title
used the word “revitalizing” instead of the expression "innovating" for the following reasons:
•
Revitalization of enterprises, which many of our companies have been trying to do must necessarily comprise
revitalizing those enterprises´ information systems as well
•
The word "revitalization" covers more precisely the goal of the change which the information systems of our
enterprises should go through
After having been designed and implemented, each information system was handed over to its users in a certain
condition, size, having characteristic features. Since the moment of being handed over for use, it can develop, stagnate
or decline. By the development of an information system we understand improving its qualitative and quantitative
parameters in the course of its use. If those parameters remain unchanged during the system's use, stagnation occurs. In
case of impairing the information system parameters, we call it decline - a degradation of the information system.
The qualitative parameters describe certain characteristic values of the information system, especially of technological,
programme, system and operation type, e.g.: number of computers, external memories capacity, processor operation
speed, etc. The qualitative parameters can be considered similarly, such as: the way of using database concept,
automation degree of tasks, and the like.
For each i-th parameter out of n parameters we can stipulate a so-called increase index ri,j for moment tj as follows.
We introduce for each parameter its unit value hi,0 in time t=0 i.e. at the moment of its handover for operation. We will
make it equal one. Then we can determine the value of the increase index for another time period compared to the initial
value of the parameter. Consequently we can easily calculate the average increase index Rj, for a certain moment of
time j. Then we can even make a graph of dependence of the information system development in time by the means of
the average index of the information system development.
What we will call the information system revitalization is a situation when the average index of the information system
development changes by a decisive leap, resulting mainly of qualitative parameters, especially those oriented to the
benefits of the information system and supporting the decisive processes in the enterprise directly.
2 Implication statement on the information system development
The causes of the information system development must be seen in the requirements of the company staff to the
information system. Those are derived from the following facts:
•
the company size increases
•
the experience in the information system use grows making thus the demands for the information system to
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
73
•
•
•
•
•
etc.
increase as well
the requirements for information increase in consequence of the increasing information barrier in the society
technological and programming means of computing technology develop
new information technologies develop
the theory and practice of the information systems develop
the demands for quality increase generally,
The development of the enterprise, however, is a factor to be identified as the most important one.
Under the term of development of the enterprise we will understand a situation, in which the size of the enterprise
extends, the production volume grows and profits increase.
We can declare the following statement:
If an enterprise develops, the information system of the enterprise develops as well.
The enterprise development represents an antecedent of implication here, while the information system development
represents a consequent of this implication.
Let us try to comment individual combinations possible of the implication presented. The zero therein represents "false"
and one represents "true" of the respective affirmation.
0⇒0
An enterprise, which does not develop or even stagnates, has no resources for the information system development. It is
probably managed incorrectly; therefore its management is not probably interested in developing the information
system.
1⇒1
A developing enterprise must support its successful development by developing its information system to meet the
information demands of its employees and ensure relevant and top-quality information to support decision-making
processes.
0⇒1
This true combination of the implication says that even a bad performing enterprise can develop its information system.
It may happen in some of the following situations:
•
An enterprise revealed bad functioning of its information system as an obstruction in its development and decided
to improve it to a necessary level. Unfortunately, this situation comprises even the alternative of investing into the
information system only as "a wonder" process which is automatically to prevent any further stagnation of the
company or even its decline.
•
A stagnating enterprise makes use of the progress made in the information technologies, improving its system by
various, especially intensifying methods as a consequence of general progress in this sphere.
•
The milieu of the enterprise (state administration bodies, state legislation, competitors, client demands, public
opinion, etc.) drive the enterprise to develop its information system.
1⇒0
A false combination signals a situation which may have a negative impact on the enterprise prosperity and further
development by the decline of the stagnating information system which will stop providing support to the management
and employees of the company. It may turn to be one of the factors which at first will cause problems and later might
even become a cause of the company decline.
3 A sentence on the information system development pace
The fact itself that the information system develops is not enough if we want it to support the decision making activities
in the company really well.
We must find out the validity of the following sentence:
An information system must develop in a pace corresponding to the pace of the enterprise development
Similarly, as for the information system development, we can determine an average index of the company development
R` and its parameters for certain moments of time j (the equal ones will make the best results).
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
74
The pace of development – the speed – can be expressed in both cases as the first derivation of the function of the
average index increase of the information system development or respectively of the average increase index of the
enterprise depending on time.
It should be applied, that if an information system of suitable size was handed over for use, supporting management of a
company within the scope necessary, the pace of the information system development should be equal or faster
compared to the economic development of the company:
If the pace is slower, the information system development pace lags behind the company development pace, resulting
in a disproportion which may mean problems again, and possibly even stagnation and decline of the company in case of
non-solving such situation.
4 A problem of information systems revolutionary development
In the previous contemplations, gradual growth of both information system and the enterprise parameters – i.e. so-called
evolutionary development was assumed. In practice, revolutionary development of the information system or the
enterprise often occurs, or revolutionary growth in both cases, such as an exchange of an existing outmoded computer
for a new and progressive one, two firms merger, increase in production through opening a new-built hall, etc.
Expressed in graphics, this will be manifested by a leap within the respective growth curve (see Fig.1).
R
t
Fig.1
In a book by C. Gray [1], the author explains why the enterprise's growth is a necessary precondition for its successful
existence. It does not mean that the number of employees would have to increase permanently, but after all, the
company must develop still more and more perfect products, decrease its costs, increase profits, etc.
Company stagnation in today's dynamic world of market economy heads towards its breakdown very fast.
At present era of global computerization, it is more important than ever to develop its information system as well.
Current computer producers meet this demand, designing their individual computer models of standard computer lines
in such a way that their efficiency covers the large scope of the performance and external disc memories capacities.
That means the user may gradually extend his computer fluently according to his increasing requirements and financial
capacities, which is very advantageous for him. For the start of his information system implementation, he does not
have to pay unnecessarily for his computer capacity or memory capacity of the external memories capacity, which he
would not use. Moreover, he can schedule the increase in his computer's capacity as his gradually earned financial
resources allow. And he constantly works with the same operational system, within the same database and
communication setting, with the same standard and application programme equipment, valorizing and protecting the
resources invested into the technological equipment and personnel training before. If a computer does not have a
concept like that, the individual modules increase their capacity with certain delay. Figure No.2 shows a solution, where
the user bought a new model having higher capacity immediately after the capacity of the old model exhausted. He pays
unnecessarily for this capacity improvement for the entire period B, before he can really use this potential. Fig. 3 shows
a situation where the user is waiting and does not cover his increased information requirements until they reach the
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
75
possibilities of a new model.
R
Demand
Satisfaction
A
B
C
t
Fig.2
R
t
A
B
C
Fig.3
There are losses in both cases, however. In the former case due to unexploited and paid capacity. In the latter, in
consequence of insufficient computer support in steering and decision taking. Fig. 4 demonstrates the course of
information system benefits in individual periods A,B,C. Therefore the companies offering a limited number of
computer models design their parameters to cover mutually. That enables using a more efficient model before it
reaches the limits of its capacity. Such situation is ideal, in which the user works with a supplier ready and willing to
accept the old model back as a counter value during the innovation. If the user does not have an opportunity like that,
the benefits decrease and losses occur caused by the value of the computer discarded.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
76
€
t
A
B
C
Fig.4
Another situation, much worse, is that of a user who finds out that at the start of the information system implementation
he had decided for a computer having no further capacity extension possibilities, and the new computer offered requires
substantial financial resources to be invested into programmes modifications. Similar situation may happen in case of
communication network exchange, which may comprise cables redesign to achieve higher transmission speeds,
exchange of the still working network cards for more efficient ones together with the adjustment of the operational
system adaptation which must communicate now with the new network programme equipment. The whole event
postpones the possibility of solving the entire problem and due to the overhaul and modifications; the parameters may
even impair for a certain amount of time. The increase curve may run as shown in Fig.5. Sections B and C are critical as
the users´ requirements are not covered there. In addition, in section C, technical complications often occur, requiring
extra costs. Therefore the costs curve's line is often that shown in Fig. 6. An irregular, immediate increase in costs is
very unfavourable from the company finances point of view, as any financial expert can explain and confirm.
Everybody thus try to avoid such situation; the users should design their concepts of the information system
implementation in such a way that similar situations do not happen.
IS Capacity
t
A
B
C
D
Fig. 5
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
77
€
A
B
C
D
Fig.6
"Leap" changes in the information system development may occur for other than technological reasons, of course
(financial situation of the company, changes in the company's information policy, etc.).
Revitalization always represents a leap change in efficiency. Therefore the aforementioned curves must be
contemplated very well in relation to its implementation.
5 Discussion on the information system revitalization goals
It is a common phenomenon in the information technologies field, that modern information technologies and the
application thereof are a motor of innovations within the information systems sphere. In spite of that, it is necessary to
state in relation to the information system s revitalization, that main driving force should be in striving for improving
the information system quality of the with regard to company management support and company processes efficiency
improving so that the competitive abilities of the company can grow.
Implementation of information technologies on itself cannot be considered a competitive advantage anymore. The
reality is that the inability to use modern information technologies effectively to maximum extent turns out to be a
retardation factor in company development leading to the company's competitive abilities impairment.
The goals of information system revitalizing must be conceived in relation to overall improvement in company
processes effectiveness and efficiency (Business Process Reengineering). Information, especially of economic type, is
essential for proper company management. [8, 9] Parallel economic data processing intercepted by modern controlling,
and using the information obtained from those data cohere to the possibility of good company development [7].
6. Conclusion
The knowledge, quoted at the end of the preceding paragraph should be understood as an urgent stress on the
importance of the right information system revitalization strategy choice. Creating no information system strategy
causes a lot of troubles to Czech companies. After all, it is one of the most frequent reasons why most contributions of
the existing information systems are very moderate.
This paper would like to point out some important facts:
1. When revitalizing a company, its information system revitalization must be solved as well
2. The information system revitalization goals must support directly the company revitalization
3. Properly chosen strategy of the information system revitalization implementation, coherent to the company's
development planned enables correct setting of the so useful information system development dynamics
Literature:
[1]
GRAY, C. Růst podniku. Publikace edice Business Guide pro malé a střední podnikatele, Praha : Readers
International Prague, 1993.
[2]
STRASSMAN, P. A. Stages of Growth. Datamation, October 1976, str. 46 – 50.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
78
[3]
[4]
[5]
[6]
[7]
[8]
[9]
LACKO, B. Analýza zkušeností z inovace počítačového systému v k. p. TOS KUŘIM. Kuřim : Interní
publikace TOS KUŘIM, 1982.
LACKO, B. Restrukturalizace báze dat. Sborník referátů semináře DATASEM 95, CS COMPEX Brno : 1995.
LACKO, B. Vývojové trendy v informačních a řídicích systémech. Abstrakt habilitační přednášky, Brno : VUT
FS, 1994, 16 stran.
LACKO, B. Analýza dynamika rozvoje informačního systému. Sborník mezinárodní konference Systémová
integrace 95, Praha : VŠE KIT, 1995, str. 133 – 144.
FEDOROVÁ, A. Some connections between the accountant management and economic development in Czech
Republic. In: International Congress, Business and Economic Development in Central and Eastern Europe,
Brno : TU of Brno, 2001.
TVRDÍKOVÁ, M. Zavádění a inovace informačních systémů. Praha : Grada, 2000.
MERUNKA, V. ; POLÁK, J. ; CARDA, A. Umění systémového návrhu. Praha : Grada, 2003.
Address:
Doc. Ing. Branislav Lacko, CSc.
VUT v Brně
Technická 2
CZ-616 69 Brno
e-mail: lacko @ fme.vutbr.cz
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
79
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
80
MODEL LEARNING AND INFERENCE THROUGH ANFIS
Amal Al Khatib
Brno University of Technology
Abstract: This paper discusses the map and architecture of a learning procedure called ANFIS (adaptivenetwork-based fuzzy inference system), ANFIS is a fuzzy inference system implemented in the framework of
adaptive networks. By using a hybrid learning procedure, ANFIS can construct an input-output mapping
based on both human knowledge (in the form of fuzzy if-then rules) and using input-output data pairs.
1. INTRODUCTION
Intelligent systems have appeared in many technical areas, such as consumer electronics, robotics and industrial control
systems. Many of these intelligent systems are based on fuzzy control strategies which describe complex systems
mathematical model in terms of linguistic rules.
Fuzzy set theory derives from the fact that most natural classes and concepts are fuzzy rather than crisp nature. On the
other hand, people can approximate well enough to perform many desired tasks. The fact is that they summaries from
massive information inputs and still function effectively. For complex systems, fuzzy logic is quite suitable because of
its tolerance to some imprecision.
2. ANFIS
2.1. Model Learning and Inference Through ANFIS
Assuming that we already have a collection of input/output data and would like to build a fuzzy inference system that
approximate the data, this system would consist of a number of membership functions and rules with adjustable
parameters similarly to that of neural networks.
Rather than choosing the parameters associated with a given membership function arbitrarily, these parameters could be
chosen so as to tailor the membership functions to the input/output data in order to account for these types of variations
in the data values.
2.2. What Is ANFIS?
ANFIS (adaptive-network-based-fuzzy inference system) is considered to be an adaptive network which is very similar
to neural networks, using a given input/output data set, ANFIS constructs a fuzzy inference system (FIS) whose
membership function parameters are adjusted using either a backpropagation algorithm alone, or in combination with a
least squares method. This allows the fuzzy systems to learn from the data they are modeling.
2.3. ANFIS Objective
The purpose of ANFIS is to integrate the best features of Fuzzy Systems and Neural networks. From Fuzzy Systems it
is a representation of prior knowledge into a set of constraints to reduce the optimization search space. From Neural
networks it is an adaptation of back propagation to structured network to automate the parametric tuning.
2.4. ANFIS Architecture
For simplicity, I‘ll assume the fuzzy inference system under consideration has two inputs x and y and one output z.
Suppose that the rule base contains two fuzzy if-then rules of Takagi and Sugeno’s type.
Rule 1: If x is A1 and y is B1, then f1 = pl x + q1y + rl
Rule 2: If x is A2 and y is B2, then f2 = p2x + q2y + r2
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
81
Then the fuzzy reasoning is illustrated in Fig. 2.4.1.(a), and the corresponding equivalent ANFIS architecture is shown
in Fig. 2.4.1.(b).
Fig. 2.4.1.(a)
Fig. 2.4.1.(b)
The node functions in the same layer are of the same function family as described below:
Layer 1: Every node i in this layer is a square node with a node function
Oi = µAi ( x )
1
1
Where x is the input to node i, and Ai is the linguistic label associated with this node function. In other words, Oi is
the membership function of Ai and it specifies the degree to which the given x satisfies the quantifier Ai. Usually
µAi ( x ) is chosen to be bell-shaped such as:
µAi ( x ) =
1
 x − c  2 
i
  bi
1 + 
 ai  
Where {ai,bi,ci} is the parameter set. As the values of these parameters change, the bell-shaped functions very
accordingly, thus exhibiting various forms of membership functions to linguistic label Ai.
Parameters in this layer are referred to as premise parameters.
Layer 2: Every node in this layer is a circle node labeled II which uses the logic operation that the user chooses
(AND,OR), example:
wi = µAi ( x ) AND µBi ( x )
Each node output represents the firing strength of a rule.
Layer 3: Every node in this layer is a circle node labeled N. the ith node calculates the ratio of the ith rule’s firing
strength to the sum of all rules‘ firing strengths:
wi =
wi
w1 + w2
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
82
Layer 4: Every node i in this layer is a square node with a node function
OI4 = wi fi = wi ( pi x + qi y + ri )
Where wi is the output of layer 3, and {pi,qi,ri} is the parameter set. Parameters in this layer will be referred to as
consequent parameters.
Layer 5: The single node in this layer is a circle node labeled Σ that comuptes the overall output as the summation of all
incoming signals, i.e.,
O15 = overall output = ∑ wi f i =
i
∑wf
∑w
i i
i
i
i
Fig 2.4.2. shows a 2-input, ANFIS with 2 rules. Two membership functions are associated with each input, so the input
space is partitioned into four fuzzy subspaces, each of which is governed by a fuzzy if-then rules. The premise part of a
rule delineates a fuzzy subspace while the consequent part specifies the output within this fuzzy subspace.
premise
parameters
ANFIS
A1
x
Π
w1
consequent
parameters
Ν
w1
A2
B2
Layer 1
Π
w2
Layer 2
Ν
Layer 3
Σ wi*fi
Σ
x y
B1
y
w1 *f1
w2 *f2
w2
Layer 4
Layer 5
Fig. 2.4.2. anfis map
2.5. Hybrid Learning Algorithm
From the anfis architecture (Fig. 2.4.2) it is observed that given the values of premise parameters, the overall output can
be expressed as linear combinations of the consequent parameters. The output f in layer 5 can be expressed as:
f =
w2
w1
f2
f1 +
w1 + w2
w1 + w2
= w1 f1 + w2 f 2
= ( w1 x ) p1 + ( w1 y ) q1 + ( w1 ) r1 + ( w2 x) p2 + ( w2 y ) q2 + ( w2 ) r2
which is linear in the consequent parameters (pi,qi,ri, p2,q2,r2) as a result we have:
S = set of total parameters
S1 = set of premise parameters
S2 = set of consequent parameters
After finding the initial parameters by the generation of the fuzzy inference system (FIS), we can directly apply the
hybrid learning rule which will be discussed in the coming section.
More specifically, in the forward pass of the hybrid learning algorithm, functional signals go forward till layer 4 and the
consequent parameters are identified by the least squares estimate.
In the backward pass, the error rates propagate backward and the premise parameters are updated by the gradient
descent. The following table (Table 2.5.1.) summarizes the activities in each pass.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
83
MF param. (premise)
Rule param. (consequence)
forward pass
fixed
least-squares
backward pass
back propagation
fixed
Table 2.5.1. Tow pass in Hybrid Learning procedure
The consequent parameters thus identified are optimal (in the consequent parameters space) under the condition that the
premise parameters are fixed. Accordingly the hybrid approach is much faster than the strict gradient descent and it is
worthwhile to look for the possibility of decomposing the parameter set.
However, it should be noted that the computation complexity of the least squares estimate is higher than that of the
gradient descent. In fact, there are four methods to update the parameters, as listed below according to their computation
complexities:
1. Gradient descent only: all parameters are updated by the gradient descent.
2. Gradient descent and one pass of least squares estimate (LSE): the LSE is applied only once at the very beginning
to get the initial values of the consequent parameters and then the gradient descent takes over to update all
parameters.
3. Gradient descent and LSE: this is the proposed hybrid learning rule.
4. Sequential (Approximate) LSE only: The Anfis is linearized with respect to the premise parameters.
The choice of the above methods should be based on the trade-off between computation complexity and resulting
performance.
3.9. Hybrid Learning Rule (Forward pass)
Though we can apply the gradient method to identify the parameters in an adaptive network, the method is generally
slow and likely to become trapped in local minima. That’s why a Hybrid Learning Rule is proposed which combines the
gradient descent method and the least squares estimate (LSE) to identify parameters.
For simplicity, it is assumed that the adaptive network under consideration has only one output
Output = F ( I , S )
Where I is the set of input variables and S is the set of parameters. If there exists a function H such that the composite
function H○F is linear in some of the elements of S, then these elements can be identified by the least squares method.
More formally, if the parameter set S can be decomposed into two sets
S = S1 ⊕ S2
Where ⊕ represents direct sum such that H○F is linear in the elements of S2 , then upon applying H to the output
equation we have
H(output) = H○ F ( I , S )
Which is linear in the elements of S2. Now given values of elements of S1, we can plug P training data into the previous
equation and obtain a matrix equation:
AX = B
Where X is an unknown vector whose elements are parameters in S2. Let S 2 ‌ = M, then the dimensions of A, X and B
are PxM, Mx1 and Px1, respectively. Since P (number of training data pairs) is usually greater than M (number of linear
parameters), this is an over determined problem and generally there is no exact solution to the previous equation (AX =
B). Instead, a least squares estimate (LSE) of X, X*, is sought to minimize the squared error AX − B
2
.
This is a standard problem that forms the grounds for linear regression, adaptive filtering and signal processing. The
most well-known formula for X* uses the pseudo-inverse of X:
X* = (ATA)-1ATB
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
84
Where AT is the transpose of A, and (ATA)-1AT is the pseudo-inverse of A if ATA is non-singular. The sequential
formulas are used to compute the LSE of X. this sequential method of LSE is more efficient (especially when M is
small) and can be easily modified to an on-line version for systems with changing characteristics. Specifically, let the
T
T
ith row vector of Matrix A be ai and the ith element of B be b i , then X can be calculated iteratively using the
sequential formulas:
X i +1 = X i + Si +1ai +1 (biT+1 − aiT+1 X i )
S i +1
Si ai +1aiT+1 Si
= Si −
, i = 0,1,….,P-1
1 + aiT+1Si ai +1
Where Si is often called the covariance matrix and the least squares estimate X* is equal to Xp. The initial conditions
are X0 = 0 and S0 = γ I , where γ is a positive large number and I is the identity matrix of dimension MxM.
Now we can combine the gradient method and the least squares estimate to update the parameters in an anfis structure.
Each epoch of this hybrid learning procedure is composed of a forward pass and a backward pass. In the forward pass,
we supply input data and functional signals go forward to calculate each node output until the matrices A and B are
obtained, and the parameters in S2 are identified by the sequential least squares formulas.
After identifying parameters in S2 the functional signals keep going forward till the error measure is calculated. In the
backward pass, the error rates (the derivatives of the error measure with respect to each node output) propagate from the
output end and toward the input end, and the parameters in S1are updated by the gradient method (as will be explained
in the coming section).
For given fixed values of parameters in S1, the parameters in S2 thus found are guaranteed to be the global optimum
point in the S2 parameter space due to the choice of the squared error measure. Not only can this hybrid learning rule
decrease the dimension of the search space in the gradient method, but in general it will also cut down substantially the
convergence time.
3.10. Gradient descent (Backward pass)
The objective of applying this step is to update the premise parameters (membership functions’ parameters). And in
this paper I’m using the Back propagation algorithm.
ANFIS Back propagation’s basic idea is based on the Error measure E (Overall error measure)
P
P
1
( p)
E = ∑ E ( p ) = ∑ (d ( p ) − O5 ) 2
p =1
p =1 2
where:
p = number of nodes in a layer
d = pth component of desired output vector
O5 = actual output in layer 5 of the ANFIS structure (see Fig.4)
Assuming that the symbol θ represents each membership function’s parameter’s update then for each parameter θ
the update formula is:
P
∂E
∂E ( p )
∆θ i = −η
= −η ∑
∂θ i
p =1 ∂θ i
where
η=
k
is the learning rate
∂E
∑
i ∂θ
i
κ is the step size
∂E
is the derivative update
∂θ i
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
85
The chain rule is used in order to calculate the partial derivatives. Based on the Sugeno inference mechanism, the error
rate for consequence parameters can be calculated as follows:
( p)
2
∂O4( ,pj) ∂O3(,pj) ∂O2( ,p1) ∂O1(,1p )
∂E ( p )
( p)
( p ) ∂O5
= ∑ (d − O5 ) ( p )
∂θ i
∂O4, j ∂O3(,pj) ∂O2( ,p1) ∂O1(,1p ) ∂θ i
j =1
The derivation of
∂O5
is as follows
∂O4 , j
∂ ∑ O4, k 
∂O5
 =1
= k
∂O4, j
∂O4, j
for
for
∂O4,i
is
∂O3, j
∂O4 ,i ∂(O3,i ( pi x + qi + ri ))  pi x + qi + ri
i= j
=
=
∂O3, j
∂O3, j
0
i≠ j

∂O3,i
is
∂O2, j
∂O3,i
∂O2 , j
and for
∂O1,i
∂ai
,
∂O1,i
∂bi
, and
∂O1,i
∂ci
 O2,i
  O2,1

∂
, if
O2 ,1 + O2, 2   O2,1 + O2, 2

=
=
− O2,1
∂O2, j

, if
 O2,1 + O2, 2
i= j
i≠ j
are
x − ci 2bi
x − ci 2
x − ci 2bi
x − ci 2 bi
−(
) In(
)
2bi (
)
)
∂O1,i
∂O1,i
∂O1,i
ai
ai
ai
a
=
,
=
,
=
x − ci 2b 2
x − ci 2bi 2
x − ci 2bi 2
b
c
∂ai
∂
∂
i
i
ai [1 + (
) i]
[1 + (
) ]
( x − ci )[1 + (
) ]
ai
ai
ai
2bi (
3. CONCLUSION
The hybrid system ANFIS with inference mechanism based on the adaptive network is being considered in this paper.
An important property of this system is that it’s possible to tune the parameters of the fuzzy system that has been
described with the help of ANFIS. The same methods that are used to tune the parameters are also used to tune the
weights in the neural networks. As the ANFIS is a multilayer network, it can be concluded, that the tuning of its
parameters is a non-linear task (the parameters of the inner layers can’t be expressed linearly). Hence, new algorithms
‘inherit’ problems that are related to the training algorithms of the multilayer neural. Moreover, new algorithms require
more computational power; ANFIS is a learning method that is computationally more complex yet more effective than
some of the other methods proposed in the neural network field.
References
[1]
JANG, J. S.; ANFIS, R. Adaptive-Network-Based Fuzzy Inference System, IEEE Transactions on Systems,
Man and Cybernetics, Vol. 23, No. 3, May/June 1993, pp. 665-683.
[2]
The MathWorks, Inc.: Neural Network Toolbox (Matlab Toolbox), in reference to the handbook of The
MathWorks, Inc., Boston 1998
[3]
VALISHEVSKY, A. Comparative Analysis of Different Approaches towards Multilayer Perceptron Training,
Scientific Proceedings of Riga Technical University, 2001.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
86
[4]
[5]
Neuro-Fuzzy Modeling and Control”, J.S.R. Jang and C.-T. Sun, Proceedings of the IEEE, 83(3):378-406
The Fuzzy Logic Toolbox for use with MATLAB, J.S.R. Jang and N. Gulley, Natick, MA: The MathWorks
Inc., 1995
Address:
Ing. Amal Al Khatib
VUT v Brně
Technická 2
CZ-616 69 Brno
e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
87
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
88
GRAMMATICAL EVOLUTION WITH BACKWARD PROCESSING
Pavel Ošmera, O. Popelka,1 Imrich Rukovanský2
1
2
Brno University of Technology,
European Polytechnical Institute Kunovice
Abstract: This paper describes Parallel Grammatical Evolution (PGE) that can evolve complete programs
using a variable length linear genome to govern the mapping of a Backus Naur Form grammar definition.
To increase the efficiency of Grammatical Evolution (GE) the influence of backward processing was
tested. The significance of backward coding (BC)and the comparison with standard coding of GEs is
presented. BC can speed up Grammatical Evolution with high quality features. The adaptive significance
of Parallel Grammatical Evolution with male and female populations has been studied.
1 INTRODUCTION
Grammatical Evolution (GE) [1] can be considered a form of grammar-based genetic programming (GP). In particular,
Koza’s genetic programming has enjoyed considerable popularity and widespread use. Unlike a Koza-style approach,
there is no distinction made at this stage between what he describes as function (operator in this case) and terminals
(variables). Koza originally employed Lisp as his target language. This distinction is more of an implementation detail
than a design issue. Grammatical evolution can be used to generate programs in any language, using Backus Naur Form
(BNF). BNF grammars consist of terminals, which are items that can appear in the language, i.e. +, -, sin, log etc. and
non-terminal, which can be expanded into one or more terminals and non-terminals. A non-terminal symbol is any
symbol that can be rewritten to another string, and conversely a terminal symbol is one that cannot be rewritten. The
major strength of GE with respect to GP is its ability to generate multi-line functions in any language. Rather than
representing the programs as parse tree, as in GP, a linear genome representing is used. A genotype-phenotype mapping
is employed such that each individual’s variable length byte strings, contains the information to select production rules
from a BNF grammar. The grammar allows the generation of programs, in an arbitrary language that are guaranteed to
be syntactically correct. The user can tailor the grammar to produce solutions that are purely syntactically constrained,
or they may incorporate domain knowledge by biasing the grammar to produce very specific form of sentences.
GE system in [1-3] codes a set of pseudo random numbers, which are used to decide which choice to take when a nonterminal has one or more outcomes. Because GE mapping technique employs a BNF definition, the system is language
independent, and theoretically can generate arbitrarily complex functions. There is quite an unusual approach in GEs, as
it is possible for certain genes to be used two or more times if the wrapping operator is used. BNF is a notation that
represents a language in the form of production rules. It is possible to generate programs using the Grammatical Swarm
Optimization (GSO) technique [2] with a performance similar to the GE. Given the relative simplicity of GSO, the
small population sizes involved, and the complete absence of a crossover operator synonymous with program evolution
in GP or GE. Grammatical evolution was one of the first approaches to distinguish between the genotype and
phenotype. GE evolves a sequence of rule numbers that are translated, using a predetermined grammar set into a
phenotypic tree.
Our approach uses a parallel structure of GE (PGE). A population is divided into several subpopulations that are
arranged in the hierarchical structure [4]. Every subpopulation has two separate parts: a male group and a female group.
Every group uses quite a different type of selection. In the first group a classical type of GA selection is used. In the
second group only different individuals can be included. It is a biologically inspired computing similar to a harem
arrangement. This strategy increases an inner adaptation of PGE. The following text explains why we used this
approach. Analogy would lead us one step further, namely, to the belief that the combination of GE with a sexual
reproduction [5-6]. On the principle of the sexual reproduction we can create a parallel GE with a hierarchical structure.
2 PARALLEL GRAMMATICAL EVOLUTION
The PGE is based on the grammatical evolution GE [1], where BNF grammars consist of terminals and non-terminals.
Terminals are items, which can appear in the language. Non-terminals can be expanded into one or more terminals and
non-terminals. Grammar is represented by the tuple {N,T,P,S}, where N is the set of non-terminals, T the set of
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
89
terminals, P a set of production rules which map the elements of N to T, and S is a start symbol which is a member of N.
For example, below is the BNF used for our problem:
N = {expr, fnc}
T = {sin, cos, +, -, /, *, X, 1, 2, 3, 4, 5, 6, 7, 8, 9}
S = <expr>
and P can be represented as 4 production rules:
1. <expr> := <fnc><expr>
<fnc><expr><expr>
<fnc><num><expr>
<var>
2. <fnc> :=
sin
cos
+
*
U3. <var> := X
4. <num> := 0,1,2,3,4,5,6,7,8,9
The production rules and the number of choices associated with each are in Table 1. The symbol U- denotes an unary
minus operation.
Table 1: The number of available choices for each production rule.
rule no
choices
1
4
2
6
3
1
4
10
There are notable differences when compared with [1]. We don’t use two elements <pre_op> and <op>, but only one
element <fnc> for all functions with n arguments. There are not rules for parentheses; they are substituted by a tree
representation of the function. The element <num> and the rule <fnc><num><expr> were added to cover generating
numbers. The rule <fnc><num><expr> is derived from the rule <fnc><expr><expr>. Using this approach we can
generate the expressions more easily. For example when one argument is a number, then +(4,x) can be produced, which
is equivalent to (4 + x) in an infix notation. The same result can be received if one of <expr> in the rule
<fnc><expr><expr> is substituted with <var> and then with a number, but it would need more genes.
There are not any rules with parentheses because all information is included in the tree representation of an individual.
Parentheses are automatically added during the creation of the text output.
If in the GE is not restricted anyhow, the search space can have infinite number of solutions. For example the function
cos(2x), can be expressed as cos(x+x); cos(x+x+1-1); cos(x+x+x-x); cos(x+x+0+0+0...) etc. It is desired to limit the
number of elements in the expression and the number of repetitions of the same terminals and non-terminals.
3 BACKWARD PROCESSING OF THE GE
The chromosome is represented by a set of integers filled with random values in the initial population. Gene values are
used during chromosome translation to decide which terminal or nonterminal to pick from the set. When selecting a
production rule there are four possibilities, we use gene_value mod 4 to select a rule. However the list of variables has
only one member (variable X) and gene_value mod 1 always returns 0. A gene is always read; no matter if a decision is
to be made, this approach makes some genes in the chromosome somehow redundant. Values of such genes can be
randomly created, but genes must be present.
The figure Fig. 1 shows the genotype-phenotype translation scheme. Body of the individual is shown as a linear
structure, but in fact it is stored as a one-way tree (child objects have no links to parent objects). In the diagram we use
abbreviated notations for nonterminal symbols: f - <fnc>, e - <expr>, n - <num>, v - <var>.
The column description in Fig. 1:
A. Objects of the individual’s body (resulting trigonometric function),
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
90
B. Genes used to translate the chromosome into the phenotype,
C. Modulo operation, divisor is the number of possible choices determined by the gene context,
D. Result of the modulo operation,
E. State of the individual’s body after processing a gene on the corresponding line,
F. Blocks in the chromosome and corresponding production rules,
G. Block marks added to the chromosome.
Fig.1: Relations between genotype (column B) and phenotype (column A)
Since operation modulo takes two operands, the resulting number is influenced by gene value and by gene context (Fig.
1C = see Fig. 1 column C). Gene context is the number of choices, determined by the currently used list (rules,
functions, variables). Therefore genes with same values might give different results of modulo operation depending on
what object they code. On the other hand one terminal symbol can be coded by many different gene values as long as
the result of modulo operation is the same (31 mod 3) = (34 mod 3) = 1. In the example (Fig. 1A) given the variables
set has only one member X. Therefore, modulo divider is always 1 and the result is always 0, a gene which codes a
variable is redundant in that context (Fig. 1D). If the system runs out of genes during phenotype-genotype translation
then the chromosome is wrapped and genes at the beginning are reused.
4 PROCESSING THE GRAMMAR
The processing of the production rules is done backwards – from the end of the rule to the beginning (Fig. 2). E.g.
production rule <fnc><expr1><expr2> is processed as <expr2><expr1><fnc>. We use <expr1> and <expr2> at this
point to denote which expression will be the first argument of <fnc>.
Fig. 2: Proposed backward notation of a function tree structure
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
91
The main difference between <fnc> and <expr> nonterminals is in the number of real objects they produce in the
individual’s body. Nonterminal <fnc> always generates one and only one terminal; on the contrary <expr> generates
an unknown number of nonterminal and terminal symbols. If the phenotype is represented as a tree structure then a
product of the <fnc> nonterminal is the parent object for handling all objects generated by <expr> nonterminals
contained in the same rule. Therefore the rule <fnc><expr1><expr2> can be represented as a tree (Fig. 3).
Fig. 3: Production rule shown as a tree
To select a production rule (selection of a tree structure) only one gene is needed. To process the selected rule a number
of n genes are needed and finally to select a specific nonterminal symbol again one gene is needed. If the processing is
done backwards the first processed terminals are leafs of the tree and the last processed terminal in a rule is the root of a
subtree. The very last terminal is the root of the whole tree. Note that in a forward processing (<fnc><expr1><expr2>)
the first processed gene codes the rule, the second gene codes the root of the subtree and the last are leafs.
When using the forward processing and coding of the rules described in [1] it’s not possible to easily recover the tree
structure from genotype. This is caused with <expr> nonterminals using an unknown number of successive genes. The
last processed terminal being just a leaf of the tree. The proposed backward processing is shown in Fig. 1E.
4.1 PHENOTYPE TO GENOTYPE PROJECTION
Using the proposed backward processing system the translation to a phenotype subtree has a certain scheme. It begins
with a production rule (selecting the type of the subtree) and ends with the root of the subtree (in our case with a
function) (Fig. 1F). In the genotype this means that one gene used to select a production rule is followed by n genes
with different contexts which are followed by one gene used to translate <fnc>. Therefore a gene coding a production
rule forms a pair with a gene coding terminal symbol for <fnc> (root of the rule). Those genes can be marked when
processing the individual. This is an example of a simple marking system:
BB – Begin block (a gene coding a production rule)
IB – Inside block
EB – End block (a gene coding a root of a subtree)
The EB and BB marks are pair marks and in the chromosome they define a block (Fig. 1G). Such blocks can be nested
but they don’t overlap (the same way as parentheses). The IB mark is not a pair mark, but it is always contained in a
block (IB marks are presently generated by <num> nonterminals). Given a BB gene a corresponding EB gene can be
found using a simple LIFO method.
A block of chromosome enclosed in a BB-EB gene pair then codes a subtree of the phenotype. Such block is fully
autonomous and can be exchanged with any other block or it can serve as completely new individual.
Only BB genes code the tree of individual’s body, while EB and IB genes code the terminal symbols in the resulting
phenotype. The BB genes code the structure of the individual, changing their values can cause change of the applied
production rule. Therefore change (e.g. by mutation) in the value of a structural gene may trigger change of context of
many, or all following genes.
This simple marking system introduces a phenotype feedback to phenotype; however it doesn’t affect the universality of
the algorithm. It’s not dependent on the used terminal or nonterminal symbols; it only requires the result to be a tree
structure. Using this system it’s possible to introduce a progressive crossover and mutation.
4.2 CROSSOVER
When using grammatical evolution the resulting phenotype coded by one gene depends on the value of the gene and on
its context. If a chromosome is crossed at random point, it is very probable that the context of the genes in second part
will change. This way crossover causes destruction of the phenotype, because the newly added parts code different
phenotype than in the original individual.
This behavior can be eliminated using a block marking system. Crossover is then performed as an exchange of blocks.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
92
The crossover is made always in an even number of genes, where the odd gene must be BB gene and even must be EB
gene. Starting BB gene is presently chosen randomly; the first gene is excluded because it encapsulates (together with
the last used gene) the whole individual.
The operation takes two parent chromosomes and the result is always two child chromosomes. It is also possible to
combine the same individuals, while the resulting child chromosomes can be entirely different.
Given the parents:
1) cos( x + 2 ) + sin( x * 3 )
2) cos( x + 2 ) + sin( x * 3 )
The operation can produce children:
3) cos( sin( x * 3 ) + 2 ) + sin( x * 3 )
4) cos( x + 2 ) + x
This crossover method works similar to direct combining of phenotype trees, however this method works purely on the
chromosome. Therefore phenotype and genotype are still separated. The result is a chromosome, which will generate an
individual with a structure combined from its parents. This way we receive the encoding of an individual without
backward analysis of his phenotype. To perform a crossover the phenotype has to be evaluated (to mark the genes), but
it is neither used nor know in the crossover operation (also it doesn’t have to exist).
4.3 MUTATION
Mutation can be divided into mutation of structural (BB) genes and mutation of other genes. Mutation of one structural
gene can affect other genes by changing their context therefore structural mutation amount should be very low. On the
other hand the amount of mutation of other genes can be set very high and it can speed up searching an approximate
solution.
Given an individual:
sin( 2 + x ) + cos( 3 * x )
and using only mutation of non-structural genes, it is possible to get:
cos( 5 – x ) * sin( 1 * x )
Therefore the structure doesn’t change, but we can get a lot of new combinations of terminal symbols. The divided
mutation allows using the benefits of high mutation while eliminating the risk of damaging the structure of an
individual.
4.4 POPULATION MODEL
The system uses three populations forming a simple tree structure (Fig. 4). There is a Master population and two slave
populations, which simulate different genders. The links among the populations lead only one way - from bottom to top.
Fig. 4: The population model
4.5 FEMALE POPULATION
When a new individual is to be inserted in a population a check is preformed whether it should be inserted. If a same or
similar individual already exists in the population then the new individual is not inserted. In a female population every
genotype and phenotype occurs only once. The population maintains a very high diversity; therefore the mutation
operation is not applied to this population. Removing the individuals is based on two criterions. The first criterion is the
age of an individual - length of stay in the population. The second criterion is the fitness of an individual. Using the
second criterion a maximum population size is maintained. Parents are chosen using the tournament system selection.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
93
4.6 MALE POPULATION
New individuals are not checked so duplicate phenotypes and genotypes can occur, also the mutation is enabled for this
population. Mutation rate can be safely set very high (30%) provided that the structural mutation is set very low (less
then 2%). For a couple of best individuals the mutations are nondestructive. If a protected individual is to be mutated a
clone is created and added to the population. If the system stagnates in a local solution the mutation rate is raised using
a linear function depending on the number of cycles for which the solution wasn’t improved. Parents are chosen using a
logarithmic function depending on the position of an individual in a population sorted by fitness. For every selected
male parent a new selection of female parent is made.
4.7 MASTER POPULATION
The master population is superior to the male and female populations. Periodically the subpopulations send over their
best solutions. Moreover the master population performs another evolution on its own. Parents are selected using the
tournament system. The master population uses the same system of mutations as the male population, but for removing
individuals from the population only the fitness criterion is used. Therefore master population also serves as an archive
of best solutions of the whole system.
4.8 FITNESS FUNCTION
Around the searched function there is defined an equidistant area of a given size. Fitness of an individual’s phenotype is
computed as the number of points inside this area divided by the number of all checked points (a value in <0,1>). This
fitness function forms a strong selection pressure; therefore the system finds an approximate solution very quickly.
4.9 RESULTS
Given sample of 100 points in the interval [0,2π] and using the block marking system described in 5.1, PGE has
successfully found the searched function sin(2*x)*cos(2+x) on the majority of runs. The graph (Fig 5.) shows
maximum fitness in the system for ten runs and an average (bold). On the other hand the same system with phenotype
to genotype projection disabled (Fig. 6). The majority of runs didn’t find the searched function within 120 generations.
We have simplified the generation of numbers by adding a new production rule, thus allowing the generation of
functions containing integer constants. The described parallel system together with phenotype to genotype projection
improved the speed of the system. The progressive crossover and mutation eliminates destroying partial results and
allowed us to generate more complicated functions (e.g. sin(2 * x)*cos(2 + x)).
1
F itn e s s
0 ,8
0 ,6
0 ,4
0 ,2
G e n e ra tio n
0
1
2 5
5 0
7 5
1 0 0
Fig. 5: Convergence of the PGE using backward processing (average in bold)
We have described a parallel system, Parallel Grammatical Evolution (PGE) that can map an integer genotype onto a
phenotype with the backward coding. PGE has proved successful for creating trigonometric identities.
Parallel GEs with the sexual reproduction can increase the efficiency and robustness of systems, and thus they can track
better optimal parameters in a changing environment. From the experimental session it can be concluded that modified
standard GEs with two sub-populations can design PGE much better than classical versions of GEs.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
94
1
F itn e s s
0 ,8
0 ,6
0 ,4
0 ,2
G e n e ra tio n
0
2 5
1
5 0
7 5
1 0 0
Fig. 6: Convergence of the PGE using forward processing (average in bold)
Fitness
1
0,8
0,6
0,4
0,2
Generation
0
0
50
100
Fig.7: Convergence of the PGE with 5 PC using backward processing (average in bold)
Fig.8: The parallel structure of PGE with 6 computers
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
95
The PGE algorithm was tested with the group of 6 computers in the computer network (see Fig. 8). Five computers
calculated in the structure of five subsystems MR1, MR2, MR3, MR4, and MR5 and one master MR. The male
subpopulation M of MR in the higher level follows the convergence of the subsystem. In Fig. 7 is presented 10 runs of
the PGE- program. The shortest time of computation is only 10 generation. All calculation were finished before 40
generation. This is better to compare with backward processing on one computer (see Fig. 5). The forward processing
on one computer was the slowest (see Fig. 6).
5 CONCLUSIONS
The increased awareness from other scientific communities, such as biology and mathematics, promises new insights
and new opportunities. There is much to accomplish and there are many open questions. Interest from diverse
disciplines continues to increase and simulated evolution is becoming more generally accepted as a paradigm for
optimization in practical engineering problems.
The parallel grammatical evolution can be used for the automatic generation of programs. This can help us to find
information as a part of complexity. We are far from supposing that all difficulties are removed but first results with
PGEs are very promising.
REFERENCES
[1]
O’NEILL, M.; RYAN, C. Grammatical Evolution: Evolutionary Automatic Programming in an Arbitrary
Language Kluwer. Academic Publishers 2003.
[2]
O’NEILL, M.; BRABAZON, A.; ADLEY, C. The Automatic Generation of Programs for Classification
Problems with Grammatical Swarm. Proceedings of CEC 2004, Portland, Oregon (2004) 104 – 110.
[3]
PIASÉCZNY, W.; SUZUKI, H.; SAWAI, H. Chemical Genetic Programming – Evolution of Amino Acid
Rewriting Rules Used for Genotype-Phenotype Translation, Proceedings of CEC 2004, Portland, Oregon
(2004) 1639 - 1646.
[4]
OŠMERA, P.; ŠIMONÍK, I.; ROUPEC, J. Multilevel distributed genetic algorithms. In Proceedings of the
International Conference IEE/IEEE on Genetic Algorithms, Sheffield (1995) 505–510.
[5]
OŠMERA, P.; ROUPEC, J. Limited Lifetime Genetic Algorithms in Comparison with Sexual Reproduction
Based Gas. Proceedings of MENDEL’2000, Brno, Czech Republic (2000) 118 – 126
[6]
OŠMERA, P. Evolution of System with Unpredictable Behavior, Proceedings of MENDEL’2004, Brno, Czech
Republic (2004) 1 - 6.
[7]
OŠMERA, P. Genetic Algorithms and their Aplications, the habilit work, in Czech language 2002.
Address:
Doc. Ing. Pavel Ošmera, CSc.
Institute of Automation and Computer Science
Brno University of Technology
Technicka 2,
616 69 Brno, Czech Republic
Tel.: +420 541 142 294
Fax: +420 541 142 490
e-mail: osmera @fme.vutbr.cz
Address:
Bc. Ondřej Popelka
Institute of Automation and Computer Science
Brno University of Technology
Technicka 2,
616 69 Brno, Czech Republic
Tel.: +420 541 142 294
Fax: +420 541 142 490
e-mail: [email protected],
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
96
Address:
Prof. Ing. Imrich Rukovanský, CSc.
European Polytechnical Institute, s.r.o.
Osvobození 699,
686 04 Kunovice, Czech Republic
e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
97
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
98
OBJECT RECOGNITION BY MEANS OF NEW AL
Jiří Štastný, Martin Minařík
Brno University of Technology
Abstract: This document provides an overview of algorithms for object recognition. Three basic
algorithms are described - recognition with the aid of the momentum, recognition with the aid of grammar
and back propagation algorithm, thus recognition with the aid of neural network. Finally, speed and
applicability of these algorithms are compared.
Key-Words: Back Propagation Algorithm, Momentum, Grammar
1 Introduction
Pattern recognition consists in sorting objects into classes. Class is a subset of objects whose elements have common
features from the classification standpoint. Object has a physical character, which in computer vision is most frequently
taken to mean a part of segmented image.
Methods for the classification of objects constitute last and upper-most step in computer vision theory.
The following methods were mutually compared:
•
Recognition with the aid of moments
•
Recognition with the aid of grammar describing the edges of object
•
Recognition with the aid of neural network (back propagation)
A real technological scene for object classification was simulated by digitizing
five selected objects (see Fig. 1). For this purpose, two-dimensional images of
three-dimensional objects were prepared. The aim was to test such objects that
resemble two-dimensional images of real objects. The choice of objects of
similar shape was also intentional.
Fig. 1
2 Recognition with the aid of the momentum method
The resultant moment characteristics for object detection, which are used in the program, will be in the form:
ϕ1 = θ 20 + θ 02
(1)
ϕ2 = (θ20 +θ02 ) + 4θ
2
2
11
(2)
ϕ3 = (θ30θ12) + (3θ21 −θ03)
2
2
(3)
ϕ 4 = (θ 30 + θ12 ) + (θ 21 + θ 03 )
2
2
[
]
ϕ5 = (θ30 − 3θ12)(θ30 +θ12) (θ30 +θ12)2 −3(θ21 +θ03)2 +
[
+ (3θ21 −θ03)(θ21 +θ03) 3(θ30 +θ12) − (θ21 +θ03)
[
2
2
]
]
ϕ6 =(θ20+θ02)(θ30θ12)2 −(θ21+θ03)2 +4θ11(θ30+θ12)(θ21+θ03)
[
]
ϕ7 = (3θ21 −θ03)(θ30 +θ12) (θ30 +θ12) −3(θ21 +θ03) −
[
2
2
−(θ30 −3θ12)(θ21 +θ03) 3(θ30 +θ12) −(θ21 +θ03)
2
2
]
(4)
(5)
(6)
(7)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
99
The moment description represents binary and grey shade areas. This method (see [7]) is based on computing seven
moment object flags. These moments are invariant with respect to repositioning, rotation and size of the object.
Recognition with the aid of the moment method has yielded very good results. This method faultlessly classified most
objects already at the stage of learning on some model etalon. The moments method can be used for a different edge
detector than for which the moments were calculated since they are not too sensitive to changes in object edges, e.g. the
Canny detector (at higher sigma it rounds the edges) and the Sobel operator. This method is unfit for applications
requiring the recognition of minimum dissimilar shapes, because this method is not sensitive to minor shape changes.
This method is fit for applications requiring fast recognition of dissimilar objects in different rotation.
3 Recognition with the aid of grammar
While in flag methods of pattern recognition use is made of quantitative description of objects by numerical parameters,
the flag vector, in syntactical methods the input description is of quantitative nature reflecting the structure of the
object. The elementary properties of syntactically described objects are referred to as primitives. Primitives are edge
parts of a certain shape or a graph or relation description of areas when the primitives are sub-areas of a certain shape.
The task of syntactical pattern recognition of an image is to determine whether the image under analysis corresponds to
the images of a given grammar, i.e. whether this grammar can generate this image. The image is represented by a
language string given by the grammar.
The simplest way of pattern recognition is „comparison with model“. A string representing the image is compared with
elements of sentences set representing single model images. In the comparison, either complete or partial agreement
with the model is necessary but on the basis of a certain adapting criterion. This method is simple and rapid. If a
complete image description is necessary for pattern recognition, syntactical analysis is required. In object analysis tasks,
the aim is to obtain a description that would include not only the listing of recognized objects and their mutual
arrangement (structural information) but also their dimensions and distances found between them (semantic
information). In the design of syntactical analyzer we must expect random effects such as image distortion.
Primitives are the basic building element of image. When choosing them, their easy recognition must be taken into
consideration. For images that are characterized by an edge or skeleton it is suitable to have parts of line as primitives.
For example, a line segment can be characterized by its beginning and end, its length or angle. The same holds for
curves. The choice of primitives depends on the application being solved.
It generally holds:
•
Primitives must be easy to recognize also by existing non-syntactical methods
•
Primitives must provide a compact and sufficient description of images by means of specified relations
If primitives of greater complexity are used we obtain a simpler structural description of objects and these results in
applying a simpler grammar for the description of objects. But it also leads to a greater complexity when seeking such
primitives in an image. On the contrary, simpler primitives lead to a more complex grammar but they are easy to
identify in the image. After the choice of primitives the next important step is to set up transcription rules for the
grammar, based on experience and knowledge.
Example of setting up grammar for object and choosing primitives is on Fig 2). If from a point marked on Fig. 2) we
proceed anticlockwise, the string will be:
dfbcajbcag
Due to the poor quality of input image it may happen that short straight segments exhibited on Fig. 2) will not be
detected and thus the objects string will also be changed
dfbcjbcag
dfbcajcag
dfbcjcag
For this case it is suitable to modify the grammar such that it also generates the above strings. This type of approach is
suitable for the expected deformations. In the case of random deformations it is advantage to use the classification by
means of distance (the Levenshtein distance).
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
100
Fig. 2
In the program generated, a syntactical analyzer is applied, which operates using the following algorithm:
1. If not all the strings have been analysed, read a new string and proceed with step 2, otherwise proceed with step 7
2. Perform the bottom-to-top analysis for the class selected
3. If the string belongs to the language of the grammar of selected class, proceed with step 6
4. If the number of string rotations is less than the string length, rotate the string and proceed with step 2, otherwise
proceed with step 5
5. If the number of string rotations is less than (360 / angle step), rotate the object by the angle step given and
proceed with step 2
6. enter the result and proceed with step 1
7. write the message about pattern recognition
The syntactical analyzer has been designed for the left linear grammar
String rotation: This mean shifting the last terminal symbol to the beginning
1.rotace
abcde → eabcd →L
Object rotation: This means rotating the object by a given angle and thus obtaining a different string.
If we need to classify N objects, we must create N classes, N grammars for them, and the respective languages L(G1),
L(G2), ..., L(GN). For example, if grammar Gx generates words containing only one terminal symbol b, then all the
objects containing just this one symbol b will belong to class X pertaining to this grammar. Objects containing more
than one symbol b will be further analysed using the remaining grammars. In the case that no grammar is found that
corresponds to the given string, the object will be suppressed.
In the case of primitives marking single edge segments, the grammar is very sensitive to small mistakes in edge
detection. It is necessary to tailor the grammar for a definite type of edge detector. For example, it is a mistake to set up
the grammar for objects to which the current-zero operator was applied, and then to recognize objects by means of the
Canny edge detector with sizeable Sigma (it modifies edges).
Grammars are suitable to use in applications which require recognizing differently rotated objects and when the
emphasis is on recognizing small changes in the segment edge. Setting up a grammar requires time and the knowledge
of grammar description of edges. Preparing the rules for grammar must be done manually, it is not done automatically
as in other methods.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
101
4 Back Propagation Algorithm
Back-propagation algorithm is an iterative method where the network gets from an initial non-learned state to the full
learned one (see [10]). It is possible to describe the algorithm in the following way:
random initialization of weights;
repeat
repeat
choose_pattern_from_training_set;
put_chosen_pattern_in_input_of_network;
compute_outputs_of_network;
compare_outputs_with_required_values;
modify_weights;
until all_patterns_from_traning_set_are_chosen;
until total_error < criterion;
The learning algorithm of back-propagation is essentially an optimization method that is able to find the weight
coefficients and thresholds for the given neural network and training set. The network is assumed to be made up of
neurons the behaviour of which is described by the formula:
 N

y = S  ∑ wi xi + Θ 
 i =1

(17)
where the output nonlinear function S is defined by the formula:
S (ϕ ) =
1
1 + e −γϕ
(18)
where γ determines the curve steepness in the origin of coordinates. Input and output values are assumed to be in the
range < 0, 1 >.
In the following formulas the parameter o denotes the output layer, h the hidden layer, and i,j the indexes. Index i
o
h
indexes output neurons and index j their inputs. Then yi means i-th neuron output of the hidden layer and wij means
the weight connecting i-th neuron of the output layer and j-th neuron of the previous hidden layer.
The appurtenant back-propagation algorithm can be written in the following steps:
1. Initialization. You set at random all the weights in the network at values in the recommended range < -0.3, 0.3 >.
2. Pattern submitting. You choose a pattern from the training set and put it in network inputs. Then you compute
outputs of particular neurons by relations (17) and (18).
3. Comparison. First you compute the neural network energy (SSE) under relation (19).
E=
1 n
∑ ( y − di ) 2
2 i =1 i
(19)
Then you compute an error for output layer by the relation:
δio = (d i − yio ) yioγ (1 − yio )
4.
(20)
Back-propagation of an error and weight modification. You compute for all neurons in the layer:
∆wijl (t ) = ηδi l (t ) y lj−1 (t ) + α∆wijl (t − 1)
(21)
∆Θil (t ) = ηδi l (t ) + α∆Θ il (t − 1)
(22)
By the relation:
δi
h −1
h −1
i
=y
h −1
i
(1 − y
n
)∑ wkihδkh
(23)
k =1
you back-propagate an error in the layer nearer the inputs. Then you modify the weights:
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
102
wijl (t + 1) = wijl (t ) + ∆wijl (t )
Θ il (t + 1) = Θ il (t ) + ∆Θ il (t )
(24)
(25)
You apply step number 4 to all the layers of network. You start with output layer followed by hidden layers.
5. Termination of pattern selection from the training set. If you have submitted all patterns from the training set to
network then continue with step number 6 else you go back to step number 2.
6. Termination of learning process. If neural network energy in the last computation has been less than the criterion
selected then terminate the learning process else you continue with step number 2.
Flag vectors have been submitted to network and arm lengths have been transformed into values from interval < 0, 1 >.
The number of vector components has been put in position 140 in the implemented computer program.
This method is the fastest of all the methods under comparison. For the description of objects using this method, 70
symptomatic vectors were used that went from the centre of gravity to object edges. This method can recognize objects
with considerably modified shapes but it may identify incorrectly objects of similar shape. This method recognizes
differently rotates objects. The error of the method increases with decreasing size of objects.
5 Conclusion
Recommended application of pattern recognition methods:
Grammar recognition – is suitable where the recognition of rotated objects is required and where single edge segments
need to be detected with high accuracy without the risk of the occurrence of significant errors, and where high-speed
classification is required. Regarded as significant is an error that cannot be implied in the rules.
Recognition with the aid of moments – is suitable where the edge course is not very important, where a rough division
into single classes is sufficient. For example, it does not matter whether there is a sharp transition or a short curve
between two edges.
Recognition with the aid of neural network – is suitable where high-speed classification with randomly rotated objects
is required and where we need to tolerate some differences between learned etalons and classified objects.
The fastest methods for pattern recognition are recognition with the aid of grammar and recognition with the aid of
neural network. The moment method is the slowest.
Acknowledgement
This research was supported by the grants:
No 102/03/0434 Limits for broad-band signal transmission on the twisted pairs and other system co-existence. The
Grant Agency of the Czech Republic (GACR)
No 102/03/0260 Development of network communication application programming interface for new generation of
mobile and wireless terminals. The Grant Agency of the Czech Republic (GACR)
No 102/03/0560 New methods for location and verification of compliance of quality of service in new generation
networks. The Grant Agency of the Czech Republic (GACR)
No CEZ: J22/98: 261100009 Nontraditional methods for investigating complex and vague systems
No CZ 400011(CEZ 262200011) Research of communication systems and technologies (Research design)
Grant 1570 F1 New approach to the subject High-speed Communication Systems (grant of the Czech Ministry of
Education, Youth and Sports)
Grant 1563 F1 Restructure of telecommunication objects for third age university (grant of the Czech Ministry of
Education, Youth and Sports)
References:
[1]
FISHER, R.: World and Scene Representations. [Online], 2002.
<www.dai.ed.ac.uk/CVonline/repres.htm>
[2]
HEALTH M. and SARKAR S.: Edge detection comparison. [Online], 1996.
<marathon.csee.usf.edu/edge/edge_detection. html>
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
103
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
GONZALES, R.,C. and WOODS, R., E.: Digital Image Processing, Addison-Wesley Publishing Co., New
York 1993.
ŽÁRA, J. and BENEŠ, B.: Model Computer Graphic, Computer Press, Prague 1996.
SMITH, M.,W. and Davis, W.,A.: A new Algorithm for Edge Detection. John Wiley, New York 1974.
SVITÁK, R.: Edge Detection on Images. [Project-online]. ZU FAV. Plzeň 2001.
<http://herakles.zcu.cz/~rsvitak/school/zvi_rsvitak. pdf>
ŠONKA, M. and HLAVÁČ, V. and BOYLE, R..: Image Processing, Analysis and Machine Vision. PWS,
Boston 1998.
JEŽEK, B.: Computer Graphics II. [Lectures -online]. UHK FIM
<http://lide.uhk.cz/home/fim/ucitel/fujezeb1/www/>. Hradec Králové 2002.
BULB, M. : Programming [Online]. Prague 2000 <www.freesoft.cz/projekty/vyhen/clanky/prog/bres. html>
ŠNOREK, M. and JIŘINA, M.: Neuronové sítě a neuropočítače, Prague, 1998.
ŠONKA, M: Course Digital Image Processing. [Online]. <css.engineering.uiowa.edu/~DIP>, Prague 2002.
ZAHN, T., CH.: Fourier Descriptors for Plane Closed Curves. In: IEEE Trans. on Computers, vol. C-21, No.3,
1972.
LOPEZ-CAVIEDES, M. and SANCHEZ-DIAZ, G.: A New Clustering Criterion in Pattern Recognition.
International Journal WSEAS Transactions on Computers, Issue 3, Volume 3, July 2004, ISSN 1109-2750.
RODRIGUEZ, J. N. and CO.: An Artificial Vision System for Identify and Classify Objects.
International Journal WSEAS Transactions on Computers, Issue 2, Volume 3, April 2004, ISSN 1109-2750.
Address:
Ing. RNDr. Jiří Šťastný, CSc.
Institute of Automation and Computer Science
Brno University of Technology
Technicka 2,
616 69 Brno, Czech Republic
Tel.: +420 541 142 294
Fax: +420 541 142 490
e-mail: [email protected]
Address:
Ing. Martin Minařík
Institute of Automation and Computer Science
Brno University of Technology
Technicka 2,
616 69 Brno, Czech Republic
Tel.: +420 541 142 294
Fax: +420 541 142 490
e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
104
APLIKÁCIA TEÓRIE GRAFOV V INTELIGENTNOM DOPRAVNOM SYSTÉME3
Tomáš Klieštik
Žilinská univerzita
Abstrakt: článok pojednáva o aplikácii teórie grafov v inteligentnom dopravnom systéme. Konkrétne
o aplikácii Hamiltonovských cykloch v grafe t. j. rieši úlohu obchodného cestujúceho. Pomocou danej
úlohy môže dopravný podnik optimalizovať (minimalizovať) prepravné náklady. A to tak, že minimalizuje
prepravnú trasu. Daný problém je ilustrovaný na modelovom príklade a riešený pomocou optimalizačného
software LINGO.
Kľúčové slová: graf, cyklus v grafe, minimalizácie, účelová funkcia,, ohraničenia, inteligentný dopravný
systém,
Reálne systémy opisujeme a skúmame pomocou ideálnych matematických objektov. Jedným z takýchto ideálnych
objektov, ktorý vytvorila matematika a ktorý slúži vyjadreniu mnohých, obsahovo často celkom odlišných situácií, je
graf.
Teória grafov patrí medzi najmladšie matematické disciplíny, ako systematická veda sa sformovala iba v tridsiatych
rokoch minulého storočia. Za jedného z prvých priekopníkov teórie grafov sa považuje Leonard Euler, ktorý sa
preslávil okrem iného riešením problému siedmych mostov mesta Kráľovca. Úloha spočívala v navrhnutí okružnej
cesty, ktorá prechádza cez všetky mosty, ale cez každý iba raz. Euler dokázal, že takáto okružná cesta neexistuje. Teóriu
grafov ďalej rozpracovali G. R. Kirchhoff, K. Appel, W. Haken, W. R. Hamilton a iní. V príspevku budem podrobnejšie
rozoberať poznatky teórie grafov, ktoré ako prvý definoval William Rowan Hamilton. Jeho poznatky budem aplikovať
v tzv. úlohe obchodného cestujúceho resp. okružnom probléme.
Túto úlohu je možné riešiť dvoma spôsobmi a to: heuristickými metódami alebo pomocou lineárneho programovania.
Ja budem okružný problém riešiť metódami lineárneho programovania. Z výpočtového hľadiska ide o časovo
mimoriadne náročnú úlohu, a preto načrtneme na modelovom príklade možnosť riešenia pomocou optimalizačného
software LINGO.
V úlohe o obchodnom cestujúcom treba určiť, v akom poradí obchodný cestujúci, ak vyjde z určitého miesta, navštívi
práve raz ostatné mestá a vráti sa naspäť do východiskového mesta. Predpokladá sa, že je známy počet miest v sieti n
a že známe sú aj vzdialenosti medzi jednotlivými mestami cij (i = 1,2,....n, j = 1,2,....n). Uvedený problém možno
zapísať ako úlohu lineárneho programovania kde premenné sú bivalentné t. j. ak sa cesta na príslušnej trase zrealizuje,
hodnota premennej je 1, ak sa cesta neuskutoční, hodnota premennej sa rovná 0.
Uvažujeme, že sa cesta medzi ľubovoľnými dvoma miestami na t-tom kroku (t = 1,2,....n) uskutoční. Do úlohy
zavedieme bivalentné premenné xijt . Ak sa cesta na t-tom kroku z miesta i do miesta j uskutoční, xijt = 1, ak nie, xijt = 0.
Pre koeficienty cij platí:
cij , ak existujecesta z miestai do miesta j
cij = 0, ak i = j
 M , ak neexistujecesta z miestai do miesta j
3
(1)
príspevok je výstupom vedeckého projektu CISKO, Š. a kol. :Ekonomické aspekty inteligentného dopravného systému
(dopravnej telematiky) v odbore cestnej dopravy, projekt VEGA 1/12349/04, ŽU v Žiline, FPEDaS
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
105
Úlohu možno matematicky potom formulovať nasledovne:
n
n
n
min ∑∑∑ cijxijt , i, j, t = 1,2,...., n
(2)
i =1 j =1 t =1
za podmienok
n
n
∑∑ x
= 1, t = 1, 2,...., n
(3)
ijt
= 1, i = 1, 2,...., n
(4)
ijt
= 1,
(5)
ijt
i =1 j =1
n
n
∑∑ x
j =1 t =1
n
n
∑∑ x
j = 1, 2,...., n
i =1 t =1
n
n
i =1
k =1
∑ xijt − ∑ xjk (t + 1) = 0,
n
∑x
n
ijn
i =1
− ∑ xjk 1 = 0,
j = 1, 2,...., n, t = 1, 2,...., n − 1
j = 1,2,...., n
(6)
(7)
k =1
Xijt ∈ {0,1}
(8)
Podmienky (3) až (5) zabezpečujú, že na každé miesto môže obchodný cestujúci prísť len raz aj z každého miesta odíde
ten raz, pričom vykoná n ciest. Podmienky (6) a (7) zabezpečujú, že ak obchodný cestujúci vykoná na k-tom kroku
cestu do niektorého miesta s, tak môže na k+1-om kroku vyjsť ten z toho istého miesta s. Podmienky (3) až (7)
zabezpečujú neprerušenosť okružnej cesty a zamedzujú vzniku cyklov. Podmienka (8) je podmienkou bivalentnosti.
V úlohe lineárneho programovania potom vystupuje n.n.n premenných a n+n+n+n+(n-1)+n ohraničení4. Rozmery
úlohy nie sú síce problémom z výpočtového hľadiska, pomocou niektorého z optimalizačných programových balíkov
ich možno rýchlo vyriešiť, prinášajú však značné komplikácie pri zostavovaní úlohy.
Danú úlohu môžeme previesť na tzv. Tuckerovu formuláciu úlohy o obchodnom cestujúcom. Do úlohy sa zavádzajú
bivalentné premenné xij , ktoré označujú cestu medzi miestom i a miestom j. Aby sa zabránilo vytvoreniu cyklov
v dopravnej sieti, zavádzajú sa do úlohy lineárneho programovania ďalšie podmienky v tvare:
ui − uj + nxij ≤ n − 1 i, j = 2,3,...., n, i ≠ j
v ktorých premenné ui a uj môžu nadobúdať ľubovoľné hodnoty (sú to reálne čísla priradené miestu i, resp. miestu j).
Potom možno problém obchodného cestujúceho formulovať ako úlohu lineárneho programovania nasledujúcim
spôsobom:
n
n
min ∑∑ cijxij
(9)
i =1 j =1
za podmienok
n
∑x
ij
= 1,
j = 1,2,...., n
(10)
= 1, i = 1, 2,...., n
(11)
i =1
n
∑x
ij
j =1
4
V príspevku budem problematiku ilustrovať na modelovom príklade s piatimi miestami t.j. úloha by mala 125
premenných a 29 ohraničení.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
106
ui − uj + nxij ≤ n − 1 i, j = 2,3,...., n, i ≠ j
(12)
Xij ∈ {0,1} i, j = 1,2,....n
(13)
V takto naformulovanej úlohe bude počet premenných a počet ohraničení už výrazne redukovaný.5. Ale aj tak bude
úloha už aj pri 5 miestach pomerne rozsiahla a preto je vhodné použiť niektorý z optimalizačných softwarových
balíkov. Úlohu som sa rozhodol riešiť v programe LINGO, ktorý obsahuje špeciálny jazyk, pomocou ktorého môžeme
zápis úlohy ešte o niečo zjednodušiť. Nevýhodou je to, že voľne k dispozícii je iba výučbová demo verzia, ktorá síce
funguje ako „ostrá“ verzia, má však určité obmedzenia. Jedným z obmedzení je aj počet bivalentných premenných a to
iba 30 t. j. môžeme optimalizovať iba päť miest.
Úlohu zapíšeme následovne:
MODEL:
!Úloha obchodného cestujúceho;
SETS:
MIESTO/A,B,C,D,E/:U;
MATICA(MIESTO,MIESTO):X,KM;
ENDSETS
!minimalizácia počtu ubehnutých kilometrov;
MIN= @SUM( MATICA:KM*X);
!riadkové a stĺpcové súčty sú rovné 1;
@FOR( MIESTO(I):@SUM(MIESTO(J):X(I,J))=1);
@FOR( MIESTO(J):@SUM(MIESTO(I):X(I,J))=1);
!Premenné U môžu byť ľubovolné;
@FOR( MIESTO:@FREE(U));
!Kuhn-Tuckerove podmienky;
@FOR( MIESTO(I)|I#GT#1:@FOR( MIESTO(J)|J#GT#1:
U(I)-U(J)+@SIZE(MIESTO)*X(I,J)<=@SIZE(MIESTO)-1));
!Podmienky bivalentnosti;
@FOR(MATICA:@BIN(X));
DATA:
!Matica vzdialeností;
KM=0 25 60 87 42
25 0 68 12 58
60 68 0 33 40
87 12 33 0 71
42 58 40 71 0;
ENDDATA
END
5
V modelovom príklade to bude iba 25 premenných a 17 ohraničení
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
107
Po spustení príkazom Solve dostaneme následujúce riešenie:
Global optimal solution found at iteration:
65
Objective value:
152.0000
Variable
U( A)
U( B)
U( C)
U( D)
U( E)
X( A, A)
X( A, B)
X( A, C)
X( A, D)
X( A, E)
X( B, A)
X( B, B)
X( B, C)
X( B, D)
X( B, E)
X( C, A)
X( C, B)
X( C, C)
X( C, D)
X( C, E)
X( D, A)
X( D, B)
X( D, C)
X( D, D)
X( D, E)
X( E, A)
X( E, B)
X( E, C)
X( E, D)
X( E, E)
Value
0.000000
4.000000
2.000000
3.000000
0.000000
0.000000
0.000000
0.000000
0.000000
1.000000
1.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
1.000000
0.000000
0.000000
1.000000
0.000000
0.000000
0.000000
0.000000
0.000000
1.000000
0.000000
0.000000
Reduced Cost
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
25.00000
60.00000
87.00000
42.00000
25.00000
0.000000
68.00000
12.00000
58.00000
60.00000
68.00000
0.000000
33.00000
40.00000
87.00000
12.00000
33.00000
0.000000
71.00000
42.00000
58.00000
40.00000
71.00000
0.000000
Z tohto vyplýva, že optimálna trasa je následovná A-E-C-D-B-A, prepravné náklady na danú trasu budú 152 jednotiek
a riešenie dosiahneme do 65 iteráciách t. j. neexistuje pri daných vstupných podmienkach iná trasa, ktorá by viedla ku
nižším nákladom, ako trasa ktorú sme zistili pomocou úlohy o obchodnom cestujúcom.
Literatúra:
[1]
UNČOVSKÝ, L. Modely sieťovej analýzy. Bratislava : Alfa, 1991.
[2]
BREZINA, I.; IVANIČOVÁ, Z. Kvantitatívne metódy v logistike. Bratislava : Ekonóm, 1999.
[3]
DADO, M. et al. TASID – technológie a služby inteligentnej dopravy, vedecko technický projekt č.
AV/819/2002, Cisko, Š. et al: čiastkový projekt – Ekonomické a mimoekonomické efekty a hodnotenie
investícií, ŽU v Žiline, 2002-2005
[4]
GREGOVÁ, E. Regionálne aspekty globalizácie: nová úloha konkurenčnej schopnosti regiónu. In:
Globalizácia a jej sociálno-ekonomické dôsledky ´05: Zborník z medzinárodnej vedeckej konferencie. Rajecké
Teplice 2005. ISBN: 80-8070-463-5
Adresa:
Ing. Tomáš Klieštik, PhD.
Žilinská univerzita,
010 26 Žilina
t.č.:041/5133221
e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
108
THE VORTEX-FRACTAL THEORY OF THE UNIVERSE STRUCTURES
Pavel Ošmera
Brno University of Technology
Abstract: The strength of physical science lies in its ability to explain phenomena as well as make
predictions based on observable, and repeatable phenomena according to known laws. Science is
particularly weak in examining unique, nonrepeatable events. We try to piece together the knowledge of
evolution with the help of biology, informatics and physics to describe a complex vortex structure of the
universe. Evolution is a procedure where matter, energy, and information come together. We would like to
find the plausible unifying mechanisms for an explanation of the vortex systems. Investigators with
specialized training in overlapping disciplines can bring new insights to the area of study, enabling them
to make original contributions. This paper is an attempt to explain a vortex-fractal principle of universe
structures, vortex light rays and what is gravitation.
Keywords: evolution, universe, a basic particle structure, light, gravitation
1. Introduction
Matter has an innate tendency to self-organizing and generating complexity [1-9]. This tendency has been at work since
the birth of the universe, when a pinpoint of featureless matter budded from “nothing” at all [11]. Irreversibility and
nonlinearity characterize phenomena in every field of complexity. Nonlinearity causes small changes on one level of
organization to produce large effects (anomalies) at the same or higher levels. The smallest of events can lead to the
most massive consequences. We can see an emergent property, which manifests as the result of positive and negative
feedback. But global features of the system cannot be understood only by analyzing the parts separately. Deterministic
chaos arises from the infinitely complex fractal structure (see Fig.1). A fractal’s form is the same no matter what length
scale we use. By using the techniques of parallelism and massive parallelism in computer simulations we come a little
closer to explaining of basic principles of complex systems. Our attention is directed to the most efficient algorithms of
turbulence simulation, which can help us understand a behavior of very complex fractal objects as a whirl. Chaotic
systems are exquisitely sensitive to initial conditions, and their future behavior can only be reliably predicted over a
short time period. Moreover, the more chaotic system, the less compressible its algorithmic representation.
Turbulence is regarded as one of the “grand challenge” problems in contemporary high-performance computing.
Despite this astonishing progress during the fifty years since the visionary work of von Neumann, simulating turbulent
fluid flow in realistic way is still largely beyond the capability of today computers. In essence, the common underlying
theme linking complexity of nature with computation, which depends on the emergence of a complex organized
behavior from many simpler cooperative and conflicting interactions between the microscopic components, whether
they are spinning electrons, atoms etc.
Fig. 1 A spiral structure as a fractal
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
109
Earthquakes, avalanches, and financial crashes do have a common fingerprint: the distribution of events follows a
simple power law [3], [11]. This power law means that the physics of small avalanches is the same as that of large ones.
Self-organization is a natural consequence of time evolution of vast aggregates of simple agents (particles). By making
these agents interact in a more complex way we could create an even greater variety of behavior, such as spiral
structures (see Fig.2b) reminiscent of galaxies (see Fig. 3a,b), hurricanes (see Fig. 3c), tornado and particles of matter.
Nonliving things, for instance crystals, are capable of self-reproduction during growth. Evolution on the edge of chaos
can be extended for nonliving systems [6 – 8]. The negative forces are caused by negative fluctuation and positive
forces are caused by positive fluctuation and by selection as an influence of boundary conditions.
Fractals seem to be very powerful in describing natural objects on all scales. Fractal dimension and fractal measure, are
crucial parameters for such description [12 - 15]. Many natural objects have self-similarity or partial-self-similarity of
the whole object and its part. Different physical quantities describing properties of fractal objects in E-dimensional
Euclidean space with a fractal dimension D [12]. Fractal dimension D depends on the inter-relation between the number
of repetition and reduction of individual object. There is relationship between the dimensionality and fractal properties
of the matter, which contains the constant of golden mean φ = (√ 5 – 1)/2 = 0.6180339887. Constant φ is a special case
of fractal dimension D defined by the condition D (D – E + 2) = 1 for E = 3 [12]. Links between inverse coupling
constants of various interactions (gravitational, electromagnetic, weak and strong) in the three-dimensional Euclidean
space are discussed in [13]. Different properties of particles (and interactions between them) correspond to the specific
values of a fractal dimension. Following values (D = 0, E – 2, E – 1, E) play the most important role in such analysis
[13].
There exists a large body of knowledge about the process of natural evolution that can be used to guide simulations.
This process is well suited for solving problems with unusual constrains where heuristic solutions are not available or
generally lead to unsatisfactory results. Often revolution has an interdisciplinary character. Its central discoveries often
come from people straying outside the normal bounds of their specialties.
Naturalistic explanations of universe’s origin are speculative [1,9,11]. But does this mean such inquiries are impotent or
without value? The same criticism can be made of any attempt to reconstruct unique events in the past. We cannot
complete our knowledge without answering some of the fundamental question about nature. How does universe begin?
What is turbulence? Above all, in a universe ruled by entropy, drawing inexorably toward greater and greater disorder,
how does order arise? Although the various speculative origin scenarios may be tested against data collected in
laboratory experiments, these models cannot be tested against the actual events in question, i.e., the origin of complex
structures. Such scenarios, then, must ever remain speculation, not knowledge. There is no way to know whether the
results from these experiments tell anything about the way universe itself evolved. In a strict sense, these speculative
reconstruction are not falsifiable; the may only be judged plausible or implausible. In the familiar Popper sense of what
science is, a theory is deemed scientific if it can be checked or tested by experiment against observable, repeatable
phenomena. Behavior of complex nonlinear systems with unpredictable behavior can be demonstrated by a relatively
simple and transparent system as a magnetic pendulum [8]. The idea is to set the pendulum swinging and guess which
attractor will win. Even with just three magnets placed in a triangle, the pendulum’s motion cannot be predicted. The
unexpected behavior can be extended to physiological and psychiatric medicine, economic forecasting, and perhaps the
evolution of society. A physicist could not truly understand turbulence or complexity unless he understood pendulums.
The chaos began to unite the study of different systems. A simulation brings its own problem: the tiny imprecision build
into each calculation rapidly takes over, because this is a system with sensitive dependence on initial conditions. But
people have to know about disorder if they are going to deals with it. Classical scientists want to discover regularities. It
is not easy to find the grail of science, the Grand Unified Theory or “theory of everything”. On the other hand there is a
trend in science toward reductionism, the analysis of system only in terms of their constituent parts: quarks,
chromosomes, or neuron. Some scientists believe that they are looking for the whole.
Magnetic fields are most easily understood in terms of magnetic field lines. These field lines define the direction and
strength of the magnetic field at any location in 3D nonlinear space. These magnetic lines have both direction and
strength – the closer we are to a magnetic source, then stronger the field lines. The magnetic field lines always begin on
the north poles of a magnet, and end on the south poles. The magnetic field of a magnetic dipole is approximately
proportional to the inverse cube of the distance from the dipole. Therefore, if we double the distance from the magnet,
then the magnetic field strength will be reduced by a factor of 8. Magnetic system of a magnetic pendulum is very
complex [8]. If we know the initial state we cannot predict the final state. Even with just three magnets on the base
plate, we cannot predict the motion. On the other hand, if we know the final state we cannot derive history to the initial
state. The same problem is with universe’s origin.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
110
2. Self-organization of complex systems
Complex systems share certain crucial properties (non-linearity, complex mixture of positive and negative feedback,
nonlinear dynamics, emergence, collective behavior, spontaneous organization, etc.). In the natural world, such systems
include universe, brains, immune systems, ecology, cells, developing embryos, and ant colonies. In the human world,
they include cultural and social systems [5, 6, 8]. Each of these systems is a network of a number of “agents” acting in
parallel. In a brain, the agents are nerve cells; in ecology, the agents are species; in a cell, the agents are organelles such
as the nucleus and the mitochondria; in an embryo, the agents are cells, and so on. Each agent finds itself in the
environment produced by its interactions with the other agents in the system. It is constantly acting and reacting to what
the other agents are doing. There are emergent properties, the interaction of a lot of parts, the kinds of things that the
group of agents can do collectively, something that the individual cannot. There is no master agent - for example - a
master neuron in the brain. Complex systems have a lot of levels of organization (hierarchical structures), with agents at
any level serving as building blocks for agents at a higher level. An example of a self-organized structure is a whirlpool
(see Fig. 2a). Nonlinearity in feedback processes serves to regulate and control. Evolution is chaos with feedback [17].
3. Vortex structures
Perhaps vortex structures with vortex lines, such as are created approximately in a whirlpool or in a tornado are a
plausible speculation of elementary particle structures. The whirlpool-structure (a turbulent eddy) with a funnel shape
can have for example a water outlet of the bath or in the PET-bottle (see Fig. 2a). The streamlines are spirals (or circles)
about a vortex axis, similar to the lines of the magnetic field round a wire carrying a current. The velocity v of the flow
is inversely proportional to the distance from the vortex axis as can be observed at the drain hole of a bath-tub [16,18].
Speed v depends on friction. In the bath-tub the core is replaced by air. For a hurricane, the core is called the eye. If two
or more vortex lines are parallel side by side in the fluid, the core of each vortex line must move in the velocity field
arising from other vortex lines. So two parallel vortex filaments with opposite rotation (spin) follow straight lines
course side by side (see Fig. 6a), whereas with the same spin they dance round each other (Fig. 6b). If three or more
vortexes are working together (see Fig. 6c), a more complex structure can be created. If one bends a vortex line into a
closed ring, then the vortex ring moves with unchanging shape in a strait line: each part of the ring must move in the
velocity field of all the other parts. For example experienced smokers can blow smoke-rings (see Fig. 4).
a)
b)
Fig.2 a) The vortex in the PET-bottle
b) The vortex model
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
111
Fig. 3 Examples of spiral structures: a), b) galaxies, c) the Earth’s hurricane
Fig. 4 The annular structure that can be created from vortexes (for example the electron)
The presented theory is based on the following assumptions and hypotheses:
1. There is a hidden substance in the universe that contains very small sub-particles with unmeasurable mass.
2. We will call these sub-particles “osmerons” (“osmero” was the the name of the deity in ancient Egypt for 4
pairs of gods).
3. Vortices and annular structures (rotational structures) can be created from these sub-particles.
4. There are two types of the vortices VB, VT with opposite flow of the energy E (see Fig. 5a,b)
5. The vortex pair with two VB, or two VT can create two types of the pair: the same rotation with parallel
rotational axes or the contra rotation with the parallel rotational axes (see Fig 6a,b and 13b,c).
6. The vortex VB and the vortex VT can create the pair with fore head orientation on the same rotational axis.
7. Vortices VB and vortices VT can create the chain (string) structure (see Fig. 14).
8. The vortex or annular structures can change from one structure to another very quickly.
9. There is sufficient amount of the accessible energy E.
If a semi-fractal description of nature is plausible for us, we can imagine that many objects of the universe are the
fractal-vortices [19]. If we see vortex structures in a macro-world (as spiral galaxies) and in real world (as the whirlpool
of bath-tub shown in Fig. 5a and as the tornado in Fig. 5b or hurricane) it can be probable that particles in micro-world
have similar fractal-vortex structures. The flow of energy E in the tornado-vortex VT and in the bath-tub-vortex VB has
opposite direction (see Fig. 5a and Fig. 5b). The pressure p is higher at the bottom of vortices. To create a vortex
structure we need a minimum value of energy – a quantum of energy and the sufficient number of sub-particles
(osmerons). Perhaps there is a relation between Planck’s constant and the minimum energy of the vortex structure VB or
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
112
VT.
a)
b)
Fig. 5 Vortex structures: a) the vortex VB at the drain hole of bath-tub, b) the tornado-vortex VT
The forces between two vortices and motion of two types of the vortex pairs are shown in Fig. 6. The behavior of the
vortex pair shown in Fig. 6a can help us to explain the expansion of the universe (the Hubble’s law – with antigravitational forces Fag). The behavior of the vortex pair shown in Fig. 6b can help us to explain the disc and spiral
shape of the Milky Way with vortex-gravitational forces Fg. But it cannot help us to explain the spherical shape of the
universe bodies as the Sun or the Earth. There are an another forces that can occur at the vortex structures (see Fig. 16a,
Fig. 17a). More then two vortices can form the complex structures (for example three vortices shown in Fig. 6c or more
vortices in Fig. 16).
Fig. 6 The motion and gravitational forces Fg and anti-gravitational forces Fag of the vortex pair V1 and V2:
a) with the contra rotation of ω1, ω2
b) with the equal rotation of ω1, ω2
c) the motion of 3 vortex particles p1, p2, and p3 (all particles have the same direction of rotation).
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
113
We can build the chain structure from the vortices (see Fig. 10, 14). The forces and motion of two vortex pair is shown
in Fig. 3a. There are two possible lines between pairs, which depends on the direction of rotation P1 and P2. The lines
between vortex pair P1 and P2 (see Fig. 7a) with the same rotation are shown in Fig. 7b. The arrangement of lines for the
contra rotation of pairs is shown on Fig.7c. More vortices can create a vortex ray (see Fig.8a).
Fig. 7
a)
b)
a) The forces and motion of two vortex pairs with opposite flow of the energy E
b) Lines of hidden sub-particles between pairs with the same rotation of vortices
c) Lines of hidden sub-particles between pairs with the opposite rotation of vortex pairs
c)
The value of the frequency of vortex’s vibrations along the rotational axis increases the accumulated energy in the
complex vortex-structure, for example in photon rays (see Fig. 8). The photon-ray can be the vortex row (chain) with a
very small mass [1]. Every photon-ray can have opposite rotation with regard neighbor photon-rays (see Fig. 8a). The
number of vortex-rays in circle structure of the stream must be even (see Fig. 8a) to form divergent rays (see Fig 8b).
Figures 8a,b,c can help us to explain the behavior of vortex-rays: as the energy flow of particle structures (vortices) and
the wave transport of energy. Because the side rays in Fig.8b have no neighbor rays, they are deviated by forces F from
the neighbor that is near to the center of ray-flow. It can explain the wave behavior of the light flow of photons as
vortex rays behind the hole. The laser beam structure in Fig.8c can be explained with particle motion shown in Fig. 6c.
Photon-vortex-structure V (or Vp) can be created from annular vortex-electron structure e by two ways:
a) by the change of the shape of vortex-electron-structure (see Fig. 9a),
b) by cutting the closed electron-structure (see Fig.9b).
a)
Fig. 8 Vortex rays of the light:
b)
c)
a) the spin example of vortex-rays (if f = f1 = 1 Hz then E = Ep = h . f1),
b) the spreading of vortex-rays behind the hole (the wave refraction of light),
c) the spreading of vortex-rays with the same rotation (the laser beam).
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
114
Fig. 9 The release of vortex pair Vp from vortex-electron structure e:
a) with change of the shape of vortex-electron-structure,
b) by cutting the closed electron-structure.
4. Elementary particles
Macroscopic matter consists of molecules, which are built out of atoms; these atoms define the elements. From the four
elements of the ancient Greeks we moved to 92 natural and about 20 artificial elements, each of which may appear in
several isotopes [16]. We learned that each atom consists of a small nucleus and a large electron cloud around it. The
nucleus is again composed of P charged protons and N neutral neutrons. A neutral atom thus has P electrons
determining its chemical behavior. For the same, P + N different atomic weights describe isotopes. In 1932 we had just
three basic particles, the proton, the neutron, and the electron, to build all known tangible matter from. The neutrino
(little neutron) was proposed in 1930 by Pauli to take away some of the energy, momentum and spin arising in betadecay of a neutron into a proton and electron. Neutrinos show very little interest in any reactions and have zero or very
small mass. “Zero” neutrino mass means smaller than measurable. Each particle seems to have an antiparticle of the
same mass but with a different sign of the electric charge. For example positron balances the electron, and the
antiproton was found in a particle accelerator built particularly to produce such antiprotons according to E = mc2. Basic
particles consist from smaller parts. The proton consists of two up and one down quark, or in short: proton = (uud), the
neuron = (udd). Quarks have fractional electric charges ±1/3, ±2/3, which explains the existence of the double particles
consisting of three quarks with 2/3 charge. Quarks appear in six types (six “flavors”): u, d, c, s, t, b (=up, down, charm,
strange, top, bottom). These six flavors are grouped into three generations, which correspond to the three leptons. Thus
the mesons with two quarks are Bosons, and the baryons with three quarks are Fermions. Each of the quarks and leptons
has its antiparticles; mesons are formed with one quark and one antiquark. Quarks, in contrast to leptons, appear in three
“colors”. We have at present 36 different quarks of various colors and favors, and 12 different leptons, or 48
fundamental particles all together. The masses of the three (up or down) quarks forming the proton or neutron are much
smaller than the mass of that nucleon; most of the mass is hidden in the interaction energy due to the enormous color
forces between quarks [16]. Perhaps it has something with a rotational vortex structure of nucleons. One very
speculative imagination of the electron structure is presented in Fig. 4. Experimental investigations of possible types of
reactions show that certain particle numbers are conserved in the sense that the number of incoming particles of this
type must equal the number of particles of the same type after the reaction is over. Each of the three-lepton generations
has the own conserved number of particles, and so do the quarks for all generation together. Perhaps during lepton
generation is a rotational structure changed (see Fig. 9). Leptons do not consists of quarks. Also the electric charge is
always conserved, whereas the mass can be transformed into energy and back. Also, electric charge is conserved with
antiparticles having opposite charge. However, antiparticles always count negative for the particle number and charge
(not for mass). Thus radiation energy can form an electron-positron pair since then the number of e-leptons is still zero.
So normal matter needs only the electron, u, and d as constituents, a nice simplification.
Investigators with specialized training in overlapping disciplines can bring new insights to an area of study, enabling
them to make original contributions. It can present the ways universe’s structure could have arisen. An open question
remains; what is gravity [16].
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
115
With these conservation laws we now understand a crucial difference between mesons (quark + antiquark) and baryons
(three quarks). The meson number is not conserved since the quark and antiquark can annihilate each other. The baryon
number is conserved since their quark number. For example the free neutron decays after 15 minutes into a proton, an
electron, and an anti-electron-neutrino. This is allowed since both the proton and the neutron has the same baryon
number of one. But a neutron star or pulsar does not decay into protons because of the strong forces between neutrons.
It seems the force between two quarks is quite strong and for large distances independent of distance. We cannot
observe quarks isolated somewhat like north and south poles of a magnetic dipole. So if we try to pull quarks apart, we
need so much energy that we merely create new particles. Only “white” combinations of quarks, where the color forces
have cancelled each other (like quark-antiquark, or three quarks with the three fundamental colors), are observed as
isolated particles [16].
We still feel gravitation since there are no negative masses, in contrast to positive and negative electric charges, which
cancel each other in their force over long distances. Electric forces do not propagate with infinite velocity but only with
the large but finite light velocity c. Perhaps c is maximum velocity of spreading in the hidden substance. Light waves
are called photons in quantum theory or Coulomb forces are transmitted via the quasi-particles called photons.
Similarly, gravitational forces propagate with velocity c, perhaps with the help of quantized gravity waves called
“gravitons” (not yet detected as quantized quasi-particles) [16]. Quite generally at present, forces are supposed to come
from the exchange of intermediate Bosons (virtual particles). Virtual particles are packets of energy ∆E = mc2 with a
short lifetime ∆t, such that the energy-time uncertainty relation ∆E ∆t ≤ h/2π allows their creation. The color forces
between quarks are transmitted by gluons (i.e., by particles glueing the quarks together) of zero mass. They bind three
quarks together as a nucleon (proton or neutron). At some distance from this nucleon some remnant of color forces is
felt, since they have not canceled each other exactly. Coulomb forces and gravitation are felt over infinitive distances
without exponential cut-off and thus have “zero” mass. Color forces also must have infinite range since otherwise we
could isolate single quarks; thus also the gluons are massless. The weak interaction covers only very short distances
because of the large mass of the corresponding intermediate Bosons. The iteration energy remains the same if all spin
reverse their orientation.
What has been described here is the so –called standard model, which includes color forces. The Grand Unified Theory
(GUT) combines it with electromagnetic and weak forces, and the Theory of Everything would include gravity.
Magnetism we understand on the basis of suitable models. How a spontaneous magnetization can be formed? A very
speculative imagination how an electromagnetic field can be created, during a jump of an electron between two atoms,
is presented in [19]. Magnetic and electric lines are presented as a vortex flow of hidden substance (subparticles with a
mass smaller than measurable).
Finally I would like to present my very speculative origin scenario. It is possible that all very complex systems exist in
anomalous states, as vortex structures. These anomalous states have a hierarchical structure. May be 3D-matter is a first
anomalous stage after a collision of “supervortex” spaces. At the second level is the origin of living systems. At the
third level is a brain with a consciousness. There is no greater anomaly in nature than matter that can live and can have
a consciousness.
a)
b)
Fig. 10 A speculative structure of a photon flow (electromagnetic lines)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
116
If a fractal description with the fractal dimension is plausible for us, we can imagine that almost all objects of the
universe are fractal vortices with a different fractal dimension. If we see vortex structures in a macro-world (as spiral
galaxies) and real world (as tornado, whirlpools, and hurricanes) it can be probable that particles in micro-world have
the same fractal-vortex structure [20]. To create a fractal vortex we need a minimum value of energy – a quantum of
energy. There can be a relation between Planck’s constant h and the fractal dimension of the vortex. The value of
frequency of vortex’s vibrations (see Fig. 14) increases the accumulated energy of a vortex structure in coincidence
with physical law for photon’s energy. A photon flow can be vortex row with a very small mass (see Fig. 10, 14). It can
be an opened structure created by cutting the closed “electron” structure in Fig.10a). May be it is a better model than a
classical planetary model. Perhaps our universe is not a superstring space but a “supervortex” space. Vortex structures
can explain magnetism, perhaps gravity etc. Vortices can attract each other using their different polarities (see Fig. 6b).
Vortices with their rotation have inertia, which explains what the mass of matter can be to compare with a hidden
substance (subparticles without mass). We can see, for example, the fractal vortex structure on Jupiter’s weather or
Earth’s weather (see Fig. 3c). Perhaps vortex structures will be a plausible speculation but research is needed to test it.
The increased awareness from other scientific communities, such as biology and mathematics, promises new insights
and new opportunities. There is much to accomplish and there are many open questions. Interest from diverse
disciplines continues to increase and evolution of complex structures is becoming more generally accepted as a
paradigm for imagination of basic principles of nature. We can make our model more complex, and more faithful to
reality, or we can make it simpler and easier to handle (to generalize and abstract). Some patterns are fractal, exhibiting
structures self-similar in scale.
5. The annular structure of the basic particles
What are the shape and the structure of the basic particles as the electron, the proton, and the neutron? All these
particles have the spin. We use the definition of the spin s as the ratio the sum of the threads (coils) c1 and c2 to the
number of electron-threads Ce (see Fig. 11 and Fig. 12). The proton and the neutron are described as the group of three
quarks [20]. Our attempt to use the quarks: u and d to form the annular shape of the proton and the neutron is presented
in Fig. 11, 16 (not in right scale – the electron is smaller and the proton and the neutron are thicker and larger).
Fig. 11 The annular and close energy structure of the basic particles with their spins s ( fractional electric charges)
(quarks u and d are only abstract and open substructures – “building blocks” – of the proton and the neutron)
The spin structure of basic particles can be explained with the description in Fig. 12 where circle structures are opened
to could be easily drawn (from Fig. 12a to Fig.12d). The closed structure from Fig. 12a is in Fig. 12e and from Fig. 12b
in Fig. 12f. Zero spin of the neutron can be form from nonzero threads (coils) c1 and c2: c1 = - c2.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
117
Fig. 12 The spin structure of particles (Fig12e is closed structure from Fig. 12a, Fig. 12f is closed structure from Fig.
12b)
Forces between two vortex pairs with different axes and directions of the energy flow are presented in Fig. 13.
Fig. 13 Forces between two vortex pairs in the different arrangement
Forces between vortices in the photon’s (or in the gluon’s) flow of the energy are presented in Fig. 14.
Fig. 14 Forces and oscillations of the photon flow
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
118
Fig. 15 Forces between two electrons
The forces between two electrons and their trajectories tr are shown in Fig. 15. Electrons with the same direction of
trajectory tr form the electron rays. It occurs in two cases of electron-electron orientations (it is shown in Fig. 15 on the
right). Two neighbor electrons in the electron-ray slightly attract each other (with the force Fa) due to the opposite
direction of magnetic lines. All reaction forces in the electron-ray have the same direction (the same as the trajectory tr).
The behavior of electron rays is similar to the behavior of photon-rays described in Fig. 8a. The strong nucleus forces
can be explained with the vortex bonds Vp1 and Vp2 between protons (see Fig. 16a) and neutrons.
c)
Fig. 16 a) The strong nucleus forces by vortex bonds Vp1 and Vp2 between protons
b) The spin structure of the proton or the neutron)
c) The model of the proton’s (or neutron’s) structure (the same as in Fig. 16b)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
119
a)
b)
Fig. 17 a) The vortex bonds V1a , V1b and V2a , V2b between two electrons that can transport energy
b) Gravitation forces between two particles with mass m1 and m2.
The forces Fe (Coulomb‘s low) between the electron and the proton depends on line density in the area S which is
inverse proportional to square distance d2 [[20].
The light beam (the photon flow) is a complex structure that can translate the energy by excitation of the vortex row.
There is the distance between two vortices where the couple force F has maximum value (see Fig. 14). Around this
position every vortex can oscillate as was presented in the center of Fig 14 (the wave theory of light). The vortices
(photons) in the light flow oscillate (vibrate) to translate energy (it is not similar to particle translation). But one vortex
pair (photon) can move and translate the energy separately. The strong nucleus forces can be explained with the vortex
bonds between vortex pairs Vp1 and Vp2 (see Fig. 16a).
The gravitational forces FG depend proportional on the density of magnetic lines in the area S (see Fig.17b) which
decreases indirectly with square distance d2 and increases proportional with the mass m1 and m2 (Newton’s low). The
higher number of proton’s vortex bonds is between protons in the nucleus (see Fig. 16a) the higher density of the
magnetic field will be in the area S and the gravitational forces will be higher. The higher energy has nucleus the higher
number of vortex bonds will be created and stronger forces are between protons and neutrons (the strong interaction).
The main component of gravitational force FG is the complex magnetic field with two-way magnetic lines (see Fig. 17b)
that alternate each another. There is an analogy between electric forces (Coulomb’s low) and magnetic forces for a
gravitational influence (Newton’s low). All the universe is fill up with the magnetic lines. To explain the structure of the
magnetic lines we need lower “sub-subparticles” than are the basic particles as electron, proton, and neutron. This “subsubparticles” can form flexible magnetic lines with similar structure as photons but they have to be smaller (perhaps
they can be gluons).
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
120
Fig. 18 Forces between the proton p and electron e
Position of the electron from the proton during the levitation depends on two different types of forces. The magnetic
force Fm repels the electron from the proton (see Fig. 18a) and the charge-reaction force Fr of the electron attracts him.
The charge forces Fr work on the principle of activity and reaction (the same principle as the rocket engine but with the
hidden substance as is shown in Fig. 15. The magnetic repulsion forces are stronger when the particles are closer. It
hangs where this upward repulsion balances the downward force of the charges, that is, at the point of equilibrium
where the total force is zero. If the electron were not spinning, the magnetic torque would turn it over. When the
electron is spinning, the torque acts gyroscopically and the axis does not overturn but rotates about the direction of the
magnetic proton’s field. This rotation is called precession. For the electron to remain suspended, equilibrium is not
enough. The equilibrium must also be stable, so a slight horizontal or vertical displacement produces a force pushing
the electron back toward the equilibrium point. The reaction force Fr of the electron and the strength Fm of
magnetization between the proton and the electron determine the equilibrium distance d where magnetism balances
“rocket” force Fr. Slight changes of temperature alter the magnetization of particles.
6. Conclusions
The annular-vortex model might be better than a classical planetary one. Our universe might be considered as
“supervortex” space. Vortex structures can explain the electromagnetic field, perhaps gravitation too. Vortices can
attract each other using their different polarities (see Fig. 13a). Planck’s constant h might be the energy Ep of one vortex
pair Vp (see Fig. 8a). Close vortex structures (Fermions) with their rotation have the inertia, which explains what the
mass of matter can be compared with a hidden substance (the sub-particles “osmerons” with very small and
unmeasurable size in the vortex structure). The radiation is an open vortex structure (Bosons, for example light with
photons) and matter is a close vortex structure with mass (for example: electrons, protons, and neutrons and follows
complex structures as the nucleus {see Fig. 16a} etc.). Electron structures rotate and proton (neutron) structures need
not rotate to have a rotating magnetic field. Both (the electron and the proton) have the rotating magnetic field. Vortex
structures might be a plausible speculation for a computer models and calculation. Fractals seem to be very powerful in
describing natural objects on all scales.
References
[1]
THAXTON, Ch. B.; BRADLEY, W. L.; OLSEN, R. L. The Mystery of Life’s Origin: Reassessing Current
Theories, New York : Philosophical Libery, 1984.
[2]
DAWKINS, R. The Selfish Gene. Oxford : Oxford Univrsity Press, 1976.
[3]
KAUFFMAN, S. A. Investigations. New York : Oxford University Press, 2000.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
121
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
PRIGOGINE, I.; STENDERS, I. Order out of Chaos. Flamingo, 1985.
OŠMERA, P. Complex Adaptive Systems. Proceedings of MENDEL’2001, Brno : Czech Republic (2001) 137
– 143.
OŠMERA, P. Complex Evolutionary Structures. Proceedings of MENDEL’02, Brno : Czech Republic (2002)
109 –116.
OŠMERA, P. Evolvable Controllers using Paralel Evolutionary Algorithms. Proceedings of MENDEL’2003,
Brno : Czech Republic (2003) 126 - 132.
OŠMERA, P. Evolution of System with Unpredictable Behavior, Proceedings of MENDEL’2004, Brno : Czech
Republic (2004) 1 - 6. Ošmera, P.: Genetic Algorithms and their Aplications, the habilit work, in Czech
language 2002.
WAŮDROP, M. M. Complexity – The Emerging Science at Edge of Order and Chaos. Viking 1993.
OŠMERA, P.; POPELKA, O.; PANACEK, T. Parallel Grammatical Evolution, Proceedings of
MENDEL’2005, Brno : Czech Republic (2005).
COVENEY, P.; HIGHFIELD, R. Frontiers of Complexity. Faber and Faber, 1996.
ZMEŠKAL, O.; NEZADAL, M.; BUCHNICEK, M. Fractal-Cantorial geometry. Hausdorf dimension and
fundamental laws of physics, Chaos, Solitons and Fractals 17 (2003) 113-119.
ZMEŠKAL, O.; NEZADAL, M.; BUCHNICEK, M. Coupling constants in fractal and cantorian physics.
Solitons and Fractals (2005) article in press
EL NACHIE MS. On the exact mass spectrum of quark. Chaos, Soliton & Fractals 2002,14;369-76
EL NACHIE MS. Quantum gravity. Clifford algebras and fundamental constant of nature. Chaos, Soliton &
Fractals 2002,14;437-50
STAUFFER, D.; STANLEY, H. E. From Newton to Mandelbrot. A Primer in Theoretical Physics with Fractal
for the Personal Computer, Springer-Verlag Berlin Heidelberg, 1996.
GLICK, J. Chaos - Making a New Science. Vintage, 1998.
CAPRA, F. The Web of Life. HarperCollins Publishers, 1996.
OŠMERA, P. Evolution of the univers structures. Proceedings of MENDEL 2005, Brno : Czech Republic
(2005) 1-6.
OŠMERA, P. The Vortex-fractal Theory of the Gravitation, Proceedings of MENDEL 2005, Brno : Czech
Republic (2005) 7-14.
Address:
Doc. Ing. Pavel Ošmera, CSc.
Institute of Automation and Computer Science
Brno University of Technology
Technicka 2,
616 69 Brno, Czech Republic
Tel.: +420 541 142 294
Fax: +420 541 142 490
e-mail: osmera @fme.vutbr.cz
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
122
VORTEX-FRACTAL PHYSICS
Pavel Ošmera
Brno University of Technology
Abstract: We would like to find the plausible unifying mechanisms for an explanation of the vortex
systems. This paper is an attempt to attain a new and profound understanding of nature’s behavior as a
vortex-fractal principle for everything. There is a vortex explanation of polarization, the diffraction
grating, and we compare quantum electrodynamics (QED) with the vortex-fractal description. This new
approach can be called physics of vortex structures (FVS).
Keywords: vortex, polarization, diffraction grating, basic particle structure, light, gravitation.
1. Introduction
The electrical force, like a gravitational force, decreases inversely as the square of distance between charges. This
relationship is called Coulomb’s law. There are two kinds of “matter”, which we can all positive and negative. Like
kinds repel and unlike kinds attract – unlike gravity where there is only attraction. But it is not precisely true when
charges are moving – the electrical forces depends also on the motion of charges in a complicated way [2]. One part of
the force between moving charges we call the magnetic force. It is really one aspect of a vortex effect. That is why we
call the subject “electromagnetism”. We find, from experiment, that the force that acts on a particular charge – no
matter how many other charges there are or how they are moving – depends only on the position of that particular
charge, on the velocity of the charge, and on the amount of charge [2]. We can write the force on a charge q moving
with a velocity v as
F = q (E + v x B).
(1.1)
We call E the electric field and B the magnetic field at the location of the charge. There is still “something” there when
the charge is removed. The field we consider as mathematical function of position and time. For an arbitrary closed
surface, the net outward flow – or flux – is the average outward normal component of the velocity, times the area of the
surface:
Flux = (average normal component).(surface area).
(1.2)
In the case of an electric field, we can mathematically define something analogous to an outflow, and we again call it
flux, but of course it is not the flow of any substance, because the electric field is not the velocity of anything [2]. In the
vortex-fractal hypothesis of electron structure [6] it can be velocity of osmerons [8]. Osmerons are sub-particles that
create for example a vortex structure of an electron [7]. The name osmeron was derived from the name of the Egyptian
deity with 4 pairs of gods for primary creative forces (from a chaos beginning). Osmerons are too small that is why they
have unmeasurable size and mass (see Fig. 1).
Fig. 1 A vortex structure of light rays
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
123
Physics has a history of synthesizing many phenomena into a few theories. For example it was discovered that heat
phenomena are easily understandable from the law of motion. The theory of gravitation, on the other hand, was not
understandable from the other theories. Gravitation is, so far, not understandable in terms of other phenomena [4].
Quantum mechanics thus supplied the theory behind chemistry. So, fundamental theoretical chemistry is really physics.
The theory of interaction of light and matter is called “quantum electrodynamics” QED. There constants in quantum
electrodynamics, that have been measured and calculated with very high accuracy. QED theory is probably not too far
off these calculations. It is necessary to distinguish two questions: how Nature works and why Nature works that way.
There is another question: “What holds the nucleus together”? [2]. In nucleus there is several protons, all of which are
positive. Why don’t they push themselves apart? It turns out that in nuclei there are, in edition to electrical forces,
nonelectrical forces, called nuclear forces, which are greater than the electrical repulsion. The nucleus forces, however,
have a short range – their force falls off much more rapidly than 1/r2 [2]. It seems to me that this is one complex energy
structure created from protons and neutrons connected by vortex bonds [7, 8]. In this complex nucleus structure energy
is running in one complex loop {7]. We may ask, finally, what holds a negatively charged electron (since it has no
nuclear forces). If an electron is all made of one kind of substance, each part should repel the other parts [2]. If we
accept the vortex electron-structure it can be vortex forces between photons from which the electron is created [7, 8].
What is the charge? It can be something that has relation to the flow of osmerons though annular electron structure
(ring). Electrical force, like gravitational force, decreases inversely as the square of distance between charges. There
must be the same principle. Perhaps there is very small escape of energy from nucleus complex loop in vortex bonds
(“gravitons” – small number of osmerons) that creates gravitational field.
2. Diffracting grating
A particular color o f light can be split one more time in a different way, according to its so-called “polarization”. Thus
light is something like raindrops – each little lump of light is called a photon - and if the light is all one color, all the
“raindrops” are the same size and vortex structure (see Fig. 1). The human eye is a very good instrument: it takes only
about five or six photons to active a nerve cell and send a message to the brain [4]. Light goes in straight lines; it bends
when it goes into water; when it is reflected from a surface like a mirror, the angle at which the light hits the surface is
equal to the angle at which it leaves the surface. Light can be separated into color; you can see beautiful colors on a
mud puddle when there is a little bit of oil on it (because the oil film’s thickness is not exactly uniform), lens focuses
light, and so on [4]. When a photon comes down on the surface of the glass, it interacts with electrons throughout the
glass, not just on the surface. The photon and electron do some kind of a dance, the net result is the same as if the
photon hit only the surface [4].
There is the relationship between the thickness of a sheet of glass and partial reflection [4]. It appears that partial
reflection can be “turned off” or “amplified” by the presence of an additional surface. It demonstrates a phenomenon
called “interference”. As the thickness of the glass increases, partial reflection goes a repeating of zero to 16%, with no
signs of dying out [4]. This strange phenomenon of partial reflection by two surfaces can be explained for intense light
by a theory of waves, but the wave theory cannot explain how the detector makes equally loud clicks as the light gets
dimmer. Quantum electrodynamics “resolves” this wave/particle duality by the probability that a photon will hit a
detector. Grand principle of QED: The probability of an event is equal to the square of the length of arrow called
“probability amplitude”. General rule of QED: Draw an arrow for each way and then combine the arrows (“add” them)
by hooking the head of one to the tail of the next [4] (see Fig. 2c). Every phenomenon about light that has been
observed in detail can be explained by the theory of quantum electrodynamics (QED) [4]. In Fig. 2a,b the same
diffraction on DVD surface is explained by the vortex structures with the same result like in Fig. 2c. Some osmeron’s
trajectory are changed or absorbed, due to symmetry of vortex structure is changed to asymmetric (compare with the
symmetrical vortex structure in Fig. 1). In the blue rays the diameter D2 is greater then D2 at the red rays. The diffraction
for red rays is greater then for blue rays because the asymmetry of vortex structure of red light is higher then at blue
light (see Fig. 2b). A diffracting grating with grooves at the right distance for red light also works for other colors. If
you shine white light down onto the grating, red light comes out at one place, orange light comes out slightly above it,
followed by yellow, green, and blue light – all the colors of rainbow. Where there is a series of grooves close together,
you can often see colors – when you hold a CD disc or better DVD disc – under bright light at correct angels (see Fig.
2d and Fig. 3). What is interesting that one light ray doesn’t really travel only in a strait line; it “smells” the neighboring
paths around it, and uses s small core of nearby space [4]. When the one slot b is smaller, the detector D starts clicking
not only in the position on strait line (photon 1 in Fig. 4). When we have two slots and the distance d between them is
decreasing (see Fig. 4) we can see interference between photon 1 and photon 2. This is an example of the “uncertainty
principle”; there is a kind of “complementary” between knowledge of where light goes trough two holes and where it
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
124
goes afterwards – precise knowledge of both is impossible. So the idea that light goes in a straight line is a convenient
approximation to describe what happens in the world that is familiar to us [4].
Fig. 2 Diffraction on DVD surface
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
125
Fig. 3 Example: how we can measure the wave length λof light (for example: red laser)
Fig. 4 Photons (or electrons] coming though sheet with two holes
Sometimes our observations (measurements) involve condition that are special and represent in fact a limited
experience with nature. It is a small section only of natural phenomena that one gets from direct experience. It is only
through refined measurements and careful experimentation that we have a wider vision [4]. And then we see
unexpected things; we see things that are far from what we would guess - far from what we could have imagined but
just to comprehend those things, which are there. For two slots problem in Fig. 4 there is one simplification at least.
Electrons behave in this respect in exactly the same way as photons; they are both screwy in exactly the same way [4].
3. Polarization
Polarization produces a large number of different possible couplings. All possible combinations of polarized electrons
and photons do not couple (see Fig. 6). So far, no fundamental spin 0 particles have been found. But we can see the
vortex rings in the water [9], in the air making a travelling vortex ring [2]. It is clear form Fig. 5 that in the points 1,2
there is not phase change. But osmerons in the points 3,4 have an opposite phase shift which change the symmetry of
vortex structure in the light ray.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
126
Fig. 5 Polarization of light ray that has a vortex structure
Fig. 6 Forces that are between the electron and the proton (a, b) and their outer vortex structures (c, d)
Fig. 7 Vortex structure of electron and anti-electron with two face orientation
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
127
Fig. 8 Vortex structure of the proton and the antiproton
4. Conclusions
The vortex models might be better than a classical one. Our universe might be considered as “supervortex” space.
Vortex structures can explain the electromagnetic field, perhaps gravitation too. Vortices can attract each other using
their different polarities. Planck’s constant h might be the energy Ep of one vortex pair Vp. Vacuum is a space full of
osmerons. There can be non-homogeneous density of osmerons from which vortex structures are created. Close vortex
structures with their rotation have the inertia, which explains what the mass of matter can be compared open structures
that create radiation. The radiation is an open vortex structure (for example light with photons) and matter is a close
vortex structure with mass (for example: electrons, protons, and neutrons and followed with complex structures as the
nucleus. All things are only complex vortex structures. Because the electron structure has the face and the back we can
distinguish two states “0” and “1”. This can be used in coding of “electron computers” and “electron memories”
(analogy to quantum computers). Fractal dimensions seem to be very powerful in describing natural objects on all
scales. The behavior and creation of annular-vortex structures with zero spin was described in [2], [9]. Water forms a
spiraling, funnel-shaped vortex as it drains from 1.5 or 2-liter soda PET-bottle. A simple connector device from two
original lids with 1cm hole allows the water drain into a second bottle. Fill only one of soda bottles about two-thirds full
of water. Place the two bottles on a table with the filled bottle on top. Watch the water slowly drip down into the lower
bottle as air simultaneously bubbles up into the top bottle. The flow of water may come to a complete stop. To create
vortex structure it is necessary add chaotic movement (shacking) or better rapidly rotate the top bottle in a circle a few
times. Notice the shape of the top vortex and there is second vortex in the lower bottle (in principle it is tornado
structure destroyed by gravity). The whole assembly can then be inverted and the process repeated. This simple model
demonstrates the basic principle of vortex structures and how we can come from chaos to self-organized structure (the
basic principle of evolution for nonliving systems). This knowledge can be used when we trying quickly get the liquid
from a tank (canister).
References
[1]
FEYNMAN, R. P.; LEIGHTON, R. B.; SANDS, M. The Feynman Lectures on Physics, volume I, AddisonWesley publishing company, 1977.
[2]
FEYNMAN, R. P.; LEIGHTON, R. B.; SAMDS, M. The Feynman Lectures on Physics, volume II, AddisonWesley publishing company, 1977.
[3]
FEYNMAN, R. P.; LEIGHTON, R. B.; SANDS, M. The Feynman Lectures on Physics, volume III, AddisonWesley publishing company, 1977.
[4]
FEYNMAN, R. P. QED – The Strange Theory of Light and Matter. Princeton University Press, 1988.
[5]
FEYNMAN, R. P. The Character of Physical Law, Penguin Books, 1992.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
128
[6]
[7]
[8]
[9]
[10]
OŠMERA, P. Evolution of univers structures, Proceedings of MENDEL 2005, Brno : Czech Republic (2005)
1-6.
OŠMERA, P. The Vortex-fractal Theory of the Gravitation, Proceedings of MENDEL’2005, Brno : Czech
Republic (2005) 7-14.
OŠMERA, P. The Vortex-fractal Theory of Universe Structures. Proceedings of the 4th International
Conference on Soft Computing ICSC2006, January 27, 2006, Kunovice, Czech Republic
LIM, T. T.; NICLES, T. B. Instability and reconnection in thehead –on collision of two vortex rings, letter to
Nature, vol. 357, May 1992.
WALLRAFF, A.; LUKASHENKO, A.; LISENFELD, J.; KEMP, A.; FISTUL, M. V.; KOVAL, Y. &
USTINOV, A.V. Quantum dynamics of a single vortex, letters to nature, vol.425, September 2003.
Address:
Doc. Ing. Pavel Ošmera, CSc.
Institute of Automation and Computer Science
Brno University of Technology
Technicka 2,
616 69 Brno, Czech Republic
Tel.: +420 541 142 294
Fax: +420 541 142 490
e-mail: osmera @fme.vutbr.cz
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
129
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
130
VÝZNAM MONITOROVÁNÍ POČÍTAČOVÝCH SÍTÍ
Imrich Rukovanský
Evropský polytechnický institut, s.r.o.
Abstrakt: Zda počítačová síť pracuje efektivně, můžeme prokazatelně zjistit pouze sledováním jeho aktivit
a to měřením.Tuto činnost lze provádět využitím monitorů (síťových analyzátorů), kterými lze sledovat až
desítky různých výkonnostních parametrů (propustnost, využití CPU, využití komunikačních linek mezi
uzly, úzká místa) a to na různých úrovních sítě. Avšak charakter měření se liší případ od případu a závisí
na tom, jaké výkonnostní parametry sledujeme, zda sledujeme pouze vybranou část sítě, nebo síť jako
celek, zda využíváme hardwareový, softwarový monitor, nebo kombinaci obou.Důležitost monitorování
počítačových sítí, jakož i různorodost praktických měření dokladuje tento příspěvek.
Klíčová slova:počítačová síť, monitor, měření, bezdrátové spoje, WiFi, přepínač, směrovač, server,
propustnost, zátěž, výkonnost.
Úvod
Složitost, rozlehlost a různorodost počítačových sítí neustále roste.Trvale se zvyšují přenosové rychlosti mezi uzly sítě,
lokální sítě se propojují v geograficky rozsáhlé celky , vznikají moderní síťové operační systémy zajišťující maximální
průchodnost toku úloh sítí. S cílem maximálního zužitkování nákladných prostředků technického i programového
vybavení sítě nastupují různé filosofie sdílení zdrojů, prokládaných činností a souběžných aktivit různých uživatelů,
výstavba důmyslných databázových systémů, až po schopnost sdružování uzlů sítě dle charakteru řešených úloh
(clustering).
Avšak na otázku, zda vůbec a do jaké míry počítačová síť pracuje efektivně, do jaké míry je využita kapacita počítačů
v uzlech sítě, jak jsou využity komunikační prostředky mezi jednotlivými uzly, zda je efektivně zvolena topologie sítě,
jaké je využití databází včetně úrovně souběžně pracujících uživatelů, aktivity na úrovni vstupních a výstupních
jednotek, a na celou řadu dalších skutečností zodpovědět nelze bez využití monitorování (měření) sítě, resp.některých
jejich částí.
Monitorování (měření) počítačové sítě provádíme využitím hardwarových, nebo softwarových monitorů, resp.
kombinací obou forem. Provést veškeré druhy měření na počítačové síti najednou je prakticky nemožné. V současné
době se využívá několik stovek druhů měření. Navíc měření se provádí zpravidla jen ve vytipované části sítě (switche,
routery, servery,časti LAN, apod.), nebo činností ( toky paketů, propustnost, přetížení, apod ).
Proto před zamýšleným monitorováním je třeba zodpovědět zásadní otázku, a to co chceme monitorováním dosáhnout,
k čemu poslouží naměřené hodnoty eventuálně ve které části sítě (složky) očekáváme zvýšení efektivnosti nebo
výkonnosti.
Problematika optimalizace činnosti a měření počítačových sítí je řešena v rámci výzkumných úkolů katedry Aplikované
informatiky na EPI Kunovice. Některé konkrétní výsledky pro ilustraci uvádíme. [1], [2], [6].
1 Uplatnění měření při zavádění nové technologie do stávající sítě
Za účelem zvýšení propustnosti počítačové sítě EPI Kunovice bylo rozhodnuto začlenit do ní možnost bezdrátového
přístupu .Byl proveden návrh takto koncipované sítě včetně vytipovaných míst pro přístupové body, zvolená konkrétní
technologie WiFi a další náležitosti související s technickým i programovým zabezpečením sítě. V procesu realizace a
následného zprovoznění takto upravené sítě bylo zapotřebí provést měření a ověřit tak reálný přínos zvolené koncepce
řešení.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
131
Měření přístupových bodů
V první fázi implementace WiFi bylo třeba naladit přístupové body tak aby si v dosahu vzájemně nekolidovaly a
současně, aby jejich vysílací výkon odpovídal normám ČTU. Za účelem zjištění skutečného stavu bylo provedeno
měření u všech přístupových bodů v dané lokalitě.
Měřením bylo zjištěno, že v souvislosti se zavedením WiFi v budovách školy bude nutné pro zajištění správného chodu
sítě velmi často kontrolovat frekvence přístupových bodů a případně doladit podle toho, který z kanálů bude vykazovat
nejlepší signál.
Měření bylo provedeno pomocí monitoru na Linuxu distribuce Debian na softwaru Cacti, který je volně distribuován
v rámci open source licencí. Tento software se na EPI,s.r.o běžně využívá ke sledování stavu sítě, zvláště pak
k monitorování zátěže a průtoku dat v jednotlivých uzlech sítě. Bez využití výsledků monitorování by zajištění
efektivního chodu sítě vzhledem k její rozmanitosti a intranetových spojů mezi uzly různými technologiemi od
bezdrátové mikrovlnné až po optické metalické by bylo nereálné. [3].
Měření propustnosti sítě po zavedení technologie WiFi
Pro zjištění přínosu bezdrátové technologie WiFi do sítě školy, bylo třeba vyjít z původně naměřených statistik
poskytnutých výše zmíněným monitorem sítě a porovnat je s nově naměřenými hodnotami.
Testováním náhodného uživatele bezdrátového připojení se ukázalo, že se propustnost intranetu školy se dle očekávání
zvýšila. Porovnáním statistických dat z dřívějšího provozu a současného režimu byl zjištěn přenos většího objemu dat,
navíc v kratších intervalech. Příznivý vliv zvýšení propustnosti lze vypozorovat nejen v samotném intranetu, ale také u
internetových přenosů sítě EPI. Uvedené skutečnosti ilustruje přiložený obrázek (Obr.1).
Obr.1 Zatíženost přenosových cest u vybraného uživatele.
Z obrázku je patrná rychlost stahování v kilobajtech za sekundu. Od měsíce července je přenos dat do intranetu
mnohonásobně vyšší než v předchozím měsíci. Rovněž můžeme vypozorovat nárůst zátěže jak na odchozím směru do
internetu (zelená křivka) tak i příchozím směru (modrá křivka); zvýšenou přenosovou kapacitu pak ilustruje na
obrázku fialová křivka (Total traffic)
Prezentované výsledky získané měřením jednoznačně potvrdily přínos začlenění nové technologie WiFi do školní sítě
[3].
2 Měření síťových prvků a propustnosti LAN
Zkoumaná lokální počítačová síť společnosti Branson je součástí celosvětové WAN sítě společnosti Emerson a je
určena pro sváření plastů, pro sváření ultrazvukem, teplem a vibracemi. Firma většinu výrobků produkuje pro
automobilový průmysl. Zkoumaná LAN je vybudovaná výhradně na bázi produktů Cisco a zaručuje tím efektivní
sdílení a využívání veškerých služeb, které hardware poskytuje.
Sestává ze tří hlavních serverů na kterých běží v době prvního měření Windows Server NT 4.0, přičemž v současné
době probíhá migrace na nový operační systém Microsoft Windows Server 2003 Standard Edition. Dále pak z jednoho
Modular Access routera Cisco 1700 (vyčleněného ke komunikaci s mateřskou společností a s divizemi v Evropě),
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
132
jednoho switche třídy 4000, dalších 6-ti switchů Catalyst třídy 3500, cca 200 pracovních stanic a zařízení (tiskárny,
CNC stroje, apod), které vyžadují IP adresu.
Nakolik zkoumaná LAN byla vyprojektovaná a uvedena do provozu v roce 2000, vyvstal problém zjistit, zda bude
výkonnostně stačit novým potřebám a požadavkům kladeným v současnosti. V síti dochází ke změně filosofie ukládání
dat a to od dřívějšího lokálního na centrální ukládání všech souborů a dat.V této souvislosti je potřeba zjistit propustnost
jednotlivých prvků sítě, zda vyhovují narůstající zátěži související s novými funkcemi daného aplikačního
prostředí.Samotné měření, jak již bylo naznačeno dříve se provádí na dvou úrovních: na úrovni monitorování prvků sítě
(switchů) a monitorování jednotlivých větví resp. segmentů sítě.
Měření propustnosti switchů
Měření je zaměřeno na switche z toho důvodu, že veškerý provoz na síti je zabezpečena právě těmito zařízeními, a to
nám poskytuje podrobný přehled o celkovém provozu sítě, jelikož data získáváme přímo z těchto zařízení samotných.
K měření se využívají dva nástroje. Hodnoty z centrálního switche Catalyst 4006 se získávají pomocí MRTG (The
Multi Router Traffic Grapher) Jde o monitor určený ke sledování přenosové zátěže na spojích sítě. Generuje HTML
stránky obsahujících PNG zobrazení, které poskytuje živé vizuální reprezentaci těchto toků (Check
http://www.stat.ee.ethz.ch/mrtg/)
Menší switche se měří pomocí nástroje zvaného Cluster management, což je v principu podobný monitor jako
předchozí, jenže je vyvinut přímo Ciscem samotným pro své sítě, což je znát na podrobnosti měřených dat
prezentujících cca 30 hodnot.
Měřením prvků (switchů) jsou k disposici nejrozmanitější informace o stávající činnosti sítě.
Ať již jde o informace o portech samotných, podrobné informace týkající se odesílání paketů, paketů přijímaných, nebo
o procentuální vyjádření hodnot, počet samotných paketů, které prošli zařízeními, využití informace o šířce pásma apod.
Vedle „výkonnostních „ parametrů poskytují rovněž informace o poruchovosti sítě a tím i dokonalý bezprostředný
přehled o stavu sítě.
Jelikož za pomocí Cluster managementu je neustále vidět co se děje v zařízeních sítě, je možné na vzniklé situace
přiměřeně a adekvátně reagovat.
Měření probíhalo ve stejných časových úsecích během pracovní doby a tím se získal dokonalý přehled o efektivnosti
využití jednotlivých složek sítě. Ukázalo se, že pro získání dokonalejšího přehledu je třeba časovou periodu snížit; byla
zvolena na 30 minut.
Sběr informací pomocí MRTG probíhal automaticky, měření pomocí Cluster managementu byla prováděna manuálně,
kdy se informace překopírovaly do excelových listů a následně se zpracovávaly a vyhodnocovaly.
Přenosové aktivity na Catalystu 4006 ilustruje obrázek Obr. 2. Jde o port č. 13 spolu s naměřenými hodnotami, které
nás nejvíce zajímají.
Max
Average
Current
In
4560.0 B/s (0.0%)
1513.0 B/s (0.0%)
283.0 B/s (0.0%)
Out
5063.0 B/s (0.0%)
1924.0 B/s (0.0%)
486.0 B/s (0.0%)
Obr.2. Naměřené přenosové rychlosti u vybraného portu.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
133
Průběh měření na Catalystech 3500 mělo odlišný charakter. Vzhledem k rozmanitosti a množství údajů byl pro ilustraci
vybrán switch 129.115.4.38 a u něj vybraný parametr Transit Rate.
FastEthernet0/1
0,9
FastEthernet0/2
FastEthernet0/3
0,8
FastEthernet0/4
FastEthernet0/5
FastEthernet0/6
0,7
FastEthernet0/7
FastEthernet0/8
0,6
FastEthernet0/9
FastEthernet0/10
FastEthernet0/11
0,5
FastEthernet0/12
FastEthernet0/13
0,4
FastEthernet0/14
FastEthernet0/15
0,3
FastEthernet0/16
FastEthernet0/17
FastEthernet0/18
0,2
FastEthernet0/19
FastEthernet0/20
0,1
FastEthernet0/21
FastEthernet0/22
FastEthernet0/23
0
FastEthernet0/24
00
6:
00
7:
00
8:
00
9:
:00
10
:00
11
:00
12
:00
13
:00
14
:00
15
GigabitEthernet0/1
GigabitEthernet0/2
Obr.3 Aktuální přenosové rychlosti v daných okamžicích.
Na obr.3 jsou zachyceny aktuální přenosové rychlosti ( Mbps) s jakou jsou data na daném switchi přeposílaná na další
zařízení. V našem případě jde o přenos z hlavního switche přes tento switch ke koncovým zařízením.
Výsledky měření a vyhodnocení statistických dat ukázaly, že počítačová síť z velké části pokryje narůstající
výkonnostní požadavky na něj kladené. Ukázalo se ale také, že na oddělení strojní a elektrokostrukce vzniká nutná
potřeba pořízení jednoho 24 portového switchu pro vyselektování pracovišť pracujících se zvýšeným objemem dat;
dosavadní podmínky již nevyhovují kapacitně jejich potřebám. Navíc bylo zjištěno, že pro zajištění efektivní práce sítě
bude třeba vytvořit VLAN pro separaci určité skupiny pracovníků spolu s centrálním serverem na ukládání dat. [4].
Měření propustnosti segmentů, resp. větví LAN
Pro získání uceleného přehledu o výkonnosti celé sítě LAN bylo rozhodnuto doplnit závěry získané měřením prvků o
monitorování vybraných segmentů sítě.Jde o ty větve, na které mají být vlivem modernizací naší LAN pokládány
nejnáročnější aplikační požadavky.
Jde předně o řadu upgradů CAD aplikací, jako je EPLAN, SolidEdge, a dalších.
Měření na nejvytíženějších větvích má zajistit plynulý přechod paketů v celé síti LAN a tím na nové podmínky práce
sítě.
K sledovaným parametrům patří:
•
monitorování toku paketů (pracovní stanice a uzly LAN), rovněž tak ztráty, směrování, výměna a další
•
identifikace nejintenzivněji vysílajících a nejintenzivněji přijímacích stanic
•
monitorování příliš dlouhých a příliš krátkých rámců,chyb CRC, chybová statistika
•
monitorování různých parametrů na jednotlivých vrstvách OSI (párování MAC adres, síťových adres, IP adres, a
pod)
Monitorování sítě bude provedeno pomocí osvědčených síťových analyzátorů Sniffer Analyzer a NetXrey, spolu
s prostředky pro správu sítě obsažené v OS WIN2003 Server SE. (Bandwidth test).
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
134
Sniffer Analyzer je prostředek určený k analýze aktivit v sítích LAN a WAN, všude tam kde se vyžaduje sofistikovaná
analýza paketů. Zachytává anomálie na všech úrovních sítě, odhaluje problémy související s propustností sítě a
předkládá návrhy možných řešení.Kromě toho je osvědčeným prostředkem i k odhalování poruch sítě. Bandwidth test
se využívá k ověřování propustnosti větve sítě. [5].
3 Sledování výkonnosti serverů pomocí Microsoft Operations Manager 2005
O významu sledování výkonnosti složitých počítačových struktur, tedy i počítačových sítí jsou vedle uživatelů
přesvědčeni také přední počítačové firmy. U předchozího příkladu jsme upozornili na monitory firmy Cisco, které jsou
specielně tvořené ke sledování výkonnosti počítačových sítí tvořených z prvků této firmy.
Poněkud s jiným řešením přichází na trh fy Microsoft , která pro monitorování výkonových a kapacitních parametrů
serverů Windows 2000 a 2003 uvádí na trh monitor
Microsoft Operations Manager 2005 (MOM 2005). Umožňuje celou škálu měření počínaje využitím CPU, diskového
prostoru, parametrů databází, včetně konektivity souběžně pracujících uživatelů, využití prostoru databáze, apod.
Za účelem ověření reálných možností monitoru a jeho použitelnosti v konkrétním nasazení, byla provedena jeho
instalace v počítačové síti s 250 uživateli na OS MS Windows ve W2k3 doméně. S File serverem, Domain
Controlerem, Exchange serverem W2k a Ebi serverem.
Po instalaci monitoru se na firemní síti neobjevil žádný problém.Po důkladnějším prozkoumání options byla zapnuta
větší filtrace a už se objevily první výsledky monitorování. Rozpoznal z čeho se skládá daná síť, ukázal, že jsou tam
switche a huby, že jsou použity prvky jak 1Gbit tak 100Mbit, tak 10Mbit a hned doporučil nahradit 10Mbit minimálně
za 100Mbit.
Dalším pozitivním rysem MOM je, že se počítačová síť dá různě rozgrupovat a podle daných grup používat určité typy
filtrace a zkoumání problémů.
Např. na zkoumané síti je k disposici oddělení DEMO, které slouží k předvádění produktů firmy (serverů a programů
na nich vytvořených na míru). V této skupině jsou statické adresy a všude jinde kromě oddělení IT oddělení je DHCP
(automatické přidělování IP adres ze serveru). V této skupině dochází často ke kolizi IP adres. Ukázalo se, že MOM
dokáže tuto situaci dobře mapovat a informovat o daném problému správce sítě. Uživatel si nastaví IP adresu a MOM
hlásí, že je uživatel v kolizi. V zápětí uživatel volá správce, že má problém se dostat na síť. Správce však již ví kde je
problém.
Velice užitečná funkce poskytovaná MOM, kterou uvítá nejeden uživatel je, že dokáže pro lepší orientaci vytvořit
předpokládaný model počítačové sítě, kterou lze přehledně znázornit, popsat s určitými informacemi jednotlivé skupiny
kde se přesně ví, co se děje a co by se dít nemělo.
Monitor MOM poskytuje uživateli možnost upravovat si chybová hlášení (error messages), jakož i styl jejich zobrazení.
Stupeň důležitosti konkrétních chybových hlášení si určí uživatel sám a speciálně si nastaví důležitost těch situací, před
kterými nás má monitor varovat.Samozřejmě, že je možno jednotlivé úrovně různě vybarvovat, měnit barvu apod.
I když zatím proběhla pouze první část experimentů s monitorem MOM (podzim 2005), lze potvrdit jeho užitečnost a
význam při sledování činnosti sítě. Na základě dosud získaných poznatků :
•
můžeme jednoznačně doporučit pořízení monitoru MOM všude tam, kde se jedná o rozsáhlejší topologii
počítačové sítě. Prokazatelně poskytuje cenné informace o skutečné činnosti sítě a navíc dokáže předcházet
problémům.Investice do MOM se vyplatí.
•
konstatujeme, že je určen pro sledování serverových systémů a což jsme předpokládali není příliš vhodný jako
monitorující nástroj pro problémy související s infrastrukturou směrovačů či přepínačů. [2, 6].
Závěr
O významu monitorování počítačových sítí již dnes není pochyb.Předně o něj mají zájem uživatelé sítí, kteří měřením
získávají aktuální informace o reálné činnosti sítě. Vedle získávání výkonnostních parametrů jako jsou propustnost,
využití, apod., mohou rovněž získávat různá hlášení o bezporuchovém stavu sítě a další potřebné informace.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
135
Renomované firmy, zabývající se výrobou různých komponent, nebo výstavbou rozsáhlých počítačových sítí proto
nabízí celou škálu monitorů, kterými lze zajistit sledování (měření) a následné efektivní provozování těchto složitých
počítačových struktur.
Závěrem je si však nutno uvědomit, že monitory (síťové analyzátory) nejsou univerzální a jsou závislé na tom jaké
parametry mají měřit, kterou složku sítě mají monitorovat (např.servery, switche, apod), jsou zpravidla vyvinuté pro
prvky určité firmy (Cisco, Microsoft).
LITERATURA
[1]
RUKOVANSKÝ, I. Sledování výkonnosti počítačových sítí. Kunovice : Výzkumná zpráva EPI, s.r.o. prosinec
2005.
[2]
HANCE, B. Microsoft Operations Manager 2005. Computerworld č.8, 2005, str. 21.
[3]
CHMELA, T. Využití WiFi ke zvýšení propustnosti sítě. Kunovice : EPI, s. r. o. Bakalářská práce EPI, 2005.
[4]
UNČÍK, M. Měření síťových prvků počítačové sítě a vyhodnocení měření. Kunovice : Podklady k Bc práci
EPI, s.r.o., 2005.
[5]
SLOBODA, P. Měření propustnosti lokální počítačové sítě. Kunovice : Podklady k Bc. práci, EPI, s.r.o., 2005.
[6]
KADLEC, R. Ověření funkčních schopností MOM 2005 na konkrétní síti. Kunovice : Prezentace výsledků
týmu 3-2-1, EPI, s.r.o., 2006.
Adresa:
Prof. Ing. Imrich Rukovanský, CSc.
Evropský polytechnický institut, s.r.o.
Osvobození 699
686 04 Kunovice
tel. / fax.: +420 572 549 018, +420 572 548 788
e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
136
ANALÝZA DAT S VYUŽITÍM NEURONOVÝCH SÍTÍ A KONTINGENČNÍCH TABULEK
Jindřich Petrucha
Evropský polytechnický institut, s.r.o.
Abstrakt: Článek popisuje možnosti použití externích zdrojů pro získání vstupních dat o Evropské unii a
převod těchto dat do formátu vhodného pro OLAP analýzu. Je popisován proces transformace a čištění dat
tak aby bylo možné provést převod do kontingenční tabulky, ze které jsou data vkládána do externího
programu realizující analýzu vybrané časové řady. Externí program simulátoru neuronové sítě provede
proces učení z vybraných dat a další etapu procesu analýzy.
Klíčová slova: OLAP, neuronová síť, časové řady, kontingenční tabulka, čištění dat, ETL.
1. Úvod
Možnosti použití externích zdrojů pro analýzu dat se v současné době neustále zvětšují, protože mnohá data jsou
prezentována na internetu a jsou neustále aktualizována. Problém tedy nespočívá ani tak v získaní dat, ale spíše v
problematice jejich analýzy pomocí moderních prostředků informačních technologií. Mnohé specializované nástroje
vyžadují určitý standardizovaný formát, který slouží jako import dat, nebo je nutno čerpat tato data z datových skladů a
použít techniku datové pumpy, která dovoluje vybrat požadovaná data. Pokud pracujeme s daty na internetu většina
těchto dat je v textovém formátu s různými grafickými úpravami, které vizuálně zpřehledňují zobrazená data, ale na
druhou stranu pro automatické zpracování je tento formát naprosto nehodný. Tato nadbytečná data je nutno odstranit
pomocí programových nástrojů nebo ručně pomocí různých editorů do požadovaného formátu. Takto upravená data je
možné použít pro analýzu pomocí nástrojů umělé inteligence, které zkvalitňují rozhodovací proces.
2. Problematika extrakce dat
2.1 Obecné principy
Důležitým krokem je etapa popsána v [1] jako ETL (Extraction Transformation Loading) tedy extrakce dat z určitého
transakčního zdroje, transformace těchto dat do potřebných struktur a následně nahraní těchto dat do datového skladu
nebo přímo do programového systému. Cílem této etapy je centralizovat data tak by byla splněna podmínka rychlého
přístupu k těmto datům. Je vhodné na počátku si definovat cíle této etapy tak aby byla splněna obecná kritéria.
Předpoklad pro tyto činnosti je naše schopnost zpracovávat data na principech uvedených ve firemní literatuře firmy
Oracle kde je definován proces transformace dat na informace tehdy když:
•
máme údaje,
•
víme, že máme údaje,
•
víme, kde tyto údaje máme,
•
máme k nim přístup,
•
zdroji údajů můžeme důvěřovat.
Ne vždy se podaří jednotlivé podmínky této definice splnit, protože ve všech systémech, kde pracuje lidský faktor,
dochází k chybám, ať už záměrným nebo náhodnou chybou lidského činitele, který pracuje v transakční úrovni
zpracování. Množství dat je v mnoha případech tak velké, že jen menší část je podrobována analýze, ze které v procesu
business inteligence vznikají znalosti.
V počáteční fázi ETL potřebujeme získat data uložené v určitém externím informačním zdroji, kterým může být
informační podnikový systém, ve kterém se provádí většina transakčních operací, ať už se jedná o účetnictví, skladové
hospodářství, odběratelsko dodavatelské vztahy nebo podobné systémy. Z těchto systémů pokud obsahují kompatibilní
databázové systémy, můžeme získat data přímo z relačních tabulek nebo přes export dat do patřičných formátů, které
podporuje většina systémů. Například CSV textové soubory a podobně. Další možností je monitorovaní tržního
prostředí internetu a odtud získat potřebná data. Zde je možné použít moderní nástroje, jako jsou čtečky RSS kanálů a
sledovat vznik informačních zdrojů změnou URL přímo na stránkách určité firmy. Nebo můžeme využívat datové
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
137
zdroje informačních a vyhledávacích serverů. Tento systém se využívá hlavně v oblasti finančních trhů při sledování
indexů jednotlivých akciových trhů a následném vyhodnocení hodnoty akcií.
2.2 Extrakce a čištění dat na konkrétním příkladu
Pro konkrétní případ si vyberme aktuální statistická data o evropské unii, která popisují stav inflace a vývoj průmyslové
výroby v jednotlivých státech evropské unie za určité období. Tato data je možné získat dle odkazu [5] v pdf formátu v
přehledových tabulkách tak jak je znázorněno na obr. 1. V tabulce je na levé straně časová osa a na horní části se
nacházejí názvy jednotlivých států.
Obr. 1. Statistická data z informačního zdroje EUROSTAT
Pokud chceme extrahovat tato data je možné vložit data s využitím kopírování a označené části data do textového
souboru, který dále opatříme oddělovacími středníky mezi jednotlivé údaje a provedeme rozdělení na jednotlivé
záznamy. Složitost tohoto kroku ukazuje textová část po výběru dat z pdf formátu. Data vytvářejí jeden záznam bez
oddělovacích informací. Tato málo strukturovaná forma zápisu se musí transformovat tak aby bylo možné separovat
potřebná data do tabulkového kalkulátoru.
EUEMUAustriaBelgiumDenmarkFinlandFranceGermanyGreeceIrelandItalyLuxembourgetherlandsPortugalSpainSwed
enUKCyprusCREstoniaHungaryLatviaLithuaniaMaltaPolandSlovakiaSloveniaI.99-0,2%-0,1%-0,1%0,4%0,2%-0,2%0,4%-0,1%-1,3%-0,8%0,1%-1,7%0,0%-0,4%0,3%-0,4%-0,6%0,7%0,9%1,1%2,6%1,0%1,0%na1,5%3,0%1,0%II.990,3%0,3%0,2%0,2%0,5%0,4%0,4%0,2%0,7%0,7%0,2%1,9%0,7%0,0%0,1%0,1%0,2%-1,8%-0,1%0,2%1,3%0,2%0,0%na0,5%0,8%0,4%
Transformaci je vhodné provést do CSV formátu kterému rozumí tabulkový kalkulátor Excel a dovoluje importovat
data z tohoto textového formátu. Dalším důležitým krokem pro analýzu je vytvoření datové struktury, která je vhodná
pro kontingenční tabulku, která představuje datovou OLAP kostku, ve které je možné provádět různé pohledy na
analyzovaná data. Na obr. 2 je znázorněna kontingenční tabulka z vloženými daty. V levé části je zachována časová
osa, která je rozdělena jednotlivé roky a další členění představují jednotlivé měsíce podle číselného označení. V
kontingenční tabulce je možné provádět různé agregace podle časové osy a vybírat si státy, které chceme v daném
ukazateli sledovat. Pod každým rokem je použit řádek s agregující hodnotou průměru dat pro určitý sloupec. Tento
agregační vzorec lze také podle požadavků měnit. Následující obrázek obr. 2 zobrazuje data vybraná pro rok 2004 a
2005 pro státy EU se zvýrazněním CR. Hodnota inflace pro CR je velmi nízká a je srovnatelná se zeměmi jako je
Rakousko, Německo nebo Francie.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
138
Obr. 2. Kontingenční tabulka s údaji inflace pro vybrané země EU .
3. Analýza dat s využitím neuronových sítí
3.1 Analýza časových řad
Pro další analýzu údajů v kontingenční tabulce můžeme použít určité další nástroje umělé inteligence, které dovolují
zlepšit analytický proces vyhledáním souvislostí a vazeb, které nejsou do dat přímo vloženy. Jeden z těchto nástrojů
představují neuronové sítě realizované pomocí programových simulátorů, které dovolují zadat architekturu neuronové
sítě, provést etapu učení podle nastaveného kritéria, simulovat proces zpracování vstupních a výstupních vzorů. Jako
hodnotící kritérium lze používat různé přístupy hodnocení v učící množině, Pro náš případ použijeme střední
kvadratickou odchylku (MSE, mean squared error) přes celé zahrnované období, zahrnující N predikcí, kdy hodnota
nesmí překročit určitou zadanou hodnotu.
MSE = 1/N ∑ (predikce – skutečnost)2
Při analýze budeme používat jednokrokovou predikci, která je zabudována přímo do programového simulátoru. Proces
zpracování dat z kontingenční tabulky bude znázorněn v následujících odstavcích.
3.2 Použití simulátoru neuronových sítí pro analýzu časových řad
Jako vstupní datový soubor je používán ACII soubor, který má určenou strukturu dovolující zadat parametry neuronové
sítě a časovou řadu podle struktury vstupního vzoru.
Ukázka ASCII souboru:
Neuron - casova rada inflace CR od roku 2000
5 -pocet vstupu
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
139
1 -pocet vystupu 1-krok predikce
70 -pocet dat casove rady
10 -pocet neuronu ve skryte vrstve
**** data casove rady ****
0.0170
0.0020
0.0000
-0.0020
0.0030
……
Data jsou vložena do tohoto souboru buď přímo z kontingenční tabulky nebo z dat ze kterých čerpá kontingenční
tabulka. Tento způsob dovoluje modifikovat také parametr jako je počet neuronů ve skryté vrstvě velmi jednoduchým
způsobem. Program simulátoru je napsán v jazyce PASCAL DELHI a dovoluje realizovat učení na vybrané části časové
řady. Na obr. 3 je základní okno simulátoru zobrazující data inflace pro CR normovaná na interval od nuly do jedné.
Z obrázku je zřejmé kolísání hodnot, které se bude snažit neuronová síť analyzovat a naučit. Pro učení byla vybrána
oblast dat od 1 do 50 prvku a pro vyhodnocení chyby oblast od 10 do 55 prvku. Proces učení byl nastaven na 50000
cyklů učení pro vhodné sledování totální chyby ve spodním pravém oknu simulátoru.
Po ukončení procesu učení byl proveden test, který je znázorněn v horním levém okně simulátoru, kde zelené hodnoty
jsou data časové řady a modré hodnoty (čaerkované) jsou predikované hodnoty pomocí simulátoru umělé neuronové
sítě. Na obr. 4 je detailně vidět oblast od 50 prvku po 70 prvek, kdy od 55 prvku se pracovalo s již neznámými daty.
Simulátor zachycuje změny a spíše má snahu předvídat prudší kolísání, tak jak bylo zřejmé v předchozích letech.
Největší rozdíl je pro hodnotu 61 prvku, kdy simulátor předpokládá růst inflace, který se ale nekonal. Pro přesnější
hodnoty by bylo vhodné prodloužit počet cyklů dvojnásobně a učit simulátor na delší časové období.
Obr. 3. Simulátor umělé neuronové sítě s daty vloženými z kontingenční tabulky pro zemi CR.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
140
Obr. 4. Detailní zobrazení dat ze simulátoru umělé neuronové sítě pro data inflace CR.
Využití této metodiky dovoluje velkou flexibilitu při analýze dat, které se nacházejí v různých informačních zdrojích na
internetu a provádět hledání souvislostí mezi jednotlivými daty. Kontingenční tabulky dovolují velmi přehledně
analyzovat souhrnné za vybrané časové období a jejich vazba na simulátory neuronových sítí dává kvalitní nástroj do
rukou samotného uživatele. Tento postup je vhodný pro různé systémy, které mají určenou časovou osu ve které se lze
dobře orientovat.
3. Závěr
Z příkladu, který byl postupně prezentován v předcházejících odstavcích je zřejmé, že pro kvalitní analýzu je velmi
důležité mít připravena data, která jsou očištěna od různých výkyvů a dovolující provést na jejich základě rozhodovací
proces. OLAP nástroje jsou připraveny pro zpracování dat, ale mají jen velmi málo možností na jejich transformaci do
požadovaného formátu. Velmi často jsou k dispozici data málo strukturovaná, která je nutno zpracovat v etapě ETL, pro
kterou je nutné připravit programové nástroje podle charakteru vstupních dat. Tato etapa je podle mého názoru
nejsložitější a zabírá nejvíce času z hlediska přípravy dat. Kontingenčním tabulky jsou vhodným nástrojem pro analýzu
dat a dovolují výběr časové řady s možností vložení do dalších programových systémů. Simulátory neuronových sítí je
vhodné použít tam kde lze sledovat určitý trend v datech, která analyzujeme.
Použitá literatura:
[1]
LACKO, L. Datové sklady analýza OLAP a dolování dat s příklady v Microsoft SQL Serveru a Oracle. 1. vyd.
Brno : Computer Press, 2003. s. 486. ISBN 80-7226-969-0.
[2]
DOSTÁL, P. Moderní metody ekonomických analýz – Finanční kybernetika. První vydání. Zlín : Univerzita
Tomáše Bati, 2002, s.110. ISBN 80-7318-075-8.
[3]
PETRUCHA, J.; MIKULA, V. Application of Neural Networks for Time Series Prediction and Making the
Adequate Program Simulator. In Proceedings First INTERNATIONAL CONFERENCE ON SOFT
COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIROMENTS January 30 -31. Kunovice :
Evropský polytechnický institut, 2003. s.165-171. ISBN 80-7314-017-9.
[4]
PETRUCHA, J. Technologie analýzy dat – OLAP systémy v prostředí DBPROVE. ACTA UNIVERSITATIS
AGRICULTURAE ET SILVICULTURAE MENDELIANAE BRUNENSIS, 2000, ročník XLVIII, číslo 2, s.
149-155. ISSN 1211-8516.
[5]
http://www.csas.cz/banka/application?pageid=downloads&dtree=cs&selnod=57/dataEU_public.pdf [online].
2005 [cit. 2006-01-18]. Dostupný z WWW: < http://www.csas.cz/banka>.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
141
Adresa:
Ing. Jindřich Petrucha, Ph.D.
Evropský polytechnický institut, s.r.o.
Osvobození 699,
686 04 Kunovice
te./fax.: +420 572 549 018, +420 572 548 788
e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
142
DETECTION OF INITIAL DATA GENERATING BOUNDED SOLUTIONS OF LINEAR
DISCRETE EQUATIONS1
Jaromír Baštinec, Josef Diblík
Brno University of Technology
Abstract: In the situation when graphs of solutions of discrete equations remain in a prescribed domain,
the problem concerning determination of their initial data is discussed. Special attention is paid to linear
discrete equations and initial data generating its solutions such that their graphs remain in a
prescribed domain are found. Illustrative examples are considered, too.
Key words: Linear discrete equation, bounded solutions, initial data.
AMS Subject Classification: 39A10, 39A11
1 Introduction and the problem considered
1.1 General suppositions
We consider a scalar discrete equation
∆u (k ) = f (k , u (k ))
(1.1)
f : N (a ) × R → R where (a ) = {a, a + 1, ...}and a ∈ N , N = {0,1,...}. Together with discrete equation (1.1) we consider an
initial problem. It is posed as follows: for a given s ∈ N we are seeking the solution u = u (k ) of (1.1) satisfying
the initial condition
u (a + s ) = u s ∈ R
(1.2)
with a prescribed constant us. Let us recall that the solution of initial problem (1.1), (1.2) is defined as an infinite
{ }
sequence of numbers u k
∞
k =0
with u k = u (a + s + k ), i.e.
u 0 = u s = u(a + s ), u 1 = u (a + s + 1),..., u n = u (a + s + n ),...
such that for any k ∈ N (a + s ) the equality (1.1) holds. Let us note that the existence and uniqueness of the solution of
the initial problem (1.1), (1.2) is a consequence of properties of the function f. If function f depends continuously on
second argument then the initial problem (1.1), (1.2) depends continuously on its initial data. Let b(k),c(k) be real
functions defined on N(a) such that b(k) < c(k) for every k ∈ N (a ) . We define a set ω ⊂ N (a ) × R as
ω : = {(k , u ) : k ∈ N (a ), u ∈ ω (k )}
with
ω (k ) := {u : b(k ) 〈 u 〈 c(k )}
and a closure of the set u; as
with
Obviously it holds:
1
ϖ := {(k , u ) : k ∈ N (a ), u ∈ϖ (k )}
ϖ (k ) := {u : b(k )≤ u ≤ c (k )}
{(k , u ), k ∈ N (a ), u ∈ ω (k )} = ω =
U
k∈N (a )
{(k , u ), u ∈ ω (k )}
Preliminary version.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
143
Let us involve set B = B1 ∪ B 2 with
and define the boundary of ω as
B1 : = {(k , u ) : k ∈ N (a ), u = b(k )} ⊂ N (a ) × R,
B2 : = {(k , u ) : k ∈ N (a ), u = c(k )} ⊂ N (a ) × R
∂ω : = {(k , u ): k ∈ N (a ), (u − b(k ))(u − c(k )) = 0} = B
Define, moreover,
∂ω (k ) := {b(k ), c (k )}
and ( for (k , u ) ∈ N (a )× R ) auxiliary functions
U 1 (k , u ) := u − b(k ),
U 2 (k , u ) := u − v(k ).
Definition 1.1. The full difference ∆U 1 (k , u ) | (k ,u )∈B1 of the function U 1 (k , u ) for a given (k , u ) ∈ B1 with respect to the
discrete equation (1.1) and the set
is defined as
∆U 1 (k , u ) |(k ,u )∈B1 : = f (k , b(k )) − b(k + 1) + b(k ).
B1
The full difference ∆U 2 (k , u ) | (k ,u )∈B2 of the function U 2 (k , u ) for a given (k , u ) ∈ B 2 with respect to the discrete
equation (1.1) and the set B2 is defined as
∆U 2 (k , u ) | (k ,u )∈B2 : = f (k , v(k )) − c (k + 1) + c(k ).
Definition 1.2. A point (k , u ) ∈ B with k ∈ N (a ) is called the point of the type of strict egress for the set ω with
respect to the discrete equation (1.1) if
∆U 1 (k , u ) | (k ,u )∈B 〈 0
in the case when (k , u ) ∈ B1 , and
∆U 2 (k , u ) | (k ,u )∈B 〉 0
in the case when (k , u ) ∈ B 2 .
The affirmation of following lemma is based on above Definitions 1.1, 1.2 and is an easy consequence of the
formulated notions.
Lemma 1.3. The point (k , u ) ∈ B with k ∈ N (a ) is a point of the type of strict egress for the set LJ with respect to the
discrete equation (1.1) if and only if
in the case when (k , u ) ∈ B1 , and
in the case when (k , u ) ∈ B 2 .
f : (k , b (k )) − b(k + 1) + b(k ) 〈 0
f : (k , b (k )) − c (k + 1) + c(k ) 〉 0
1.2
Nonlinear case and description of problem considered
The following theorem, concerning asymptotic behavior of solutions of the equation (1.1) is a particular case of a more
general result in [3, Theorem 2, p. 520] (see [4] also).
Theorem 1.4. Let us suppose that f is defined on ϖ with values in R and is continuous with respect to the
second argument. If, moreover, each point (k , u ) ∈ B is the point of the type of strict egress for the set ω with respect to
the discrete equation (1.1), then there exists an initial problem
u ∗ (a ) = u ∗ ∈ ω (a )
(1.3)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
144
such that the corresponding solution u = u ∗ (k ) satisfies the relation
u ∗ (k ) ∈ ω (k )
(1.4)
for every k ∈ N (a ) .
Now we are able to describe the general version of the problem considered. Analysing result given by Theorem
1.4 we conclude that the existence of (at least one) solution of the problem (1.1), (1.3) having the indicated
(asymptotic) behavior characterized by relations (1.4) is stated without concrete determination of the corresponding
initial data u* itself. In this contribution we try particularly to fill this gap in the linear case. Note that the
questions concerning behavior of solutions of discrete equations are considered e.g. in [1, 2], [5]— [9].
Unfortunately, problem concerning the determination of corresponding initial data was not considered there.
1.3
Linear case and the problem considered
Let us put
f (k , u (k )) : = ϕ (k )u (k ) + δ (k )
in (1.1) with ϕ (k ), δ (k ): N (a ) → R and consider the corresponding linear equation
∆u (k ) = ϕ (k )u(k ) + δ (k )
together with an initial problem
u (a ) = u ∗
It is easy to verify that in the linear case Theorem 1.4 takes the form:
Theorem 1.5. Let the inequalities
(1 + ϕ (k ))b(k ) + δ (k ) + b(k + 1) 〈 0
(1 + ϕ (k ))c(k ) + δ (k ) + c(k + 1) 〉 0
(1.5)
(1.6)
(1.7)
(1.8)
hold for every k ∈ N (a ) . Then there exists an initial problem
u ∗ (a ) = u ∗ ∈ ω (a ),
(1.9)
such that the corresponding solution u = u (k ) of equation (1.5) satisfies for every k ∈ N (a ) the inequalities
∗
b(k ) 〈 u ∗ (k ) 〈 c(k ).
(1.10)
Now, let us formulate problem under consideration.
Problem 1.6. Determine at least one value u* such that the corresponding solution u = u ∗ (k ) of the linear problem
(
)
(1.5), (1.6) satisfies the relations k, u ∗ (k ) ∈ ω for every k ∈ N (a ) , i.e. satisfies inequalities (1.10) for every
k ∈ N (a ) .
We will show in the sequel that conditions of Theorem 1.5 together with the condition ϕ (k ) 〉 − 1 for every k ∈ N (a ) are
sufficient for determining at least one initial value u* .
2
Main Results
k2
In the following we put
∏
G (i ) ≡ 1 and
k2
∑ G(i ) ≡ 0 if k , k
1
2
∈ N (a ), k1 〉 k 2 and G is a function well-defined on
i = k1
i = k1
N(a).
2.1
Auxiliary Lemma
Lemma 2.1. Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 and the inequalities (1.7), (1.8) hold. Then the sequence
c(a + s ) −
u cs : =
s −1
∑
δ (a + i )
i =0
a + s −1
∏ (1 + ϕ ( j ))
j = a + i +1
s −1
{u cs }∞s =0 with
∏ (1 + ϕ (a + i ))
, s∈N
(2.1)
i =0
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
145
is a decreasing convergent sequence and the sequence {u bs }∞s =0
b(a + s ) −
s −1
∑
δ (a + i )
i=0
u bs :=
a + s −1
∏ (1 + ϕ ( j ))
j = a + i +1
, s ∈ N,
s −1
∏ (1 + ϕ (a + i ))
(2.2)
i =0
is an increasing convergent sequence. Moreover u cs 〉 u bs holds for every s ∈ N and for limits c ∗ , b ∗ , where
c ∗ = lim u cs ,
b ∗ = lim ,
s →∞
(2.3)
s →∞
the inequality c ∗ ≥ b ∗ holds.
Proof. Let us divide the proof into several steps.
a) Property u cs 〉 u bs , s ∈ N (a ) .
Let us show that u cs 〉 u bs for every s ∈ N (a ) . This follows from (2.1), (2.2), since c(a + s ) 〉 b(a + s ) for every
s ∈N .
b) Sequence {u cs }s =0 is a decreasing sequence.
∞
Let us verify that u cs 〉 u c , s +1 for s ∈ N every . If s = 0 and s = 1 then (2.1) gives
c(a + 1) − δ (a )
u c 0 = c(a ) and u c1 =
1 + ϕ (a )
The inequality u c 0 〉 u b1 is due to the property (1 + ϕ (a )) 〉 0 a consequence of (1.8) with k = a since
(1 + ϕ (a ))c(a ) + δ (a ) − c(a + 1) 〉 0
Let us consider the general case. For k = a + s, s ∈ N , the inequality (1.8) gives
(1 + ϕ (a + s ))c(a + s ) + δ (a + s ) − c(a + s + 1)
or (since 1 + ϕ (a + s ) 〉 0)
c(a + s ) 〉
〉 0
c(a + s + 1) − δ (a + s )
1 + ϕ (a + s )
(2.4)
Then using (2.1) we estimate the general term u cs , s ∈ N of the sequence {u cs }∞s =0 :
c(a + s ) −
s −1
∑
i =0
u cs =
a + s −1
δ (a + i )
∏ (1 + ϕ ( j ))
j = a + i +1
s −1
∏ (1 + ϕ (a + i ))
c (a + s + 1) − δ (a + s )
−
δ (a + i )
(1 + ϕ ( j ))
1 + ϕ (a + s )
i=0
j = a + i +1
s −1
〉 [due to (2.4)] 〉
i =0
∑
a + s −1
∏
s −1
∏ (1 + ϕ (a + i ))
=
i =0
a+s
s −1


c (a + s + 1) − δ (a + s ) −  δ (a + i )
(
1 + ϕ ( j ))(1 + ϕ (a + s )) c (a + s + 1) − δ (a + s ) −
δ
(
a
+
i
)
(1 + ϕ ( j ))


i
=
0
j
a
i
=
+
+
1
i
=
0
j
=
a
+
i
+
1


=
=
a + s −1
s −1
∑
∏
∑
s
s
∏ (1 + ϕ (a + i ))
∏ (1 + ϕ (a + i ))
i =0
c(a + s + 1) −
=
s
a+s
i =0
j = a + i +1
∑ δ (a + i ) ∏ (1 + ϕ ( j ))
s
∏ (1 + ϕ (a + i ))
∏
i =0
= u c, s +1
i =0
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
146
So, the inequality u cs 〉 u c , s +1 holds for every s ∈ N .
c) Sequence {u bs }∞s = 0 : is an increasing sequence.
Let us show that u bs 〈 u b , s +1 for s ∈ N . If s = 0 and s = 1 then (2.2) gives
b(a + 1) − δ (a )
u b 0 = b(a ) and u b1 =
1 + ϕ (a )
The inequality u b 0 〈 u b1 is due to the property (1 + ϕ (a )) 〉 0 a consequence of (1.7) with k = a since
(1 + ϕ (a ))b(a ) + δ (a ) − b(a + 1) 〈 0
Let us consider the general case. For k = a + s, s ∈ N , the inequality (1.7) gives
(1 + ϕ (a + s ))b(a + s) + δ (a + s ) − b(a + s + 1)
or (since 1 + ϕ (a + s ) 〉 0)
b(a + s ) 〈
〈 0
b(a + s + 1) − δ (a + s )
1 + ϕ (a + s )
(2.5)
Then using (2.2) we estimate the general term u bs , s ∈ N of the sequence {u bs }∞s = 0 :
b(a + s ) −
s −1
∑
δ (a + i )
i =0
u bs =
a + s −1
∏ (1 + ϕ ( j ))
j = a + i +1
s −1
∏ (1 + ϕ (a + i ))
b(a + s + 1) − δ (a + s )
−
δ (a + i )
(1 + ϕ ( j ))
1 + δ (a + s )
i =0
j = a + i +1
s −1
〈 [due to (2.5)] 〈
i =0
∑
a + s −1
∏
a −1
∏ (1 + ϕ (a + i ))
=
i =0
a + s −1
a+ s
s −1
 s −1

b(a + s + 1) − δ (a + s ) −  δ (a + i )
(
1 + ϕ ( j ))(1 + ϕ (a + s )) b(a + s + 1) − δ (a + s ) −
a
i
δ
(
)
(1 + ϕ ( j ))
+


i =0
j = a + i +1
i =0
j = a + i +1


=
=
=
s
s
∑
∏
∑
∏ (1 + ϕ (a + i ))
b(a + s + 1) −
=
s
i =0
∏ (1 + ϕ (a + i ))
i =0
a+ s
∑ δ (a + i ) ∏ (1 + ϕ ( j ))
j = a + i +1
s
∏ (1 + ϕ (a + i ))
∏
i =0
= u b , s +1
i =0
Consequently, the inequality u bs 〈 u b , s +1 is verified for every s ∈ N .
The lemma is proved since all remaining affirmations are elementary consequences of the theory of number sequences.
2.2 Main Results
Lemma 2.2. Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 . Then solutions u (k ), U (k ), k ∈ N (a ) of two problems for linear
equation (1.5):
u (a ) = α , and U (a ) = β
with α 〈 β satisfy the inequalities
for every k ∈ N (a ) .
u (k ) 〈 U (k )
(2.6)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
147
Proof. For k = a + 1 we have
u (a + 1) = (1 + ϕ (a ))u (a ) + δ (a ) = (1 + ϕ (a ))α + δ (a ),
U (a + 1) = (1 + ϕ (a ))U (a ) + δ (a ) = (1 + ϕ (a ))β + δ (a )
and u (a + 1) 〈 U (a + 1) . Let inequality (2.6) holds for k = a, a + 1,..., a + p . Then
u (a + p + 1) = (1 + ϕ (a + p ))u (a + p ) + δ (a + p ),
U (a + p + 1) = (1 + ϕ (a + p ))U (a + p ) + δ (a + p ),
and obviously u (a + p + 1) 〈 U (a + p + 1) . The proof is complete.
Lemma 2.3. Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 and the inequalities (1.7), (1.8) hold. Then
a)
∗
The solution u = u cs
(k ), k ∈ N (a ) of the problem
∗
u cs
(a ) = u cs , s ∈ N
for the linear equation (1.5) satisfies the relations
∗
u cs
(k ) ∈ ω (k ), k = a, a + 1,..., a + s − 1
and
∗
u cs
(a + s ) = c(a + s )
Moreover,
∗
u c∗, s +1 (k ) 〈 u cs
(k ), k = a, a + 1,..., a + s
b)
(2.7)
(2.8)
(2.9)
(2.10)
∗
(k ), k ∈ N (a ) of the problem
The solution u bs
∗
u bs
(a ) = u bs , s ∈ N
for the linear equation (1.5) satisfies the relations
∗
u bs
(k )∈ ω (k ), k = a, a + 1,..., a + s − 1
and
∗
u bs
(a + s ) = b(a + s )
Moreover,
∗
u b∗, s +1 (k ) 〉 u bs
(k ), k = a, a + 1,..., a + s
(2.11)
(2.12)
(2.13)
(2.14)
Proof.
α ) Consider the initial problem (2.7). Then
∗
u cs
(a + 1) = (1 + ϕ (a ))u cs∗ (a ) + δ (a ) = (1 + ϕ (a ))u cs + δ (a )
(2.15)
Suppose that the formula
∗
u cs
(a + k ) = u cs
k −1
k −1
k −1
i =0
i =0
j = i +1
∏ (1 + ϕ (a + i )) + ∑ δ (a + i )∏ (1 + ϕ (a + j ))
(2.16)
holds for k = 0,1,..., s − 1 . For k = 0 and k = 1 it holds since we consequently get the relations (2.7) and (2.15).
Moreover,
∗
u cs
(a + s ) = (1 + ϕ (a + s − 1))u cs∗ (a + s − 1) + δ (a + s − 1)

= (1 + ϕ (a + s − 1))u cs


= u cs
= u cs
s −1
s −2
s −2
s−2

i =0
i =0
j = i +1


∏ (1 + ϕ (a + i )) + ∑ δ (a + i )∏ (1 + ϕ (a + j )) + δ (a + s − 1) =
s−2
s −1
∏ (1 + ϕ (a + i )) +∑ δ (a + i )∏ (1 + ϕ (a + j )) + δ (a + s − 1) =
i =0
i=0
j = i +1
s −1
s −1
s −1
i =0
i =0
j = i +1
∏ (1 + ϕ (a + i )) + ∑ δ (a + i )∏ (1 + ϕ (a + j ))
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
148
Comparing the last expression with (2.16), we conclude that (2.16) holds for k = s and, moreover, for every k ∈ N .
Using the representation (2.1), we get
s −1
s −1


δ (a + i )
(1 + κ (a + j )) s −1
 c(a + s ) −
s −1
s −1


i =0
j = i +1
∗
1
ϕ
δ
u cs
+
+
+
+
(a + s ) = 
(
(
a
i
)
)
(
a
i
)
(1 + ϕ (a + j )) = c(a + s )

s −1

 i =0
i =0
j = i +1
(1 + ϕ (a + i ))



i =0

So, the formula (2.9) holds.
β ) Consider the initial problem (2.11). Then
∑
∏
∑
∏
∏
∏
∗
u bs
(a + 1) = (1 + ϕ (a ))u bs∗ (a ) + δ (a ) = (1 + ϕ (a ))u bs + δ (a )
(2.17)
Suppose that the formula
∗
(a + k ) = u bs
u bs
k −1
k −1
k −1
i =0
i =0
j = i +1
∏ (1 + ϕ (a + i )) + ∑ δ (a + i ) ∏ (1 + ϕ (a + j ))
holds for k = 0,1,..., s − 1 (for k = 0 and k = 1 we consequently get relations (2.11) and (2.17)). Moreover,
∗
u bs
(a + s ) = (1 + ϕ (a + s − 1))u bs∗ (a + s − 1) + δ (a + s − 1)

= (1 + ϕ (a + s − 1))u bs


= u bs
= u bs
s −1
s −2
s −2
s −2

i =0
i =0
j = i +1


∏ (1 + ϕ (s − i )) + ∑ δ (a + i )∏ (1 + ϕ (a + j )) + δ (a + s − 1) =
s −2
s −1
∏ (1 + ϕ (a + i )) + ∑ δ (a + i )∏ (1 + ϕ (a + j )) + δ (a + s − 1) =
i=0
i =0
j = i +1
s −1
s −1
s −1
i =0
i =0
j =i +1
∏ (1 + ϕ (a + i )) + ∑ δ (a + i )∏ (1 + ϕ (a + j ))
Comparing the last expression with (2.17), we conclude that (2.17) holds for k = s and, consequently, for every k ∈ N .
Using the representation (2.2) we get
s −1
s −1


δ (a + i )
(1 + ϕ (a + j )) s −1
 b(a + s ) −


i =0
j = i +1
∗
u bs
(a + s) = 
(1 + ϕ (a + i )) +

s −1

 i =0
(1 + ϕ (a + i ))


i =0


∑
∏
∏
∏
s −1
s −1
i =0
j = i +1
∑ δ (a + i )∏ (1 + ϕ (a + j )) = b(a + s )
So the formula (2.13) is proved.
γ ) By Lemma 2.1 we have u cs 〉 u bs , s ∈ N .. Then, by Lemma 2.2, for every, k ∈ N (a ) .inequalities
∗
u cs
(k ) 〉 u bs∗ (k ) holds. Since, by Lemma 2.1, u cs 〉 u c , s +1 , s ∈ N and u bs 〈 u b , s +1 , s ∈ N , the properties (2.10) and
(2.14) are a consequence of Lemma 2.1 and Lemma 2.2. Let us prove relations (2.8), (2.12). Since
∗
u bs
(a ) 〈 u cs∗ (a ) and u b∗, s +1 (a ) 〉 u bs∗ (a ), u c∗, s +1 (a ) 〈 u cs∗ (a ), Lemma 2.2 gives
∗
u bs
(k ) 〈 u b∗, s +1 (k ) 〈 u c∗, s +1 (k ) 〈 u cs∗ (k )
For k = a + s we get
∗
b(a + s ) = u bs
(a + s) 〈 u b∗, s +1 (a + s ) 〈 u c∗, s +1 (a + s ) 〈 u cs∗ (a + s ) = c(a + s )
The last inequalities can be rewritten as
b (a + ~s ) = u b∗ s (a + ~
s ) 〈 u b∗, s (a + ~
s ) 〈 u c∗, s +1 (a + ~
s ) 〈 u c∗s (a + ~
s ) = c(a + ~
s)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
149
Putting here consequently ~
s = 0,1,2..., s − 1 we see that (2.8) and (2.12) hold, too.
Theorem 2.4. [Main Result] Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 and the inequalities (1.7), (1.8) hold. Then
[
]
every initial problem (1.9) with u ∗ ∈ b ∗ , c ∗ , where b* and c* are defined by (2.3), determines a solution satisfying
inequalities (1.10).
Proof. The proof is a straightforward consequence of Lemmas 2.1, 2.2, 2.3. For solutions of the problems
u (a ) = b ∗ , U (a ) = c ∗ we have b(k ) 〈 u(k ) ≤ U (k ) 〈 c(k ) for every k = a, a + 1,... and u (k ) ≤ u~(k ) ≤ U (k )
for every k = a, a + 1,... if u~ (a ) ∈ b ∗ , c ∗ .
[
]
Consequence 1. If Theorem 2.4 holds, then the expression
 s −1

u (a + s ) = u ∗ 
(
1 + ϕ (a + i )) +


 i =0

∏
[
s −1
∑
s −1
∏ (1 + ϕ (a + p))
δ (a + i )
i=0
pí +1
]
with u ∗ ∈ b ∗ , c ∗ is a solution of the problem (1.5), (1.9) satisfying inequalities (1.10).
Proof. The proof follows immediately from the statement of Theorem 2.4 and from the explicit form of the
problem (1.5), (1.9) which can be derived from (1.5) directly.
Theorem 2.5. Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 and the inequalities (1.7), (1.8) hold. The initial problem
u (a ) = u ∇
[
]
with u ∇ ∈ [b(a ), c (a )] \ b ∗ , c ∗ generate a solution u = u ∇ (k ) of equation (1.5) not satisfying inequalities (1.10) for all
k = N (a )
(
)
Proof. Let us suppose that u ∇ ∈ c ∗ , c(a ) . Then there exists a number s = s ∇ ∈ N such that u ∇ 〉 u cs∇ . By Lemma
2.2, the inequalities
∗
u ∇ (k ) 〉 u cs
∇ (k )
∗
∗
where u cs
∇ (k ) is solution with u cs∇ (a ) = u cs∇ , hold for very k = N (a ) . In accordance with Lemma 2.3,
∗
u cs
∇ (k ) ∈ ω (k ),
and
k = a, a + 1,..., a + s ∇ − 1
(
) (
∗
∇
u cs
= c a + s∇
∇ a+s
)
Moreover, due to (1.5) and (1.8),
∗
∇
∇
∗
∇
u cs
u cs
+ δ a + s ∇ = 1+ ϕ a + s∇ c a + s ∇ + δ a + s ∇ 〉 c a + s∇ +1
∇ a + s +1 = 1+ ϕ a + s
∇ a+s
(
Consequently,
) ( (
)) (
) (
(
) ( (
)
(
)) (
) (
)
(
)
)
u ∇ a + s ∇ +1 〉 c a + s∇ + 1
[
]
So the inequalities (1.10) do not hold for k = a + s ∇ + 1 . The case u ∇ ∈ b(a ), b ∗ can be considered similarly. The
following two corollaries follow obviously from Theorem 2.4 and Theorem 2.5.
Corollary 1. Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 and the inequalities (1.7), (1.8) hold. Then a solution
[
]
u = u (k ) of equation (1.5) satisfies inequalities (1.10) for every k = N (a ) if and only if u (a ) ∈ b ∗ , c ∗ .
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
150
Corollary 2. Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 and the inequalities (1.7), (1.8) hold. Let, moreover, b* = c*. Then
the equation (1.5) has a unique solution u = u*(k) satisfying for every k = N (a ) inequalities (1.10). This solution is
determined by initial data
u*(a) = u* = b*.
It is interesting to find sufficient conditions for the case b* = c*. Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 and the
inequalities (1.7), (1.8) hold. Let us denote
∆(s ) = u cs − u bs , s = 0,1,...
Then the length of the interval [b*,c*] can be estimated (due to the monotonicity of sequences {u cs }∞s =0 , {u bs }∞s =0 as
0 ≤ c ∗ − b ∗ 〈 ∆(s ),
s = 0,1,...
From the definition of the expressions u cs , u bs we see that ∆(s ) = u cs − u bs = [due to (2.1) and (2.2)] =
c(a + s ) −
s −1
∑
δ (a + i )
i=0
=
a + s −1
∏ (1 + ϕ ( j ))
j = a + i +1
s −1
∏ (1 + ϕ (a + i ))
b(a + s ) −
∑
a + s −1
δ (a + i )
i =0
−
∏ (1 + ϕ ( j ))
j = a +i +1
s −1
∏ (1 + ϕ (a + i ))
i=0
∆(s ) =
s −1
i=0
c(a + s ) − b(a + s )
=
c(a + s ) − b(a + s )
s −1
∏ (1 + ϕ (a + i ))
, i.e.
i =0
s −1
∏ (1 + ϕ (a + i ))
i =0
The following corollary is obvious.
Corollary 3. Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 and the inequalities (1.7), (1.8) hold. Then the following inequalities
hold obviously:
0 〈 u cs − c ∗ 〈 ∆ (s ), s ∈ N
0 〈 b ∗ −u bs 〈 ∆ (s ),
s∈ N
Theorem 2.6. Let for every k ∈ N (a ) : ϕ (k ) 〉 − 1 and the inequalities (1.7), (1.8) hold. Then b* = c* if
lim ∆(s ) = 0
s →∞
(2.18)
Proof. From Theorem 1.5, the existence a solution of problem (1.5), (1.9) follows. Then
c(a + s ) −
∗
s −1
∑
i −0
∗
c − b = lim
a + s −1
∏ (1 + ϕ ( j ))
δ (a + i )
j = a + i +1
s −1
s →∞
b(a + s ) −
− lim
s→∞
∏ (1 + ϕ (a + i ))
i =0
c (a + s ) − b(a + s ) −
s −1
lim
s →∞
∏ (1 + ϕ (a + i ))
a + s −1
∏ (1 + ϕ ( j ))
j = a + i +1
s −1
∏ (1 + ϕ (a + i ))
=
i =0
s −1
a + s −1
j = a + i +1
i−0
j = a + i +1
∏ (1 + ϕ (a + i ))
s −1
i−0
a + s −1
s −1
s →∞
c (a + s ) − b(a + s )
∑
δ (a + i )
∑ δ (a + i ) ∏ (1 + ϕ ( j )) + ∑ δ (a + i ) ∏ (1 + ϕ ( j ))
i −0
lim
s −1
=
i=0
= lim ∆(s ) = 0
s→∞
i =0
Then c* = b*.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
151
Remark 2.7. The condition (2.18) is valid e.g. in the case when
s −1
lim [c(a + s ) − b(a + s )] = 0
lim
s →∞
s →∞
∏ (1 + ϕ (a + i )) 〉
0
i =0
or in the case when functions c(k), b(k) are bounded on N(a) and ϕ (k ) ≥ ε 〉 0, ε = const, for every k = N (a ) .
2.3 Concluding remarks
Let us consider a partial case of equation (1.5) with ϕ (k ) = −1 + A, A ∈ R, A 〉 1 and with a function δ (k ) ,
u (k +1) = Au(k ) + δ (k )
(2.19)
The following theorem is a consequence of the previous results:
Theorem 2.8. Let A > 1 and δ (k ) 〈 M on (a ) . Then initial problem
δ (a + i )
∞
∑
u ∗ (a ) = u ∗ = −
(2.20)
A i +1
i =0
generates a unique bounded solution of equation (2.19) on N(a).
Proof. The series (2.20) is obviously convergent since it can be majorized by a convergent series
M
A
∞
∑A
1
i=0
i
=
M
A −1
Put c(k ) := εM , b(k ) := −εM with ε 〉 1 / ( A − 1), ε =const. Then inequalities (1.7) and (1.8) hold since
(1 + ϕ (k ))b(k ) + δ (k ) − b(k + 1) ≤ M (ε (1 − A) + 1) 〈 0
and
(1 + ϕ (k ))c(k ) + δ (k ) − c(k + 1) ≥ M (ε ( A − 1) − 1) 〉 0
for every k = N (a ) . All assumptions of Theorem 2.4 hold. Let us compute the limits c*, b* . In accordance with (2.3),
(2.1), we get
c (a + s ) −
c ∗ = lim
s →∞
s −1
∑
δ (a + i )
i =0
a + s −1
∏ (1 + ϕ ( j ))
j = a + i +1
s −1
s −1
∑
εM −
i =0
= lim
∏A
j = a + i +1
∏A
i=0
∞
∑
=−
i =0
δ (a + i )
A i +1
= u∗
i =0
Since
c(a + s ) − b(a + s )
lim ∆(s ) = lim
s →∞
a + s −1
s −1
s→∞
∏ (1 + ϕ (a + i ))
δ (a + i )
s→∞
s −1
∏ (1 + ϕ (a + i ))
= lim
s →∞
2εM
As
=0
i=0
then in accordance with Theorem 2.6 we conclude
w* = c* =b*.
This is in accordance with (2.20). Then by Corollary 2 there exists a unique solution u = u*(k) (generated just by
relation (2.20)) satisfying for every k = N (a ) inequalities
− εM 〈 u ∗ (k ) 〈 εM
These inequalities express the boundedness of u*(k) on N(a). Since inequalities (1.7), (1.8) are valid for arbitrary
positive M we can conclude that bounded solution of equation (2.19) is really only one.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
152
Theorem 2.8 serves only as an illustration of results obtained. Let us note that this result coincides with result
described in the book [5, p. 77, Exercise 20, part (c)].
2.4 Examples
In this section, two illustrative examples are considered. In the first one we determine initial data generating unique
bounded solution of equation of the type (1.5). In the second one every solution is unbounded, but initial data
determine a particular solution with slightly different asymptotic behaviour.
Example 2.9. Let us consider the equation (1.5) with ϕ (k ) = 1 / k , δ (k ) = −1 / (k + 2 ) and a = 1, i.e. the equation
∆u (k ) =
1
1
u (k ) −
, k ∈ N (1)
k
k +2
(2.21)
Define b(k ) ≡ 0, c(k ) ≡ 1, k ∈ N (1) . Then
(1 + ϕ (k ))b(k ) + δ (k ) − b(k + 1) = δ (k ) = −
1
〈0
k +2
for all k ∈ N (1) and the inequalities (1.7) hold. Moreover,
(1 + ϕ (k ))c(k ) + δ (k ) − c(k + 1) = 1 −
1
2
=
〉 0
k k + 2 k (k + 2 )
and the inequalities (1.8) hold for all k ∈ N (1) . All conditions of Theorem 2.4 hold and therefore there exists a solution
u~ (k ) of equation (2.21) such that
0 〈 u~ (k ) 〈 1
(2.22)
for every k ∈ N (1) . Moreover, using (2.3) and (2.1) we get
c(a + s ) −
c ∗ = lim u cs = lim
s → +∞
s −1
∑
δ (a + i )
i=0
a + s −1
∏ (1 + ϕ (a + i ))
j = a + i +1
s −1
s → +∞
1+
s −1
∑
i =0
= lim
s −1
s → +∞
∏ (1 + ϕ (a + i ))
1
(1 + i ) + 2
∏
i =0
i=0
s

1
∏ 1 + j 
j =i + 2
1 

1 +

+i 
1

=
s +1  1  4 5
s +1 
13 4
1
1 1
1
1 
1 1 1 1
1 +  . .....
(1)1
1 + (s + 1) . + . + ... +
. +
.
 +  . .....
 + ... +

+
+
+ 1
s  43 4
s 
s+2
32 3
3
2
4
3
s
1
s
s
2
s

= lim
=
lim
s → +∞
s → +∞
3 4 5
s s +1
s +1
2. . . .....
.
2 3 4
s −1 s
s +1
s +1
 1 1   1 1 
1
1
1  1
1 1 
 1
+ lim
= lim
−
 −
 = lim  −  +  −  + ... + 
 =
s → +∞ s + 1 s → +∞
(i + 1)i s → +∞ i = 2  i i + 1  s →+∞  2 3   3 4 
 s + 1 s + 2  2
i =2
lim
∑
∑
Since
lim ∆(s ) = lim
s →∞
s→∞
c (a + s ) − b(a + s )
s −1
∏ (1 + ϕ (a + i ))
= lim
i =0
1
s → ∞ s −1
2+i
∏ 1+ i
= lim
s→∞
1
=0
s +1
i =0
then, in accordance with Theorem 2.6, b* = c* = 1/2. Therefore the equation (2.21) has a unique solution
u = u~ (k ), k ∈ N (1) satisfying inequalities (2.22), determined by the initial data u (1) = 1 / 2 . Indeed, it is easy to verify,
that the function
k
u~ (k ) =
, k ∈ N (1)
k +1
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
153
is such a solution of equation (2.21). Moreover the general solution of (2.21) is given by formula
u (k ) = u~ (k ) + C.k ,
where C is any constant. It means that u~ (k ) is a unique bounded solution of equation (2.21).
Example 2.10. Let us consider the equation (1.5) with ϕ (k ) = 2 / (k + 1), δ (k ) = 2 and a = 2, i.e. the equation
∆u (k ) =
2
u (k ) + 2,
k +1
k ∈ N (2)
(2.23)
Define b(k) = k2, c(k) = (k +1I)2, k ∈ N (1) . Then
(1 + ϕ (k ))b(k ) + δ (k ) − b(k + 1) = 1 +

2  2
1− k
2
〈 0
k + 2 − (k + 1) =
k +1
k +1
for all k ∈ N (2) and the inequalities (1.7) hold. Moreover,
(1 + ϕ (k ))c(k ) + δ (k ) − c(k + 1) = 1 +

2 
2
2
(k + 1) + 2 − (k + 2) = 1 〉 0
k + 1
and the inequalities (1.8) hold for all k ∈ N (2) . All conditions of Theorem 2.4 hold and therefore there exists a solution
u~ (k ) of equation (2.23) such that
k 2 〈 u~ (k ) 〈 (k + 1)2
(2.24)
for every k ∈ N (2) . Using (2.3) and (2.1) we get
s −1
∑
c(a + s ) −
i =0
c ∗ = lim u cs = lim
s → +∞
δ (a + i )
a + s −1
∏ (1 + ϕ ( j ))
j = a + i +1
s −1
s → +∞
∏ (1 + ϕ (a + i ))
s −1
= lim
(a + s + 1)2 − 2∑
s → +∞
i =0
s −1
(s + 3)2 − 2∑
i =0
lim
s −1
s → +∞
s +1
j +3
∏ j +1
j =i +3
= lim
i+5
(s + 3)2 − 2(s + 3)(s + 4 ) 1 . 1 + 1 . 1 + ... +
4 5 5 6
(s + 3)(s + 4)
3.4
s → +∞
s +3
(s + 3)2 − 2(s + 3)(s + 4 )∑
i =4
lim
s → +∞
12 −
1
.(s + 3)(s + 4)
12
i =0
2 

1 +

 a + i + 1
s + 2  5 6
s+ 2
 s+2
 4 5
5 6 7 8
s s +1 s + 2 s + 3 s + 4
.
.
.
.
. . . .....
3 4 5 6
s − 2 s −1 s s + 1 s + 2
i =0
lim
∏

2 

1 +
+ 1 
j
j = a + i +1
∏
=
(s + 3)2 − 2 6 . 7 ..... s + 4  +  7 . 8 ..... s + 4  + ... +  s + 4  + 1
s → +∞
∏ i +3
i =0
s −1
s +1
1
1 
 −

 i (i + 1) 
1
1
1
1 
.
.
.
s + 2 s + 3 s + 3 s + 4 
=
s+3
= lim
s → +∞
(s + 3)2 − 2(s + 3)(s + 4)∑
i=4
1
.(s + 3)(s + 4)
12
1
i (i + 1)
=
 s+3
1
1
1
1 
1 1 1 1
= lim 12.
− 24. . + . + ... +
−
+
−
 =
s → +∞ 
s+4
s + 2 s + 3 s + 3 s + 4 
4 5 5 6
24
=6
4
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
154
Since
lim ∆(s ) = lim
s →∞
s→∞
c(a + s ) − b(a + s )
(s + 3)2 − (s + 2)2
s −1
s →∞ 1
.(s + 3)(s + 4 )
∏ (1 + ϕ (a + i ))
12
= lim
12(2s + 5)
=0
s → ∞ (s + 3)(s + 4 )
= lim
i =0
then, in accordance with Theorem 2.6, b* = c* = 6. Therefore the equation (2.23) has a unique solution
u = u~ (k ), k ∈ N (2) satisfying inequalities (2.24), determined by the initial data u~ (2 ) = 6 . Indeed, it is easy to verify that
the function
u~ (k ) = k (k + 1), k ∈ N (2)
is such a solution of equation (2.23). It is easy to see that the general solution of (2.23) is given by formula
u (k ) = u~ (k ) + C.(k + 1)(k + 2)
where C is any constant.
Acknowledgment
The first author was supported by the Grant 201/04/0580 of the Czech Grant Agency (Prague), the second author was
supported by the Council of Czech Government MSM 00216 30503.
References
[1]
AGAWAL, R.P. Differential Equations and Inequalities, Marcel Dekker, Inc., 2nd ed., 2000.
[2]
AGAWAL, R.P.; POPENDA, J. Periodic solutions of first order linear difference equations, Mathl. Comput.
Modelling 22, 11-19, 1995.
[3]
DIBLÍK, J. Discrete retract principle for systems of discrete equations, Comput. Math. Appl 42 (2001), 515528.
[4]
DIBLÍK, J. Asymptotic behaviour of solutions of discrete equations, Fund. Differ. Equ., 11 (2004), 37-48. J.
Diblik, Retract principle for difference equations Proceedings of the Fourth International Conference on
Difference Equations, Poznan, Poland, August 27-31, 1998. Eds.: S.Elaydi, G. Ladas, J. Popenda and J.
Rakowski, Gordon and Breach Science Publ., 107-114, 2000.
[5]
ELAYDI, S. N. An Introduction to Difference Equations, Springer, 1999. Second Edition.
[6]
GOLDA, W.; WERBOWSKI, G. Oscillation of linear functional equations of the second order, Funkc. Ekvac.
37 (1994), 221-227.
[7]
GYORI, I.; PITUK, M. Asympotic formulae for the solutions of a linear delay difference equation, J. Math.
Anal. Appl. 195 (1995), 376-392.
[8]
GYORI, J.; PITUK, M. Comparison theorems and asymptotic equilibrium for delay differential and difference
equations, Dyn. Systems and Appl. 5 (1996), 277—302.
[9]
MIGDA, M.; MIGDA, J. Asympotic behaviour of solutions of difference equations of second order, Demonstr.
Math. XXXII (1999), 767-773.
Adresa:
Doc. RNDr. Jaromír Baštinec, CSc.
Department of Mathematics
Faculty of Electrical Engineering and Communication
Brno University of Technology
Technická 8,
616 00 Brno,
[email protected]
Adresa:
Doc. RNDr. Josef Diblík, DrSc.
Department of Mathematics
Faculty of Electrical Engineering and Communication
Brno University of Technology
Technická 8,
616 00 Brno,
[email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
155
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
156
ON SOME PROPERTIES OF FRACTIONAL CALCULUS
Vlasta Krupková, Zdeněk Šmarda
Brno University of Technology
Abstract: This paper is devoted to some properties of the fractional integral and derivative. There are
shown applications of this calculus in cases of some classes of fractional integral equations.
Key words: Fractional integral, fractional derivative, Laplace transform.
1.
Introduction
The fractional calculus is a generalization of integration and derivation to non-integer order operators [4,7]. The idea of
fractional calculus has been known since the development of the normal calculus, with the first reference probably being
associated with Leibniz and L'Hospital in 1695.
Fourier, Euler, Laplace are among the many that dabbled with fractional calculus and the mathematical consequences [6].
The most famous of these definitions that have been popularized in the world of fractional calculus are the RiemannLiouville and Grunwald-Letnikov defintion [1]. From the view of requirements of physical reality Caputo reformulated the
more "classic" definition of the Riemann-Liouville fractional derivative in order to use integer order initial conditions to
solve fractional order differential equations [7]. As recently as 1996, Kolowankar reformulated again, the RiemannLiouville fractional derivative in order to differentiate no-where differentiable fractal functions [2].
In this paper we will also be devoted to other constructions of the fractional derivative especially, and we show some
particularities of these ones occuring at solving of certain classes of integral equations.
2.
The fractional integral
Understanding of definitions and use of fractional calculus will made necessary some but relatively simple
mathematical definitions that will arise in the study of these concepts.
Euler's Gamma function:
Γ(t ) = ∫ x t −1 e − x dx,
(1)
Γ(n ) = (n − 1)!
(2)
∞
special case when x = n :
0
Mittag-Leffler function in two parameters:
∞
Εα , β (t ) = ∑
k =0
tk
Γ (αk + β )
α 〉 0,
β 〉 0
(3)
It is a generalization of exponential function
∞
Ε1,1 (t ) = ∑
k =0
More particular cases
( )
Ε 2,1 (t ) = cosh t , Ε1
∞
tk
tk
=∑
= et
Γ(k + 1) k =0 k!
2 ,1
(t ) =
2
π
( )
e −t erfc t
(4)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
157
The Laplace transform:
L{ f (t ) } := ∫ e − pt f (t )dt = F ( p )
∞
(5)
0
Also commonly used is a convolution of two function
f (t ) ∗ g (t ) := ∫ f (t − τ )g (τ )dτ = g (t ) ∗ f (t )
t
0
L{ f (t ) ∗ g (t ) } = F ( p )G ( p )
(6)
One final important property of the Laplace transform that should be addressed is the Laplace transform of a derivative
of integer order n of the function f (t )
n −1
n −1

L F ( n ) (t ) } = p n F ( p ) − ∑ p n− k −1 f ( k ) (0) = p n F ( p ) − ∑ p k f
k =0
k =0

( n − k −1)
(0)
(7)
Cauchy formula for evaluating the nth integration of the function:
∫ ...∫ f (τ )dτ =
t
1
(t − τ )(n−1) f (τ )dτ
∫
0
0
(n − 1)!
n
For the abbreviated of this formula, we introduce the operator I
t
1
I n f (t ) := f n (t ) =
(t − τ )(n−1) f (τ )dτ
∫
0
(n − 1)!
t
(8)
(9)
For direct use in (8), n is restricted to be an integer. The primary restriction is the use of the factorial which in essence
has no meaning for non-integer values. The Gamma function is however an analytic expansion of the factorial for all
reals, and thus can be used in place of the factorial as in (2). Hence, by replacing the factorial expression for its Gamma
function equivalent, we can generalize (9) for all α ∈ R+
I α f (t ) := f α (t ) =
1 t
α −1
(
t − τ ) f (τ )dτ
∫
Γ(α ) 0
(10)
This approach is commonly referred to as the Riemann-Liouville approach. This formulation of the fractional integral
carries it some very important properties, that will later show importance when solving equations involving integrals
and derivatives of fractional order. First, we consider integrations of order α = 0 to be an identity operator, i.e.
I 0 f (t ) = f (t )
Also, given the nature of the integral's definition, and based on the principle from which it came (Cauchy repeated
integral equation), we can see that just as
I n I m = I m+ n = I m I n ,
m, n ∈ N
(11)
so to,
I α J β = I α +β = I β I α , α , β ∈ R
The one presupposed condition placed upon a function f (t ) that needs to be satisfied for these and other similar
properties to remain true, is that f (t ) be a causal function, i.e. that it is vanishing for t ≤ 0 . Although this is a
consequence of convection, the convenience of this condition is especially clear in the context of the property
demonstrated in (11). The effect is such that
function Φ (t )
f (0 ) = f n (0 ) = f α (0) = 0 . Using the Gamma function we define a
Φ α (t ) :=
t α −1
Γ(α )
From this we obtain
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
158
α −1
(
t −τ ) +
Φ α (t ) ∗ f (t ) = ∫
f (τ )dτ
0
Γ(α )
t
t + denotes the function vanishes for t ≤ 0 . Now the formula of the fractional integral (10) can be written in the form
1 t
α −1
(
I α f (t ) = Φ α (t ) ∗ f (t ) =
t − τ ) f (τ )dτ
∫
Γ(α ) 0
(12)
We will find the Laplace transform of the Riemann-Liouville fractional integral. In (12) we showed that the fractional
integral could be expressed as the convolution of two functions,
given by
{
L t α −1
Φ α (t ) and f (t ) . The Laplace transform of t α −1 is
} = Γ(α ) p −α
Thus, the Laplace transform of the fractional integral is found to be
{
L I α f (t ) } = p −α F ( p )
(13)
3. The fractional derivative
Because the Riemann-Liouville approach to the fractional integral began with an expression for repeated integration of
a function, one's first instinct may be to imitate a similar approach for the derivative. In the following we introduce two
definitions fractional derivative in the sense of the Riemann-Liouville approach. Consider a differentiation of
order α ∈ R+ . Now, we select an integer m such that m − 1 〈 α 〈 m . Given these numbers, we have two possible
ways to define the derivative. Having found the integer m, the first step of the process is to integrate our function
f (t ) by order m − α and second, we differentiate the resulting function f m−α (t ) by order m. This method we will call
Left Hand Definition (LHD) of the fractional derivative and there is given
 d mm  1 ∫t f (ατ+)1− m dτ 
 Γ ( m −α ) 0 (t −τ )

D Lα f (t ) :=  dtd m 
, m −1 〈 α 〈 m
f
(
t
)
,
α
=
m
 dt m
(14)
The Right Hand Definition (RHD) attempts to arrive at the same result using the same operations, but in the reverse
order. The mathematical results of this is the form
(m)
 1 ∫t f α (+τ1−)m dτ
 Γ ( m −α ) 0 (t −τ )
D f (t ) :=  d m
, m −1 〈 α 〈 m
f (t ), α = m
m
 dt
α
R
(15)
This second definition, although referred to here as the Right Hand Definition, was originally formulated by Caputo,
and is therefore, commonly referred to as the Caputo fractional derivative. Demonstrating the practicality of the RHD
over the LHD is conveniently simple. For example, the fractional derivative of a constant using the LHD is not zero,
and in fact there is valid
Dια C =
Ct −α
Γ(1 − α )
which is a substantial problem in the physical world. In the section 2. we demonstrated the Laplace transform of the
fractional integral (13). Using this definition, we may find similarly the Laplace transform of LHD fractional derivative.
The fractional derivative of this one may be written in the form
DLα f (t ) = g ( m ) (t ), where g (t ) = I m −α f (t ),
m −1 ≤ α 〈 m
Using the formula (7) and the definition of the fractional integral Laplace transform, we obtain
m −1
m −1

L DLα f (t ) } = p m G ( p ) − ∑ p k g ( m −k −1) (0 ) = p α F ( p ) − ∑ p k DL(α −k −1) f (0)
k =0
k =0

(16)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
159
It is obvious that the required initial conditions are for all k to n — 1 terms, fractional order derivatives of f (t ) .
For the RHD, we will write the derivative in the form
DRα f (t ) = I m−α g (t ), where g (t ) = F (m ) (t ), m − 1 ≤ α 〈 m
Using the formula (13), we get
{
m −1
L DRα f (t ) } = p − (m − α )G ( p ) = pα F ( p ) − ∑ p α −k −1 f ( k ) (0)
k =0
(17)
In this formulation, the order a does not appear in the derivatives of f (t ) . So, quite conveniently , integer order
derivatives f (t ) are used as the initial conditions, and therefore easily interpreted from physical data and observations.
Consider the fractional integral equation of the first kind
1 t u (τ )
dτ = f (t ),
Γ(α ) ∫0 (t − τ )(1−α )
This equation can be written in the form
0 〈α 〈1
(18)
I α u (t ) = f (t )
There is valid
I α u (t ) = Φ α (t ) ∗ u (t ) ⇒ L{Φ α (t ) } =
U ( p)
pα
(19)
We can reorder the result (19) into one of two forms.
 F (p)
U ( p ) = p α F ( p ) = p 1−α 
 p 
or
U ( p ) = pα F ( p ) =
1
( pF ( p ) − f (0)) + f 1(−0α)
1−α
p
p
Inverting the first form into the time domain, we get
u (t ) =
1
d t f (τ )
dτ = f (t )
Γ(1 − α ) dt ∫0 (t − τ )α
which is equivalent to solution of (18) with the LHD. The second form can be similarly inverted to yield
u (t ) =
t f ′(τ )
1
t −α
(
)
(
)
d
τ
f
t
f
0
=
+
Γ(1 − α ) ∫0 (t − τ )α
Γ(1 − α )
which is equivalent to solution of (18) with the RHD.
Now we consider the fractional integral equation of the second kind in the form
u (t ) +
λ t u (t )
dτ = f (t ) ⇔ 1 + λI α u (t ) = f (t )
1−α
∫
0
Γ(α ) (t − τ )
(
)
(20)
Applying the Laplace transform to (20) we get



λ 
L  1 + λI α u (t ) } = L  f (t ) } ⇒ 1 + α U ( p ) = F ( p )

 p 

(
)
(21)
The equation (21) can be rearranged in many ways, but we will solve this one using of LHD. We can rewrite the
equation (21) as follows
 p α −1

U ( p) =  p α
− 1 F ( p ) + F ( p )
 p +λ 
(22)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
160
The equation (22) is next inverted back into the normal function domain. In order to do this one must address
comprehend the Laplace transform of the Mittag-Leffler function

L Eα ,1 − λt α

(
)
p α −1
} = α +1
p
(23)
By the relationship given in (7), it is clear that between the brackets in (22) is the Laplace transform of the first
derivative of the Mittag- Leffler function in (23), i.e.
d
L  Eα ,1 − λt α
 dt
(
From this the inverse (22) we obtain
)

} = L  Eα(1,)1 − λt α

(
)
(
)
pα −1
} = p α −1
p +λ
u (t ) = f (t ) + Eα(1,)1 − λt α ∗ f (t )
Acknowledgement
This research has been supported by the Czech Ministry of Education in the frame of MSM002160503 Research
Intention MIKROSYN New Trends in Microelectronic Systems and Nanotechnologies.
Reference
[1]
CHEN, Y.Q. Fractional-order calculus in Signal processing and Control, CSOIS, ECE Dept. of Utah State
University, 2003, 1-83.
[2]
KOWANKAR, K. M.; GANGAL, A. D. Fractional Differentiability of nowhere differentiable functions and
dimensions, Chaos, Vol.6, No 4, 1996, Amer. Inst. of Physics.
[3]
NISHIMOTO, K. An essence of Nishimoto's Fractional Calculus, Descartes Press Corp., 1991.
[4]
OLDHAM, K. B.; SPANIER, J. The Fractional Calculus, Acad. Press, New York, 1974.
[5]
PIRES, E. J. S. Fractional Order Dynamics in a GA planar, Signal Processing 83, 2003, 2377-2386.
[6]
PODLUBNY, I. Fractional Differential Equations, Mathematics in Science and Engineering Vol. 198, Acad.
Press 1999.
[7]
PODLUBNY, I. Fractional Differential Equations, Acad. Press, San Diego 1999.
[8]
RAYNAUD, H. F.; ZERGALNOH, A. State-space Representation for Fractional order controllers,
Automatica 36, 2000, 1017-1021.
Address:
RNDr. Vlasta Krupková, CSc.
University of Technology
Technická 8
616 00 Brno
e-mail: [email protected]
Address:
Doc. RNDr. Zdeněk Šmarda, CSc.
University of Technology
Technická 8
616 00 Brno
e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
161
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
162
ON THE STABILITY OF LINEAR INTEGRODIFFERENTIAL EQUATIONS
Zdeněk Šmarda
University of Technology, Brno
Abstract: The stability solutions of linear integrodifferential equations with respect to an asymptotic
behaviour of base and regular ordinary differential equations is investigated.
Key words: Stability solutions, integrodifferential equations, base and regular system.
1. Introduction
While studying integro-differential equations (IDE) we often meet two basic problems, which do not have an analogy in
the theory of ordinary differential equations:
•
The system of IDE has singular points of the first and second order , i.e. points that more one solution
passes through or no solution exists there at all, in simple cases, presuming continuity (see [1,2]).
•
An unknown function is under the integration sign, so the right-hand side of the IDE cannot be evaluated
in given point. This means that we are actually unable to determine the direction field of the IDE, although
the direction field exists. Therefore known qualitative methods of an investigation of ordinary differential
equations, e.g. Wazewki's topological method, cannot be applied to IDE (see [4,5,6,7]).
In this paper, we investigate stability of solutions of linear systems of IDE and appropriating base systems. There is
introduced some examples base systems and linear systems of IDE in which integral terms will change asymptotic behaviour
of base systems. Character of asymptotic behaviour of linear systems of IDE with respect to base systems depends on
kernels of integral terms, especially. There are given sufficient conditions under which linear systems of IDE is stable or
asymptotic stable with assumption that base systems is unstable.
2. Examples and basic notions
Consider the linear system of IDE
u ' (t ) = A(t )u (t ) + K (t , s )u (s )ds,
∫
t
(1)
0
where u(i) (t) = (u1(i)(t),.…,un(i)(t))T, i = 0,1, A(t) is n x n- the matrix function, A(t) ∈ C1(J), detA(t) ≠ 0, for t ∈ J, K(t,
s) is n × n - the matrix function, K(t, s) ∈ C1(J x J), J = [0, ∞ ). The function K(t, s) will be called the kernel of (1).
We also consider the base system of (1)
u'(t) = A(t)u(t).
(2)
Let V(t), V(0) = E, be the fundamental matrix of (2) then the Cauchy problem of (2) with u(0) = b, b = (b1,..., bn)T is a
constant vector, has the unique solution
u(t) = V(t)b.
(3)
The zero solution of (2) will be called stable if for any constant vector c and t → ∞ there is valid
u (t ) ≤ V (t ) c ≤ M c , M ∈ R +
If as t → ∞ , ||u(t)|| → 0 then the zero solution of (2) will be called asymptotic stable.
If as t → ∞ , ||u(t)|| → ∞ then the zero solution of (2) will be called unstable, ||.|| is a usual norm in Rn or R2n.
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
163
The system (2) is the linear system. Thus, there is alwayes valid one of these cases:
(I) All solutions of (2) are stable.
(II) All solutions of (2) are asymptotic stable.
(III) All solutions of (2) are unstable.
Consequently, we will say that the system (2) is stable, asymptotic stable, unstable. There is also valid in the case of (1).
Example 1.
Consider the system of IDE
 1

0 
−
u(t ) +
u ' (t ) =  t + 1
1 
 0
−


t + 1

τ +1 

t + 1 u (τ )dr
0 
t
0
0
0

∫
(3)
with the initial conditions u1(0) = b1, u2(0) = b2. The particular solution of (3)
 1

u (t ) =  t + 1

 0

t2 

2(t + 1) b.
1 

t +1 
Thus, the system (3) is unstable. Appropriating the base system has the form
u'k(t)= −
1
u k (t )
t +1
u k (0) = bk , k = 1,2
(4)
with the particular solution
uk(t) = −
1
bk
t +1
From here the system (4) is asymptotic stable so that the integral term in (3) changed the asymptotic stable system (4) in
the unstable system (3).
Example 2. Consider the system of IDE
1


+
t
t + 2)
(
1
)(
u ' (t ) = 

0



0 
u (t ) +
1 

t +1 

0
t
∫  0
0


τ +1


(t + 1) (t + 2) 
2(τ + 1) u(τ )dτ
−

(t + 1) 3 
4
3
(5)
with the initial conditions u1(0) = b1, u2(0) = b2. The particular solution of (5)
u1 ( t ) =
 t +1
2(t + 1) 
2
3
2 

+
−
+
b1 + b2  ln
2
t+2 
(t + 1) 3 
 t + 2 t + 1 (t + 1)
1 

u 2 (t ) =  2 −
b2.
t + 1

„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
164
From here the system (5) is only stable. Appropriating the base system has the form

0 
u (t )
1 

t + 1
1


u ' (t ) =  (t + 1)(t + 2)

0


with the particular solution
u1 (t ) =
2(t + 1)
b1 ,
t+2
(6)
u 2 (t ) = (t + 1)b2
Thus, the system (6) is unstable. Now it is obvious that the integral term in (5) changed the unstable system (6) in the
stable system (5).
3. Main results
Let R(t, s) ∈ C1(JX J) be a resolvent the matrix kernel –A-1(t)K(t, s).
Put
∫
t
u (t ) = z (t ) + R(t , s ) z ( s)ds.
(7)
0
Substituting (7) into (1) we obtain
z ' (t ) = B (t ) z (t ) −
t
∫ R (t , s) z(s)ds,
0
'
t
where B(t) = A(t) + A-1(t)K(t,t). Consider the system
z ' (t )´B(t ) z (t )
(9)
which will be called the regular system with respect to (1) and let W(t) be the fundamental matrix of (9).
Theorem. Let following assumptions hold:
(i) The base system (2) is unstable.
(ii) The regular system (9) is either stable or asymptotic stable.
(iii)
∞ t
∫∫
0
0
W −1 ( s) R t' (t , s)W ( s) dsdt < N < ∞
F (t ) =
t
∫ R(t , s)W (s) ds < L < ∞.
0
F(t) —> 0 ast → ∞ in the case of the asymptotic stability.
Then the system (1) is either stable or asymptotic stable.
Proof. The resolvent R(t, s) of (7) satisfies the equation (see [3])
R (t , s ) = Q(t , s ) + Q(t , µ )R (µ , s )dµ
∫
t
0
(10)
where Q(t, s) = -A-1(t)K(t, s). From (8) we get
z (t ) = W (t )b +
∫
t
W (t )W −1 (s )
0
∫
s
0
Rt' (s, µ ) z ( µ )dµds.
(11)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
165
−1
−1
Multiplying by W (t ) the both sides of (11) and putting α (t ) ≡ W (t )z (t ) we obtain
α (t ) ≤ b N .
Thus
z (t ) ≤ W (t ) bN
Now, from (7) it follows
u (t )
≤
W ( t )W
−1
∫
(t ) z (t ) +
≤ W (t ) W −1 (t ) z (t ) +
∫
t
0
t
R ( t , s )W ( s )W
−1
0
( s ) z ( s ) ds ≤
R (t , s)W (s ) W −1 (s ) z (s ) ds ≤
t

≤  W (t ) + ∫ R(t , s )W ( s ) ds

0

 N b ≤ ( W (t ) + L N b.

(12)
From here it is obvious that the system (1) is stable . In the case F(t) —> 0 and ||W(t)|| —> 0 as t → ∞ , we obtain from
(12) that the system (1) is asymptotic stable. The proof is complete. Results of this Theorem we apply to the system of
IDE in the example 2, i.e.
τ +1


1


0

0 

4
3
t
(t + 1) (t + 2) 

(
t
+
1
)(
t
+
2
)


u (t ) + 
u (τ )dr
u ' (t ) =
2(τ + 1) 
1 

0
−
0
 0



t +1

(t + 1) 3 

with the initial conditions u1 (0) = b1 , u 2 (0) = b2 . We already know that the base system
∫
1


0 

u (t )
u (t ) =  (t + 1)(t + 2)
1 

0


t +1 

is unstable. The matrix B(t) of the regular system z'(i) = B(t)z(t) has the form
1
 1

1 

B (t ) = A(t ) + A −1 (t )K (t , t ) =
t
+
2
(
t
+
1
)(
t
+
2
)


t +1  0

−1


and the resolvent
1


1 0 −

R (t , τ ) =
(t + 1)(t + 2) 
τ + 1  0

−2


The fundamental matrix of the regular system
 2(t + 1)
t +1 
1  

1 −


3(t + 2)  (t + 1)3  .
W (t ) =  t + 2

1
 0

t +1


´
Thus, the regular system is stable. Remain to verify the conditions (iii),(iv). In this case there is valid
2


1 0 −

R (t , τ )W (τ ) =
(
t
+
1
)(
t
+
2
)


(τ + 1)2  0
2

∫ R(t ,τ )W (τ ) dτ = 6 − ln 2 = N 〈 ∞
t
5
0
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
166
W −1 (t )Rt´ (t , τ )W (τ ) =
∞ t
∫∫
0
0
t+2
2
2(t + 1)(τ + 1)
 1
1

−
 (t + 1)2 (t + 2)2

 0 1 

.
 0 0 

W −1 (t )Rt´ (t , τ )W (τ ) dτdt 〈 3 = L 〈 ∞
Now, it is obvious that all assumptions of Theorem are fulfilled and withought computting of the particular solution of
the system of IDE (5) we get that this one is also stable. The example is complete.
Acknowledgement
This research has been supported by the Czech Ministry of Education in the frame of MSM002160503 Research
Intention MIKROSYN New Trends in Microelectronic Systems and Nanotechnologies.
Reference
[1]
BYKOV, J. V. Theory of integrodifferential equations , Kirg. Univ. Frunze 1957 (in Russian).
[2]
IMANALIEV, M. Oscillation and stability of solutions of singular-p erturbation integro differential equation,
Akad. nauk , ILIM Frunze, 1974 (in Russian).
[3]
ŠKRÁŠEK, J. Základy aplikované matematiky III., Praha : SNTL, 1993.
[4]
ŠMARDA, Z. On some particularities of integro-differential equations. Proceedings of APLIMAT 2003,
p.253-257.
[5]
ŠMARDA, Z. On solutions of an implicit singular system of integro differential equations depending on a
parameter, Demonstratio Mathematica, Vol.XXXI, No 1, (1998), 125-130.
[6]
ŠMARDA, Z. On an initial value problem for singular integro- differential equations, Demonstratio Mathematica, Vol. XXXV, No 4, (2002), 803-811.
[7]
YANG, G. Minimal positive solutions to some singular second-order differential equations, J. Math. Anal.
Appl. 266 (2002), 479-491.
Address:
Doc. RNDr. Zdeněk Šmarda, CSc.
University of Technology
Technická 8
616 00 Brno
e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
167
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
168
EXISTENCE OF POSITIVE SOLUTIONS FOR RETARDED FUNCTIONAL DIFFERENTIAL
EQUATIONS WITH UNBOUNDED DELAY AND FINITE MEMORY
Josef Diblík, Zdeněk Svoboda
University of Technology, Brno
Abstrakt: For systems of retarded functional differential equations with unbounded delay and with finite
memory sufficient and necessary conditions of existence of positive solutions on an interval of the form
t ∗ , ∞ are derived. A general criterion is given together with corresponding applications (including a linear
case, too). Examples are inserted to illustrate the results.
[ )
Key words and phrases: Positive solution, delayed equation, p- function.
1 Introduction
One of the basic projects in the theory of regulation is to find the control of some process such that parameters of this stay
in the required area. The most simply type of this area is often described by inequalities. Description of continuous
processes is usually realized by using the differential equation. In this paper is given a criterion for the existence of positive
solutions (i.e. a solution with positive coordinates on a considered interval) for systems of retarded functional
differential equations (RFDE's) with unbounded delay and with finite memory. At first let us give short explanation of
emphasized above terms. Let us recall basic notions of RFDE's with unbounded delay but with finite memory. A
function p ∈ C[R x [-1,0], R] is called a p -function if it has the following properties [12, p. 8]:
(i)
(ii)
(iii)
p(t,0)=t.
p(t, -1) is a nondecreasing function of t.
there exists a σ ≥ - ∞ such that p( t, ϑ ) is an increasing function for ϑ for each t ∈ ( σ , ∞ ). (Throughout the
following text we suppose t ∈ ( σ , ∞ ).)
In the theory of RFDE's the symbol yt , which expresses "taking into account", the history of the process y(t) considered,
is used. With the aid of p - functions the symbol yt is defined as follows:
Definition 1 ( [12, p. 8] ) Let t0 ∈ R, A > 0 and y ∈ C ( [p(t0, -l), t0 + A), Rn). For any t ∈ [t0,t0 + A), we define
y t (ϑ ) := y ( p(t , ϑ )), − 1 ≤ ϑ ≤ 0
and write
yt ∈ C := C[[-l,0],Rn].
Note that the frequently used symbol “yt” (e.g., yt(s) := y(t + s), where −τ ≤ s ≤ 0, τ 〉 0, τ = const) in the theory of
delayed functional differential equations for equations with bounded delays is a partial case of the above definition. Indeed,
in this case we can put p (t, ϑ ) := t + T ϑ , ϑ ∈ [-1, 0].
In this paper we investigate existence of positive solutions of the system
ý(t) = f(t,y t )
(1)
where f ∈ C([t0,t0 +A) x C,Rn), A > 0, and yt is defined in accordance with Definition 1. This system is called the system of
p-type retarded functional differential equations (p-RFDE's) or a system with unbounded delay with finite memory.
Definition 2 The function y ∈ C ([p(t0, -l),t0 +A),Rn) ∩ C1 ([t0,t0 + A),Rn) satisfying (1) on [t0,t0+A) is called a solution
of (1) on [p(t0, - 1),t0 + A).
Suppose that Ω . is an open subset of R × C and the function f : Ω . —> Rn is continuous. If (t0, φ ) Ω , then there
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
169
exists a solution y = y(t0, φ ) of the system p-RFDE's (1) through (t0, φ ) (see [12, p. 25]). Moreover this solution is
unique if f(t, φ ) is locally Lipschitzian with respect to second argument φ ([12, p. 30]) and is continuable in the usual
sense of extended existence if f is quasibounded ([12, p. 41]). Suppose that the solution y = y(t0, φ ) of p-RFDE's (1) through (t0,
φ ) ∈ Ω , defined on [t0, A], is unique. Then the property of the continuous dependence holds too (see [12, p. 33]), i.e. for
every ε > 0, there exists a δ ( ε ) > 0 such that (s, ψ ) ∈ Ω , |s - t0| < δ and || ψ - φ || < δ imply
|| yt (s, ψ ) - (to, φ )|| < ε , for all t ∈ [ ζ ,A]
where y(s, ψ ) is the solution of the system p- RFDE's (1) through (s, ψ ), ζ = max {s,t 0 } and || • || is the
supremum norm in Rn. Note that these results can be adapted easily for the case (which will be used in the sequel) when
Ω has the form Ω = [t*, ∞ ) x C where t* ∈ R.
1.1 Problem of existence of positive solutions
In this paper we are concerned with the problem of existence of positive solutions (i.e. problem of existence of
solutions having all its coordinates positive on considered intervals) for nonlinear systems of RFDE's with unbounded
delay but with finite memory. Let us cite some known results for retarded functional differential equations. For the
scalar equation
x& (t ) + p(t )x(t − τ (t )) = 0
(2)
with p, τ ∈ C ( [t0, ∞ ), R+), τ (t) ≤ t, lim (t - τ (T)) = ∞ and R+ = [0, ∞ ) a criterion for existence of a positiv solution
t →∞
is given in the book [10]. Namely, (2) has a positive solution with respect to t1 if and only if there exists a continuous
function λ (t) on [T1, ∞ ) with T1 = inf {t - τ (t)}, such that λ (t) > 0 for t ≥ t1 and
t ≥ t1
t
λ (t) ≥ p(t)e ∫ t −τ ( t ) λ (s)ds , t ≥ t1.
(3)
(A function x is called a solution of (2) with respect to an initial point t1 ≥ t0 if x is defined and is continuous on [T1, ∞ ),
differentiable on [t1, ∞ ), and satisfies (2) for t ≥ t1.) Results in this direction are formulated in the book [11] and in the
papers [1, 2], too. Positive solutions of (2) in the critical case were studied e.g. in [4]-[10]. The cited criterion was
generalized for nonlinear systems of RFDE's with bounded retardation in [3] and for nonlinear systems of RFDE's with
unbounded delay and with finite memory in [6]. These generalizations are in a sense "direct" generalizations since in their
formulations existence of a positive (vector) functions playing a similar role as λ in (3) is supposed.
2 Sufficient conditions
Let a constant vector k >> 0 and a vector λ (t) defined and locally integrable on [p*, ∞ ) are given. Then the operator T
is well defined by
T(k, λ ) (t) := ke ∫
t
p*
λ (s)ds = (k1e ∫
t
p*
λ (s)ds ,..., kne ∫
t
p*
λ (s)ds ).
Define for every i ∈ {1, 2,..., n} two types of subsets of the set C:
τ
and
: = { φ ∈ C : 0 « φ ( ϑ ) « T (k, λ )t ( ϑ ), ϑ ∈ [-1,0] except for φ i(0) = k1e ∫
i
τ
: = { φ ∈ C : 0 « φ ( ϑ ) « T (k, λ )t ( ϑ ), ϑ ∈ [-1,0] except for φ i(0) = 0
i
t
p*
λ (s)ds
}
}.
Theorem 1 Suppose f ∈ C ( Ω ,Rn) is locally Lipschitzian with respect to the second argument and quasibounded. Let a
constant vector k >> 0 and a vector λ (t) defined and locally integrable on [p*, ∞ ) are given. If, moreover, inequalities
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
170
µ i λi (t ) > kiµi e ∫
hold for every i ∈ { 1 , 2 , . . . , n}, (t, φ ) ∈ [t*, ∞ ) x
τ
t
p*
λ (s)ds ⋅ f (tφ )
i
i
and inequalities
(4)
µ i f i (t , φ ) > 0
(5)
hold for every i ∈ { 1 , 2 , . . . , n}, (t, φ ) ∈ [t*, ∞ ) x τ i , where µ i = - 1 for i = 1,... ,p and µ i = 1 for i = p+1,... ,n,
then there exists a positive solution y = y(t) on [p*, ∞ ) of the system p -RFDE's (1).
The proofs of this and next theorems are based on the retract method and on the Lyapunoff method. Analogous
consideration can be found in [6].
3 Scalar linear application
Let us consider the scalar linear equation with delay
y& (t ) = −
[ )[
∫ ( ) K (t , s )y(s )ds
t
(6)
τ t
)
[ ) [
)
[
)
where K : t ∗ , ∞ x p ∗ , ∞ → R + is a continuous function, and τ : t ∗ , ∞ → p ∗ , ∞ is a nondecreasing function with
τ (t )〈 t .
Theorem 2 The equation (6) has a positive solution
([
) )
function λ ∈ C p , ∞ , R , such that λ (t )〉 0 for t ≥ t and
∗
y = y (t ) on p ∗ , ∞ if and only if there exists a
∗
t
t


λ (u )du
λ  t ≥
K (t , s )e ∫s
ds 
τ (t )


∫
[ )
(7)
on the interval t ∗ , ∞ .
Inequality (7) can be used for finding sufficient conditions for the existence of a positive solution of Eq. (6). Let us give
two of them.
[ )
In the case when τ (t ) ≡ p ∗ 〈t ∗ and K (t , s ) ≡ c(t ) for every t ∈ t ∗ , ∞ , Eq. (6) takes the form
y& (t ) = −c (t )
∫
t
p∗
y (s )ds
(8)
c(t ) ≤
(
δ2
δ t − p∗
e
with a positive constant S is a sufficient condition.
[
t ∈ t∗,∞
) −1 ,
[p , ∞) , the inequality
∗
Theorem 3 For the existence of a solution of Eq. (8), positive on
)
(9)
[ )
In the case when τ (t ) ≡ t − 1, 1 ∈ R + and K (t , s ) ≡ c(t ) for every t ∈ t ∗ , ∞ , Eq. (6) takes the form
y& (t ) = −c (t )
∫
t
t −1
y (s )ds
(10)
[
)
Theorem 4 For the existence of a solution of Eq. (10), positive on t ∗ ,−1, ∞ , the inequality
[ )
c(t ) ≤ M , t ∈ t , ∞
∗
is sufficient for M = α (2 − α ) / l = const with a constant a being the positive root of the equation 2 − α = 2e
approximate values are a = 1., 5936 and M = 0., 6476/l2.)
2
(11)
−α
. (The
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
171
Acknowledgment
This research has been supported by the Czech Ministry of Education in the frame of MSM002160503 Research
Intention MIKROSYN New Trends in Microelectronic Systems and Nanotechnologies.
Reference
[1]
BEREZANSKI, L.; BRAVERMAN, E. On oscillation of a logistic equation with several delays, J. Comput.
and Appl. Mathem. 113 (2000), 255-265.
[2]
ČERMÁK, J. A change of variables in the asymptotic theory of differential equations with unbounded delay,
J. Comput. Appl. Mathem. 143 (2002), 81-93.
[3]
DIBLÍK, J. A criterion for existence of positive solutions of systems of retarded functional differential
equations. Nonl. Anal., TMA 38 (1999), 327-339.
[4]
DIBLÍK, J. Positive and oscillating solutions of differential equations with delay in critical case, J. Comput.
Appl. Mathem. 88 (1998), 185-202.
[5]
DIBLÍK, J.; KOKSCH, N. Positive solutions of the equation x(t) = —c(t)x(t — r) in the critical case. J. Math.
Anal. Appl. 250 (2000), 635-659.
[6]
DIBLÍK, J.; SVOBODA, Z. An existence criterion of positive solutions of p-type retarded functional differential
equations. J. Comput. Appl. Mathem. 147 (2002), 315-331.
[7]
DOMSHLAK, Y. On oscillation properties of delay differential equations with oscillating coefficients, Funct.
Diff. Equat., Israel Seminar 2 (1996), 59-68.
[8]
DOMSHLAK, Y.; STAVROULAKIS, I. P. Oscillation of first-order delay differential equations in a critical
case, Appl. Anal. 61 (1996), 359-371.
[9]
ELBERT, Á.; STAVROULAKIS, I. P. Oscillation and non-oscillation criteria for delay differential equations,
Proc. Amer. Math. Soc. 123 (1995), 1503-1510.
[10]
ERBE, L. H.; KONG, Q.; ZHANG, B. G. Oscillation Theory for Functional Differential Equations. New York :
Marcel Dekker, 1995.
[11]
GYÖRI, I.; LADAS, G. Oscillation Theory of Delay Differential Equations. Oxford : Clarendon Press, 1991.
[12]
LAKSHMIKAMTHAN, V.; WEN, L.; ZHANG, B. Theory of Differential Equations with Unbounded Delay,
Kluwer Academic Publishers, 1994.
[13]
HALE, J.K.; LUNEL, S. M. V. Introduction to Functional Differential Equations, New York : Springer-Verlag,
Inc., 1993.
Address:
Doc. RNDr. Josef Diblík, DrSc.
Department of Mathematics
Faculty of Electrical Engineering and Communication
Brno University of Technology
Technická 8,
616 00 Brno,
[email protected]
Address:
RNDr. Zdeněk Svoboda, CSc.
Department of Mathematics
Faculty of Electrical Engineering and Communication
Brno University of Technology
Technická 8,
616 00 Brno,
[email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
172
APPLICATION OF NON SIMPLEX METHOD FOR LINEAR PROGRAMMING
Marie Tomšová
University of Technology Brno
Abstract: This paper showes a method for the solution of optimizing problems which is different from the
usually used Simplex method. The simplex method is considered one of the basic models fom which many
linear programming techniques are directly and indirectly derived. The simplex method is an iterative
process which approaches, step by step, an optimum solution in such a way that an objective function of
maximization or minimization is fully reached. Each iteration in this process consists of shortening the
distance (mathematically and also graphically) from the objective function to the intercepted vertex of a
convex set determined by the inequalitis which describe the problem. The simplex method is not the only
technique known and used for solving linear programming problems. For the pedagogical expendiency are
more useful also other methods, see for example R. Dorfman, P.A.Samuelson, and R.M.Solov, Linear
Programming and Economic Analysis, New. York: McGraw-Hill Book Comp. Inc,1958. I interduce an
other method from the simplex method. This method will be based on the princip of the graphical method
of optimization of linear problems for two variables, but my method will be generalized for n variables and
an arbitrary finite number of inequalities descibing the problem.
Key words. Liner programming, system of inequalities, disposal and slack variable, dummy variable,
objective function, iteration, key row, key column, key element, polyhedron, hyper-plane.
1. The leading article
Remark I reported this problem on the fourth Mathematical Workshop in Brno 2005. This contribution
aims at spreading knowledge of the described method - I prepared a programme for solving of concrete problems.
This programme is given as an appendix.
Introduction The generall problem of linear programming is usually formulated as follows:
Let aij, bi , cj (i = 1,2, ... , m ; j = 1,2, ... ,n) be given real numbers and let us denote I1 C I = {i = 1,2, ... , m} and J1 C
J = j = 1, 2, ... ,n}. The problem of maximizing of the function
n
∑c x
j
i =1
j
(1)
on the set of
n
∑a x
j =1
ij
j
n
∑a x
j =1
ij
j
xj ≥ 0
≤ bj
= bj
( i ∈ I1 )
( i ∈ I − I1 )
( j ∈ J1 )
I ≠ 0, I1 ≠ I , or J1 ≠ J .
is called maximizing problem of linear programming in mixed form if 1
(2)
(3)
(4)
The problem of linear programming given by (1) till (4) where I1 = I and J1 = J that is the problem of maximizing of the
function
n
∑c x
i =1
j
j
(5)
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
173
on the set of linear independent system of linear unequations
n
∑a x
j =1
ij
xj ≥ 0
≤ bj
( i ∈ I1 )
( j ∈ J1 )
j
(6)
(7)
is called maximizing problem of linear programming in the form of unequations.
With respect to the fact that for arbitrary set M ⊂ R n where Rn is n dimensional vectorial space and for arbitrary linear
function z : M → R n
min z ( x ) = max(− z ( x )), where x ∈ R n
holds then if one of extrems exists we can transform also minimizing problem on the problem with linear equations or
or linear unequations. We do the rearrangement by multiplication by number -1.
2. The solution of the general problem
We desist from the condition (4) and hence also from (7) in the following considerations. We rewrite the system (6) and
add the objective function as the last row into the form:
a11 x1 + a12 x2 + ... + a1n xn + b1 ≥ 0
a21 x1 + a22 x2 + ... + a2 n xn + b2 ≥ 0
...
am1 x1 + am 2 x2 + ... + amn xn + bm ≥ 0
c1 x1 + c2 x2 + ... + cn xn = 0
(8)
We call the set x = {x1,x2, ... ,xn} C Rn of elements as polyhedron. We call the polyhedron opened if m ≤ n . We know
that the objective function z (x ), x ∈ R n receives its optimal values at vertixes or at all areas of polyhedron. In the first
part of computational procedure there we find one from vertixes of polyhedron and we transform it into the coordinate
origin simultaneously. Simultaneously we transform all hyperplanes of polyhedron and the objective function with
respect to the given transformation. We continue as folows:
We select arbitrary hyperplane and we denote it by the index i ∈ {1, 2,..., m}. We divide the whole column at the variable
x1 by coefficient ai1 and we put simultaneously
x1 = x1′ − ai 2 x2 − ai 3 x3 ... − ain xn − bi
(9)
We introduce the transformation relation (9) into the system (8) and the mathematical representation of the problem
transforms for m > 1 on to the following form:



a11
a a 
a a 
a a 
a
x1′ +  a12 − 11 i 2  x2 +  a13 − 11 i 3  x3 + ... +  a1n − 11 in  xn + b1 − bi 11 ≥ 0
ai1
ai1 
ai1 
ai1 
ai1






a21
a a 
a a 
a a 
a
x1′ +  a22 − 21 i 2  x2 +  a23 − 21 i 3  x3 + ... +  a2n − 21 in  xn + b2 − bi 21 ≥ 0
ai1
ai1 
ai1 
ai1 
ai1



ai −1,1
a a 
a a 
a a 
a



x1′ +  ai −1,2 − i −1,1 i 2  x2 +  ai −1,3 − i −1,1 i 3  x2 + ... +  ai −1, n − i −1,1 in  xn + bi −1 − bi i −1,1 ≥ 0
ai1
ai1 
ai1 
ai1 
ai1



x´1
+
0
+
0 + ... +
0 +
0
≥0
ai +1,1
a a 
a a 
a a 
a



x1′ +  ai +1,2 − i +1,1 i 2  x2 +  ai +1,3 − i +1,1 i 3  x2 + ... +  ai +1,n − i +1,1 in  xn + bi +1 − bi i +1,1 ≥ 0
ai1
ai1 
ai1 
ai1 
ai1



„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
174



a21
a a 
a a 
a a 
a
x1′ +  a22 − 21 i 2  x2 +  a23 − 21 i 3  x3 + ... +  a2n − 21 in  xn + b2 − bi 21 ≥ 0
ai1
ai1 
ai1 
ai1 
ai1



In the following step there we choose some of rows where the coefficient at the variable x2 is different from zero
arbitrary. The existence of such road follows from the assumption that m linear rows is independent. In the next we
suppose hat this assumption is satisfied by the row s, s ≤ m . It is obvious that s ≠ i . We continue such that we divide
the whole second column by the expression
as 2 −
asi ai 2
ai1
and then we introduce the following transformation:
x2 = −
a s ,1 a in


a s1
a a 
+ x ´2 −  a s 3 − s1 i 3  x 3 + ... −  a s , n −
a i1
a i1 
a i1


a

 x n − bs + b1 s ,1

a i1

After this transformation the s-th row will be of the form:
0
x2′ +
0
+
0 + ... +
0 +
0
≥0
We continue till we do the all m < n transformations by analogy. Thus we calculate one point of one edge of polyhedron
which transformed into the coordinate origin.
In the account that m ≥ n there we find after n transformations one vertix of polyhedron which transformed into the
coordinate origin. The whole calculus is done on computer therefore we calculate only with the matrix of coefficients of
polyhedron. We apply all the steps of transformation to the objective function
c1 x1 + c 2 x 2 + ... + c n x n + 0 ≥ 0
and we receive after the first transformation


ca 
ca 
ca
c1 ´ 
x1 +  c 2 − 1 i 2  x 2 +  c 3 − 1 i 3  x 3 + ... +  c n − 1 in
a i1
a
a
a i1
i1 
i1 




c
 x n + 0 − bi 1 ≥ 0
a
i1

Further adaptations of coefficients of the objective function run over simultaneously with the adaptations of coeficients
of polyhedron such as it was given in the previous description of the hash algorithm applied onto polyhedron.
As the next step we extend the matrix of coefficients descibing the system (8) with the objective function which is of
the type (m +1) x (n +1) such that we add the matrix of the type n x (n +1) which consists of unit matrix type n x n with
added column vector of zeros of the length n.This step is necessary for explicit expression of the poin of edge optionally
vertex of polyhedron and the optimal value of the objective function. Onto such expanded matrix are applied all before
described affinite transformations. We obtain after the making described transformtion algorithm the original
coordinates of the point of edge or the vertex of polyhedron which is transformed into the coordinate origin in the last
column of the matrix n x (n + 1). We show the expanded matrix of coefficients of the type (m + 1 + n) x (n + 1) before
the transformation algorithm.
 a11 a1 2 a1 3

 a21 a2 2 a2 3
 .
.
.

a
1
a
2
a
m
m3
 m
 c1
c2
c3

0
0
 1
 0
1
0

.
.
 .
 0
0
0

...
...
...
...
...
...
...
...
...
b1 

b1 
.
. 

am n bm 
cn
0

0
0
0
0

.
. 
1
0 
a1n
a2 n
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
175
3. Optimizing and decision making process small We suppose that m>n and that the transformation
algorithm transformed one of vertices of polyhedron was transformed into the coordinate origin. If after this
transformation all the coefficients in the first m rows of the last (n + 1)-the column are nonnegative numbers and
simultaneously all transformed coefficiens of the objective function it is c,k in the (m + 1-the row negativw the
maximizing process of the objective function z(x) is finished. In the last n rows of the (n + 1)-the column there are
original coordinates of the vertex of polyhedron in which the objective function acquires its maximum and the value of
this maximum is at the position [m + 1, n + 1] of transformed matrix.
If previous situation does not occure then it is necessary to do the following analysis. We suppose for coefficients in the
first m rows of the (n + 1)th column nonnegative again but some of transformed coefficiens of the objective function in
the (m + 1)-th row is positive. Let this situation in the j0-th column occure. We look at all transformed coeficients in the
j0-th column. If all transformed coefficients of the polyhedron a ´ij 0 . are nonnegative then the problem has not any
solution. It is possible to get along this edge incident to polyhedron to infinity. Polyhedron is not bounded ant the
solution does not exist.
4. Example
4.1
Remark I drafted a program for explanation of the given method which adress is on the server Pal of
the Technical University in Brno: Q: \vyuka\ matematm\Tomsova\Polyhedron\ matice. exe. We do an applivcation of
this program. Data are denoted as CONCRETE
4.2
Example Maximize the objective function z = 3x+5y in the area bounded by the following restrictions:
1.
x≥0
2.
y≥0
3.
x≤5
4.
x + 2 y ≤ 12
5. 2 x + 3 y ≤ 19
Solution: We line a figure for the better graphical preview where the area of polyhedron will be bounded with bisectors
suitabled to the restrictions with the vertixes 0 = [0, 0], A = [5, 0], B = [5, 3], C = [2, 5], D = [0, 6]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
176
After startup of program we browse the vertixes in the sentence 0, A, B, C, D. We can monitor the coefficients and
values of the objective function in the green coloured row. The coordinates of the vertices are in the last column in the
last two rows. The program finishes at that moment when all coefficients of the objective function are nonpositive and
the result is: The problem has just one solution at the point C and the value of the objective function is 31.
References
[1]
CHURCHMAN, CH. W.; ACKOFF, R. L.; ARNOFF, L. Introduction to Operation Research. New York :
John Wiley & Sons. Inc. 1957.
[2]
KLAPKA, J.; DVOŘÁK, J.; POPELA, P. Metody operačního výzkumu. Brno : VUTIUM, 2001
[3]
RAIS, K. Vybrané kapitoly z operační analýzy. Brno : PGS, 1985.
[4]
VACULÍK, J.; ZAPLETAL, J. Podpůrné metody rozhodovacích procesu. Brno : Masarykova univerzita,
1998.
[5]
WALTER J. a kol. Operační výzkum. Praha : SNTL, 1973.
[6]
ZAPLETAL, J. Operační analýza. Kunovice : Skriptorium VOS, 1995.
Address:
Mgr. Marie Tomšová
University of Technology
Technická 8,
616 00 Brno,
e-mail: [email protected]
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
177
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
178
AUTHOR INDEX
A
S, Š
ABDURRZZAG, T....................27
AMALKA AL K. ......................81
SVOBODA, Z. .......................169
ŠMARDA, Z. ..................157, 163
ŠŤASTNÝ, J. ............................99
B
BAŠTINEC, J. ...................35, 143
T
D
TOMÁŠ, I..................................57
TOMŠOVÁ, M. ......................173
DIBLÍK J. ................. 35, 143, 169
DOSTÁL, P. .............................21
V
VÉRTESY, G. ..........................57
F
Z
FAJMON B. .............................51
ZAPLETAL, J. .......................... 9
K
KLIEŠTIK T. ..........................105
KOSTIHA J. .............................65
KRUPKOVÁ V. .....................157
L
LACKO, B. ..............................73
LAŠŠÁK, V. .............................89
M
MÉSZÁROS, I. .........................57
MIKULA, V. ............................45
MINAŘÍK, M. ..........................99
N
NOVÁK, M. .......................43, 51
O
OŠMERA, P. ............ 89, 109, 123
P
PETRUCHA, J. .................45, 137
POPELKA, O. ..........................89
R
RUKOVANSKÝ, I. ..........89, 131
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
179
„ICSC– FOURTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENT”
EPI Kunovice, Czech Republic, January 27, 2006
180
SPONSORS
Sverepec 365
Považská Bystrica
PSČ: 01701
tel.: 00421-42-4321110
fax: 00421-42-4379930
majitelia:
Milan Richtárik
Jozef Ďurajka
VS-mont, s. r. o.,
Lazy pod Makytou, SLOVENSKO
email: [email protected]
tel./fax: 00421 42 4681 965
tel.: 00421 42 4681 952
Udiča 366,
018 01 Považská Bystrica
telefón: 042/4260768042/4260769
fax:042/4340269
e-mail:[email protected] - objednávky tovaru
[email protected] - ekonomické oddelenie
[email protected] - sekretariát
Název:
ICSC 2006 – Fourth International Conference on Soft Computing Applied in
Computer and Economic Environment
Autor:
Kolektiv autorů
Vydavatel, nositel autorských práv, vyrobil:
Evropský polytechnický institut, s.r.o.
Osvobození 699, 686 04 Kunovice
Náklad:
200 ks
Počet stran:
182
Vydání: první
Rok vydání: únor 2006
ISBN 80-7314-084-5
I SBN 8 0 - 73 14 - 0 84 - 5
9 788073 140847

Podobné dokumenty