European Polytechnical Institute Kunovice

Komentáře

Transkript

European Polytechnical Institute Kunovice
EUROPEAN POLYTECHNICAL INSTITUTE KUNOVICE
PROCEEDINGS
SECOND INTERNATIONAL CONFERENCE ON SOFT COMPUTING
APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS
ICSC 2004
January 29 - 30, Kunovice, Czech Republic
1
Edited by:
Prof. Ing. Imrich Rukovanský, CSc, and Doc. Ing. Pavel Ošmera, CSc
Papers authorized under supervisoring of the committee:
Prof. PhDr. Karel Lacina, DrSc,
Prof.Ing.Vladimír Mikula,CSc
Doc. Ing. Jaroslav Ďaďo, PhD
Prepared to print by:
Bc. Andrea Šimonová, DiS, Bc. Pavel Kubala, DiS and Bc. Evžen Němec, DiS
Printed by:
© European Polytechnical Institute Kunovice, 2004
ISBN : 80–7314– 025- X
2
SECOND INTERNATIONAL CONFERENCE ON SOFT COMPUTING
APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS
ICSC 2004
Organized by
EUROPEAN POLYTECHNICAL INSTITUTE, KUNOVICE, CZECH REPUBLIC
and
PLEHANOV RUSSIAN ACADEMY OF ECONOMICS, MOSCOW, RUSSIA
in cooperation with
ASSOCIATION OF SMALL AND MEDIUM-SIZED ENTERPRISES AND CRAFTS CZ
Conference Chairman
Ing. Oldřich Kratochvíl, rector
Conference Co-Chairmen
Prof. Ing. Imrich Rukovanský, CSc, vice-rector
Assoc. Prof. Ing.Pavel Ošmera, CSc
3
INTERNATIONAL PROGRAMME COMMITEE
O.
M.
D.
V.
O.
J.
R.
T.
J.
U.
B.
V.
V.
F.
E.
B.
K.
I.
V.
Y.
J.
P.
B.
J.
J.
J.
K.
I.
N.
G.
J.
N.
G.
V.
J.
A.
W.
Kratochvíl – Chairman (CZ)
Baraňski (Poland)
Bartoněk (Czech Republic)
P. Beljanskij (Russia)
Biro (Austria)
Ďaďo (Slovak Republic)
Hampel (Germany)
Hlačina (Czech Republic)
M. Honzík (Czech Republic)
K. Chakraborthy (USA)
Katalic (Austria)
Kebo (Czech Republic)
G. Knjazev (Russia)
Koliba (Czech Republic)
Krikavskij (Ukraine)
Kulcsár (Hungary)
Lacina (Czech Republic)
Matušíková (Czech Republic)
Mikula (Czech Republic)
A. Mylopoulos (Greece)
Nesvadba (Czech Republic)
Ošmera (Czech Republic)
Ošťádal (Czech Republic)
Pavo (Hungary)
Petrucha (Czech Republic)
Radová (Czech Republic)
Rais (Czech Republic)
Rukovanský (Czech Republic)
Sano (Japan)
N. Smirnov (Russia)
Strišš (Slovak Republic)
P. Tichomirov (Russia)
Vértesi (Hungary)
I. Vidjapin (Russia)
Wiktor (Poland)
M. S. Zalzala (U.K.)
Zamojski (Poland)
ORGANIZING COMMITEE
I.
P.
M.
A.
P.
E.
J.
L.
I.
P.
Rukovanský (Chairman)
Ošmera
Vinklárik
Šimonová
Kubala
Němec
Kavka
Šimíček
Gosiorovský
A. Makarenko
P.
S.
M.
M.
I.
M.
J.
T.
Matušík
Amoutzas
Balus
Brázda
Polášková
Měska
Šáchová
Chmela
Session 1: Marketing and Globalization
Chairman: Prof. PhDr. Karel Lacina, DrSc,
Co-Chairman : Assoc.Prof. J.Radová, PhD
Session 2: Financial systems
Chairman: Assoc.Prof..Ing.J.Ďaďo, PhD
Co-Chairman: Assoc.Prof.Ing.J.Strišš,CSc
Session 3: Soft Computing in Computer Environments
Chairman: Prof.Ing.V.Mikula,CSc
Co-Chairman: Assoc.Prof.Ing. P.Ošmera,CSc
4
DODÁVKA A MONTÁŽ PODLAHOVÝCH KRYTIN
A INTÉRIÉROVÝCH DVEŘÍ
Českomoravská 12a/2255, 190 00, Praha
9 Tel: 283 083 111, 333 fax: 283 083 534, 536
ELEKTROMECHANIKA Úvaly
4
CONTENTS:
A Message from the General Chairman ICSC Conference...................................................... 9
MARKETING AND GLOBALIZATION
THE SOLUTION OF LOGISTICS AND TRANSPORT PROBLEMS BY MEANS OF
GENETIC ALGORITHM
Dostál Pavel, Rais Karel (Czech Rep.)................................................................................. 11
EVOLUTIONARY ALGORITHMS FOR MARKETING AND MANAGEMENT
Ošmera Pavel, Kratochvíl Oldřich ........................................................................................ 15
DIAGNOSTIC APPROACH TO THE HUMAN CAPITAL DEVELOPMENT
Kucharčíková Alžbeta, Vodák Jozef (Slovak Rep.).............................................................. 21
MARKETING ACTIVITIES OF SLOVAK BUSINESSES AFTER THE ENTRY TO EU
Strišš Jozef, Rešková Monika (Slovak Rep. )...................................................................... 27
THE DECISION - MAKING PROCESS MODELLING
Hittmár Štefan. (Slovak Rep.)............................................................................................... 31
MARKETING AND ITS GLOBAL ORIENTATION ON TURN OF THE
CENTURIES
Jedlička Milan (Slovak Rep.)............................................................................................... 35
METHODS USED IN MARKETING SEGMENTATION PROCESS
Rešková Monika ( Slovak Rep.) ......................................................................................... 41
ADVERTISING AND GLOBALISATION INFLUENCES
Achalova Larisa Vladislavovna (Russia) ............................................................................. 47
CUSTOMER BEHAVIOUR
Jankal Radoslav ( Slovak Rep. ) .......................................................................................... 53
CONSUMER BEHAVIOUR IN GLOBAL MARKET
Seifoullaeva Elvira (Russia) ................................................................................................ 57
SALES PROMOTION
Berchik Helen (Russia)........................................................................................................ 61
STRATEGIES OF AN INTERNATIONAL MARKET ENTRY
Ponomareva Maria Alexandrovna (Russia)........................................................................... 65
SCOPES OF MARKETING STRATEGY FOR 3G MOBILE SERVICES
Tokarčíková Emese ( Slovak Rep.)..................................................................................... 69
5
CULTURE IN INTERNATIONAL MARKETING AND BUYER BEHAVIOUR
Kalugin Evgeniy (Russia).................................................................................................... 73
EFFECTIVE COMMUNICATION AND PROMOTION STRATEGY
Oreshkin Alexey Gennadievich (Russia) ............................................................................. 77
MARKETING PREPAREDNESS OF SMALL AND MIDDLE-SIZE ENTERPRISES FOR
THE ENTRY INTO THE EU
Jaroslav Ďaďo ( Slovak Rep.) ............................................................................................... 83
INTERCULTURAL DIFFERENCES BETWEEN SLOVAK AND EU COUNTRIES
Vladimír Laššák, Jaroslav Ďaďo ( Slovak Rep.) ................................................................... 89
FINANCIAL SYSTEMS
FUZZY LOGIC AND FINANCIAL TIME SERIES
Dostál Pavel, Žák Libor. (Czech Rep.) .....................................................................................93
DETERMINATION OF TAX SHIELD AND ITS INFLUENCE IN FINANCIAL
DECISIONS
Radová Jarmila, Marek Petr, Hlačina Tibor (Czech Rep.) ........................................................99
SOFT COMPUTING IN COMPUTER
ENVIRONMENTS
MODELLING OF INFLATION FUZZY TIME SERIES USING THE COMPETITIVE
NEURAL NETWORK
Marček Dušan (Slovak Rep.)............................................................................................. 107
USE OF NEW METHOD FOR LEARNING OF NEURAL NETWORKS
Petrucha Jindřich, Mikula Vladimír (Czech Rep.)............................................................... 113
NEURAL NETWORK AND JOINT TRANSFORM CORRELATOR APPLIED FOR
THERMAL IMAGERY-BASED RECOGNITION
Kościuszkiewicz Krzysztof, Kobel Joanna, Mazurkiewicz Jacek, (Poland) ..................... 119
NEURAL NETWORK IN ADAPTIVE CONTROL
Veleba Václav, Petr Pivoňka (Czech Rep.) ........................................................................ 125
6
FUZZY INFERENCE SYSTEM AND PREDICTION
Žák Libor (Czech Rep.) ...................................................................................................... 131
EVOLVING CONTROLLERS
Ošmera Pavel, Chakraborthy K.Uday, Rukovanský Imrich (Czech Rep./USA)................... 137
AN IMPROVED ALGORITHM BASED ON STOCHASTIC HEURISTIC METHODS
FOR SOLVING STEINER TREE PROBLEM IN GRAPHS
Smutek Daniel (Czech Rep.) ............................................................................................. 145
DETECTING PARETO OPTIMAL SOLUTIONS WITH PSO
Baumgartner,U.,Magele,Ch.,Renhart,W (Austria) ............................................................ 155
THE RISK ANALYSIS OF SOFT COMPUTING PROJECTS
Lacko Branislav ( Czech Rep.) ........................................................................................... 163
NETWORK VIRTUAL LABORATORY – INDUSTRIAL ROBOT
Baranski Wlodzimierz M., Walkowiak Tomasz (Poland)................................................... 169
SIMD APPROACH TO RETRIEVING ALGOROTHM OF MULTILAYER
PERCEPTRONd
Mazurkiewicz Jacek (Poland) ............................................................................................ 177
ACOUSTIC LOCALIZATION OF MULTIPLE VEHICLES
Zamojski Wojciech, Walkowiak Tomasz (Poland) ............................................................ 185
THE FUZZY RATIONAL DATABASE SYSTEM FSEARCH 2.0
Bednář Josef (Czech Rep.) ................................................................................................. 193
A SPECIAL DATABASE FOR THE MEASURING IN FIELD
Bartoněk Dalibor (Czech Rep.).......................................................................................... 199
INTERNET – A MODERN E-LEARNING MEDIUM
Woda Marek, Walkowiak Tomasz (Poland)....................................................................... 205
USING THE RUP FOR SMALL WEB INFORMATION SYSTEMS
Švec Petr (Czech Rep.)....................................................................................................... 215
SHAPE OPTIMISATION OF THE FERROMAGNETIC CORE OF FLUXSET
MAGNETOMETER
Vértesy Gábor, Pávó,J ( Hungary) ..................................................................................... 221
RECONSTRUCTION AND REPRESENTATION OF EGYPTIAN MONUMENTS
FROM POINT CLOUDS USING NEURAL NETWORK BASED METHOD
Ashraf S.Hussein, Mohammed F.Tolba (Egypt) ............................................................... 229
EVALUATION OF DEFECT OF PIPELINE ISOLATION DETECTED BY PEARSON
METHOD
Bartoněk Dalibor,Rukovanský Imrich (Czech Rep.) .......................................................... 239
7
AN ENHANCED WWW-BASED SCIENTIFIC DATA VISUALIZATION SERVICE
USING VRML
Dina Reda Khattab, Amr Hassan Abdel Aziz, and Ashraf Saad Hussein (Egypt) ............. 245
SYSTEMS FOR EVALUATION OF ANTICORROSIVE PROTECTION OF PIPELINE
Bartoněk Dalibor, Nesvadba Jaroslav (Czech Republic) ................................................... 255
COURSE TIMETABLING BY AN ARTIFICIAL IMMUNE SYSTEM
Čechman J. (Czech Rep.) ................................................................................................... 261
MANAGING A HIGH SPEED LAN USING DISTRIBUTED ARTIFICIAL
INTELLIGENCE
Ibrahiem M. M-El Emary (Jordan) ................................................................................... 269
MARKOV CHAINS AND EXAMPLE OF THEIR USE
Ševčík Vítězslav (Czech Rep.) ........................................................................................... 277
APPLICATION OF THE TWO-DIMENSIONAL HELLINGER AND SHANNON
QUASI-NORM
Jurák Petr, Karpíšek Zdeněk (Czech rep.)........................................................................... 285
THE NEW MODEL OF SYSTEM FUZZY RELIABILITY
Jelínek Pavel, Karpíšek Zdeněk (Czech Rep.).................................................................... 291
ON THE APPLICATION OF INTELLIGENT AUTONOMOUS AGENTS FOR
MANAGING THE REAL TIME COMMUNICATION NETWORK
Ibrahiem M. M-El Emary, (Jordan) ................................................................................... 297
HOLOGRAPHIC REDUCED REPRESENTATION IN ARTIFICIAL INTELLIGENCE
AND COGNITIVE SCIENCE
Kvasnička Vladimír, Pospíchal Jiří (Slovak Rep.).............................................................. 305
A NOTE ABOUT NON-LINEAR FUNCTION APPROXIMATION WITH FUNCTION
BOUNDARY
Radomil Matoušek (Czech Rep.) ........................................................................................ 329
ANALYSIS METHODS FOR IMAGE PRE-PROCESSING AND 2-DIMENSIONAL
PATTERN RECOGNITION
Roman Weisser (Czech Rep.) ............................................................................................. 333
ON THE DISREGARDED FORMS OF COMPUTER CRIME
Pichl Karel, Pichlová Markéta (Czech Rep./ France) ........................................................ 343
8
A message from the General Chairman of the Conference
Dear guests and participants at this conference
I would like to welcome you very kindly on behalf of myself and my colleges at the second
Scientific conference ICSC held in the European polytechnialc institute. I would like to thank to
all who actively acted in preparation in of the conference. Delivered papers of the top quality
guarantee the high quality of the conference program. The personal attendance of the authors gives
a possibility of both high professional level and the social level of the event.
The specialization of the conference as in ICSC 2003 focuses on three study branches at our
college: Marketing and Globalization, Soft Computing in the Financial Systems, Soft Computing
in Computer Environments.
Thanks to you there will be brought forward the most up-to-date contemporary reports in
all the sections. I mean mainly the problems of using a neural networks, fuzzy logic and genetic
algorithms presented in particular articles, and questions concerning the different optimization
problems to the modelling of management systems, marketing optimization using genetic
algorithms, as well as the role of the financial systems in the global economy.
The first international conference ICSC 2003 held by our school in January last year
fulfilled our imagination to getting the basic information about the actual trends in developing the
branches, the state of knowledge of various problems and not the last to give the possibility to
make a low of personal contacts with local institutions as well as those abroad.
The papers of the conference ICSC 2004 follow up the themes of the last one but the large
development of principles and possibilities of using neural networks and genetic algorithms in
practice. In the future we expect that the attention will be paid to application of using the
conference results in „business intelligence“.
The conference ICSC 2004 has been organized by European Polytechnical Institute in cooperation with The Plehanov Russian Academy of Economics in Moscow and with mutual
activities of Association of Small and Medium-sized Enterprises and Crafts in Czech Republic.
Let me close with me wishing you all, by myself and in the name of all the college
management, a pleasant stay and the establishing of useful professional and personal contacts. I
believe that our conference will fulfil your expectations and that you will gain a lot of valuable
information and ideas here.
Kunovice, January 29, 2004
Dipl. Ing. Oldřich Kratochvíl
rector
9
10
THE SOLUTION OF LOGISTICS AND TRANSPORT PROBLEMS BY MEANS OF
GENETIC ALGORITHMS
Pavel Dostál1), Karel Rais2)
1)
Strakatého 15, 636 00 Brno, Czech Republic
E-mail: [email protected], http://www.iqnet.cz/dostal/
Phone: ++420 5 44211639, Fax: ++420 5 44234750
2)
Technical University of Brno, Faculty of Business and Management,
Technická 2, 616 69 Brno, Czech Republic
E-mail: [email protected], Tel: ++420-54114 2233, Fax: ++420-54114 2527
Abstract: The article presents the possible use of genetic algorithms for the solution of logistics and
transport problems. The logistics problem is demonstrated on the case of pick-up places and
collection centers. The transport problem is demonstrated on the case of traffic route. The process
of optimalization is contribution to minimize the costs and maximize the profit.
Keywords: Logistics problems, transport problems, optimalization, genetic algorithms
1. INTRODUCTION
They are problems that are necessary to solve in practice and transport and logistics problems belong among
them. The correct optimization of such problems enables us to minimize the cost and time. The genetic
algorithms can help us with such problems.
2. LOGISTICS OPTIMALIZATION
The solution of logistics problem can be presented in the following case. We have defined the coordinates of
pick-up places (for example pick-up of mail, garbage, etc.) and we determine the number of collection centers
where the fabricator will be placed. The problem is to determine the coordinates of such collection centers and
respective pick-up places. The method of clustering can be used and the genetic algorithms optimize the
solution. During the process of calculation the pick-up places are divided into clusters and then the coordinates
of center of each cluster are calculated. The number of pick-up places and collection centers are not restricted.
Part of Tab.1. shows the number and coordinates of pick-up places and their relative collection center. The
Tab.2. shows calculated coordinates of collection centers A,B,C.
Number
…..
22
23
24
25
26
27
28
29
…..
Pick-up places
X
Y
…..
…..
25.00
4.19
16.00
2.62
111.00
16.79
66.80
10.10
20.40
4.10
120.00
17.01
56.80
5.69
59.90
10.80
…..
…..
Collection center
…...
A
A
C
B
A
Tab.2. Collection
C
B
B
…...
Coordinates
X
Y
18.4
3.1
59.8
9.0
106.8
15.9
Collection center
A
B
C
centers
Tab.1. Pick-up places
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
11
The process of optimalization determines the placement of collection centers and relative pick-up places in such
a way to minimize the distances among them. The Fig.1. shows the pick-up places and collection center which
solve the mentioned problem and thus to minimize the costs and time.
20
15
10
5
0
0
20
40
60
80
100
120
Fig.1. The pick-up places and collection center A,B,C
3. TRAVEL OPTIMIZATION
The solution of travel problem can be presented in the following case. When we know the coordinates of pick-up
places and collection centers we can determine the traffic distances among them for each group. The problem is
to determine the shortest traffic route of places which we have to visit. The genetic algorithms can help us to
solve such problem especially when the number of visited places is high. During the process of calculation is
searched the shortest traffic route.
Part of the Tab.3. shows the traffic distances among collection center A (CCA) together with pick-up places in
group A (PPAx) in km. The Tab.4. shows the order of visited places before and after optimization. The traffic
route measures after (before ) optimization 327 (441) km. The Fig.2. shows these two traffic routes. The shortest
traffic route enables us to minimize the costs and time.
CCA
…..
PPA2
PPA3
PPA4
PPA5
…..
Order
…..
…..
…..
…..
…..
…..
…..
…..
1
CCA
CCA
PPA2
30
…..
0
14
30
30
…..
2
PPA1
PPA13
PPA3
24
…..
14
0
18
17
…..
3
PPA2
PPA12
PPA4
36
…..
30
18
0
15
…..
4
PPA3
PPA10
CCA
0
…..
30
24
36
24
…..
5
PPA4
PPA9
PPA5
24
…..
30
17
15
0
…..
6
PPA5
PPA8
…..
…..
…..
…..
…..
…..
…..
…..
7
PPA6
PPA7
8
9
PPA7
PPA11
PPA8
PPA1
10
PPA9
PPA4
PPA10
PPA3
12
PPA11
PPA5
4.5
13
PPA12
PPA6
4.0
14
PPA13
PPA2
15
PPA14
PPA14
Tab.3. Travel distances
Before optimalization After optimalization
Tab.4. Traffic route
11
4 .5
4 .0
3 .5
3.5
3 .0
3.0
2 .5
2.5
2 .0
2.0
1
CCA
CCA
Total:
441 km
327 km
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED 1.5
IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
1 .5
EPI Kunovice, Czech Republic, January 29-30, 2004
12 1.0
1 .0
0
5
10
15
20
25
30
0
5
10
15
20
25
30
Fig.2a. Traffic route (441 km)
Fig.2b. Traffic route (327 km)
4. CONCLUSION
The genetic algorithms enable us to solve complicated logistics and traffic problems. The correct optimalization
and application of results in practice enables us to minimize the costs, increase the profit and save our
environment.
LITERATURE:
[1] DOSTÁL P., RAIS, K.: Methods of Large Investment Unit Modelling, Trento 2001, Italy, Transformation of
CEEC Economies to EU Standards, Conference, s.53-57, ISBN 80-86510-27-1.
[2] DOSTÁL P., RAIS, K.: Genetické algoritmy a jejich využití v modelování investic. Brno 2002, Manažerské
mosty 2002, Konference, p..41-44, ISBN 80-7314-004-7.
[3] DOSTÁL, P.: Moderní metody ekonomických analýz (Modern Methods of Economic Analyses). UTB –
FAME - Zlín 2002, Study materials, 110 p., ISBN 80-7318-075-8
[4] DOSTÁL P., RAIS, K.: Production Schedule by Means of Genetic Algorithms. Kunovice 2003, International
Conference on Soft Computing, p.32-34, ISBN 80-7314-017-9
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
13
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
14
EVOLUTIONARY ALGORITHMS FOR MARKETING AND MANAGEMENT
Pavel Ošmera1), Oldřich Kratochvíl2)
1)
Institute of Automation and Computer Science
Brno University of Technology, Technická 2, 616 69 Brno, Czech Republic
osmera @uai.fme.vutbr.cz
2)
European Polytechnic Institute, Osvobození 699, 686 04 Kunovice, Czech Republic
Tel. 572 548 035, Fax: 572 549 018, E-mail: [email protected]
Abstract: There has been a recent resurgence of interest in using simulated evolution as a method
for discovering solutions to economic optimization problems. These include prediction, modeling,
control, decision etc. The adaptive significance of parallel GAs and the comparison with standard
GAs are presented. Evolutionary algorithms can be used to solve the management and marketing
problems. Few evolutionary algorithms have been applied to the Dynamic Economic Problem
(DEP). Hybrid and parallel genetic algorithms (GAs) for solving the dynamic economic problems
can be implemented. The adaptive significance of parallel GAs is presented.
Keywords: parallel evolutionary algorithms, adaptive systems, optimization economic problems
1 INTRODUCTION
Business decisions almost depend upon some forecast the course of events. Forecasting involves
making the best possible judgment about some future event. In today’s rapidly changing business world such
judgments can mean the difference between success and failure. Demand forecasts are also valuable as an input
to sales and departmental budgets. Sales forecast, e.g. in dollars, are provided to the finance department as an
input to the financial budgeting process. Managers use time-series analysis smoothing, regression, and
judgmental methods in developing forecasts of sales. As forecaster, at one time or another, we have to ask
ourselves why we should try to forecast in the first time. There is a three-part response to this question: First, the
power of force such as economics, competition, markets, social concerns and the ecological environment to
affect the individual firm is severe and continues growing. Secondly, forecast assessment is a major input in
management’s evaluations of different strategies at business decision-making levels. Thirdly, the inference of no
forecasting is that the future either contains “no significant change” or there is ample time to react “after the
fact”[18].
The fundamental approach to decision-making is to formulate a single metric which summarizes the
performance or value of a decision and iteratively improve this performance by selecting from among the
available alternatives. Unfortunately, many problems do not lend themselves to optimization by classic methods
(e.g., Newton-Gauss, steepest descent, and so forth). The mapping from each proposed solution to its
corresponding performance index may not be differentiable, or even continuous, and may possess multiple local
optima. In many modeling problems, the assumptions that are required to implement an estimation technique are
violated. Unknown systems may be nonstationary or even intelligently interactive. Classic optimization methods
often lead to unacceptably poor performance when applied to real-world circumstances. A more robust
optimization technique is required. Applying the logical aspects of the evolutionary process to optimization
offers several distinct advantages. There exists a large body of knowledge about the process of natural evolution
which can be used to guide simulations. This process is well-suited for solving problems with unusual constrains
where heuristic solutions are not available or generally lead to unsatisfactory results. A traditional science tended
to emphasize stability, order, uniformity, and equilibrium. It concerned itself mostly with closed systems and
linear relationships in which small inputs uniformly yield small results. Generally they center on the basic
conviction that - at some level - the world is simple and is governed by time-reversible fundamental laws. What
makes the new paradigm especially interesting is that it shifts attention to those aspects of reality that
characterize today’s accelerated social change: ways, scientists have expressed that the puzzle fits disorder,
instability, diversity, disequilibrium, nonlinear relationships (in which small inputs can trigger massive
consequences), and temporality – a heightened sensitivity to the flow of time. Most phenomena of interest to us
are, in fact, open systems, exchanging energy or matter (and, one might add, information) with their
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
15
environment. All systems contain subsystems, which are continually “fluctuating”. At times, single fluctuations
or a combination of them may become so powerful, as a result of positive feedback, that they shatter the preexisting organization. Authors [6], [8] call it ”a singular moment” or “a bifurcation point”. It is inherently
impossible to determine in advance which direction a change will take: whether the system will disintegrate into
“chaos” or leap to a new, more differentiated, higher level of “order” or organization which is called “dissipative
structure”.
Today we are going from deterministic, reversible processes to stochastic and irreversible ones. Certain
events go only one way - not because they cannot go the other way, but because it is extremely unlikely that they
go backward. Initial conditions corresponding to a single point in unstable systems correspond to infinite
information and are therefore impossible to find or observe. No equilibrium brings “order out of chaos”.
However, as it was already mentioned, the concept of order (or disorder) is more complex than it was thought to
be. Open systems evolve to higher and higher forms of complexity. Molecular biology shows that not everything
in a cell is alive in the same way. Some processes reach equilibrium; others are dominated by regulatory
enzymes far from equilibrium.
Speculation that artificial intelligence, or more properly “machine learning”, could be created through a
simulation of evolution goes back at least to Cannon (1932), who pictured natural evolution as a process that is
similar in consequence to that which an individual undergoes, proceeding by random trial-and- error. The first
attempt at the formal application of evolutionary theory to practical engineering problems appeared in two
somewhat disparate areas: statistical process control and artificial intelligence.
2 ADAPTATION OF COMPLEX ECONOMIC SYSTEMS
In dynamical systems, transition can be found: order, complexity, and chaos. Analogously, water can
exist in solid, transitional, and fluid phases. Kauffman was busy on the autocatalytic set simulation for the origin
of life [5], [6]. In nonlinear systems, a chaos theory tells us that the slightest uncertainty in our knowledge of
initial conditions will often grow inexorably, and our predictions are nonsense. Complex adaptive systems share
certain crucial properties (non - linearity, complex mixture of positive and negative feedback, nonlinear
dynamics, emergence, collective behavior, spontaneous organization, etc.). In the natural world, such systems
include brains, immune systems, ecology, cells, developing embryos, and ant colonies. In the human world, they
include cultural and social systems. Each of these systems is a network of a number of “agents” acting in
parallel: in a brain, the agents are nerve cells; in ecology, the agents are species; in a cell, the agents are
organelles such as the nucleus and the mitochondria; in an embryo, the agents are cells, and so on. Each agent
finds itself in the environment produced by its interactions with the other agents in the system. It is constantly
acting and reacting to what the other agents are doing. There are emergent properties, the interaction of a lot of
parts, the kinds of things that the group of agents can do collectively; something that the individual cannot. There
is no master agent - for example - a master neuron in the brain. Complex adaptive systems have a lot of levels of
organization (hierarchical structures), with agents at any level serving as building blocks for agents at a higher
level. It is no wonder that complex adaptive systems (with multiple agents, building blocks, internal models, and
perpetual novelty) are so hard to analyze with standard mathematics. We need mathematics and computer
simulation techniques (a whole new mathematical art: programming) that emphasize internal models, emergence
of new building blocks, and a rich web of interactions between multiple agents.
We now have a good understanding of chaos and fractals showing how simple systems with simple
parts can generate very complex behaviors. The edge of chaos is a special region onto itself, the place where you
can find systems with lifelike, complex behavior. Living systems are actually very close to this edge-of-chaos
phase transition, where things are much looser and more fluid. A natural selection is not an antagonist of selforganization. It is a force that constantly pushes emergent, self-organizing systems towards the edge of chaos.
Evolution always seemed to lead to the edge of chaos. The complex evolutionary structure described in [10],
[13] can be transformed to the structure of the computational intelligence (see Fig. 1).
A random genetic crossover or mutation may give a species the ability to run much faster than before.
The agent starts changing, then it induces changes in one of its neighbors, and finally you get an avalanche of
changes until everything again stops changing. Systems get to the edge of chaos through adaptation: each
individual (agent) tries to adapt to all the others. Co - evolution can also get them there; the whole system coevolves to the edge of chaos. In ecosystems or ecosystem models [10], three regimes can be found: ordered
regime, chaotic regime, and edge-of chaos like a phase transition. When the system is at the phase transition,
then - of course - order and chaos are in balance. There is an evolutionary metadynamics, a process that would
tune the internal organization of each agent so that they all reside at the edge of chaos. The maximum fitness
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
16
occurs right at the phase transition. Life is clearly a chemical phenomenon, and only molecules can
spontaneously undergo a complex chemical reaction with one another. The first source of chemistry’s power is a
simple variety. The second source of power is reactivity: the structure A can manipulate the structure B to form
something new – a structure C. A growth of complexity really does have something to do with far-fromequilibrium systems building themselves up, cascading to higher and higher levels of organization. Once they
have accumulated a sufficient diversity of objects at the higher level, they go through a kind of autocatalytic
phase transition and get an enormous proliferation of things at that level. Life is a natural expression of complex
matter. It is a very deep property of chemistry and catalysis and being far from equilibrium.
Evolutionary computation is generally considered as a consortium of genetic algorithms (GA), genetic
programming, evolutionary programming, evolutionary strategies. While it has been studied for 40 years by
computer scientists, its application to economics and finance has a much shorter history. After a decade-long
development, we believe that it is high time to have a special interest on this subject. In 1988, when John
Holland and Brian Arthur established an economics program at the Santa Fe Institute, artificial stock markets
were chosen as the initial research project. Now many papers are directly related to agent-based artificial stock
markets ACE (Agent-Based-Computational Economic).
There are other softcomputing methods suitable for solving economic problems:
Particle Swarm Optimization,
Multi-Agent EA,
Agent-based Multiobjective Optimalization
Ant Colony Optimization, Team Optimization,Culture Algorithms,
Evolutionary Computation:
Evolution Strategies
Genetic Programming
Genetic Algorithms, Parallel GAs
Fuzzy logic, Neural Networks, Fuzzy-rough Sets, Fuzzy-neural Modeling,
Hybrid Learning, Intelligent Control, Cooperative co-evolutionary Algorithms,
Parasitic Optimization, Bacterial EA,
Artificial Life Systems, Differential EA,
Parallel Hierarchical EA, Meta-Heuristics,
Evolutionary Multi-objective Optimization,
Evolvable control, Embryonic Hardware,
Human-Computer Interaction, Molecular-Quantum Computing,
Data Mining, Chaotic Systems, Scheduling
3 MARKETING OPTIMIZATION USING PARALLEL GENETIC ALGORITHMS
For artificial foreign exchange markets, GA learning can be implemented in two different ways,
namely, learning how to optimize and learning how to forecast. Many GAs are implemented on a population
consisting of haploid individuals. However, in nature, many living organisms have more than one chromosome,
and there are mechanisms used to determine dominant genes [9], [11]. A sexual recombination generates an
endless variation of genotype combinations that increase the evolutionary potential of the population [11]. As it
increases the variation among the offspring produced by an individual, it improves the chance that some of them
will be successful in varying, and they will often encounter unpredictable changes of the environment. Mutation,
genetic drift, migration, nonrandom mating, and natural selection cause changes in variation within population. It
can be demonstrated that standard GAs with haploid chromosomes are unable to correctly locate optimal
solutions for time-dependent objective functions. The adaptation of GAs depends on the speed of landscape
changes through time. An applications of GAs concerns the effectiveness of different policies on pollution
control. Pollution control is very hot topic in microeconomics because it provides a perfect illustration on how
the optimal level of pollution can potentially be achieved by using the market mechanism. Cooperative
computation with market mechanism provides another demonstration how to use Softcomputing.
3.1 Algorithm description
The following GAs can be used to solve the DEP:
Standard GA – One population with the size = 40 individuals. Individuals are sorted by their fitness. The higher
is the fitness, the higher is the probability of selecting the individual to be a parent (a roulette wheel). The second
parent is selected in the same manner. Then, the crossover, mutation, and correction are applied. The
reproduction is repeated until the worse half of population is replaced.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
17
GA with two sub-populations (the sexual approach where the male and the female are distinguished) –Every
sub-population with the size = 20 individuals. Individuals are sorted by their fitness. The first parent is selected
from the male sub-population while the second parent is selected from the female sub-population. The selection
probability of the first parent is performed with a uniform distribution function, while the selection probability of
the second parent is different, using a modified roulette wheel approach that more often prefers better
individuals. Crossover, mutation, and correction operators are then applied. The reproduction is repeated until
the worse half of every sub-population is replaced.
The GA with two sub-populations is a special case of the parallel GA with two sub-populations and
sexual selection. The sexual recombination generates an endless variety of the genotype combinations that
increase the evolutionary potential of the population [9-13]. As it increases the variation among the offspring
produced by an individual, it improves the chance that some of them will be successful in the varying
environment. Mutation, genetic drift, migration, a nonrandom mating, and natural selection cause changes in
variation within the population. It can be demonstrated that standard GAs with haploid chromosomes that are
unable to correctly locate optimal solutions for time-dependent objective functions. The adaptation of GAs
depends on the speed of landscape changes through time.
Parallel GA has a two - level structure. The first level is created by several populations with different
GAs [10], [14]. The best or random individuals from the first level are sent to the second level. At this level, the
standard GA with the elitism runs. This two-level structure allows us to find a better solution than that found by
GAs in the first level; the best solution from the first level can never be lost but only overtaken in the second
level.
4 CONCLUSIONS
Evolutionary optimization can quicly achive near optimum solution even the space of possible solution
is immense. Further, the nature of the algorithms indicates its potential utility for parallel processing machines.
Each parent can be mutated independently and each offspring can be evaluated independently (unless fitness is
an explicit function of the other members of the population and even then the process can be accelerated through
parallelism). Competition and selection can alsou be parallelized. Thus the required execution time of
evolutionary algorithms can be decreased dramatically if a number of small processors are devoted to these
independent calculations. Mete-approaches can be applied to the continuum of evolutionary algorithms. This
may lead to an increased rate of optimization.
Natural evolution is the most robust yet efficient problem solving technique. Evolutionary algorithm
can be made as robust. The same procedure can be applied to diverse problems with relatively litle
reprogramming. While such efforts will undoubtedly successfully address difficult real-world problems, the
focus of further development is in simulation evolution shloud remain “top-down”. The ultimate advancement of
the field will, as always, rely on careful observation and abstraction of natural process of evolution (for example
artifficial immune systems to solve complex economic problems).
There is an optimal lifetime where almost all individuals in the population have high fitness. This
strategy is used in nature by viruses and bacteria. A host-parasite arms race is one of the most basic and
unavoidable consequences of evolution. It begins to look as if parasites are inevitable in any systems of life.
Computer viruses have since become a worldwide problem, too. Sex is about disease. It is used to combat the
threat from parasites. Organisms need sex to keep their genes one step ahead of their parasites. Females add
sperm to their eggs because if they did not, the resulting offspring would be identically vulnerable to the first
parasite that picked their genetic locks [7]. Sex stores genes that can be currently bad but have promise for reuse.
The computer simulation helps identify the conditions under which the evolution of living world is running
forever. The sexual reproduction is typical for complicated creatures. GAs with sexual reproduction can increase
the efficiency and robustness, and thus they can track better optimal parameters in a changing environment
[4], [7].
Diploid chromosomes increase the efficiency and robustness of GAs, and they can better track optimal
parameters in a changing environment. Sex is distinguished by two-bit value stored in the chromosome [4]. It
can be demonstrated that standard GAs with haploid chromosomes are unable to correctly locate optimal
solutions for time-dependent objective functions. An interesting fact was found, namely that sexual reproduction
(SR) and the immune system (IS) use two types selections working in parallel (IS: clonal and negative selection
[11], SR: female and male selection [2]). Every selection mechanism solves one task (for example: convergence
or adaptation). The system as a whole has then both desired features.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
18
ACKNOWLEDGMENTS.
This work has been supported by MŠMT grant No: MSM 260000013
REFERENCES
[1] PUVES,W. K., ORIANS, G. H., HELLER, H. C.: Life, the science of biology, Sinauer Associates, Inc. 1992.
[2] BLACKMORE, S.: The Meme Machines, Oxford University Press 1999.
[3] DAWKINS, R.: The Selfish Gene, Oxford University Press, Oxford 1976.
[4] RYAN, C.: Shades: Polygenic Inheritance Scheme, MENDEL’97, Brno, Czech Republic (1997) 140-147.
[5] KAUFFMAN, S.A.: The Origins of Order, Oxford University Press, New York 1993
[6] KAUFFMAN, S.A.: Investigations, Oxford University Press, New York 2000
[7] RIDLEY, M.: The Red Queen – sex and the evolution of nature, Penguin Books Ltd. 1993
[8] PRIGOGINE, I., STENGERS, I.: Order out of Chaos, Flamingo 1985
[9] OŠMERA, P., MASOPUST, P.: Schedule Optimization using Genetic algorithms, Proceedings of
MENDEL’2002, Brno, Czech Republic (2002) 132 – 138.
[10 OŠMERA, P.: Complex Evolutionary Structures, Proceedings of MENDEL’2002, Brno, Czech Republic
(2002) 109 – 116.
[11] OŠMERA, P., ROUPEC, J.: Limited Lifetime Genetic Algorithms in Comparison with Sexual
Reproduction Based GAs, Proceedings of MENDEL’2000, Brno, Czech Republic (2000) 118 – 126.
[12] ROUPEC, J.,OŠMERA, P., MATOUŠEK, R.: The Behavior of Genetic Algorithms in Dynamic
Environment, Proceedings of MENDEL’2001, Brno, Czech Republic (2001) 84 - 90.
[13] OŠMERA, P.: Complex Adaptive Systems, Proceedings of MENDEL’2001, Brno, Czech Republic (2001)
137 – 143.
[14] OŠMERA, P.: Genetic Algorithms and their Aplications, the work to be habilitated, in Czech language
(2002) 3 – 54.
[15] DAVIS, L.: Job Shop Scheduling with Genetic Algorithms, International conference ICGA’85 (1985) 132138
[16] MORTON, T.E., PENTICO, D.W.: Heuristic Scheduling Systems, John Wiley and Sons 1993
[17] WALDROP, M.M: Complexity – The Emerging Science at Edge of Order and Chaos, Viking 1993
[18] FOGEL D.B.: A Brief History of Simulated Evolution, First Annual Conference on Evolutionary
Programming, La Jolla, USA (1992) 1-16
[19] OŠMERA, P.: Complex Adaptive Systems, Proceedings of MENDEL’2001, Brno, Czech Republic (2001)
137 – 143
[20] OŠMERA, P.: Complex Evolutionary Structures, Proceedings of MENDEL’2002, Brno, Czech Republic
(2002) 109 – 116.
[21] OŠMERA, P.: Genetic Algorithms and their Aplications, the habilit work, in Czech language (2002) 3 –
114.
[22] WALDROP, M.M: Complexity – The Emerging Science at Edge of Order and Chaos, Viking 1993
[23] COELLO C.A., CORTÉS N.C.: A Parallel Implementation of an Artificial Immune System to Handle
Constrain in Genetic Algorithms, WCCI 2002, 819-824, Hawai
[24] SANCHEZ-VELAZCO, J., BULLINARIA, J.A.: Sexual Selection with Competitive/Co-operative
Operators for Genetic Algorithms, Proceedings of NCI’2003, Cancun, Mexico (2003) 308 – 316.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
19
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
20
DIAGNOSTIC APPROACH TO THE HUMAN CAPITAL DEVELOPMENT
Alžbeta Kucharčíková, Jozef Vodák
Department of Macro and Microeconomics
Department of Managerial Theories
Faculty of Management Science and Informatics
University of Žilina
[email protected], tel. 00421-041-565 10 12
[email protected], tel. 00421-041- 565 10 17
Abstract: This article describes the role of some diagnostics methods first of all identification and
analysis of education and other needs of firms and balanced scorecard method for the
improvement the enterprise processes and the development of the human capital.
Keywords: enterprise diagnostics, STEEP analysis of external factors, SWOT analysis,
benchmarking, cost-volume-profit analysis, human capital, identification and analysis of education
and others needs of firms, balanced scorecard method.
1. INTRODUCTION
The firms work in a new global economic age. Very fast installing of the new technologies, opening of
the world market and increasing competition are some characteristics of present situation in which the
companies must live. Firms must search for possibilities how to survive in this high turbulent environment.
That means that they must find how to increase the quality of their production, how to better serve to their
customers, how to improve their internal processes, how to improve potential of their human resources and how
to measure all of this things. Good diagnostics seem very important for efficient way in increasing the ability of
the companies to compete successfully on global markets.
2. ENTERPRISE DIAGNOSTICS
Enterprise diagnostics is science approach which is using methods and techniques for the identification
of problems in firms and for the problem solving. Advantage of this approach is its complex view on the
company and its business processes.
“Basically the term diagnosis originates from the Greek basis “dia” = though and “gnosis” = knowledge
denoting the thorough knowledge which might metaphorically mean the distinguishing of the phenomena and
processes, respectively the determination and defining of the state of the phenomena and processes using.” [1]
There are several methods which may be applied for the diagnostic process, for example:
General methods:
· mathematical methods
· comparative methods
· time investigation methods
· factor investigation methods
Methods for evaluation of specific business process
- traditional methods:
· analysis of production
· cost - benefit analysis
· analysis of competition
· marketing research
· financial analysis
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
21
- modern methods
· benchmarking
· cost-volume-profit analysis
· STEEP analysis of external factors
· SWOT analysis
· identification and analysis of education and other needs of firm
· balanced scorecard method
Benchmarking is a “continuous and systematic comparing of own performance in productivity, quality
and production process with the enterprises and organisations presenting top- achievements. Benchmarking
looks for the forms how to achieve competitive advantage. Enterprise production, services providing and
procedures have to be continuously compared with the achievements of best competition and reputable marketleaders in all sectors.” [2]
Cost-volume-profit analysis is a “systematic method of examining the relationship between changes in
volume and changes in total sales revenue, expenses and net profit. As a model of these relationships, costvolume-profit analysis simplifies the real world conditions which a firm will face. Like most models which are
abstractions from reality, cost-volume-profit analysis is subject to a number of underlying assumptions and
limitations and it is s powerful tool for decision making. The objective of cost-volume-profit analysis is to
establish what will happen to the financial results if a specified level of activity or volume fluctuates”. [3]
STEEP analysis – external factors analysis
This analysis consist of analysing the forces in social, technical, economical, environmental and
political areas which influence the company. By characterisation each of starting points is suitable to start by
external environment analysis. Performance of the company depend significantly on external forces and on
interactions between the company and external environment. Good management is able to achieve excellent
results in not very positive entrepreneurial environment. That means, good management is able to eliminate not
very positive external forces for the company prosperity or is able to eliminate their impacts.
SWOT analysis
This tool consist of two views into company and two views outside of the company. Look inside
consist of strengths and weaknesses of the company. The most important strengths are usually skilled
employees, management, organisation culture and the ability of efficient production of the company products in
expectation quality. Opportunities and threats are in the outside of a company. It is often changing
entrepreneurial environment, changing competition conditions, changing needs and expectations of customers
and economic situation in the territory. Good SWOT analysis is very suitable diagnostic tool for management of
the company.
Application of facility management enables to use some diagnostics methods, too. “Facility
management for example in building trade is useful method for the future owners of buildings of various age,
technical condition and facilities. It should be of benefit by optimising all supportive processes, specifying costs
and their reduction at more effective area utilisation. Investors as well as developer´s organisations can benefit
from facility management as well. The application of facility management helps them construct buildings of high
practical value at low operation costs, providing projects of high quality and all services for tenants. “ [4]
3. THE HUMAN CAPITAL DEVELOPMENT
At the present globalisation age are realised extensive industry changes. There are successful only
firms which invest in new technologies and people working with this technologies. People are the major
business assets. The firms need high skilled workers and so the human capital is one of the main factors of
ability to compete growth of firms.
Human capital is the sum of inborn or obtained knowledge, competencies, skills and experiences of
the individuals. Investment in human capital is tended to development of knowledge and skills of workers. Firms
invest to the human capital when they expect future benefits. Investments in human capital are realised as
investments in education and training. “Correct organised education and systematic development of employees
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
22
competences is continuous process. This process of the employees`s education and development connects with
the firm mission, objectives, philosophy, firm culture, strategy and policy of management and development of
the human potential.” [5]
There are some benefits of investment in education, for example growth of the production, services,
quality, labour productivity, decrease in costs, other innovations, high quality relationship with customers,
growth of the competitive ability on market.
Application of the diagnostic approach for human capital development can be useful for growth of the
quality of business processes and for effective business recourses decisions. Using the modern diagnostics
methods – identification and analysis of education and other needs of firm is very effective in this case.
4. IDENTIFICATION AND ANALYSIS OF EDUCATION AND OTHERS NEEDS OF FIRM
This method should identify skills gaps and also the other factors affecting performance problems in
firm. Process of the identification starts by answering questions. For example:
· Is performance of that skill really essential?
· Is the employee rewarded for using that skill?
· Does management discourage that behaviour?
· What other barriers of the performance exist?
The aim of this method is to identify a performance problems of employee by symptoms and to identify
the reason of this problems. The reason can be in the changes of the motivation and behaviour of the people or in
firm environment. This method`s result is a list of people`s education and others needs, design of the education
programme and design of the others problems solving. Evaluation of obtained date gives us possibility to
identify the other organisation problems, too.
Data collection
informations
standards
Data analysis
· Identification of the achievement
problems
· Identification of problems causes
Priorities of problem sphere
Identification of education needs
********
Identification of others needs and measures
Design of education programme
Fig.1: Diagnostics model of identification and analysis of education and other needs process
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
23
5. USING BALANCED SCORECARD IN DEVELOPMENT HUMAN POTENTIAL
Companies concern by questions about their strategy (creation and implementation), which will help
them create pre-requisite for their successful life in high competition markets. Necessary part of the strategy is
development of human resources, too. Therefore companies invest a great deal of their sources in this field. It is
necessary to invest this resource cost-effective. This is the reason why companies try to find suitable learning
methods for developing their employee potential and to find suitable methods for diagnostics and evaluation
these learning processes. We want to point to one possible view of this problem through developing human
potential in management of modern company by Balanced Scorecard (BSC) approach. This model is concern to
achieve high effectiveness by investing into developing human resources for achieving company vision and
strategy. Balanced Scorecard seems as very useful tool for creation and implementation of the company strategy
in every day life. There are other reasons why is useful to use Balance Scorecard, too. One of them is need to
look forward and create learning organisation.
The learning organisation is characterise by developing, obtaining and carrying knowledge and by
modification their behaviour towards new knowledge and opinion. In other words, learning organisation develop
its ability to react, adapt and profit from changes in external and internal environment. The word “learning“
stress focus on knowledge, competencies and is tightly connected with intellectual capital. Learning of
individuals is important as a base for collective learning of organisation.
Balanced Scorecard can be used as management system for managing organisation or very effective
measurement and diagnostic tool.
Balanced Scorecard helps significantly match strategy and its implementation to company culture and
help to recognise, understand and motivate employees. Next picture shows basic cause and effect principles
through single perspectives in Balanced Scorecard. This picture re-cover clear understanding about this method
and causality towards high level goals and company vision. Strategy focus of organisation begins with
a translation of the strategy into operational terms. Balanced Scorecard is a very good method that helps translate
strategy into operational objectives that drive employees both to the behaviour and performance.
VISION and STRATEGY
Financial
perspective
“To safety our shareholders, what, financial objectives
must we accomplish?”
Customer
perspective
“To achieve our financial objectives, what customer needs
must we serve?”
Internal processes
perspective
Learning and development
perspective
“To satisfy our customers, and shareholders, in which
internal processes
“To achieve our goals, how must our organisation learn
and innovate ?”
Fig.2: Perspectives in BSC and basic causality relations in Balanced Scorecard methodology
Suitable system of evaluation together with suitable rewards system can very significantly motivate
employees of the company. Measurement which is connected with strategy and evaluation of performance with
feedback on strategy, is system, which realise organisation learning on strategic level (double loop learning).
Realisation of implementation strategy with utilisation of potential Balanced Scorecard method brings:
·
·
·
·
Objectives conformity on all levels of organisation with its strategy.
Higher effectiveness measurement and control of the organisation performance.
Realisation of strategic feedback and creation of clear communication platform .
Maximization of investment effectiveness into human resources and other investments, too.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
24
Learning and development perspective of Balanced Scorecard - is basic presumption for successful
achieving strategic company goals. In this perspective company develop goals and measures which support
learning and development of the individuals and the whole organisation. Setting goals in finance, customer and
perspective of internal processes determine where organisation must achieve excellent results for achieving
strong increasing of its performance.
The goals in learning and development perspective create assumption for successful achieving goals in
other perspectives and move towards fulfilling its vision. Organisation which want to achieve success and fulfil
stakeholders expectation must invest into infrastructure which consist of people, systems and procedures. To
notify this ideas and fulfilling it helps successful implementation of the Balanced Scorecard methodology.
Willingness to invest effort and money into this perspective is presumption for clear interconnection with goals
at all corporate levels and understanding causal relation among perspectives and goals. Very important is
question about evaluation of investing into employees.
Basic areas of this perspective are:
·
·
·
Capability of systems (including information system ).
Leadership, motivation, empowering, involvement.
Capability and skills of employees.
Some examples of measurement which can be useful for learning and development perspective with
focus on human resources:
· Leadership index (Number /No/)
· Motivation index (No)
· Number of employees (No)
· Employee turnover (%)
· Average employee years of service with company (No)
· Average age of employees (No)
· Time in training (days/year) (No)
· Temporary employees / permanent employees (%)
· Share of employees with university degrees (%)
· Average absenteeism (No)
· Number of woman managers (No)
· Number of applicants for employment at the company (No)
· Empowerment index (No), number of managers (No)
· Share of employees less than 40 years old (%)
· Per capita annual cost of training ($)
· Full time or permanent employees who spend less than 50% of work hours at a corporate
facility (No)
In learning and development perspective companies often use some of mentioned measurements. It is
important to know, that every measurement must be able to match concrete conditions and needs of the
company. Probably only the ability of the company to learn more quickly as its competitors can be one of
maintainable competition advantage for the future. For employees to see their own growth means motivation and
one kind of satisfaction.
Training learning activities are important for developing employee potential. Training means learning
of training programmes for developing knowledge, skills and attitudes. It is important to notify that people first
of all obtain skills which are important for their everyday work and for their work performance in training.
Basic ideas about managing developing employee are based on assumption that human resources
strategy contribute towards corporate strategy. This idea depends on that deal how organisation permit that
employees create added value and they are therefore strategic source of the organisation.
Through clear direction clear goals and effort which is concerned in one direction is possible by less
energy achieve more, it is effect of synergy. In connection with developing human resources it means that by
focusing into clear needs, which are derived from setting goals and metrics of the organisation (by BSC
methodology) can company achieve its strategic goals more effectively than traditionally way.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
25
Next important dimension for human resources development is evaluation of training projects. In this
model of development potential in modern company is therefore used four level model for evaluation investment
into training projects.
6. CONCLUSION
By ratification of the organisation budget is investment into developing human resources often on the
last position but when the restrictions are needed they are at front line. Management know that investment into
human resources are important, but the connection between development training programmes and achieving
organisation goals is not clear enough. We believe that appropriate application of above mentioned diagnostics
approaches can help managers in our organisations change this situation and in that way help their organisations.
This work was supported by grant VEGA 1/0499/03.
REFERENCES:
[1] TOKARČÍKOVÁ,E.-KUCHARČÍKOVÁ,A.: Marketing diagnostics of 3G mobile communication
possibilities. TRANSCOM 2003, University of Žilina, 23.-25.6.2003,s. 369-373
[2] STRIŠŠ,J.-REŠKOVÁ,M.: Benchmarking in service enterprises. ICSC 2003, European polytechnical
institute Kunovice ČR, 30.-31.1.2003, s.45-50
[3] CHROMJAKOVÁ,F.- KNEŽNÍKOVÁ,M.: Cost-volume-profit analysis as an important input for financial
analysis in the industrial enterprise. TRANSCOM 2001, University of Žilina 25-27.6.2001,s.231-235
[4] CHODASOVÁ,Z.: Application of facility management in Slovakia. Ekonomické a riadiace procesy
v stavebníctve a v investičných projektoch, Slovak University of Technology in Bratislava – Faculty of Civil
Engineering, 16.-17.9.2003, s.133-136
[5] BLAŠKOVÁ,M.: Riadenie a rozvoj ľudského potenciálu. EDIS Žilina, 2003
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
26
MARKETING ACTIVITIES OF SLOVAK BUSINESSES
AFTER THE ENTRY TO EU
Jozef Strišš1), Monika Rešková2)
1)
European polytechnical institute, Osvobození 699, 686 04 Kunovice
Phone: +420 572 549018, e-mail: [email protected]
2)
University of Žilina, Faculty of management science and informatics,
Department of management theories
Moyzesova 20, 010 26 Žilina, Slovak republic, tel., fax: +421 41 5652 775
email: [email protected]
Abstract: With entering the EU, new possibilities for operating on foreign markets will create for
Slovak businesses; but at the same time, they will have to confront the foreign competition. The
paper is dealing with marketing activities that Slovak businesses will have to accept if they want to
be successful on foreign markets. Key marketing activities are: adaptation to global market
conditions, choice of target markets and orientation on concrete segments, product innovation,
services development, using information technologies within sale and communication, new forms of
communication.
Keywords: marketing, marketing communication, global markets, information technologies, ebusiness
In May 2004, Slovak Republic and other countries will be accepted to EU. Joining economically
forward European counties will mean outstanding change for Slovak business as well. New opportunities related
to new, free markets will arise; but also competition conditions will change. Simpler approach to markets of new
coming states for all the countries of EU, international companies with rich know-how and experience will be in
advantage regarding long-term operating in market economics conditions. Business from new coming countrieswhere Slovakia belongs to as well- will have to be more qualitative and effective in the field of production
quality, work productivity, cost reduction, more aggressive operating on the market, enhancing the
communication with external environment.
Successful entering of Slovak businesses on European Union markets is determined by marketing
approach and higher marketing activities. Just marketing can be decisive factor within ensuring effectivity and
prosperity.
Marketing activities have to be evolved right now ; after the entry to EU their absence will have bad
impact on market success. Most expressive marketing activities are:
Adaptation to global market conditions. Global marketing is not only entering new markets and
selling single product, but there is also ability of enterprise to understand single markets, differentiate them
according to social, economical, cultural and national specifications. Using global marketing strategy, enterprise
– using own strengths - tries to use all opportunities available on world markets with respecting the difference of
single regions. Enterprise should learn to sell globally, but above all, think and consider in global merit.
Selection of target markets and orientation on concrete segments. Basic marketing activity is
selecting target markets – target customers. Every enterprise can offer to own target market something special
with that, differentiate from the competition. Selecting target market should be activity made with special
attention. When selecting decisive market segments, it is not enough to find target customer group, but it is
necessary to feel customer thinking, to focus strategy on customer acquisition, to create business atmosphere
(ability to sell products), to orientate the advertising on target customer group and on area of sale, to value
(explore) the customer, to raise the customer to using new products. Activities of all enterprises’ bodies, every
process has to be orientated on customer. It does not cover only marketing and commercial bodies, but
complexly all the enterprise from management to executive employees. The enterprise has to evaluate own
activities with the eyes of the customer and enterprise results should be focused on satisfying the customers. The
enterprise objective should be permanent relationship with customers, attachment to the enterprise. It is
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
27
important to estimate customer value for the enterprise; it means profit that will reach will long-term (5, 10
years) contact. According to that, enterprise will determine expediency of contact and recourses that will invest
to the customer (developing new products, delivery channels). Existence basis of every business is customer
need. There is necessary to understand that customer does not look for product itself, but is searching for
problem solution and benefit.
Product innovation. Global marketing does not cover only single entrance on new markets,
communication and selling relatively single product, but also its modification in global merit, it means product
modification – its design and conversion in some ways jointly on all the markets with respecting national and
cultural specifications and norms. Product should reach required standards on foreign and national markets.
Global marketing includes also the enterprise ability to understand the markets and to differentiate them in term
of cultural and national differences.
Definite condition for success on the market is:
·
Permanent innovation and adaptation of products regarding the dynamics of demand to ensure apart from
big innovations also smaller innovation, where experiences from competition products can be used ; it is
possible to improve those products and services. When innovating, it is necessary not to forget customer
wishes and to directly develop and produce products according to individual customer needs ;
·
High product quality – complete functional perfection based on newest science knowledge, adapted to
customer needs. Enterprise can reach outstanding results if the products are far behind customer
expectations and have such characteristics that will surprise him.
Services development. In past few years, dynamics of services market is growing rapidly. Not only
share of services as individual products is growing, but also services that are the part of main product arise. Basic
principles that marketing has to respect are innovation and quality. Innovation can be based on better services
organisation, or providing complement services that competition does not offer. It is suitable to use enterprise
traditions when offering new services. Service marketing has to insist on quality of services provided. High
standards, quality control and strong motivation of employees oriented on qualitative objectives have to be
matter of course.
Using new information technologies when acquiring commercial partners. Quick reaction to
impulses in enterprise surrounding is basic factor of business success. Fundamental element of success is time.
Enterprise has to react quickly on all the changes related to demand change, revenues, collision situations on
markets and other problems. Building quality marketing information system has to be centred on information in
enterprise surrounding. Lot of business absent marketing information systems, providing data about market
changes, technical development, competition programmes etc. New information technologies enable the
enterprise to create information and logistic networks, creation of intra-enterprise relations that lead to
optimalisation of production, distribution, and communication and market activities and to cost reduction.
New forms of communication with the market. Slovak enterprises will have to realise expressive
changes in the field of marketing communication. In the past years, signals from market are clear – customer
needs more that classic forms of marketing communication, aimed at one-way information flow. Market is
becoming demanding and requires dialogue. It does not need to be personal meeting, but it can use modern
forms of dialogue through interactive Internet abilities, e-mail, adds, questionnaires etc. These modern forms
enable very fast product ordering by on-line system, optimal keeping of supplies and permanent customer
informing. System of dialogue marketing is based on four fundamental phases:
·
privity, means offering the customer basic information with the possibility of feedback and
convenience feeling;
·
interest, means by address dialogue or by personal meeting to convince the customer about the
uniqueness of offer and product;
·
preferences, means dialogue has to be clear and straight for the purpose of acquiring the customer for
purchase;
·
sale, means to achieve that customer will create friendly and long-term relationship with the
enterprise, even after purchase.
Using new information technologies in past years is also seen in business through the internet – in ebusiness. E-business includes using technologies, processes and management experience that improve
competitiveness of the enterprise through strategic use of electronic information. E-business is new way of
making business. Business that is based on global openness of economics, caused by information technologies
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
28
development, integrates of all the parts of the world into one „global village“. Term of e-commerce means
exchange value of money, goods, services and information in electronic form. In connection with e-commerce,
there is also the term of e-economics. E-economics means environment that e-commerce is running in and easily
question some of basic macro economical thesis. When defining e-business, we have to see e-commerce as
a factor that provides possibility of development. It is built on advantages and structure of tradition commerce,
but it uses flexibility that electronic media and networks are offering. E-commerce of enterprise can connect
critical business processes directly with its key parts. Key parts can be customers, employees, salesmen,
suppliers, business partners and however having influence on business.
21. Century technology smoothes business and commerce. For being successful, enterprises have to
understand basic parts of e-commerce. E-commerce is not ultimate objective itself, but developing process that
enables the enterprise to work better. Enterprises transforming business styles know that, that is why they
explore own strategies, techniques and tools in the light of new technologies. Companies have different looks
and sizes; there is not one set of technologies of e-commerce suitable for all them. That is the keystone of ecommerce: Using technologies to build better relationships with customers, suppliers and employees.
Corporate Identity is philosophy of enterprise operating and surviving in given market space, creating
corporate image of enterprise – common factor for all the enterprise activities. Corporate identity is the mix of
corporate culture (enterprise behaviour to customers, shareholders, public, employees), corporate design
(external enterprise image – logo, symbols, colours, shapes) and corporate communication (advertising, PR,
information). In competitive environment, there is often the corporate identity – form of enterprise
communication, trademarks, and behaviour to public – that can persuade and acquire the customer.
Creating the relationships with the customers. Lately, enterprise success is based on creating the
relationships with the customers. Customers are becoming more a partner with who the enterprise creates not
only formal, but very often informal, friendly relationships. Typical example is post-sale marketing.
Characteristics of post-sale marketing can be in slogan: „Purchase of product or service to the customer does not
mean the end of the relationship to the customer, it only begins.“ Process of post-sale marketing includes wide
complex of activities that can be summed in areas:
· processing of information files about customers,
· elaborate method and form of raising and keeping the dialogue with the customers,
· elaborate arrangements for keeping and improving customer satisfaction,
· elaborate arrangements for acquisition of lost customers.
Benchmarking – learning from the best. Benchmarking is permanent and systematic process of
comparison and measuring products, processes and methods of own organisation with those, that were found as
suitable for this measurement in order to defining the objectives of improving own activities.
Practical meaning of benchmarking::
· helps to better understanding of customer requirements,
· enables managers to gain information that would by only random process outcome
· is the way of finding unbiased indicators of measuring own performance and productivity
· one of the most effective processes of gaining impulses to own improvement.
As the conclusion, it can be said that entering the EU does not have to be nightmare for our enterprises.
When applying the principles of marketing, understanding customer meaning, its needs and satisfaction, Slovak
enterprises can also be of use on foreign markets and be successful in competitive environment with foreign
enterprises.
REFERENCES:
[1] HORÁKOVÁ, I. a kol.: Strategie firemní komunikace. Management Press. Praha. 2000.
[2] JEDLIČKA, M.: Propagačná komunikácia podniku. MAGMA. Trnava. 2002.
[3] KOTLER, P., Armstrong, G.: Marketing. GRADA. Praha. 2004.
[4] MATEIDES, A., ĎAĎO, J.: Služby. EPOS. Ružomberok. 2002.
[5] SMITH, P.: Moderní marketing. Computer Press. Praha. 2000.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
29
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
30
THE DECISION-MAKING PROCESS MODELLING
Štefan Hittmár
University of Žilina, Moyzesova 20 010 26 Žilina, Slovakia, tel.: ++421 41 5652 775
e-mail: [email protected], www.fria.utc.sk
Abstract: Decision is a fundamental part of management because it requires choosing among
variant courses of action. Decision making is a process of eight steps that include identifying a
problem, selecting an variant, and evaluating the decision´s effectiveness. In the following you
can find a processed model of decision making in management.
Keywords: Management, Analyzing, Decision-making, Process, Information, Problem, Criteria,
Variant, Implementing Evaluating
1. INTRODUCTION
Management styles vary. Some managers follow their instincts and deal with situations on a case-bycase basis. Other managers use a more systematic and structured approach to making decisions.
Decision making is part of every aspect of the manager´s duties, which include planning, organizing,
staffing, leading, and controlling. For example, managers can formulate planning objectives only after making
decisions about the organization´s basic mission. To accomplish the objectives within some time period,
decisions must be made on what resources are required. When pursuing company objectives, management must
make decisions on the division of labor and reporting relationships. Prospective staff members are identified
and selected according to what the established positions require. The organization pursues its objectives
through countless daily decisions required to make or sell the product or service. And decisions about whether
to take corrective action when performance does not measure up to standards affect operations.
Decision making is typically described as „choosing among variants“. But this view is too simple.
Why? Because decision making is a comprehensive process, not just a simple act of choosing among variants.
2. MODEL OF DECISION-MAKING PROCESS
The decision-making process is illustrated on Figure 1. This process is a set of eight steps that begins
with identifying a problem and decision criteria, and allocating weights to those criteria; moves to developing,
analyzing, and selecting a variant that can resolve the problem; implements the variant; and concludes with
evaluating the decision´s effectiveness.
.
s t e p 1.
Identification of a Problem
s t e p 2.
Identification of Decision Criteria
s t e p 3.
Allocation of Weights to Criteria
s t e p 4.
Development of Variants
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Ku novice, Czech Repu blic, Ja nu a ry 2 9 -3 0 , 2 0 0 4
31
.
s t e p 5.
Analysis of Variants
s t e p 6.
Selection of a Variant
s t e p 7.
Implementation of the Variant
s t e p 8.
Evaluation of Decision Effectiveness
Figure 1. The Model of Decision-Making Process
Step 1: Identifying a Problem
The decision-making process begins the existence of a problem or, more specifically, a discerepancy
between an existing and a desired state of affairs.
Before something can be characterized as a problem, managers have to be aware of the discrepancy,
they have to be under pressure to take action, and they must have the resources necessary to take action. (See
Figure 2.)
How do managers become aware that they have a discrepancy? They obviously have to make a
comparison between their current state of affairs and some standard. What is that standard? It can be past
performance, previously set goals, or the performance of some other unit within the organization or in other
organizations.
Avareness of discrepancy
Pressure to act
Problem
Sufficient resources to do something
Problem: A discrepancy between an existing and a desired state of affairs.
Figure 2. The Characteristics of a Problem
Step 2: Identifying Decision Criteria
Once a manager has identified a problem that needs attention, the decision criteria important to
resolving the problem must be identified. That is, managers must determine what is relevant in making a
decision.
Whether explicitly stated or not, every decision maker has criteria that quide his or her decision. Note
that in this step in the decision-making process, what is not identified is as important as what is.
Step 3: Allocating Weights to the Criteria
The criteria listed in the previous step are not all equally important so the items must be weighted in
order to give them the correct priority in the decision.
How does the decision maker weight criteria? A simple approach is merely to give the most important
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Ku novice, Czech Repu blic, Ja nu a ry 2 9 -3 0 , 2 0 0 4
32
criterion a weight of 10 and then assign weights to the rest against this standard. Thus, in contrast to a criterion
that you gave a 5, the highest factor would be twice as important. Of course, you could use 100 or 1,000 or any
number you select as the highest weight. The idea is to use your personal preferences to assign a priority to the
relevant criteria in your decision as well as to indicate their degree of importance by assigning a weight to each.
Step 4: Developing Variants
The fourth step requires the decision maker to list the viable variants that could resolve the problem
No attempt is made in this step to evaluate these variants, only to list them.
Step 5: Analyzing Variants
Once the variants have been identified, the decision maker must critically analyze each one. The
strengths and weaknesses of each variant become evident as they are compared with the criteria and weights
established in steps 2 and 3. Each variant is evaluated by appraising it against the criteria.
Step 6: Selecting an Variant
The sixth step is the crucial act of choosing the best variant from among those listed and assessed.
Since we have determined all the pertinent factors in the decision, weighted them appropriately, and identified
the viable variants, we merely have to choose the variant that generated the highest score in step 5.
Step 7: Implementing the Variant
While the choice process is completed in the previous step, the decision may still fail if it isn´t
implemented properly. Therefore, step 7 is concerned with putting the decision into action.
Implementation includes conveying the decision to those affected and getting their commitment to it.
Groups or teams can help a manager achieve commitment. If the people who must carry out a decision
participate in the process, they are more likely to enthusiastically support the outcome.
Step 8: Evaluating Decision Effectiveness
The last step in the decision-making process appraises the result of the decision to see whether the
problem has been resolved. The variant chosen in step 6 and implemented in step 7 accomplish the desired
result.
What happens if, as a result of this evaluation, the problem is found to still exist? The manager then
needs to dissect carefully what went wrong. Was the problem incorrectly defined? Were errors made in
evaluating the various variants? Was the right variant selected but improperly implemented? Answers to such
questions might send the manager back to one of the earlier steps. It might even require starting the whole
decision process over.
CONCLUSION
Decision-making process is very important part of managerial functions. It has been realised as
systematic and structured approach to every management activity. The “eight steps model“ is a tool for
improvement of managerial works.
REFERENCES:
[1] ROBINS,S.P.,COULTER,M. Management. Prentice Hall, Inc. 1999
[2] HITTMÁR,Š. The decision-making process. MOSMIC, VŠD Žilina, 1997
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Ku novice, Czech Repu blic, Ja nu a ry 2 9 -3 0 , 2 0 0 4
33
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Ku novice, Czech Repu blic, Ja nu a ry 2 9 -3 0 , 2 0 0 4
34
MARKETING AND ITS GLOBAL ORIENTATION ON TURN OF THE CENTURIES
Milan Jedlička
MtF STU Trnava, Paulínska 16, 917 26 Trnava, SLOVAKIA
phone: +42133551032, e-mail: [email protected]
Abstract:The exploitation of marketing in global market has to have the systemic fundamentals. The
expression of systemic fundamentals particularly depends on these decisive premises:
· mutual synergy of analytic and synthetic actions,
· continual and consecutive execution of marketing actions,
· mutual synergy among several managing elements as a part of marketing management,
· creating various decisive variants which will dynamically respond to changing market
situation,
· providing feedback among plans and final evaluations of marketing activities to possible
exploitation of experience and knowledge at further strategic decision making.
· marketing communication and propagation communication and its essence and
signification in business practice,
· supply to demand and demand to supply philosophical orientation of propagation
communication,
· personal selling, sales promotion, public relations and advertising as the main techniques
of propagation communication activating,
· hierarchy business communication,
Keywords: market, marketing manager, customer, global marketing, marketing communication,
propagation communication
1. INTRODUCTION
Marketing as a branch or particular management action reaches a rapid development at present. Such
trend which has begun in post-war period (especially in sixties) became step by step more dynamic.
It bounds up with several factors, from which these can be signified as decisive:
·
increasing the technological level and thereby the product manufacturing in single commodities,
·
branching out the supply not only by new material products but especially by services,
·
gradual territorial development and branching out the world-wide market,
·
forming the new types of managers, which have more complex training and approaches to
customer,
·
using the new theoretical and practical knowledge and skills,
·
huge development of informatics and following increasing of communication quantity and quality,
·
growth of purchasing power almost all around the world etc.
Of course, this trend was supported in last years by transformation of many states from "directive forms
of management" to market economy.
The most advanced states of socialist type had very well developed industrial production, high level of
education and built up infrastructure on solid level. These were the positive starting-points for economy
transformation.
But to reach the effective change, there was, there is and there will be a lot of things to do. This is also not
possible without appreciation of some most important trends in development and using of marketing in real
management practice.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
35
Aim
Therefore this paper aims to call the attention to some trends in marketing which are related with
globalisation.
But there are general formulating philosophical starting-points, which every enterprise subject must retransform
to its own conditions.
Problem Analysis
Although most experts very quickly understand the fundamental idea of transformation process - if we
want to create market and market relations, it is not possible without marketing - concrete practice already shows
something different.
Let me generalize some of my observations, which I have gained during the longer time activities in
economic practice, but also during the activities at the University, especially during the special training courses
for managers. I took part at about eighty courses of low, middle and top management. Besides I acted as the
consultant in about ten companies.
Decisive observation are following:
·
the necessity of special training of managers is understood almost by everyone, but it is necessary
to train another ones and not yourself,
·
new organizational and decision approaches are accepted by everyone, but almost nobody uses
them
·
in many companies the board "commercial or sale department" was repainted on marketing
department and that was the end,
·
intro-preparation of training course the most frequents requirement was following: "teach us do
marketing in one day", so as soon and cheap as possible.
·
minimal will to change the well-running procedures, to think strategic and long-time, to awake the
need of all company departments co-operation etc.
There is a lot of problems in markets where the transformation of economy is running. We can not solve
all these problems and therefore I will discuss the problem of marketing communication in this paper.
Marketing communication and especially its specific field, propagation communication, is today one of
the determining activities of every company which makes business on markets where is big competition.
Marketing communication has great signification in business activity and also in macrolevel, as well as
in microlevel, in external and internal environment of company. It provides the necessary communication
relations in existing market reality, in supply to demand processes.
Well not only in this reduced understanding, in market environment, where in prevalent measure it
came into being and branched out after several centuries, if we take to reflections the documented propagation,
advertising or adman artefacts. In wide interpretation we can say even about many millenniums, during which it
influenced not only the economic, but also the social life.
In history there were rather singular cases in more adult civilisations or economics. Today we can
observe, that its signification dramatically increased and its exploitation has massive character. It is the massive
communication on all levels of company with activation of every business subject, which wants to be successful
and it has effort to address and persuade almost every potential customer. Marketing communication mainly
influences and enlarges not only social material consumption, but also immaterial one, which is represented by
many social or cultural displays. It becomes generally accepted cumulation of several activities and professions,
connection of social, economic, political and artistic information. Its aim is not only to sell but also to penetrate
into awareness, to influence thought level or amuse.
It creates synergical unity of the rational, but also the irrational, the pragmatical or emotional elements,
which in final consequence influence most effectively on final decision of asking subjects.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
36
As it was mentioned the term marketing communication is called promotion or propagation
communication. This fact is the result of this reasons:
·
in European area is typically used the term “propagation” e.g. against American, where it is used
especially the term “promotion“,
·
the contents of this word in its simplificated form very exactly represents its signification for this field
of communication,
·
some authors use as the covering term “advertising“ (e.g. J.Prachar), but this term is by most authors
considered only like a subsystem of propagation (promotion), which even follows too from its contents.
The propagation can be considered as the set of specific communication techniques and their active,
systematic, synergical and effective activity in benefit of company aims.
There are used the sources in the company but also out of the company (ideas, philosophies, techniques,
equipment similarly) on the influence in market environment (external also internal company environment) with
aim to spread the desirable information and multiply total number of the customers preferring and using
company supply.
Decisively it is a very arduous process, which needs to be performed by very knowledgeable and
creative managerial collective. On the designation of suitable aims, providing the adequate procedures and
gaining expected results propagation strategy must be work-out (long-time plan of activities) and also its
particular executable plans at shortly time period.
The philosophical starting-points for application of marketing communication
Making clear the substance of philosophical orientation of marketing communication has great
signification. Company must soon, as it starts, consider about its application, make sure this main philosophical
starting points:
·
present, respectively expected state of market environment,
·
present, res. expected state of company environment,
·
specification of the primary aim of propagation and orientation of its influence.
·
Without needful analyse of main starting points is not possible to create and use propagating
communication. Therefore it is necessary to know the state at market and in company, to define the
determining characteristics, fortes and weaknesses.
As a starting point for finding and decision making can help the designation of the determining factors
which influence the social, respectively market communication (see figure No.1).
Social
level
Need of
communication
Motive of
communication
Communicatio
Supply assortment
n environment
Ability of
communication
Thought
level
Figure 1: Determining factors of the communication environment
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
37
·
·
·
·
·
·
·
On one side it is necessary to consider factors connected with customer and its individual state:
need of communication (the basis of consumption behaviour, substance of individual qualification and
quantification, hierarchy of satiation of wants and their intensity etc.),
motive of communication (the natural necessity, emotional reaction, motivation and involvement etc.),
ability of communication (individual supposition, qualities, preparation, skills, know-how etc.).
On the other side in general social and market level there are these determining factors:
social level (conditions which creates concrete society in field of law, economic tools, services etc.),
thought level (the state in which situation the population is as the demand communicator which except
the individual factors has also social characteristics as the looking at political climate, living level,
social welfare etc.),
supply assortment (quality and quantity of supply, communication techniques, tools, equipment and
forms, competitive level of communication etc.).
The result of such a primary analysis should be the designation of the centrobaric type of customers,
their nearby surroundings, ability of influence and motivation other ones, observing of social traditions,
designation the tendencies and trends, climate etc.
From present starting points we can designate the main aims of propagation impact and primary orientation
which can have two primary philosophical tendencies:
·
supply to demand,
·
demand to supply.
The position of marketing communication in business practice
Marketing communication or specifically propagation communication in field of marketing
management became the most important tool of marketing mix because if it is applied in effective way it mainly
influences still not entirely decided or hands-off customers. And that was the reason why then new management
method “customer relationship management” came into existence.
While product, price, distribution, place or another used tools of marketing mix have in content of its
activity more rationality and less widely interpreted matter, where propagation is essentially less merge, more
creative, more striking, res. more reactive and dynamic. There is more taste of man in it, propagator or seller.
Just personal approach, individual care, strength figure, something more than only selling goods for some price
and obtained money, these are very important factors for customer.
Today propagation did not became by mistake the first tool of marketing mix. In present market
practice it is the main retaining starting point by which the company introduces its own supply and wants
influence the market environment by it.
In still increasing competition, in increasing amount of offers at market, there information chaos occurs,
even some disorientation of customer. Without the right solution of customer and “its redirection to your own
side“ the long-term market success can be very difficult expected.
If the company wants to be heard and accepted, it has eminently to change and creatively develop
communication approaches, it has to have clearly and clever estimated but also executed communication
strategy. And exactly propagation and then propagation strategy are the bearing columns of communication
activities of the company in market economic.
Analogous like the task of marketing mix in its strategic expression is to properly use the single tools,
so is that in case of propagation strategy. In prevalent number of technical literature there are mentioned as
primary propagation tools these four (2):
·
personal sales,
·
sales promotion,
·
public relations,
·
advertising.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
38
Their synergical combination or multichannel affect on customer is sometimes also called
“communication mix“ (4), res. for correct differentiation it is better to use term “propagation mix“ because as it
was already mentioned, the number of communication tools of company is essentially wider as four specific
tools of propagation. Besides other tools of marketing mix, that means product, price, distribution or place, it has
also an immutable communication signification.
In wide view, propagation communication is closely connected with and goes out from the general
company communication, respectively in low hierarchical level from marketing communication (see figure
No.2).
Company
Company
information
communication
system
Marketing
Marketing
information
communication
system
Propagation
Propagation
communication
information
system
Advertising
information
Advertising
communication
Figure 2: Hierarchy of company communication
Application of management method CRM
There exist several conceptions in application of CRM method. If the conception is oriented on specific
market segments, company has to search systemic approach to marketing communication management with
these segmetns, respectively with their main representatives (leader of opinion, etc.).
Following this conception (see fig.3) the market is divide into four basic segments:
·
general public (any member of public),
·
potential customers (which preferred the company),
·
decided customers (evident order),
·
consumers (at least one purchase).
P
R
active
reactive
conception
General
public
Potential
customers
Decided
customers
A
S
P
P
S
R
M
C
Consu- CK
mers R
M
Figure 3: The conception oriented on market segments
·
·
·
The main tasks in application of this conception approach are as follows:
to better determine the content of the needs, motives and wishes of these segments,
to determine the borders among the segments and also specify their quantity and quality of
requirements,
the are used as the tools of promotion or propagation communication (personal selling, promotion sale,
public relation, advertising), also CRM, but there is another specification following the KCRM method
(Key Costumers Relationship Management) which is especially concentrated on costumers).
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
39
Designed Solutions
The new trends, which would have come on after overcoming the present failures, can be summarised
into five following suppositions:
·
mutual synergy of analytic and synthetic actions,
·
mutual synergy among several managing elements as a part of marketing management,
·
continuity, flexibility and dynamics of the marketing actions execution ,
·
creating various decisive variants which will dynamically react on changing market situation,
·
providing feedback among plans and final evaluations of marketing activities to possible
exploitation of experience and knowledge at further strategic decision making.
1.
Supposition, which is associated with the synergy of analytic and synthetic activities, aims the consistent
systemic connection of market research, its segmentation on decisive parts, and creating the marketing
strategies following the specific requirements of its participants.
2.
In advanced companies marketing isn't already only an individual activity, but it is collective concept of
marketing management. The harmony among several managing elements, that means methods,
instruments, facilities and forms, which are used for reaching the market effects, is inevitable. This
supposition is straight linked up to the previous one and it elaborates and deepens this one moreover.
3.
Continuity, flexibility and dynamics of the marketing actions execution also belongs to systemic
suppositions. Market, its elements and relations among them are still complicated and they have mainly
subjective and stochastic character. Therefore, if we want to reach large a telling ability of the
information about market, we must recognise the market longer, watch its development and we must
consequently recognise the mutual relations among various market subjects and objects. To eternally
changing conditions it is necessary to respond flexible following the supply-demand motion and
dynamically use various marketing instruments.
4.
There must be the ambition to enforce strategic decision-making, but it is necessary to reflect, that in this
case it is typical to decide with risk and in vagueness. Such reality requires the creating of several
decision-making variants for different situations of the market surround.
5.
The feedback as a systemic condition results in comparison of plans and results and it gives manager the
real vision about the true state of the enterprise activities. It is mutually necessary to connect and
synchronise rather regulation arrangements at the operative level and rather inspection mechanisms at the
strategic level.
CONCLUSION
Marketing and its orientation and application on turn of the centuries must become the immanent part of
company management and must accept the latest trends of market that becomes more globalized.
It is not sufficient to discuss this problem, accept it on a formal way and reduce it only for example to
the problem of advertising or creating of image.
At pragmatic economic level such marketing orientation creates however the outlays first of all, while
marketing should use in a complex view to create the market effects connected with the adequate profit.
Without understanding the basic contents and substance of marketing communication, acquiring the
necessary information, creating the philosophy and correct assigning the aim oriented on customers and without
strategic oriented management, the serious result of entrepreneurial activities can not be expected.
REFERENCES:
[1] DOHNAL, J..: Řízení vztahů se zákazníky, Grada Publishing Praha, 2002
[2] JEDLIČKA, M.: Propagačná komunikácia podniku, Magna Trnava, 2000
[3] JEDLIČKA, M.: Marketingový strategický manažment, Magna Trnava, 2003
[4] LABSKÁ, H.: Marketingová komunikácia, JUP Nové Zámky, 1993.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
40
METHODS USED IN MARKETING SEGMENTATION PROCESS
Monika Rešková
University of Žilina, Faculty of management science and informatics, Department of management theories
Moyzesova 20, 010 26 Žilina, Slovak Republic
tel: +421415134458, e-mail: [email protected]
Abstract: To compete successfully in today’s volatile and competitive business markets, mass
marketing is no longer a viable option for most companies. Marketers must attack niche markets
that exhibit unique needs & wants. Market segmentation is the process of partitioning markets into
groups of potential customers with similar needs or characteristics who are likely to exhibit similar
purchase behaviour. Market segmentation is the foundation on which all other marketing actions
can be based. It requires a major commitment by management to customer-oriented planning,
research, implementation & control. The paper deals with the term of market segmentation,
describes market segmentation process and focuses on most frequent and useful techniques used in
market segmentation process.
Keywords: Market segmentation, marketing strategy, positioning, targeting, variables
1. WHAT IS MARKET SEGMENTATION?
Market segmentation describes the division of a market into homogeneous groups which will respond
differently to promotions, communications, advertising and other marketing mix variables. Each group, or
“segment”, can be targeted by a different marketing mix because the segments are created to minimize inherent
differences between respondents within each segment and maximize differences between each segment.
Market segmentation was first described in the 1950’s, when product differentiation was the primary
marketing strategy used. In the 1970’s and 1980’s, market segmentation began to take off as a means of
expanding sales and obtaining competitive advantages. In the 1990’s, target or direct marketers use many
sophisticated techniques, including market segmentation, to reach potential buyers with the most customized
offering possible.
2. USING MARKET SEGMENTATION
There are many good reasons for dividing a market into smaller segments. The primary reasons are:
·
Easier marketing It is easier to address the needs of smaller groups of customers, particularly if they have
many characteristics in common (e.g. seek the same benefits, same age, gender, etc.
·
Efficient. More efficient use of marketing resources by focusing on the best segments for your offering –
product, price, promotion, and place (distribution). Segmentation can help you avoid sending the wrong
message or sending your message to the wrong people.
·
Find niches. Identify under-served or un-served markets. Using “niche marketing”, segmentation can allow
a new company or new product to target less contested buyers and help a mature product seek new buyers.
3. CONDITIONS FOR USING MARKET SEGMENTATION
Any time you suspect there are significant, measurable differences in your market, you should consider
market segmentation. Identified segments must be:
·
Big enough. Market must be large enough to warrant segmenting. Don’t try to split a market that is already
very small.
·
Different. Differences must exist between members of the market and these differences must be measurable
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
41
·
·
·
·
through traditional data collection approaches (i.e., surveys).
Responsive. Once the market is segmented, you must be able to design marketing communications that
address the needs of the desired segments. If you can’t develop promotions and advertising that speak to
each segment, there is little value in knowing that those segments exist.
Reachable. Each segment must be reachable through one or more media. You must be able to get your
message in front of the right market segments for it to be effective. If one-eyed, green aliens are your best
marketing opportunity, make certain there is a magazine, cable program or some other medium that targets
these people (or be prepared to create one).
Interested in different benefits. Segments must not only differ on demographic and psychographic
characteristics, they must also differ on the benefits sought from the product. If everyone ultimately wants
the same things from your product, there is no reason to segment buyers. However, this is seldom the case.
Even commodities like sugar and paper plates can benefit from segmentation.
Profitable. The expected profits from expanding your markets and more effectively reaching buyer
segments must exceed the costs of developing multiple marketing programs, re-designing existing products
and/or creating new products to reach those segments.
4. MARKET SEGMENTATION PROCESS
There are two basic ways to segment a market:
·
A priori. A priori segmentation involves dividing a market into segments without the benefit of primary
market research. Manager intuition, analysis of secondary data sources, analysis of internal customer
databases or other methods are used to group people into various segments. Previous “post hoc”
segmentation studies are considered to be “a priori” when applied to the same markets at some point in the
future.
·
Post hoc. Primary market research is used to collect classification and descriptor variables for members of
the target market. Segments are not defined until after collection and analysis of all relevant information.
Multivariate analytical techniques are used to define each segment and develop a scoring algorithm for
placing all members of the target market into segments.
There are two types of information used in market segmentation:
Classification variables. Classification variables are used to classify survey respondents into market segments.
Almost any demographic, geographic, psychographic or behavioural variable can be used to classify people into
segments.
·
Demographic variables – Age, gender, income, ethnicity, marital status, education, occupation, household
size, length of residence, type of residence, etc
·
Geographic variables – City, state, zip code, census tract, county, region, metropolitan or rural location,
population density, climate, etc.
·
Psychographic variables – Attitudes, lifestyle, hobbies, risk aversion, personality traits, leadership traits,
magazines read, television programs watched, PRIZM clusters, etc.
·
Behavioural variables – Brand loyalty, usage level, benefits sought, distribution channels used, reaction to
marketing factors, etc.
Descriptor variables. Descriptors are used to describe each segment and distinguish one group from the
others. Descriptor variables must be easily obtainable measures or linkable to easily obtainable measures that
exist in or can be appended to customer files. Many of the classification variables can be considered descriptor
variables. However, only a small portion of those classification/descriptor variables are readily available from
secondary sources. The trick is to identify descriptor variables that effectively segment the market in the primary
research effort which are also available or can be appended to individual customer records in customer
databases.
The process of market segmentation, targeting and positioning is outlined in Figure 1. Phases 1 and 2 of
the process involve gathering and analysing market data to identify possible market segments. From this data a
range of possible market segments is identified which then requires further detailed analysing and planning.
These market segments are evaluated to determine which are worth targeting in terms of commercial viability
and potential growth. The selected segments will represent the target market that the organisation wishes to
reach, and the marketer moves on to phase 3. The product’s position in the market is determined and appropriate
marketing strategies developed and implemented.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
42
Phase 1 – Segmentation
1.
2.
3.
Market definition
Choosing the basis for segmentation
Dividing the market and profiling the segments
Phase 2 – Targeting
4. Valuating the potential and commercial
attractiveness of each segment
5. Choosing one or more segments
Phase 3 – Product positioning
6. Defining product positioning
7. Develop appropriate marketing strategies
8. Implementation marketing strategies
Figure 1: 8-step algorithm of market segmentation
5. METHODS OF MARKET SEGMENTATION
Most multivariate analytical techniques can be used and probably have been used in some way to create
post hoc market segments. There is no ideal methodology that works with every segmentation study. Each
methodology has advantages and disadvantages. Segmentation studies generally require the use of two or more
methodologies to produce the best results. In nearly every case, multiple techniques should be tested before
selecting the "best" solution. There are 3 categories of analytical techniques applied to market segmentation: data
preparation, data analysis, and classification.
Data Preparation
Numerous techniques can be used to aide the segmentation process.
Factor analysis.
Factor analysis can reduce the number of variables to a more manageable size while also removing correlations
between each variable.
Correspondence analysis.
The coordinates produced by correspondence analysis, when calculated at the individual or group level, can be
clustered to produce market segments. Correspondence analysis can also be used to convert nominal data (like
yes/no answers) to metric scales.
Conjoint analysis.
Utilities from conjoint analyses can be used in segmentation because they represent the relative value
individuals' place on all key attributes that define a product or service. In fact, conjoint utilities represent the
most effective basis variables because they are derived from respondent preferences between product options or
from actual choices of preferred products.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
43
Data Analysis
Cluster analysis.
Cluster analysis is the most frequently used method of segmenting a market. The underlying definitions
of cluster analysis procedures mimic the goals of market segmentation: to identify groups of respondents in a
manner that minimizes differences between members of each group while maximizing differences between
members of a group and those in all other groups. However, there is one key difference between clustering and
segmenting respondents -- clusters produce groups of respondents who have similar responses on key variables
while segmentation finds groups of respondents who have similar behaviours when purchasing and seeking
products in the market. Both hierarchical and iterative cluster analysis procedures can be used, but hierarchical
procedures are difficult to evaluate once you exceed 100 or 200 survey respondents. Among the various iterative
cluster analysis procedures, the K-Means method is most often used. K-Means cluster analysis can be found in
all of the most popular statistical programs (SAS, SPSS, BMDP, Statistica, SYSTAT).
Chi-square Automatic Interaction Detection (CHAID) or Classification and Regression Trees (CART).
CHAID and CART are known as "Classification Tree Methods." These methods divide respondents
into groups and then further divide each group into subgroups based on relationships between segmenting basis
variables and some dependent variable. The dependent variable is usually a key indicator such as usage level,
purchase intent, etc. These procedures create tree diagrams, starting at the top with all respondents combined and
then branching into 2 or more groups at each new level of the tree. Subdivisions are determined by finding the
survey variable that produces the greatest difference in the dependent variable among individual response
categories or groups of response categories on that survey variable.
CHAID is the most commonly used classification tree method, but it can not handle continuous
dependent variables so a combination of CHAID and CART is sometimes used. Both CHAID and CART have
the ability to process non-metric and non-ordinal data. Unlike cluster analysis, classification tree methods create
true segments when they divide respondents. However, these segments are only based on one dependent
variable. Other methods, including cluster analysis, divide respondents based on 10's or even 100's of data
elements.
Artificial neural networks.
Artificial Neural Networks or ANNs offer another means to segment respondents. The Kohonnen
architecture is one self-organizing ANN that can be used for segmentation. It is called self-organizing because,
like cluster analysis, there is no dependent variable specified in the model. The ANN attempts to group
respondents based on their similarities. It differs from cluster analysis in its ability to ignore noisy data. Atypical
individuals have less impact on the segmenting calculations and each successive iteration makes ever smaller
adjustments to the network weights so the calculations quickly stabilize, ignoring infrequent respondent
characteristics. The greater the variation or uncertainty in respondents' answers, the better ANNs perform
compared to cluster analysis.
Latent class structure models.
Latent class analysis is often described as "factor analysis for categorical variables." It is used to find
underlying constructs within sets of variables. However, latent class analysis can also be used to cluster
categorical variables into segments based on responses across a broad range of categorical variables. Latent
classes attempt to find the underlying constructs which motivate people to buy a particular product or to desire
certain features in that product.
Classification
There are a number of classification algorithms or analytical methods which can be applied to market
segmentation.
Discriminant analysis.
Discriminant analysis can be used to classify respondents into predefined segments based on descriptor
variables like census data. The segmentation scheme determines which respondents belong in each market
segment. The classification or scoring program then creates the means of identifying potential members of each
segment based on limited information (usually data which can be obtained from secondary sources). When a
limited set of information can be used to accurately predict which market segment each individual belongs, you
have a successful classification algorithm.
Multiple regression.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
44
Multiple regression and multinomial logit can be used in the same manner to create classification
schemes for your market segments.
6. MARKET SEGMENTATION OUTCOME
Unfortunately, there is no definitive answer to the question how many segments should be the outcome
of market segmentation. Experience, intuition, statistical results and common sense all must be applied to decide
on the number of segments to retain. If you have several very small segments, you may need to change the
criteria for segmentation or remove some of these respondents as outliers. Too many segments can lead to
developing many different marketing programs for small, very similar, markets.
But there are a few rules of thumb for segmentation:
·
Large enough. Majority of segments must be large enough to be economically feasible to target
marketing and product design efforts.
·
Relevant. The segments must be relevant to your company's products/services.
·
Reachable. Segments must be reachable through one or more marketing mix variables (price, promotion,
features or distribution).
·
Different. There must be clearly defined differences between market segments to make some segments
more desirable than others. If many of the segments want essentially the same features and intend to buy
at the same frequency or volume level, then these segments do not exhibit meaningful differences.
7. CONCLUSION
The paper has outlined the market segmentation process. This process lies at the heart of overall
philosophy of marketing. Success stories abound of companies who have successfully adopted and implemented
market segmentation into their planning process. Equally, stories of failures are all to frequent as companies
poorly define their markets, treat all customers in that market the same, do not evaluate market segments
rigorously or finally, fail to position the product appropriately or communicate this position effectively.
Market segmentation techniques are becoming increasingly sophisticated, but the important principles
that underline this process will always remain essential for the development of effective market strategies.
REFERENCES
[1] KOTLER, P.: Marketing management. Grada Publishing, spol .s.r.o. 1998
[2] REŠKOVÁ, M.: Rozhodovacie procesy v marketingu so zameraním na výber cieľových trhov v osobnej
železničnej doprave. (písomná práca k dizertačnej skúške), 2002
[3] McCARTHY, J., E., PERREAULT, W., D.: Základy marketingu. Victoria Publishing a.s., 1995
[4] SHARMA, A., LAMBERT, D., M.: Segmentation of Markets Based on Customer Service. International
Journal of Physical Distribution and Logistic Management, vol. 24. MCB University Press, p. 50-58, 1994.
[5] HRUSCHKA, H., NATTER, M.: Comparing Performance of Feedforward Neural Nets and K-means for
Cluster-based Market Segmentation. European Journal of Operational Research, p. 346-353, 1999.
[6] LIN, Ch., F.: Segmenting Customer Brand Preference: Demographic or Psychographic. Journal of Product
and Brand Management, vol. 11. MCB UP Limited, p. 249-268, 2002.
[7] SNYDER, R.: Market Segmentation: Successfully Targeting the Mature Population. The Journal of Active
Aging, 2002.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
45
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
46
ADVERTISING AND GLOBALIZATION INFLUENCES
Achalova Larisa Vladislavovna
Plehanov Russian Academy of Economics, the faculty of International Economic Relations, Moscow,
Stremiannii alley., 36, tel. (095) 236-40-94, (095) 237-95-52, [email protected]
Abstract: The article will present the topic concerning advertising and the influence of
globalization. Globalization is today’s reality. Advertising is an important part of the international
marketing program of firms competing in the global marketplace. There is a chance for companies
to use advantages of globalization in creating new global advertising campaigns.
Keywords: advertising; advertising objective, budget, media, message; direct-response advertising,
advertising to the consumer market, to the business and professional markets; international
advertising
Advertising is an integral part of our social and economic systems. In the complex society and
economic system in which we live, advertising has evolved into a vital communication system for both
consumers and business. The ability of advertising and other promotional methods to deliver carefully prepared
messages to targeted audience has given them a major role in the marketing programs of most organizations.
Companies ranging from large multinationals corporations to small retailers increasingly rely on advertising and
promotion to help them market products and services. In market-based economies, consumers have learned to
rely on advertising and other forms of promotion to provide them with information they can use in making
purchase decisions.
Hence, today companies must do more than make goods – they must inform consumers about product
benefits and carefully position products in consumers’ minds. To do this, they must skillfully use the masspromotion tools of advertising, sales promotion, and public relation.
We define advertising as any paid form of nonpersonal presentation and promotion of ideas, goods, or
services by an identified sponsor. Several aspects of this definition should be noted. First, the paid aspect of this
definition reflects the fact that the space or time for an advertising message generally must be bought. The
nonpersonal component indicates advertising involves mass media (e.g., television, radio, magazines,
newspapers) whereby a message can be transmitted to large groups of individuals, often at the same time. The
nonpersonal nature of advertising means there is generally no opportunity for immediate feedback from the
message is sent, the advertiser must attempt to understand how the audience will interpret and respond to the
message.
Also, advertising can be defined as a method of delivering a message from a sponsor, through an
impersonal medium, to many people (the word ad comes from the Latin ad vertere, meaning “to turn the mind
toward”).
According to the Longman Business English Dictionary, advertising is a form of telling people
publicity about a product or service in order to persuade them to buy it.
The roots of advertising can be traced back to early history. Although advertising is used mostly by
private enterprise, it also is used by a wide range of other organizations and agencies, postal service and
branches of the armed services. Advertising is a good way to inform and persuade, whether the purpose is to sell
Coca-Cola worldwide or to get consumers in a developing nation to drink milk or use birth control.
Different organizations handle advertising in different ways. In small companies, advertising might be
handled by someone in the sales department. Large companies set up advertising department whose job it is to
set up the advertising budget, work with the ad agency, any handle direct mail advertising, dealer displays, and
other advertising not done by the agency. Most large companies use outside advertising agencies because they
offer several advantages.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
47
ADVANTAGES OF ADVERTISING. There are several advantages to the use of advertising in the
firm’s promotional mix. Since the company pays for the advertising space, it can control what it wants to say
when it wants to say it, and to some extent, to whom the message is sent. Advertising can also represent a costeffective method for communicating with large audiences, and cost per contract through advertising is often
quite low.
Advertising also can be used to create images and symbolic appeals for products and services, a
capability that is very important to companies selling products and services that are difficult to differentiate.
Another advantage of advertising is its value in creating and maintaining brand equity. Brand equity can
be thought of as a type of intangible asset of added value or goodwill that results from the favorable image,
impressions of differentiation, and/or the strength of consumer attachment to a company name, brand name, or
trademark. The equity that results from a strong company or brand name is important because it allows a brand
to earn greater sales volume and/or higher margins than it could without the name and also provides the
company or brand with a competitive advantage.
Yet another advantage of advertising is its ability to strike a responsive chord with consumers when
other elements of the marketing program have not been successful.
DISADVANTAGES OF ADVERTISING. Advertising has some disadvantages. The cost of
producing and placing advertising can be very high.
Other problems with advertising include its credibility and the ease with which it can be ignored.
Advertising is often treated with skepticism by consumers, many of whom perceive it to be vary biased and are
concerned by its intent to persuade. Not only are consumers skeptical about many of the advertising message
they see and hear, but also it is relatively easy for them to process selectively only those ads of interest to them.
Actually, with so many messages competing for our attention every day, it is out of necessity that we must
ignore the vast majority of them. The high level of “clutter” is a major problem in advertising. The numerous
commercials we see on television or hear on the radio, as well as the many ads that appear in most magazines
and newspapers, make it very difficult for advertisers to get their messages noticed and attended to by
consumers.
MAJOR DECISIONS IN ADVERTISING. Company must make five important decisions when
developing an advertising program.
1. SETTING OBGECTIVES. The first step in developing an advertising program is to set advertising
objectives. These objectives should be based on past decisions about the target market, positioning, and
marketing mix. The marketing positioning and mix strategy define the job that advertising must do in the total
marketing program.
An advertising objective is a specific communication task to be accomplished with a specific target
audience during a specific period of time. Advertising objectives can be classified by purpose - whether their aim
is to inform, persuade, or remind.
Message decisions:
Objectives setting:
Budget decisions:
Communication
objectives
Affordable approach
Percent of sales
Competitive parity
Objective and task
Sales objectives
Message selection
Message generation
Message evaluation
Message execution
Campaign evaluation:
Communication impact
Sales impact
Media decision:
Reach, frequency, impact
Major media types
Specific media vehicles
Media timing
Figure 1 Major advertising decision
Informative advertising is used heavily when introducing a new product category. In this case, the
objective is to build primary demand. Thus, producers of compact disc players first informed consumers of the
sound and convenience benefits of CDs.
Persuasive advertising becomes more important as competition increases. Here, the company's
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
48
objective is to build selective demand. For example, when compact disc players became established and
accepted, Sony began trying to persuade consumers that its brand offers the best quality for their money.
Some persuasive advertising has become comparison advertising, in which a company directly or
indirectly compares its brand with one or more other brands.
Reminder advertising is important for mature products - it keeps consumers thinking about the product.
Expensive Coca-Cola ads on television are designed to remind people about Coca-Cola, not to inform or
persuade them.
2. SETTING THE ADVERTISING BUDGET. After determining its advertising objectives, the
company next sets its advertising budget for each product. The role of advertising is to affect demand for a
product. The company wants to spend the amount needed to achieve the sales goal.
3. CREATING THE ADVERTISING MESSAGE. A large advertising budget does not guarantee a
successful advertising campaign. Two advertisers can spend the same amount on advertising, yet have very
different results. Studies show that creative advertising messages can be more important to advertising success
than the number of money spent. No matter how big the budget, advertising can succeed only if commercials
gain attention and communicative well. Therefore, the budget must be invested in effective advertising message.
Good advertising message are especially important in today’s costly and cluttered advertising
environment.
4. SELECTING ADVERTISING MEDIA. The advertiser next chooses advertising media to carry the
message. The major step in media selection are deciding on reach, frequency, and impact; choosing among major
media types; selecting specific media vehicles; and deciding on media types.
To select media, the advertiser must decide what reach and frequency are needed to achieve advertising
objectives. Reach is a measure of the percentage of people in the target market who are exposed to the ad
campaign during a given period time. Frequency is a measure of how many times the average person in the
target market is exposed to the message. The advertiser also must decide o the desired media impact – the
qualitative value of a message exposure through a given medium.
The major media types are newspapers, television, direct mail, radio, magazines, and outdoor. Each
medium has advantages and limitations.
The best media vehicles – specific media within each general media type.
Media planners also compute the cost per thousand persons reached by a vehicle; must consider the
costs of producing ads for different media; must balance media cost measures against several media impact
factors.
The advertiser also must decide how to schedule the advertising over the cource of a year.
Finally, the advertiser has to choose the pattern of the ads. Continuity means scheduling ads evenly
within a given period. Pulsing means scheduling ads unevenly over a given time period.
5. ADVERTISING EVALUATION. The advertising program should evaluate both the communication
effects and the sales effects of advertising regularly. Measuring the communication effect of an ad – copy testing
– tells whether the ad is communicating well. Measuring the sales effect – compare past sales with past
advertising expenditures.
6. CLASSIFICATIONS OF ADVERTISING. The nature and purpose of advertising differ from one
industry to another and/or across situations. The target of an organization’s advertising efforts often varies, as
does its role and function in the marketing program. To better understand the nature and purpose of advertising
to the final buyer, it is useful to examine some classifications of the various types of advertising.
So, we can subdivide advertising into two main groups:
1.
Advertising to the consumer market;
2.
Advertising to the business and professional markets.
Advertising to the consumer market.
Advertising done by a company on a nationwide basis or in most regions of the country and targeted to
the ultimate consumer market is known as national advertising. The companies that sponsor these ads are
generally referred to as national advertisers. Most of the ads for well-known brands that we see on prime-time
television or in other major national and regional media are examples of national advertising. This form of
advertising is usually very general, as it rarely includes specific prices, directions for buying the product, or
special services associated with the purchase. This type of advertising makes known or reminds consumers of the
brand and its features, benefits, advantages, and uses or reminds its image so consumers will be predisposed to
purchasing it, wherever and whenever it is needed and convenient to do so.
National advertising is the best-known and most widely discussed form of promotion, probably because
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
49
of its pervasiveness.
Another prevalent type of advertising directed at the consumer market is classified as retail (local)
advertising. This type of advertising is done by retailers or local merchants to encourage consumers to shop at a
specific store or to use a local services such as bank, fitness club, or restaurant. Retail advertising tends to
emphasize specific customer benefits such as store house, credit policies, service, store atmosphere, merchandise
assortments, or other distinguishing attributes.
Direct-response advertising is a method of direct marketing whereby a product is promoted through
an advertisement that offers the customer the opportunity to purchase directly from the manufacturer.
Another way of viewing advertising to the ultimate customer is in terms of whether the message is
designed to stimulate either primary or selective demand. Primary demand advertising is designed to stimulate
demand for the general product class or entire industry, whereas selective demands advertising focuses on
creating demand for a particular manufacturer’s brands. Primary demand advertising is often used as a part of a
promotional strategy for a new product to help it gain acceptance among customers. Selective demands
advertising is getting consumers to consider using the product.
Advertising to the business and professional markets.
For many companies, the ultimate customer is not the mass consumer market but rather another
business, industry, or profession. Business-to-business advertising is used by one business to advertiser its
products or services to another. The target for business advertising is individuals who either use a product or
service or influence a firm’s decision to purchase another company’s product to service. Three basic categories
of business-to-business advertising are industrial, professional, and trade advertising.
Advertising targeted at individuals who buy or influence the purchase of industrial goods or other
services is known as industrial advertising. Industrial goods are those products that either become a physical
part of another product (raw materials, component parts), are used in the manufacture of other goods (machinery,
equipment), or are used to help the manufacturer conduct business (office supplies, computers, copy machines,
etc.). Business services, such as insurance, financial services, and health care, are also included in this category.
Industrial advertising is usually found in general business publications (such as Fortune, Business Week) or in
publications targeted to the particular industry.
Industrial advertising is often not designed to sell a product or service directly, as the purchase of
industrial goods is often a complex process involving a number of individuals. An industrial ad helps make the
company and its product or service better known by the industrial customer, assists in developing an image for
the firm, and perhaps most important, opens doors for the company’s sales representatives when they call on
these customers.
Advertising that is targeted to professional groups – such as doctors, lawyers, dentists, engineers, or
professors – to encourage them to use or specify the advertiser’s product for others’ use is known as
professional advertising. Professional group are important because they constitute a market for products and
services they use in their business. Also, their advice, recommendation, or specification of a product or service
often influences many consumer purchase decisions.
These classifications of the various types of advertising demonstrate that this promotional element is
used in a variety of ways and by a number of different organizations. Advertising is a very flexible promotional
tool whose role in a marketing program will vary depending on the situation facing the organization and what
information needs to be communicated.
Companies are also focusing attention on international markets because of the opportunities they offer
for growth and profits. Advertising and promotion are important parts of the international marketing program of
firms competing in the global marketplace.
However, in addition to its importance, many companies are realizing the challenge and difficulties they
face in developing and implementing advertising and promotion programs for international markets. Companies
planning on marketing and advertising their products or services abroad face an unfamiliar marketing
environment and customers with a different set of values, customs, consumption patterns and habits, as well as
differing purchase motives and abilities. Not only may the language vary from country to country, but also
several may be spoken within a country, such as in India or Switzerland.
Just as with domestic marketing, companies engaging in international marketing must carefully analyze
and consider the major environmental factors of each market in which they compete. The major environmental
factors affecting international marketing include economic, demographic, cultural, and political/legal variables.
Figure 2 shows some of the factors marketers must consider in each category when analyzing the environment of
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
50
each country or market. Consideration of these factors is important not only in evaluating the viability and/or
potential of each country as a market but also in designing and implementing a marketing and promotional
program.
Economic environment:
Cultural environment:
Stage of economic
development
Economic infrastructure
Standard of living
Per capital income
Distribution of wealth
Currency stability
Exchange rates
Language
Lifestyles
Values
Norms and customs
Ethics and moral standards
Taboos
International
MARKETING
Demographic environment:
Political environment:
Size of population
Number of households
Household size
Age distribution
Occupation distribution
Education levels
Employment rate
Income levels
Government policies
Laws and regulations
Political stability
Nationalism
Attitude toward multinational
companies
Figure 2 Forces in the international marketing environment.
The discussion of differences in the marketing environment of various countries
suggests that each market is different and requires a distinct marketing and advertising program. However, in
recent years, a great deal of attention has been focused on the concept of global marketing whereby a
company utilizes a common marketing plan for all countries in which it operates, thus selling the product
in essentially the same way everywhere in the world. Global advertising would fall under the umbrella of
global marketing as a means of implementing this strategy by using the same basic advertising approach in all
markets.
The idea of a global marketing strategy and advertising program offers certain advantages to a
company, including:
·
Economies of scale in production and distribution.
·
Lower marketing and advertising costs as a result of reductions in planning and control.
·
Lower advertising production costs.
·
Abilities to exploit good ideas on a worldwide basis and introduce products quick)y into various world
markets.
·
A consistent international brand and/or company image.
·
Simplification of coordination and control of marketing and promotional programs.
Advocates of global marketing and advertising contend that standardized products are possible in all
countries if marketers emphasize quality, reliability, and low prices. They argue that people everywhere want to
buy the same products and live the same way. The results of product standardization are lower design and
production costs as well as greater marketing efficiency, which translates into lower prices for consumers.
Additionally, product standardization and global marketing enable companies to roll out products faster into
world markets, which is becoming increasingly important as product life cycles become shorter and competition
increases.
Opponents of the standardized, global approach argue that differences in culture, marketing, and
economic development; consumer needs and usage patterns; media availabilities; and legal restrictions make it
extremely difficult to develop an effective universal approach to marketing and advertising. International
advertisers face many complexities not encountered by domestic advertisers. The basic issue concerns the degree
to which global advertising should be adapted to the unique characteristics of various country markets. Some
large advertisers have attempted to support their global brands with highly standardized worldwide advertising.
Standardization produces many benefits - lower advertising costs, greater coordination of global advertising
efforts, and a more consistent worldwide company or product image. However, standardization also has
drawbacks. Most importantly, it ignores the fact that country markets differ greatly in their cultures,
demographics, and economic conditions. Thus, most international advertisers think globally but act locally. They
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
51
develop global advertising strategies that bring efficiency and consistency to their worldwide advertising efforts.
Then they adapt their advertising programs to make them more responsive to consumer needs and expectations
within local markets.
Global advertisers face several additional problems. For instance, advertising media costs and
availability differ considerably from country to country. Some countries have too few media to handle all of the
advertising offered to them. Other countries are peppered with so many media that an advertiser cannot gain
national coverage at a reasonable cost. Media prices often are negotiated and may vary greatly.
Countries also differ in the extent to which they regulate advertising practices. Many countries have
extensive systems of laws restricting how much a company can spend on advertising, the media used, the nature
of advertising claims, and other aspects of the advertising program. Such restrictions often require that
advertisers adapt their campaigns from country to country.
Thus, although advertisers may develop global strategies to guide their overall advertising efforts,
specific advertising programs usually must be adapted to meet local cultures and customs, media characteristics,
and advertising regulations.
While globalization of advertising is viewed by many in the advertising industry as a difficult task,
some progress has been made in learning what products and services are best suited to worldwide appears.
Products that can take advantage of global marketing and advertising opportunities include:
1.
Brands that can be adapted for a visual appeal that avoids the problem of trying to translate words
into dozens of languages;
2.
Brands that are promoted with image campaigns that lend themselves to themes that play up to to
universal appeals such as sex or wealth;
3.
High-tech products coming to the world for the first time; new technology products coming on the
world at once and not steeped in the cultural heritage of the country;
4.
Products with nationalistic flavor if the country has a reputation in the field;
5.
Many companies and brands rely heavily on visual appeals that are easily adapted for use in global
advertising campaigns.
Summing up, it is important to note, that the process of globalization the sphere of marketing, especially
advertising, is a reality, which represents the objective and completely inevitable phenomenon of nowadays. It
can be braked by objective reasons or by means of economic policy, but it is impossible to stop or to cancel.
REFERENCES:
1. BEARDEN W.O., INGRAM T.N., LAFORGE R.W.: Marketing: principals, perspectives. – Chicago: Irwin,
1995. – 631 p.
2. BELCH G.E., BELCH M.A.: Introduction to advertising and promotion. An integrated communications
perspective. – Homewood: Irwin, 1993 – 836 p.
3. Cateora, PHILIP R.: International marketing. – 9th ed. – Chicago: IRWIN, 1996. – 770 p.
4. KAATZ RON: Advertising & marketing checklists: 107 PROVEN CHECKLISTS TO SAVE TIME & boost
advertising & marketing effectiveness. – 2nd ed. –Lincolnwood: NTC Business Books/NTC a division of
NTC Publishing Group, 1995. – 240 p.
5. KOTLER Ph.: Marketing Management. – Millenium ed., International ed. – Englewood Cliffs: Prentice-Hall
International, Inc., 2000. – 718 p.
6. KOTLER Ph.: Marketing management: Analysis, planning, implementation, and control. – 8th ed. –
Englewood Cliffs: Prentice-Hall International, Inc., 1994. – 801 p.
7. KOTLER Ph.: Principles of marketing. – 2nd ed. – Englewood Cliffs: Prentice-Hall, 1994. – 692 p.
8. PATTI C.H., HARTLRY S.W., KENNEDY S.L.: Business-to-bussiness advertising: a marketing
management approach. – Chicago: Business/Professional Advertising Association (B/PAA), 1991. – 286 p.
9. SCHULTZ D., TANNENBAUM S., ALLISON A.: Essentials of advertising strategy. – 3rd ed. –
Lincolnwood: NTC Business Books, a division of NTC Publishing Group, 1996.-155p.
10. SHIMP T.A.: Advertising, promotion, and supplemental aspects of integrated marketing communication. –
4th ed. – Fort Worth: The Dryden Press – Harcourt Brace College Publishers, 1997.- 589 p.
11. RUSSELL T., LANE R.: Kleppner’s Advertising procedure. – 2nd ed., - Englewood Cliffs: Prentice hall, New
Jersey, 1990. - 718 p.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
52
CUSTOMER BEHAVIOUR
Radoslav Jankal
University of Žilina, Faculty of Management Science and Informatics,
Department of Management Theories
Moyzesova 20, 010 26 Žilina, Slovak republic, tel.:+421 41 513 4458, fax: +421 41 5652 775
email: [email protected]
Abstract: The sense of marketing is to meet needs and wants of customers. To understand the
customer behaviour is not easy. The consumers often say something else as they finally do. Their
thinking and ideas can change very quickly by the new stimulus. When the firms want to be
successful and profitable, they need to know very well, what and why their customers buy.
Keywords: behaviour, customer, trends, technology
1 WHAT IS CUSTOMER BEHAVIOUR
Mental and physical activities undertaken by household and business customers that result in decisions
and actions to pay for, purchase and use products and services. This is one of the views on the customer
behaviour. Customer behaviour reflects customers´ decisions with respect to:
·
the acquisition, consumption, and disposition
·
of goods and services, time, and ideas
·
by (human) decision making [over time].
For the understanding of customer behaviour, is useful to know the answers on the questions in the figure
1. [4]
Why
ar e o ur pr o duc ts
bo ug ht
Whe n
ar e c o mpe titiv e
c o nsume d
H o w o fte n
pr o duc ts
H o w muc h
ar e substitute s
By who m
Whe r e
e v aluate d
How
Figure 1 Questions for better understanding of customer behaviour
The start point of customer behaviour study is the model, which represent, how the customer reacts on
different stimuli (see figure 2). Marketing stimuli and environs effect enter in the customer subconscious.
Typical customer characteristics and his decision process lead in specific buyer response. The role of marketers
is to recognize, what is going on in customer mind from the moment, when the specific stimuli from external
environment get in his subconscious, to the moment, when he makes a purchase decision. [3]
M ar ke ting
stimuli
Product
Price
Place
Promotion
O the r stimuli
Economic
Technological
Political
Cultural
Custo me r
c har ac te r istic s
Cultural
Social
Personal
Psychological
Custo me r de c isio n
pr o c e ss
Buye r ’s
r e spo nse s
Problem identification
Collection of information
Evaluation of information
Decision
Behaviour after buying
Product choice
Brand choice
Dealer choice
Purchase timing
Purchase amount
Figure 2 Model of customer behaviour
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Ku novice, Czech Repu blic, Ja nu a ry 2 9 -3 0 , 2 0 0 4
53
2 TRENDS IN CUSTOMER BEHAVIOUR
Predicting future customer behaviour requires predicting the factors that influence customer behaviour.
Almost any of these diverse factors is capable of changing and, in turn, changing customer behaviour. Here is
on the three factors that are expected to cause the most significant change in future customer behaviour:
changes in demographics, advances in technology, and changes in public policy (see figure 3). Changes in
other customer factors are likely as well, but they will be less discernible and perhaps less dramatic. For this
reason, we explore these, projecting the trajectory of developments already visible.
De mo g r aphic s
· a ging popu la tion
· women in work force
· single-person hou seholds
· declining middle cla ss
· ethnic diversity
· geogra phic redistribu tion
Te c hno lo g y
· control over informa tion
· sma rt produ cts
· a ccess to produ cts
· ma ss cu stomisa tion
Public Po lic y
· pra gma tism over ideology
· rights of pa ssive consu mers
· regiona l economic integra tion
Figure 3 Changes in the determinants of customer behaviour
Anticipating trends in customer behaviour can give companies a key strategic advantage. Companies
that will survive and thrive in the marketplace of tomorrow are those in which managers spend less time
worrying about how to position their firm among current competitors and more time trying to envisage a new
competitive space – i.e. a space defined by tomorrow’s customer needs and wants.
new ma rk ets from
cha nnelling a la tent
need
first to
ma rk et
positive
pu blic
opinion
marketing
executive
Figure 4 Strategic advantages of anticipating customer trends
Foreseeing the coming trends in customers’ needs and wants offers companies several advantages (see
figure 4). First, if you see a market need first, you can be the first to start working to meet that need.
Consequently, it reduces the fulfilment time, making the economic payback a lot quicker. For example, Sony
anticipated the Walkman phenomenon, was the first in the marketplace with its product, and succeeded.
Second, by sighting a trend, the industry can create a market by channelling a latent need. One example
is the mobile phone industry. Being able to communicate without being grounded in one location (i.e. wireless
communication) is certainly on many people’s wish list, but not everyone can afford the high initial equipment
costs. The industry responded by offering a free phone with a service contract, and the market for mobile
phones skyrocketed. Many people consider mobile phones an essential piece of equipment and the expenditure
on mobile phones is considered in the same category as expenditures made on fast food, poker machines and
electrical goods. As a nation, Australia has taken to mobile phones faster than anyone else.
Third, anticipating trends in customer needs and wants and responding to them creates positive public
opinion for the company and the industry, portraying them as responsive. For example, Australian airlines and
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Ku novice, Czech Repu blic, Ja nu a ry 2 9 -3 0 , 2 0 0 4
54
terminals are smoke free; many restaurants have moved to a no-smoking policy even before such a policy has
become law; several local and foreign products are now produced and marketed as being environmentally
friendly products, such as organically grown foods (hydroponic tomatoes at the supermarkets), recyclable paper
products (toilet paper, writing paper) and cosmetics developed without being tested on animals. [4]
2.1 Demographic trends
Changing demographics of the population serve as a good indicator of the future marketplace. As we
have illustrated in preceding chapters, such demographic characteristics as age, income, race and geographic
location of a customer intimately influence his or her customer behaviour. Consequently, when the
demographic make-up of a population changes, the marketplace changes in terms of its needs and wants.
Studying projections or trends in the demographic make-up of a population can therefore help marketers
anticipate the needs and wants of their customers. Six demographic trends have already begun to transform the
marketplace:
·
the ageing of the population
·
the rise in the number of working women
·
an increase in single-person households
·
the decline of the middle class
·
the increase in ethnic diversity
·
geographic redistribution.
2.2 Technological trends
A second force shaping future customer values is technology. Advances in technology have already
given customers increased access to information, newer generations of products, automation of transaction
processes to provide customers with greater flexibility and control, and access to some customised products.
Future developments will further change products and the very nature of customer behaviour.
In particular, technological advances are expected to include products that give customers more control
over information and information access-smart products, automation of processes that liberate customers from
operations-driven processes and products and services that enable customised lifestyles.
Customer responses to new technology
Technological developments will in turn stimulate changes in customer behaviour. Customers will
increasingly take on the role of co-producers. In addition, they will engage in disintermediation, outsourcing
and automation of consumption (figure 5). [4]
1 . Custo me r s as c o -pr o duc e r s
2 . Disinte r me diatio n
3 . Custo me r o utso ur c ing
4 . Auto matio n o f c o nsumptio n
Figure 5 Trends in customer behaviour as a result of improved technology
2.3 Trends in public policy
Several projected trends in public policy are relevant to a study of customer behaviour. First, economic
pragmatism is prevailing over ideology. In addition, many governments are concerned with protection of
passive consumers. Finally, governments on an international level are pursuing regional economic integration.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Ku novice, Czech Repu blic, Ja nu a ry 2 9 -3 0 , 2 0 0 4
55
3. CONCLUSION
Markets have to be understood before marketing plans can be developed. The consumer market buys
goods and services for personal consumption.
The customer’s behaviour is influenced by four major factors: cultural (culture, subculture, and social
class); social (reference groups, family, and roles and statuses); personal (age and life cycle stage, occupation,
economic circumstances, life-style, and personality and self-concept); and psychological (motivation,
perception, learning, and beliefs and attitudes). All of these provide clues as to how to reach and serve the
buyer more effectively. [2]
Before planning its marketing, a company needs to identify its target consumers and the type of
decision process they go through. When buying something, a buyer goes through a decision process consisting
of problem recognition, information search, evaluation of alternatives, purchase decision, and postpurchase
behavior. The marketer's job is to understand the various participants in the buying process and the major
influences on buying behavior. This understanding allows the marketer to develop a significant and effective
marketing program for the target market. Here is very necessary and useful good and effective communication.
The communication is one of the most important points of the management of each company. It is the
irreplaceable tool of the cooperation and relation of employees and management in their common effort for the
achieving of the target vision, mission and specified aims. [1]
With regard to new products, consumers respond at different rates, depending on the consumer's
characteristics and the product's characteristics. Manufacturers try to bring their new products to the attention
of potential early adopters, particularly those who are opinion leaders.
REFERENCES:
[1] BLAŠKOVÁ, M.: Riadenie a rozvoj ľudského potenciálu: uplatňovanie motivačného akcentu v procesoch
práce s ľuďmi. - 1. vyd. - Žilina : Žilinská univerzita, 2003. ISBN 80-8070-034-6
[2] KOTLER, P.: Marketing essentials. Prentice Hall. 1984. ISBN 0-13-557232-0
[3] KOTLER, P.: Marketing management. Grada. 2001. ISBN 80-247-0016-6
[4] WIDING, R. et al.: Customer Behaviour: Consumer Behaviour and Beyond. Thomson Learning. 2002.
ISBN: 0-17-010786-8
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Ku novice, Czech Repu blic, Ja nu a ry 2 9 -3 0 , 2 0 0 4
56
CONSUMER BEHAVIOUR IN GLOBAL MARKET
Elvira Seifoullaeva
The Russian Plekhanov Academy of Economics – the Faculty of International Economic Relations:
113054, Stremanniy per. 36, Moscow, Russia, (095) 237-95-52;
[email protected], [email protected]
Abstract: in modern world success of a company directly depends on the level of its orientation on
the consumers and knowledge of major aspects of consumer behaviour.Consumer behaviour is
influenced by many factors, such as culture and mentality, personal attitudes and social backgroud.
Thus, market segmentation is nessesary for a company. Analysing consumer behaviour helps
companies create more effective marketing strategies and stay leaders on the market, constantly
widening the supply area.
Keywords:consumer behaviour, marketing strategy, methods of research and evaluation,
influencing factors, consumer preferences, marketing decisions, global market.
It is well known that nowadays the success of a company mainly depends on how efficient this
company is in satisfying consumers’ needs and desires. And the only way to be closer to the consumers is to
learn as much as possible about their preferences and their behaviour.
In modern world only a quality-focused and customer-driven strategy creates market leaders and
inevitably results in profitability and success.
Understanding consumer behaviour is the key to increasing the efficiency of a company, thus, creating
the image of a consumer oriented producer and increasing volume of sales accordingly.
Consumer behavior is often defined as “the study of individuals, groups, or organizations and the
processes they use to select, secure, use, and dispose of products, services, experiences or ideas to satisfy needs,
and the impacts that these processes have on the consumer and society”.
Thus, the definition brings up some very important points:
1. Behavior occurs either for the individual, or in the context of a group (i.e., friends often influence
personal decisions concerning buying sertain goods or not buying them) or an organization
(management decisions on products the company should use);
2. Consumer behavior involves the use and disposal of products as well as the study of how they are
purchased. Product use is often of great interest to the marketer, because it may influence how a
product is best positioned or how the increase of consumption can be encouraged;
3. Consumer behavior involves services and ideas as well as tangible products;
4. The impact of consumer behavior on society is also relevant.
In global marketing there are four major applications of consumer behavior:
1. For marketing strategy—i.e., for creating and adopting more effective marketing campaigns;
2. For public policy;
3. For Social marketing, which involves getting ideas across to consumers rather than selling
something;
4. For helping managers evaluate the supply through the consumers’ eyes.
In international marketing as well as in marketing in general there are several crucial units that should
be analyzed. Consumer behaviour is one of them.
It can be analysed through two main research methods: secondary and primary research.
Secondary research presupposes the use of data, already collected in reports, statistics or some other sources.
However, in some cases, the required information is too specific and then a nessesity to use original –
primary research arises.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
57
A company has a variety of possibilities in using different instruments of primary research, such as:
·
Surveys – one of the most useful ways for getting much specific information. Surveys may contain
either open-ended questions, where a respondent is not limited to the options listed and therefore is
not influenced by them, or closed-ended questions, where the respondent is asked to select answers
from a brief list.;
·
Experiments are used when only one explanation for a particular observation should be ruled out.
However, this method has a serious drawback - the consumer is removed from his or her natural
surroundings;
·
Focus groups are an important marketing category, which involve getting a group of 6-12
consumers together to discuss product usage. Focus groups are especially useful if there are no
specific questions to ask yet, because they help find out what consumers’ concerns might be;
·
Still, there are also drawbacks such as high costs and the fact that generalization toward the entire
population is difficult for such small sample sizes. Moreover, the fact that focus groups involve
social interaction means that, participants might say not what they really think, but what in their
opinion will make themselves look better, which can be defined as the social desirability bias;
·
Projective techniques can be used when a consumer feel embarrassed to admit to certain opinions,
feelings, or preferences. For example, many older executives may not feel comfortable admitting
to being intimidated by computers. It’s obvious, that in such cases people will tend to respond
more openly not for themselves but for "someone else" ;
·
Observation of consumers is often a powerful tool. Looking at how consumers select products may
yield insights into how they make decisions and what they are actually looking for.
In consumer behaviour analysis segmentation also plays an important role. Understanding the consumer
allows a company segment markets more meaningfully.
Segmentation basically involves dividing consumers into groups in such a way, that members of a
group are on the one hand as similar as possible to each other, but on the other hand they differ as much as
possible from members of other segments. This enables companies to deal with each segment variously by:
1. Providing different products according to customers’ preferences;
2. Offering different prices;
3. Distributing the products where they are likely to be bought by the targeted segment.
To icrease the effectiveness of a segment structure:
·
Each segment must have an identity—i.e., it must contain members that can be described in some
way (like price sensitive) and that behave differently from another segment;
·
Each segment must engage in systematic behaviours (i.e., a price sensitive segment should
consistently prefer the low price item rather than randomly switching between high and low priced
brands);
·
Each segment must offer marketing mix efficiency potential—i.e., it must be profitable to serve.
For example, a large segment may be profitable even though the competition it attracts tends to
keep prices down. A smaller segment may be profitable even if it is price insensitive or can be
targeted efficiently.
It’s important to mention that consumer behaviour can be strongly influenced by culture as an external
factor of the impact.
Culture is said to represent influences that are imposed on the consumer by other individuals. It can be
defined as "unit of knowledge, belief, art, morals, custom, and any other capabilities and habits acquired by
person as a member of society."
Culture has several important characteristics:
1. It is comprehensive. This means that all parts must fit together in some logical fashion;
2. Culture is learned rather than being something we are born with;
3. It is manifested within boundaries of acceptable behavior;
4. Conscious awareness of cultural standards is limited;
5. Cultures fall somewhere on a continuum between static and dynamic depending on how quickly
changes are accepted.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
58
Moreover, it is worth mentioning, that some cultures tend to adopt new products more quickly than
others, which is said to be based on following reasons:
·
Modernity - the extent to which the culture is receptive to new things. F.e. in some countries, such
as Britain and Saudi Arabia, tradition is valued the most - thus, new products often don’t fare too
well, whereas other regions long for progress;
·
Homophily – meaning that the more similar to each other members of one culture are, the more
likely an innovation is to spread withing this culture and therefore, some region. The cause lies in
the point of people being more likely to imitate similar than different models of consumer
behaviour.
·
Nowadays the two most rapidly adopting countries in the World are the USA and Japan.
·
Physical distance – presupposes that the greater the distance between people is, the less likely an
innovation is to spread.
·
Opinion leadership- an issue showing that the more opinion leaders are valued and respected, the
more likely an innovation is to spread. The style of opinion leaders moderates this influence.
However, in less innovative countries opinion leaders tend to be more conservative, i.e., to reflect
the local norms of resistance.
What is also vital when talking about factors, influencing consumer behaviour, is family decision
making issue.
Individual members of families often serve different roles in decisions that ultimately draw on shared
family resources. Some individuals are information gatherers/holders, who seek out information about products
of relevance. These individuals often have a great deal of power because they may selectively pass on
information that favors their chosen alternatives. Influencers do not ultimately have the power to decide between
alternatives, but they may make their wishes known by asking for specific products or causing embarrassing
situations if their demands are not met.
On the contrary, the decision makers have the power to determine issues like: which product or brand to
buy; where to buy it; when to buy and, finally, whether to buy it all or not.
However, it’s important to differentiate the roles of the decision maker and that of the purchaser. From
the point of view of the marketer, this introduces some problems since the purchaser can be targeted by point-ofpurchase (POP) marketing efforts that cannot be aimed at the decision maker.
Another point worth mentioning is group influence.
Humans are inherently social beings, therefore individuals influence each other greatly.
A useful framework of analysis of group influence on the individual is so called reference group, which
obtain various degrees of influence.
Primary reference groups come with much greater impact on single consumer than secondary reference
groups, which tend to have less influence, usully said to be limited to consumption during some certain time
period.
Among factors that have strong impact on consumer behaviour personality is the most controversial one.
Traditional research in marketing has not been particularly successful in finding a link between
personality and consumer behavior. Still, personal preferences of a consumer inevitably play major role when the
decision about the purchase is made. Unfortunately, it is very difficult to determine which factors actually
formed that very specific attitude.
What about emotional impact, on the one hand it is used by producers to attract attention to a brand or a
company itself, and on the other hand is vital for the information processing. Appealing to emotions is almost
always successful, unless only the expected reaction of consumers varies from the one in reality.
Nevertheless, orientation on consumers doesn’t mean a company should be completely dependable on
their preferences. Sometimes it is possible to change consumer behavior.
It is the question of efficient marketing strategy and its implementaion, to make consumers switch from
well known brands to new ones. One way to reach such result is temporary price discounts and coupons.
However, such approach makes consumers buy a product on deal, which brings up a threat they may justify the
purchase based on that deal (i.e., the low price) and later switch to other brands on the same basis.
A better way is to make the product itself more convinient to obtain and prove its value and quality to a
larger number of consumers.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
59
Still, the value of a product for a consumer also depends on many factors. The most important issue the
consumer takes into consideration when making a decision over a particular purchase is satisfaction of his or her
major needs.
The needs of consumer were best structured in a hierarchy by Abraham Maslow. He suggested the
intuitively appealing notion that humans must satisfy the most basic objectives before they can move onto
"higher level" ones. Thus, an individual must satisfy physiological needs (such as food and liquid) before he or
she will be able to expend energy on less fundamental objectives such as safety. The Maslow hierarchy of needs
is as follows:
It is a useful model for understanding different
needs of consumers both in national and in global markets.
Self –
However, they shouldn’t be taken too literally, since people
actualization
may occasionally "swing" between these needs. Thus, the
Personal needs:
art of marketing presupposes flexibility and adequate
status, respect, prestige
reaction to changes in consumer behaviour.
Social needs:
Therefore, a company should be perfectly
friendship, belongigng, love
informed of its success or failure in adopting the marketing
Safety needs:
strategy to a changing consumer demands. In this concept
financial, freedom from harm
consumer satisfaction is an impportant issue. Calculating
Physiological needs:
financial value of a satisfied consumer is a serious problem
food, water, oxygen, sex
for any company. The most common approach to
evaluating this is by focusing on consumer retention.
On the half of consumer, behind the very decision over a purchase lies an important process of making
a choice. In this purchase decision process a consumer passes through several stages, which are:
Problem
recognition:
Perceiving
a need
Information
search:
Seeking
value
Evaluation
of alternatives:
Assesing
value
Purchase
decision:
Buying
value
Postpurchase
behaviour:
Value in consumption
or use
Postpurchase behaviour refers to the feelings consumer has after comparing the obtained products with
his or her expectations. In case of dissatisfaction marketers will have to analyse whether the product was
deficient or consumer’s expectations were too high. Deficiency of the product may require change of design or
even quality of the product, and in case of high consumer expectations the company might need to change it’s
advertising strategy or reorient to another target group.
Thus, consumer behaviour is an integral part of marketing process and together with financial matters
and long-term obligations form the basis for strategic marketing decisions.
Besides marketing mix influences also pshychological, sociocultural and sometimes situational factors
have impact on consumer behaviour, either motivationg or discoraging consumers to obtain some particular
products.
The knowledge of consumer behaviour and adequate response to its changes are two major factors for a
company’s success in global market.
REFERENCES:
[1] BERKOWITZ, KERIN, HARTLEY, RUDELIUS: Marketing. Fifth edition: Irwin McGraw-Hill, 1997, p.p.
152-157;
[2] www.ACNielsen/customer.com
[3] www.consumerpsychologist.com
[4] www.sciencecenter.com/services/customer.html
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
60
SALES PROMOTION
Helen Berchik
Plehanov Russian Academy of Economics, the faculty of International Economic Relations,
Moscow, Stremiannii alley., 36, tel. (095) 236-40-94, (095) 237-95-52, [email protected]
Abstract: Sales promotion includes a wide variety of promotion tools designed to stimulate earlier
or stronger market response. Sales promotion can be broken into two major categories: consumeroriented promotions and trade-oriented promotions. Sales promotion plays an important role in
the total promotion mix. To use it well, the marketer must define the sales-promotion objectives,
select the best tools, and evaluate the results.
Keywords: sales promotion, promotion tools, consumer-oriented promotions and trade-oriented
promotions.
Sales promotion has been defined as “a direct inducement that offers an extra value or incentive for the
product to the sales force, distributors or the ultimate consumer with the primary objective of creating an
immediate sale”. It consists of short-term incentives to encourage purchase or sales of a product or service.
Whereas advertising offers reasons to buy a product or service, sales promotion offers reasons to buy now.
Examples are found everywhere:
· A family buys a camcorder and gets a free traveling case or buys a car and gets a check for a $500 rebate;
· An appliance retailer is given a 10 percent manufacturer discount on January’s orders if the retailer
advertises the product in the local newspaper.
Sales promotion includes a wide variety of promotion tools designed to stimulate earlier or stronger
market response. Sales promotion can be broken into two major categories: consumer-oriented promotions and
trade-oriented promotions. Consumer-oriented sales promotion includes sampling, couponing, rebates, pricesoffs, premiums, contests, bonus packs, sweepstakes and event sponsorship. These promotions are directed at the
consumers who purchase goods and services and are designed to provide them with an inducement to purchase
the marketer’s brand. Trade-oriented sales promotion includes activities such as promotional allowances, free
goods, merchandise allowances, cooperative advertising, push money, dealer sales contests; and salesforce
promotion – bonuses, contests, sales rallies designed to motivate distributors and retailers to carry a product and
make an extra effort to promote or “push” it to their customers.
RAPID GROWTH OF SALES PROMOTION
Sales promotion tools are used by most organizations, including manufacturers, distributors, retailers,
trade associations, and nonprofit institutions. Estimates of annual sales-promotion spending run as high as $125
billion, and this spending has increased rapidly in recent years. A few decades ago, the ratio of advertising to
sales promotion spending was about 60/40. Today, in many consumer packaged – goods companies, the picture
is reversed, with sales promotion accounting for 75 percent or more of all marketing expenditures. Salespromotion expenditures have been increasing 12 percent annually, compared to advertising’s increase of only 7.6
percent.
Several factors have contributed to the rapid growth of sales promotion, particularly in consumer
markets. First, inside the company, promotion now is accepted more by top management as an effective sales
tool and more product managers are qualified to use sales promotion tools. Furthermore, product managers face
greater pressures to increase their current sales. Second, externally, the company faces more competition, and
competing brands are less differentiated. Competitors are using more and more promotions, and consumers have
become more deal oriented. Third, advertising efficiency has declined because of rising costs media clutter, and
legal restraints. Finally, retailers are demanding more deals from manufacturers.
The growing use of sales promotion has resulted in promotion clutter, similar to advertising clutter.
Consumers are increasingly tuning out promotions, weakening their ability to trigger immediate purchase. In
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
61
fact, the extend to which U.S. consumers have come to take promotions for granted was illustrated dramatically
by the reactions of Eastern European consumers when Procter & Gamble recently gave out free samples of a
newly introduced shampoo. To P&G, the sampling campaign was just business as usual. To consumers in
Poland, however, it was little short of a miracle:
With nothing expected in return, Warsaw shoppers were being handed free samples of Vidal Sasoon
Wash & Go shampoo. Just for the privilege of trying the new product no standing in the line for a product that
may not even be on the shelf. Some were so taken aback that they were moved to tears.
Although no sales promotion is likely to create such excitement among promotion-prone consumers in
the United States and other Western countries, manufacturers now are searching for ways to rise above the
clutter, such as offering large coupon values or creating more dramatic point-of-purchase displays.
PURPOSE OF SALES PROMOTION
Sales-promotion tools vary in their specific objectives. For example, a free sample stimulates consumer
trial; a free management advisory service cements a long-term relationship with a retailer. Sellers use sales
promotions to attract new triers to reward loyal customers, and to increase the repurchase rates of occasional
users.
There are three types of new triers – nonusers of the product category, loyal users of another brand, and
users who frequently switch brands. Sales promotions often attract the last group – brand switchers –because
nonusers and users of other brands do not always notice or act on a promotion. Brand switchers most are looking
for low price or good value. Sales promotions are unlikely to turn them into loyal brand users. Thus, sales
promotions used in markets where brands are very similar usually produce high short-run sales response but little
permanent market-share gain. In markets where brands differ greatly, however, sales promotions can alter
market shares more permanently.
Many sellers think of sales promotion as a tool for breaking down brand loyalty and advertising as a
tool for building up brand loyalty. Thus, an important issue for marketing managers is how to divide the budget
between sales promotion and advertising. Ten years ago, marketing managers typically would fist decide how
much they needed to spend on advertising and then put the rest into sales promotion. Today, more and more
marketing managers first decide how much they need to spend on trade promotion, then decide what they will
spend on consumer promotion, and then budget whatever is left over for advertising.
There is a danger in letting advertising take a back seat to sales promotion, however. Reduced
advertising spending can result in lost consumer brand loyalty. One recent study of loyalty toward 45 major
packaged-goods brands showed that when share of advertising drops, so does brand loyalty. Since 1975, brand
loyalty for brands with increased advertising spending fell 5 percent. However, for brands with decreased ad
spending, brand loyalty dropped 18 percent.
When a company price-promotes a brand too much of the time, consumers begin to think of it as a
cheap brand. Soon, many consumers will buy the brand only when it is on special. No one knows when this will
happen, but the risk increases greatly if a company puts a well-known, leading brand on promotion more than 30
percent of the time. Marketers rarely use sales promotion for dominant brands because the promotions would do
little more than subsidize current users.
Most analysts believe that sales-promotion activities do not build long-term consumer preference and
loyalty, as does advertising. Instead, promotion usually produces only short-term sales that cannot be
maintained. Small-share competitors find it advantageous to use sales promotion because they cannot afford to
match the large advertising budgets of the market leaders. Nor can they obtain shelf space without offering trade
allowances or stimulate consumer trial without offering consumer incentives. Thus, price competition is often
used for small brands seeking to enlarge their shares, but it is usually less effective for a market leader whose
growth lies in expanding the entire product category.
The upshot is that many consumer packaged-goods companies feel that they are forced to use more
sales promotion than they would like. Recently, Kellogg, Kraft, Procter & Gamble, and several other market
leaders have announced that they will put growing emphasis on pull promotion and increase their advertising
budgets. They blame the heavy use of sales promotion for decreased brand loyalty, increased consumer price
sensitivity, a focus on short-run marketing planning, and erosion of brand-quality image.
Some marketers dispute this criticism, however. They argue that the heavy use of sales promotion is a
symptom of these problems, not a cause. These marketers assert that sales promotion provides many important
benefits to manufacturers as well as to consumers. Sales promotions encourage consumers to try new products
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
62
instead of always staying with their current ones. They lead to move varied retail formats, such as the every-daylow-price store or the promotional-pricing store, which gives consumers more choice. Finally, sales promotions
lead to greater consumer awareness of prices, and consumers themselves enjoy the satisfaction of felling like
smart shoppers when they take advantage of price specials.
Sales promotions usually are used together with advertising or personal selling. Consumer promotions
usually must be advertised and can add excitement and pulling power to ads. Trade and salesforce promotions
support the firm’s personal selling process. In using sales promotion, a company must set objectives, select the
right tools, develop the best program, pretest and implement it, and evaluate the results.
SETTING SALES-PROMOTION OBJECTIVES
Sales-promotion objectives vary widely. Sellers may use consumer promotions to increase short-term
sales or to help build long-term market share. The objective may be to entice consumers to try a new product,
lure consumers away from competitors’ products, get consumers to “load up” on a mature product, or hold and
reward loyal customers. Objectives for trade promotions include getting retailers to carry new items and more
inventory, getting them to advertise the product and give it more shelf space, and getting them to buy ahead. For
the salesforce, objectives include getting more salesforce support for current or new products or getting
salespeople to sign up new accounts.
In general, sales promotions should be consumer franchise building – they should promote the product’s
positioning and include a selling message along with deal. Ideally, the objective is to build long-run consumer
demand rather than to prompt temporary brand switching. If properly designed, every sales-promotion tool has
consumer franchise building potential.
SELECTING SALES-PROMOTION TOOLS
Many tools can be used to accomplish sales-promotion objectives. The promotion planner should
consider the type of market, the sales-promotion objectives, the competition, and the costs and effectiveness of
each tool.
Consumer-Promotion Tools
The main consumer-promotion tools include samples, coupons, cash refunds, price packs, premiums,
advertising specialties, patronage rewards, point-of-purchase displays and demonstrations, and contests,
sweepstakes and games.
Trade-Promotion Tools
More sales-promotion dollars are directed to retailers and wholesalers than to consumers. Trade
promotion can persuade retailers or wholesalers to carry a brand, give it shelf space, promote it in advertising,
and push it to consumers. Shelf space is so scarce these days that manufacturers often have to offer price-offs,
allowances, buy-back guarantees, or free goods to retailers and wholesalers to get on the shelf and, once there, to
stay on it.
Manufacturers use several trade-promotion tools. Many of the tools used for consumer promotions –
contests, premiums, displays-also can be used as trade promotions. Or the manufacturer may offer a straight
discount off the list price on each case purchased during a stated period of time (also called a price-off, offinvoice, or off-list). The offer encourages dealers to buy in quantity or to carry a new item. Dealers can use the
discount for immediate profit, for advertising, or for price reductions to their customers.
Manufacturers also may offer an allowance (usually so much off per case) in return for the retailer’s
agreement to feature the manufacturer’s products in some way. An advertising allowance compensates retailers
for advertising the product. A display allowance compensates them for using special displays.
Manufacturers may offer free goods, which are extra cases of merchandise, to middlemen who buy a
certain quantity or who feature a certain flavor or size. They may offer push money – cash or gifts to dealers or
their salesforce to “push” the manufacturer’s goods. Manufacturers may also give retailers free specialty
advertising items that carry the company’s name, such as pens, pencils, calendars, paperweights, matchbooks,
memo pads, ashtrays, and yardsticks.
Business-Promotion Tools
Companies spend billions of dollars each year on promotion to industrial customers. These business
promotions are used to generate business leads, stimulate purchases, reward customers, and motivate
salespeople. Business promotion includes many of the same tools used for consumer or trade promotions. The
most important are conventions and trade shows, and sales contests.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
63
DEVELOPING THE SALES-PROMOTION PROGRAM
The marketer must make some other decisions in order to define the full sales-promotion program.
First, the marketer must decide on the size of the incentive. A certain minimum incentive is necessary if the
promotion is to succeed; a larger incentive will produce more sales response. Some of the large firms which sell
consumer packaged goods have a sales-promotion manager who studies past promotions and recommends
incentive levels to brand managers.
The marketer also must set conditions for participation. Incentives might be offered to everyone or only
to select groups. For example, a premium might be offered only to those who turn in box tops. Sweepstakes
might not be offered in certain states, to families of company personnel, or to people under a certain age.
The marketer then must decide ho to promote and distribute the promotion itself.
The length of the promotion is also important. If the sales-promotion period is too short, many
prospects will miss it. If the promotion runs too long, the deal will lose some of its “act now” force. Brand
managers need to set calendar dates for the promotions. The dates will be used by production, sales, and
distribution. Some unplanned promotions also may be needed, requiring cooperation on short notice.
Finally, the marketer must determine the sales-promotion budget, which can be developed in one of two
ways. The marketer may choose the promotions and estimate their total cost. However, the more common way is
to use a percentage of the total budget for sales promotion. One study found three major problems in the way
companies budget for sales promotion. First, they do not consider cost effectiveness. Second, instead of spending
to achieve objectives, they simply extend the previous year’s spending, take a percentage of expected sales, or
use the “affordable approach”. Finally, advertising and sales-promotion budgets are too often prepared
separately.
PRETESTING AND IMPLEMENTING
Whenever possible, sales-promotion tools should be pretested to find out if they are appropriate and of
the right incentive size.
Companies should prepare implementation plans for each promotion, covering lead time and sell-off
time. Lead time is the time necessary to prepare the program before launching it. Sell-off time begins with the
launch and ends when the promotion ends.
EVALUATING THE RESULTS
Evaluating is also very important. Yet many countries fail to evaluate their sales-promotion programs,
and others evaluate them only superficially. Manufacturers can use one of many evaluation methods. The most
common method is to compare sales before, and after promotion.
Consumer research also would show the kinds of people who responded to the promotion and what they
did after it ended. Surveys can provide information on how many consumers recall the promotion, what they
thought of it, how many took advantage of it, and haw it affected their buying. Sales promotions also can be
evaluated through experiments that vary factors such as incentive value, length, and distribution method.
Clearly, sales promotion plays an important role in the total promotion mix. To use it well, the
marketer must define the sales-promotion objectives, select the best tools, and evaluate the results.
REFERENCES:
[1] BEARDEN W.O., INGRAM T.N., LAFORGE R.W.: Marketing: principals, perspectives. – Chicago: Irwin,
1995. – 631 p.
[2] KOTLER Ph.: Marketing Management. – Millenium ed., International ed. – Englewood Cliffs: Prentice-Hall
International, Inc., 2000. – 718 p.
[3] KOTLER Ph.: Marketing management: Analysis, planning, implementation, and control. – 8th ed. –
Englewood Cliffs: Prentice-Hall International, Inc., 1994. – 801 p.
[4] KOTLER Ph.: Principles of marketing. – 2nd ed. – Englewood Cliffs: Prentice-Hall, 1994. – 692 p.
[5] SHIMP T.A.: Advertising, promotion, and supplemental aspects of integrated marketing communication. –
4th ed. – Fort Worth: The Dryden Press – Harcourt Brace College Publishers, 1997.- 589 p.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
64
STRATEGIES OF AN INTERNATIONAL MARKET ENTRY
Ponomareva Maria Alexandrovna
Plehanov Russian Academy of Economics: Stremiannii alley, 36, 8 (095) 2379552, [email protected]
Abstract:When a company plans to enter a new market, it should first of all decide which country to
enter. There are several major criteria for the choice of country to enter which a company should
think over. They are the following: market attractiveness (GNP per head, forecast demand, etc.);
competitive advantage (prior experience in similar markets, language and cultural understanding);
risk (political stability, possibility of government intervention and similar external influences,
business risk, currency risk etc.). Once a company decides to enter a market of a particular country,
it has to determine the best model of entry. It can choose from indirect exporting, direct exporting,
licensing, joint ventures, and direct investment. Each succeeding strategy involves more
commitment, risk, control, and profit potential. And the amounts of these factors increase from
indirect exporting to direct investments.
Keywords: Indirect Export, Domestic-Based Export Merchant, Domestic-Based Export Agent,
Cooperative Organization, Export-Management Company, Direct Export, Domestic-Based Export
Department or Division, Overseas Sales Branch or Subsidiary, Traveling Export sales
Representative, Foreign-Based Distributor or Agent, Licensing, Management contract, Contract
manufacturing, Joint Venture, Direct Investment.
When a company plans to enter a new market, it should first of all decide which country to enter. There
are several major criteria for the choice of country to enter which a company should think over. They are the
following: market attractiveness (GNP per head, forecast demand, etc.); competitive advantage (prior experience
in similar markets, language and cultural understanding); risk (political stability, possibility of government
intervention and similar external influences, business risk, currency risk etc.). Furthermore, some activities
should be carried out. I mean: 1) Strategic review/SWOT analysis; 2) Market and competitor research
(marketing environment differences, market size, segmentation, cost factors, government support, etc.), 3)
Establish a network of contacts; 4) Evaluation of entry strategies cost, investment, control, risk and return; 5)
Evaluation of SLEPT factors: (social; legal; economical (Level and trend in per capita income, Balance of
payments, Inflation situation, Exchange rate situation, Convertibles of currency),political and technological. 6)
Rethink the marketing mix and conformity to standards in context of country chosen. 7) Choose of media. Press
may not be appropriate in countries where levels of literacy are low. TV ownership may not be widespread.
Outdoor tends to rely on visuals and it is therefore a good international medium. Cinema is experienced in
different ways. The quality of films varies considerably. Radio is mainly a support medium across the world.
Commercial stations may not be available.
Once a company decides to enter a market of a particular country, it has to determine the best model of
entry. It can choose from indirect exporting, direct exporting, licensing, joint ventures, and direct investment. It
is up to a company to decide which strategy to choose to enter a market. Each succeeding strategy involves more
commitment, risk, control, and profit potential. And the amounts of these factors increase from indirect exporting
to direct investments.
The normal way to get involved in a foreign market is through export. There are two types of exporting:
occasional exporting and active exporting. When we talk about occasional exporting we mean a passive level of
involvement where a company exports from time to time on its own initiative or in response to unsolicited orders
from abroad. On the contrary, active exporting takes place when the company makes a commitment to expand
exports to a particular market. In either case, the company produces all of its goods in the home country and it
might or might not adapt them to the foreign market. Exporting involves the least change in the company’s
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
65
product lines, organization, investments, or mission.
Companies typically start with indirect exporting, in other words they work through independent
middlemen. Four types of middlemen are available to the company: 1) Domestic-Based Export Merchant – this
middleman buys the manufacturer’s products and sells them abroad on its own account. 2) Domestic-Based
Export Agent – this agent seeks and negotiates foreign purchases and is paid a commission. Such agents can be
included in the group and in that case they’re called trading companies. 3) Cooperative organization –
organization that carries on exporting activities on behalf of several producers and is partly under their
administrative control. This type is often used by producers of primary products such as fruits, nuts, and so on.
4) Export-Management Company – this middleman agrees to manage a company’s export activities for a fee.
Indirect exporting has two main advantages. Firstly, it involves less investment. The firm does not have
to develop an export department, an overseas sales force, or a set of foreign contracts, because it uses
intermediaries such as export houses, specialist export management firms, complementary exporting (i.e. using
other companies’ products to pull your own into an overseas market) etc. Secondly, it involves less risk.
International-marketing middlemen bring know-how and services to the relationship, and the seller will normally
make fewer mistakes.
However when he company wants to get more potential return in spite of greater investment and risk, it
may decide to handle it’s own export. In this case it can carry on direct exporting in the following types: 1)
Domestic-based Export Department or Division – an export sales manager carries on the actual selling and draws
on market assistance as needed. It might evolve into a self-contained export department performing all the
activities involved in export and operating as a profit center. 2) Overseas Sales Branch or Subsidiary – an
overseas sales brand allows the manufacturer to achieve greater presence and program control in the foreign
market. The sales branch handles sales and distribution and might handle warehousing and promotion as well. It
often serves as a display center and customer-service center. 3) Traveling Export Sales Representatives – the
company sends home-based sales representatives abroad to find business. 4) Foreign-Based Distributors or
Agents – the company hires foreign-based distributors or agents to sell the goods on behalf of the company.
They might be given exclusive rights to represent the manufacturer in that country or only general rights.
To sum up, direct exporting is exporting to overseas customers, who might be wholesalers, retailers or
users, without the use of export houses etc.
Licensing is an alternative to foreign direct investment by which overseas producers are given rights to
use the licensor's production process in return for a fee or royalty payments.
Licensing agreement is a commercial contract whereby the licensor gives something of value to the
licensee for exchange for certain performances and payments. Licensing represents a simple way for a
manufacturer to become involved in international marketing. The licensor may provide any of following: 1)
Rights to produce a patented product or use a patented production process, 2) Manufacturing know-how
(unpatented), 3) Technical advice and assistance including the supply of essential materials, components, plants,
4) Marketing advice and assistance, 5) Rights to use a trademark or brand.
The licensor gains entry into the foreign market at little risk; the licensee gains production expertise or a
well-known product or name without having to start from scratch.
Licensing is growing in extent and importance through the world. It is used by all sizes of firms, as it
has many advantages for a licensor: 1) It requires no investment in other words save the continuing costs of
monitoring the agreement, 2) It enables entry into markets that would otherwise be closed, 3) As a mode of
entry, it is relatively simple and quick, 4) The licensor gains access to knowledge of local conditions, 5) New
products can be introduced to many countries quickly because of low investment requirements, 6) It provides all
the usual benefits of overseas production, 7) It can be a source of competitive advantage, if it spreads the firm's
proprietary technology, giving it wider exposure than that of rival.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
66
Licensing also suffers from drawbacks, however: 1) Revenues from licensing are very low, usually less
than 10% of turnover, 2) A licensee may eventually become the licensor's competitor, 3) Although the contract
may specify a minimum sales volume, there is some danger that licensee will not fully exploit the market, 4)
Product quality might deteriorate if the licensee has more lax attitude to quality control than licensor, 5)
Governments may impose restrictions or conditions on payment of royalties to the licensor or on the supply of
components, 6) It is often difficult to control the licensee effectively. The licensee's objectives often conflict with
those of licensor and disagreements are common.
Companies can enter foreign markets on other bases. A company can sell a management contract to
manage a foreign hotel, airport, hospital, or other organization for a fee. In this case a firm is exporting a service
instead of a product. Management contracting is a low-risk method of getting into a foreign market, and
furthermore it yields income from the beginning. The arrangement is especially attractive if the contracting firm
is given an option to purchase some share in the managed company within a stated period. On the other hand, the
arrangement has no sense if the company can put its scarce management talent to better uses or if there are
greater profits to be made by undertaking the whole venture. Management contracting prevents the company
from going into competition with its clients.
Another method of entry is contract manufacturing, where the firm engages local manufactures to
produce the product. In other words a firm (the contractor) makes a contract with another firm (the contractee)
abroad whereby the contractee manufactures or assembles a product on behalf of the contractor. The contractor
retains full control over marketing and distribution, while the manufacturing is done by a local firm. The
drawback of contract manufacturing is less control over the manufacturing process and the loss of potential
profits on manufacturing. On the other hand, it offers the company a chance to start faster, with less risk, and
with the opportunity to form a partnership or buy out the local manufacturer later.
So Advantages of contract manufacturing include the following: 1) There is no need to invest in plant
overseas, 2) Risk associated with currency fluctuation, is largely avoided, 3) The risk asset expropriation is
minimized, 4) Control of marketing is retained by the contractor, 5) A product manufactured in overseas market
may be easier to sell, especially to government customers, 6) Lower transport costs and, sometimes, lower
production cost can be obtained.
Disadvantages of contract manufacturing include the following: 1) Overseas contractee producers who
are reliable and capable cannot always be easily identified, 2) Sometimes the contractee producer's personnel
must relieve intensive and substantial technical training, 3) The contractee producer may eventually become a
competitor, 4) Quality control problems in manufacturing may arise.
Consequently, contract manufacturing is perhaps best suited : 1) to countries where small size of the
market discourages investment in plant, 2) to firm whose main strengths are in marketing rather in production.
The next strategy of entering a foreign market is a joint venture. Joint venture is an arrangement of two
or more firms (foreign investors and local investors) to create a manufacture and to join forces for
manufacturing, financial and marketing purposes in which each has a share in both the equity and management
of the business. Forming a joint venture might be necessary or desirable for economic or political reasons. The
foreign firm might lack the financial, physical, or managerial resources to undertake the venture alone. Or the
foreign government might require joint ownership as a condition for entry.
Advantages of joint venture include the following: 1) Some government discourage or even prohibit
foreign firms setting up independent operation; 2) Joint ventures are especially attractive to smaller or riskaverse firms, or where very expensive new technologies are behind researched and developed; 3) When funds
are limited, joint ventures permit coverage of larger number of countries since each one requires less investment
by each participator; 4) A joint ventures can reduce the risk of government intervention as a local firm is
involved; 5) The participating enterprises benefit from all sources of profit; 6) Joint venture can provide close
control over marketing and other operation, 7) A joint venture with an indigenous firm provides local
knowledge, quickly.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
67
Disadvantages of joint ventures are that there can be major conflicts of interests between the different
parties. Disagreements may arise over: 1) profit shares, 2) amount invested, 3) the management of the joint
venture; 4) the marketing strategy, or other policies. One partner might want to reinvest earnings for growth, and
the other partner might want to withdraw these earnings. Furthermore, joint ownership can hamper a
multinational company from carrying out specific manufacturing and marketing polices on a worldwide basis.
The ultimate form of foreign involvement is direct ownership of foreign-based assembly or
manufacturing facilities. The foreign company can buy part or full interest in a local company or build its own
facilities. As a company gains experience in export, and if the foreign market appears large enough, foreign
production facilities offer distinct advantages:1) The firm does not have to share its profits with partners of any
kind; 2) The firm does not have to share or delegate decision making and so there are no loses in efficiency
arising from inter-firm conflict; 3) There are none of the communication problems that arise in joint ventures,
license agreements etc.; 4) The firm is able to operate a completely integrated and systematic international
system; 5) the firm will gain a better image in the host country because it creates jobs; 6) the firm develops a
deeper relationship with government, customers, local suppliers, and distributors, enabling it to adapt its
products better to the local marketing environment; 7) the firm assures itself access to the market in case the host
country starts insisting that purchased goods have domestic content.
Disadvantages of wholly owned overseas manufacture include the following: 1) The substantial
investment funding required prevents some firms from establishing operations overseas; 2) Suitable managers,
whether required in the overseas market or posted abroad from home, may be difficult to obtain; 3) Some
overseas government discourage or even prohibit 100% ownership of an enterprise by a foreign company; 4)
This mode of entry forgoes of overseas partner's market knowledge, distribution system and other local
expertise. However, the firm has no choice but to accept these risks if it wants to operate on its own behalf in the
host country.
Many companies show a distinct preference for a particular strategy of entry. One company might
prefer exporting because it minimizes its risk. Another company might prefer licensing because it is an easy way
to make money without investing much capital. Another company might favor direct investment because it
wants full control. Yet insisting on one mode of entry is too limiting. Some countries will not permit imports of
certain goods nor allow direct investment but will only accept a joint-owned venture with a foreign national.
Consequently, companies need to master all of these entry methods. Even though a company might have
preferences, it needs to adapt to each situation. For instance, most sophisticated multinationals manage several
entry modes simultaneously.
REFERENCES:
[1] BEARDEN W.O., INGRAM T.N., LAFORGE R.W.: Marketing: principles, perspectives. – Chicago:
IRWIN, 1996, 631 p.
[2] CATCORA, PHILIP R.: International marketing. – 9th ed. – Chicago: IRWIN, 1996, 770 p.
[3] LAMBIN J.: Strategic Marketing management. – Berkshire: The McGraw – Hill Companies, 1997, 692 p.
[4] PHILIP KOTLER, GARY ARMSTRONG: Principles of marketing, sixth edition, Prentice-Hall,
International, Inc., 1994, 692 p.
[5] PHILIP KOTLER: marketing management the millennium edition, Prentice-Hall, International (UK)
Limited, Sydney, 2000, 131 p.
[6] PHILIP KOTLER: Marketing management analysis, planning, implementation, and control, eighth edition,
Northwestern University, 1994, 801 p.
[7] RANGAN V.K., SHAPIRO B.P., MORIARTY R.: Business Marketing Strategy: Cases, concepts, and
applications. – Chicago: IRWIN, 1995, 328 p.
[8] SUDHARSHAN D. : Marketing strategy: relationships, offerings, timing and resource allocation. –
Englewood Cliffs: Prentice-Hall, Inc., 1995, 446 p.
[9] WALKER O.C., BOYD H.W., LARRECHE J-Cl.: Marketing strategy: Planning and implementation. –
Chicago: IRWIN, 1995, 329 p.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
68
SCOPES OF MARKETING STRATEGY FOR 3G MOBILE SERVICES
Emese Tokarčíková
Department of Macro and Microeconomics, Faculty of Management Science and Informatics,
University of Žilina, Moyzesova 20, 010 26 Žilina, Slovak Republic
Tel: 00421 – 041 – 513 4421, fax 0041 –041 –565 2044
e–mail : [email protected]
Abstract: Third generation (3G) of mobile services are the first multimedia services and more or
less general available communication systems. Mobile operators, using this technology are able to
offer new services for their customers. The purpose of marketing strategy is directing mobile
operators to offer suitable services to selected category of customers – to be custom oriented.
Keywords : mobile services, 3 generation, marketing strategy, customer requirements
1. INTRODUCTION
Digital communications in the world have changed greatly. Mobile phones have been one of the
technology success stories of the last few years. As well more and more households have home Internet access.
Mobile telephones allows us to talk on the move. The Internet turns raw data into helpful services that people
find easy to use. Now these two technologies create 3G, which is a statement of International
Telecommunication Union (ITU) for third generation of mobile communication services. In the mid-1980’s, the
ITU created the single standard of a family of technologies entitled ‘IMT-2000’, “International Mobile
Telecommunications”. Following these trends European UMTS (which stands for Universal Mobile
Telecommunications System), falls within the ITU's IMT-2000 vision of a global family of 3G mobile
communication systems.
2. 3G MOBILE SERVICES
The simplest definition of services says that services are actions that satisfy people’s needs. Mobile
services are not counting. 3G mobile services offer new ways to communicate, access information, conduct
business, learn and be entertained – liberated from slow, cumbersome equipment and immovable points of
access. 3G mobile services have different classes for four types of traffic:
· Conversational class (voice, video telephony, video gaming)
· Streaming class (multimedia, video on demand, web cast)
· Interactive class (web browsing, network gaming, database access)
· Background class (email, SMS, downloading)
Technically, generations which allows to provide these class are defined as:
UMTS
TECHNOLOGY
FEATURES
1G
AMPS
Advanced Mobile Phone Service
2G
CDMA
Code Division Multiple Access
TDMA
GSM
PDC
Time Division Multiple Access
Global System for Mobile Communication
Personal digital cellular
CDMA-2000
TD-SCDMA
Based on the Interim Standard-95 CDMA standard
Time division synchronous code-division multiple
access
3G
Fig. 1 History
of mobile technologies
W-CDMA
Wide-band Code Division Multiple Access
-
Analogue voice service
No data service
Digital voice service
9.6 to 14.4K bit/sec.
CDMA, TDMA and PDCoffer one-way data
trans. only
No always-on data connection
-
Superior voice quality
Up to 2M bit/sec. always on data
Broadband services like video and multimedia
Enhanced roaming
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
69
Forum's Market Aspects Group has identified some common lifestyle attributes for mobile multimedia
applications. Here is a list of possible type of services that will be available in 3G networks:
·
·
·
·
·
·
·
·
·
·
·
Fun: WWW, video, post card, snapshots, text, picture and multimedia messaging, datacast,
personalisation applications (ring tone, screen saver, desk top), jukebox, virtual companion / pet...
Work: Rich call with image and data stream, IP telephony, B2B ordering and logistics,
information exchange, personal information manager, dairy, scheduler, note pad, 2-way video
conferencing, directory services, travel assistance, work group, telepresence, FTP, instant
voicemail, colour fax...
Media: Push newspaper and magazines, advertising, classified...
Shopping: E-commerce, e-cash, e-wallet, credit card, telebanking, automatic transaction, auction,
micro billing shopping...
Entertainment: News, stock market, sports, games, lottery, gambling, music, video, concerts,
adult content...
Education: Online libraries, search engines, remote attendance, field research...
Peace of Mind: Remote surveillance, location tracking, emergency use...
Health: Telemedicine, remote diagnose and heath monitoring...
Automation: Home automation, traffic telematics, machine-machine communication (telemetry)...
Travel: location sensitive information and guidance, e-tour, location awareness, timetables, eticketing...
Add-on: TV, radio, PC, access to remote computer, MP3 player, camera, video camera, watch,
pager, GPS, remote control unit...
3G will become an essential part of our everyday lives and catalyst for a whole new array of high-speed
mobile services, providing personal mobility, interactivity and access to advanced broadband and positioning
services anywhere, anytime. 3G is not about technology, it is about services. Basically, 3G opens the door to
anything costumers can imagine. They will be able to do a multitude of things while going through their daily
schedule, whether at work or at leisure.
3. SCOPES OF MARKETING STRATEGY FOR 3G MOBILE SERVICES
Mobile operators need marketing strategy, which is adapted to the very different market of mobile data.
“Winning strategy needs excellent customer and marketing conception and winning marketing conception needs
suitable framework in strategic management system in a company. The most important part of strategy is
marketing conception with its focus to suitable market segments and their customers.” [2] Therefore strategic
market planning is a very first step to create excellent marketing strategy and conception and includes the
planning process that yields decisions in how a business unit can best compete in the markets it elects to serve.
The strategic plan is based upon the totality of the marketing process.
Supports for mobile operators in creating strategic plan for 3G mobile services are experiences and
results from 2G mobile services. It is necessary to analyze the success of GSM mobile services to be able to
prognose costumer perspectives for next generation of mobile services. “Basic measurement in costumer
perspective usually consist of measurement as:
· market share
· customer loyalty
· recruitment of new customers
· customer satisfaction
· customer profitability “ [2]
Because of 3G mobile services make possible to individualize customer requirements, therefore mobile
operators need better customer specification, close customer orientation and dialog communications.
The basic question in aspects of 3G mobile services, which is need to be answered clearly in marking
strategy, is:
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
70
How can mobile operators, software designers, and handset producers inspire vendors to use their
mobile phones for longer lengths of time and for more applications?
To answer this question, mobile operators at first have to define:
· what kind of costumers want to address (e.g. following by age, profession, demand, need,
habitation)
· which kind of product or service they want to address costumers ( which 3G mobile service or
pack of services to offer )
· in what mobile operator is meaningful for customer.
After that mobile operators can assign frame of their marketing strategy:
· to define firm’s identity according to external requirement (mobile market)
· to control efficiency and expedience of impact to potential partners on the market
· to define and adhere firms´ principles and firms ´ culture and clime ( operator can be more
competitive)
· to define long term targets, because in case of 3G the returns of investments are long term
matters
· to create special unit for close dialog with customers and public
General requirements to 3G mobile services characteristic for all customers are identified as following:
One number
One bill
One phone
Global roaming
What does the
customer want from the
3G services?
Internet access
Useful services
Ease of use
Value
Fig. 2 Customer requirements
Beyond these general request, customers have more individual needs. Therefore qualitative processes
oriented on customers’ requirements and focused on satisfying their demand are determined terms in marketing
strategy.
Answering further questions and processes they can solve the basic problems in this field:
· Should 3G services be targeted at all users?
· What are the main needs that manage to demand 3G mobile services?
· What are the key market segments that should be addressed with 3G services?
· How big will the market for 3G services be in the future?
· Is there a killer application that will revolutionise the industry?
· What are the revenue opportunities for 3G mobile services providers?
· What should the focus of service creation be to ensure early time to profit in 3G?
· Who will be the key players in the 3G-value chain and how will they share the revenue?
· Make strategic analysis: competitive, regulatory, supplier and customer
· Analysis of risks and opportunities, based on a scenario approach
· Identify the service portfolio and the associated marketing mix
· Assign factors of influence customer’ s preferences
· Appoint life-cycle of service
· Define tariff policy and communication policy
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
71
4. CONCLUSION
Mobile operators have to decide not only whether to build out or upgrade 3G infrastructures, but mainly
which mobile services they will offer to customers and how they will profit from them? There are no set answers
to these questions. For all they have to create outright marketing strategy which will answer not "why" but
"how” to improve odds bided by new mobile technology. This technology is a challenge for mobile operators
and the right strategy can made them more competitive and successful. It is also a chance to realize vision of “
building a mobile future where their customers can enjoy richer communications - anytime and anywhere, to be
a friendly partner, deliver easy-to-use and inspiring solutions that let us GET MORE out of life and to believe in
openness, respect for people and commitment to excellence.”[7]
REFERENCES
[1] TOKARČÍKOVÁ, E: Nové služby pre tretiu generáciu (UMTS) mobilných sietí, Žilina, Písomná práca k
dizertačnej skúške, Žilinská Univerzita v Žiline, 2003
[2] VODÁK, J.: Meaning of marketing conception for strategic management system on global markets, Text
book of
International scientific conference: Marketing of the companies in V4 countries one step before the entry to
European union, Matej Bel University in Banská Bystrica, 2003
[3] KOTLER,P: Marketing podle Kotlera, Management Press, Praha 2000
[4] http://www. umtsworld.com
[5] http://www.tomiahonen.com
[6] http://www.3g.co.uk
[7] http://www.eurotel.sk
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
72
CULTURE IN INTERNATIONAL MARKETING AND BUYER BEHAVIOUR
Kalugin Evgeniy
The Plekhanov Russian Academy of Economics,Stremanniy pereulok 36, 113054,
Moscow, Russian Federation.
tel: +7 095 2379552; e-mail:[email protected] mail.ru.
Abstract: To develop effective marketing programs, the marketing manager must have knowledge of
the needs and wants of potential buyers, how they arise, and how and where they are likely to be
satisfied.
Keywords: Characteristics of culture International Marketing and buyer behavior, examples of
Cultural Blunders Made by International Marketers, the culture sensitivity of markets, the
development of global culture, cultural analysis of global markets, cross- cultural analysis.
1. INTRODUCTION
Culture is the learned ways of group living and the group’s responses to various stimuli. It is also the
total way of life and thinking patterns that are passed from generation to generation. It encompasses norms,
values, customs, art, and beliefs. Culture is the patterns of behavior and thinking that people living in social
groups learn, create, and share. Culture distinguishes one human group from others. A people's culture includes
their beliefs, rules of behavior, language, rituals, art, technology, styles of dress, ways of producing and cooking
food, religion, and political and economic systems. Anthropologists commonly use the term culture to refer to a
society or group in which many or all people live and think in the same ways. Likewise, any group of people
who share a common culture—and in particular, common rules of behavior and a basic form of social
organization—constitutes a society. Thus, the terms culture and society are somewhat interchangeable.
Characteristics of culture: Culture is prescriptive. It prescribes that kinds of behavior considered
acceptable in the society. The prescriptive characteristic of culture simplifies a consumer’s decision-making
process by limiting product choices to those, which are socially acceptable. These same characteristics create
problems for those products not in tune with the consumer’s cultural beliefs. Culture is socially shared. Culture
cannot exist by itself. Members of a society must share it. Thus acting to reinforce culture’s perspective nature.
Culture is learned. Culture is not inherited genetically; it must be learned and acquired. Socialization or
enculturation occurs when a person absorbs or learns the culture in which he or she is raised. Culture facilitates
communication. One useful function provided by culture is to facilitate communication. Culture usually imposes
common habits of though and feeling among people. Thus, within a given group culture makes it easier for
people to communicate with one another. But culture may also impede communication across groups because of
a lack of shared common culture values. This one reason why a standardized advertisement may have difficulty
communicating with consumers in foreign countries. How marketing efforts interact with a culture determines
the success or failure of a product. Advertising and promotion require special attention because the play a key
role in communicating product concepts and benefits to the target segment. Culture is subjective people in
different cultures often have different ideas about the same object. What is acceptable in one culture may not
necessarily be so in another. In this regard, culture is both unique and arbitrary. Culture is enduring, because
culture is shared and passed along from generation to generation, it is relatively stable and somewhat permanent.
Old habits are hard to break, and people and people tend to maintain its own heritage in spite of continuously
changing world. Culture is cumulative. Culture is based on hundreds or even thousands of years of accumulated
circumstances. Each generation adds something of its own of culture before passing the heritage on to the next
generation. Therefore culture tends to be broader based over time, because new ideas are incorporated and
become a part of the culture. Culture is dynamic. Culture is passed along from generation to generation, but one
should not assume that culture is static and immune to change. Culture is constantly changing it adapts itself to
new situations and new sources of knowledge.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
73
2. INTERNATIONAL MARKETING AND BUYER BEHAVIOR.
An understanding of buyer behavior is central to successful marketing. To develop effective marketing
programs, the marketing manager must have knowledge of the needs and wants of potential buyers, how they
arise, and how and where they are likely to be satisfied. Buyer behavior is affected by many factors. Class,
education, age, and psychosocial traits are just four of the many factors useful in distinguishing different buyer
groups. Researching the relationships that exist between the marketing-mix variables and buyer needs and
response. From this effort have evolved many buyer behavior models, concepts, and techniques.
International Marketing’s Four Buyer Behavior Tasks Apparent similarities such as language can hide
subtle but important differences between markets. International marketers have often shown a higher propensity
to misinterpret a marketing situation when the cultural and economic environments of the foreign market are
apparently the same as their own. For example, Philip Morris lost a considerable amount of money when tried to
introduce a U.S. cigarette to the Canadian market. Management was under the erroneous impression that
Canadians and Americans had similar smoking habits because the spoke the same language, had similar cultural
heritages, dresses more or less the same, and watched many of the same television programs. Campbell Soups
lost $30 million in Europe before it accepted the idea that British and U.S. soup consumers were different in
three important ways. First British soups consumers have different taste preferences. Campbell soups made no
attempt to modify the taste of their soups for the British palate. Second, British soup consumers had not been
educated to the condensed soup product concept. Because of the smaller can size. Third, British soup consumers
did not respond the same way to U.S. advertisement as U.S. consumer did. Examples of Cultural Blunders Made
by International Marketers: Language: A U.S. toothpaste manufacturer promised its customers that they would
be more “interesting” if they used the firm’s toothpaste. What the advertising coordinators did not realize,
however, was that in Latin American Countries “interesting” is another euphemism for “pregnant”. Food: Chase
and Sanborn met resistance when it tried to introduce its instant coffee in France. In the home, the consumption
of coffee plays more of a ceremonial role than in the English home. The preparation of “real” coffee is a
touchstone in the life of the French housewife, so she will generally reject instant coffee because its causal
characteristics do not “fit” into the French eating habits. Values: In 1963, Dow Breweries introduced a new beer
in Quebec, Canada; called “kebec” the promotion incorporated the Canadian flag and attempted to evoke
nationalistic pride. The strategy backfired when major local groups protested the “profane” use of “sacred”
symbols. Religion: England’s East India Company once caused a revolt when it did not modify a product. In
1857, bullets were often encased in pig wax, and the tops had to be bitten off before the bullets could be fired.
The Indian soldiers revolted since it was against their religion to eat pork. Hundreds of people were killed before
order was restored. Social Norms and time: A telephone company tried to incorporate a Latin flavor in its
commercials by employing Puerto Rican actors. In the ad, the wife said to her husband, “ run and phone Mary.
Tell her we will be a little late.” This commercial has two major cultural errors. Latin wives seldom dare order
their husband around, and almost no Latin would feel it necessary to phone to warm of tardiness since it is
expected.
The Sociocultural Dimension of Buyer Behavior Culture does influence Consumption to a great extent.
Consumption patterns, living styles, and the priority of needs are all dictated by culture. Culture prescribes the
manner in which people satisfy their desires. Not surprisingly, consumption habits very greatly. The
consumption of beef provides a good illustration. Some Chinese do not consume beef at all, believing that it is
improper to eat cattle that work on farms, thus helping to provide foods such as rice and vegetables.
3. THE CULTURE SENSITIVITY OF MARKETS.
Markets can be divided into consumer markets and industrial markets. Consumer markets can be further
subdivided into durable goods markets and nondurable goods markets. A further profitable distinction in the
international market place is to divide durable goods into technological products and nontechnological products.
1. Industrial markets: the main distinction between industrial markets and consumer markets is that
industrial buyers are interested in solving problems. Generally, these problems are to reduce costs, to increase
production administrative efficiency, to produce a particular type of product, or to effect a combination of these
goals. Consequently, industrial buyers tend to be rational and emphasize economic goals. Cultural and social
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
74
considerations play a relatively less important role in the purchase decision. To some extent, industrial markets
can be viewed as global markets. There are, however, differences that need to be considered when selecting the
products to be marketed and the marketing programs to be used. Government regulations, the size and
sophistication of the potential buyer’s operations, and the context within which the product or service is to be
used all have an impact on the marketing effort.
2. Consumer Markets: it consists of buyers interested in satisfying a personal need or want. They are
generally more susceptible to cultural and social forces than are industrial buyers. The influence of sociocultural
factors is most apparent in the purchase of nondurable products such as clothing, foods, etc. there are
expectations, however, that make the international marketer’s task more interesting and challenging.
4. THE DEVELOPMENT OF GLOBAL CULTURE.
Rapid changes in technology in the last several decades have changed the nature of culture and cultural
exchange. People around the world can make economic transactions and transmit information to each other
almost instantaneously through the use of computers and satellite communications. Governments and
corporations have gained vast amounts of political power through military might and economic influence.
Corporations have also created a form of global culture based on worldwide commercial markets. Local culture
and social structure are now shaped by large and powerful commercial interests in ways that earlier
anthropologists could not have imagined. Early anthropologists thought of societies and their cultures as fully
independent systems. But today, many nations are multicultural societies, composed of numerous smaller
subcultures. Cultures also cross national boundaries. For instance, people around the world now know a variety
of English words and have contact with American cultural exports such as brand-name clothing and
technological products, films and music, and mass-produced foods. Many anthropologists have become
interested in how dominant societies can shape the culture of less powerful societies, a process some researchers
call cultural hegemony. Today, many anthropologists openly oppose efforts by dominant world powers, such as
the U.S. government and large corporations, to make unique smaller societies adopt Western commercial culture.
5. CULTURAL ANALYSIS OF GLOBAL MARKETS.
Whether a firm is pursuing a national-market or global-market strategy, it is interested in increasing the
effectiveness and efficiency of its marketing programs within and across foreign markets. It must therefore know
to what degree it can use the same product, pricing, promotion, and distribution strategies in more than one
market. Unfortunately, the dual goals of program effectiveness and efficiency are in conflict. Market
effectiveness is achieved by adapting marketing programs to marketing characteristics and conditions within
markets. While doing so incurs additional marketing and production costs, the firm strengthens its market
competitiveness by being more responsive to the needs of the marketplace. Efficiency, on the other hand, is
achieved by minimizing marketing program changes across markets. Thus the firm minimizes marketing and
production costs and strengthens its competitiveness vis-а-vis its competitors. The economic and competitive
implications of both goals need to be taken into account when making program adaptation decisions. Both goals
depend on understanding the cultural context of each market and the degree to which they are culturally similar.
Thus, global companies need to develop a capability to conduct cross-cultural analysis of buyer behavior. Such a
capability can help these companies optimally balance the competitive benefits to be derived from effectiveness
and efficiency.
6.CROSS- CULTURAL ANALYSIS.
“Cross-cultural analysis is the systematic comparison of similarities and differences in the material and
behavioral aspects of cultures.” In the marketing, cross-cultural analysis is used to gain an understanding of
market segments within and across national boundaries. The purpose of this analysis is to determine whether the
marketing program, or elements of the program, can be used in more than one foreign market or must be
modified to meet local conditions. The approaches used to gain this understanding draw on the methods
developed by such social sciences as anthropology, linguistics, and sociology. Standard marketing research
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
75
techniques, such as multi attribute and psychographic techniques, can be used. For example, Berger, Stern and
Johansson used to multi attribute method to study Japanese and American car buyers, and Boote used a
psychographics approach to study the segmentation of the Europe community. In marketing, cross-cultural
analysis most often involves identifying the effects culture may have on family purchasing roles, product
function. Product design, sales and promotion activities, channel systems, and pricing. One approach suggested
by Engel, Blackwell, and Miniard to the study of the effects of culture on buyer behavior, and thus in the
marketing- mix elements. This involves answering a comprehensive list of questions, although these are neither
exhaustive nor specific. For example, a manufacturer of processed foods would be interested in knowing the
impact that culture has on such things as taste, purchasing habits, and eating habits. A manufacturer of household
appliances, on the other hand, would be particularly interested in how potential buyers view a product’s
reliability, durability, and reparability.
7. CONCLUSION
There is no doubt that the international marketing process do faces a large set of variables as it take
place over different countries and it does act in different environments. One of the most determinant
environments to the success of the international marketing process is Culture, which hold the reason for many
human acts and behavior. Reaching to that point international marketer should study deeply culture treaties of a
country the company is planning to act in. so that special amendments in the organization overall plans and
actions is made to act in accordance with the new market variables.
REFERENCES:
[1]
[2]
[3]
[4]
[5]
International Marketing, Sixth Edition. Vern & Ravi. Dryden Press.
International Marketing, Ninth Edition. Philip Cateora. IRWIN.
International Marketing, Sixth Edition. Michael & Ilkka. Harcourt.
International Marketing, Tenth Edition. Phillip & john graham.
Consumer Behavior, Eighth Edition. James F. Engel & Roger Blackwell, Bowel Miniard.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
76
EFFECTIVE COMMUNICATION AND PROMOTION STRATEGY
Oreshkin Alexey Gennadievich
Plehanov Russian Academy of Economics: Stremiannii alley, 36, 8 (095) 2379552, [email protected]
Abstract: In preparing marketing communications, the communicator has to understand the nine
elements of any communication process: sender, receiver, encoding, decoding, message, media,
response, feedback, and noise. The communicator's first task is to identify the target audience and
its characteristics. Next, the communicator has to define the response sought, whether it be
awareness, knowledge, liking, preference, conviction, or purchase. Then a message should be
constructed with an effective content, structure, and format. Media must be selected, both for
personal communication and nonpersonal communication. The message must be delivered by a
credible source—someone who is an expert and is trustworthy and likable. Finally, the communicator must collect feedback by watching how much of the market becomes aware, tries the
product, and is satisfied in the process.
Keywords: Advertising, Sales promotion, Public relations and Publicity, Personal selling, Direct
marketing, Sender, Receiver, Message, Media, Encoding, Decoding, Response, Feedback, Noise,
Personal Communication Channels, Nonpersonal Communication Channels.
Modern marketing calls for more than developing a good product, pricing it attractively, and making it
accessible to target customers. Companies must also communicate with present and potential stakeholders, and
with the general public. The marketing communication: advertising, sales promotion, public relations and
publicity, personal selling, and direct marketing.
Advertising. Because of the many forms and uses of advertising, it is hard to generalize about its unique
qualities as a part of the promotion mix. Yet several qualities can be noted. Advertising's public nature suggests
that the advertised product is standard and legitimate. Because many people see ads for the product, buyers know
that purchasing the product will be understood and accepted publicly. Advertising also lets the seller repeat a
message many times, and it lets the buyer receive and compare the messages of various competitors. Large-scale
advertising by a seller says something positive about the seller's size, popularity, and success.
Advertising is also very expressive, allowing the company to dramatize its products through the artful
use of print, sound, and color. On the one hand, advertising can be used to build up a long-term image for a
product (such as Coca-Cola ads). On the other hand, advertising can trigger quick sales (as when Sears advertises
a weekend sale). Advertising can reach masses of geographically spread-out buyers at a low cost per exposure.
Advertising also has some shortcomings. Although it reaches many people quickly, advertising is
impersonal and cannot be as persuasive as a company salesperson. Advertising is able to carry on only a oneway communication with the audience, and the audience does not feel that it has to pay attention or respond. In
addition, advertising can be very costly. Although some advertising forms, such as newspaper and radio
advertising, can be done on small budgets, other forms, such as network TV advertising, require very large
budgets.
Personal Selling. Personal selling is the most effective tool at certain stages of the buying process,
particularly in building up buyers' preferences, convictions, and actions. Compared to advertising, personal
selling has several unique qualities. It involves personal interaction between two or more people, so each person
can observe the other's needs and characteristics and make quick adjustments. Persona! selling also allows all
kinds of relationships to spring up, ranging from a matter-of-fact selling relationship to a deep personal
friendship. The effective salesperson keeps the customer's interests at heart in order to build a long-term
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
77
relationship. Finally, with personal selling the buyer usually feels a greater need to listen and respond, even if the
response is a polite "no thank you."
These unique qualities come at a cost, however. A sales force requires a longer-term commitment than
does advertising—advertising can be turned on and off, but salesforce size is harder to change. Personal selling
is also the company's most expensive promotion tool, costing industrial companies an average of almost $200
per sales call.10 U.S. firms spend up to three times as much on personal selling as they do on advertising.
Sales Promotion. Sales promotion includes a wide assortment of tools— coupons, contests, cents-off
deals, premiums, and others—all of which have many unique qualities. They attract consumer attention and
provide information that may lead to a purchase. They offer strong incentives to purchase by providing
inducements or contributions that give additional value to consumers. And sales promotions invite and reward
quick response- Whereas advertising says "buy our product," sales promotion says "buy it now."
Companies use sales-promotion tools to create a stronger and quicker response. Sales promotion can be
used to dramatize product offers and to boost sagging sales. Sales-promotion effects are usually short-lived,
however, and are not effective in building long-run brand preference.
Public Relations. Public relations offers several unique qualities. It is very believable—news stories,
features, and events seem more real and believable to readers than do ads. Public relations also can reach many
prospects who avoid salespeople and advertisements—the message gets to the buyers as "news" rather than as a
sales-directed communication. And, like advertising, public relations can dramatize a company or product.
Marketers tend to underuse public relations or to use it as an afterthought. Yet a well-thought-out public
relations campaign used with other promotion mix elements can be very effective and economical.
The communication process consists of nine elements: sender, receiver, message, media, encoding,
decoding, response, feedback, and noise. To get their messages through, marketers must encode their messages
in a way that takes into account how the target audience usually decodes messages. They must also transmit the
message through efficient media that reach the target audience and develop feedback channels to monitor
response to the message.
To communicate effectively, marketers need to understand the fundamental elements underlying
effective communication. A communication model includes nine elements. Two represent the major parties in a
communication sender and receiver. Two represent the major communication tools – message and media. Four
represent the major communication functions – encoding, decoding, response, and feedback. The last element in
the system is noise (random and competing messages that may interfere with the intended communication). The
model underscores the key factors in effective communication. Senders must know what audiences they want to
reach and what responses they want to get. They must encode their messages in a way that understands how the
target audience usually decodes messages. They must transmit the message through efficient media that reach
the target audience and develop feedback channels to monitor the responses.
For a message to be effective, the sender's encoding process must mesh with the receiver's decoding
process. The more the sender's field of experience overlaps with that of the receiver, the more effective the
message is likely to be. This puts a burden on communicators from one social stratum (such as advertising
people) who want to communicate effectively with another stratum (such as factory workers).
The sender's task is to get his or her message through to the receiver. The target audience may not
receive the intended message for any of three reasons:
1. Selective attention: People are bombarded by 1,600 commercial messages a day, of which 80 are
consciously noticed and about 12 provoke some reaction. Selective attention explains why ads with bold
headlines promising something, such as "How to Make a Million," have a high likelihood of grabbing attention
2. Selective insertion: Receivers will hear what fits into their belief system. As a result, receivers often
add things to the message that are not there (amplification) and do not notice other things that are there
(leveling). The communicator's task is to strive for simplicity, clarity, interest, and repetition to get the main
points across.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
78
3. Selective retention: People will retain in long-term memory only a small fraction of the messages that
reach them. If the receiver's initial attitude towards the object is positive and he or she rehearses support
arguments, the message is likely to be accepted and have high recall. If the initial attitude is negative and the
person rehearses counterarguments, the message is likely to be rejected but to stay in long-term memory.
Because much of persuasion requires the receiver's rehearsal of his or her own thoughts, much of what is called
persuasion is actually self-persuasion.
The communicator considers audience traits that correlate with persuasibilily and uses them to guide
message and media development. People of high education or intelligence are thought to be less persuasible, but
the evidence is inconclusive, Those who accept external standards to guide their behavior and who have a weak
self-concept appear to be more persuasible, as do persons who have low self-confidence.
Fiske and Hartley have outlined some general factors that influence the effectiveness of
communication:
• The greater the monopoly of the communication source over the recipient, the greater the recipient's
change or effect in favor of the source.
• Communication effects are greatest where the message is in line with the receiver's existing opinions,
beliefs, and dispositions,
• Communication can produce the most effective shifts on unfamiliar, lightly felt, peripheral issues, which
do not he at the center of the recipient's value system.
• Communication is more likely to be effective where the source is believed to have expertise, high status,
objectivity, or likability, but particularly where the source has power and can be identified with.
• The social context, group, or reference group will mediate the communication and influence whether or
not the communication is accepted.
Developing effective communications involves eight steps: 1. Identify the target audience, 2. determine
the communications objectives, 3. design the message, 4. select the communication channels, 5. establish the
total communications budget, 6. decide on the communications mix, 7. measure the communications` results,
and 8. manage the integrated marketing communication process.
In identifying the target audience, the marketer needs to perform familiarity and favorability
analyses, then seek to close any gap that exists between current public perception and the image sought.
Communications objectives may be cognitive, affective, or behavioral – that is, the company might want to put
something into the customer`s mind, change the consumer’s attitude, or get the consumer to act. In designing the
message, marketers must carefully consider message content, message structure, message format, and message
format, and message source. Communication channels may be personal (advocate, expert, and social channels)
or nonpersonal (media, atmospheres, and events).
PERSONAL COMMUNICATION CHANNELS
Personal communication channels involve two or more persons communicating directly with each other
face to face, person to audience, over the telephone, or through e-mail. Personal communication channels derive
their effectiveness through the opportunities for individualizing the presentation and feedback.
A further distinction can be drawn among advocate, expert, and social communication channels.
Advocate channels consist of company salespeople contacting buyers in the target market. Expert channels
consist of independent experts making statements to target buyers. Social channels consist of neighbors, friends,
family members, and associates talking to target buyers. In a study of 7,000 consumers in seven European
countries, 60 percent said they were influenced to use a new brand by family and friends.
Many companies are becoming acutely aware of the power of "word of mouth,' They are seeking ways
to stimulate social channels to recommend products and services. Regis McKenna advises a software company
launching a new product to \w-mote it initially to the trade press, opinion luminaries, and financial analysts, who
can supply favorable word of mouth; then to dealers; and finally to customers MH attracted customers with its
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
79
Friends and Family program, which encourages MCI users to ask friends and family members to use MCI so that
both parties will benefit from | lower telephone rates. See the Marketing Memo "How to Develop Word-ofMouth Referral Sources to Build Business."
Personal influence carries especially great weight in two situations. One is with products that are
expensive, risky, or purchased infrequently. Here buyers are likely to be strong information seekers. The other
situation is where the product suggests something about the user's status or taste. Here buyers will consult others
to avoid embarrassment.
Companies can take several steps to stimulate personal influence channels to work on their behalf:
• Identify influential individuals and companies and devote extra effort to them; In industrial selling, the
entire industry might follow the market leader in adopting innovations.
• Create opinion leaders by supplying certain people with the product on attractive terms; A new tennis
racket might be offered initially to members of high school tennis teams at a special low price. Or Toyota could
offer its more satisfied customers a small gift if they are willing to advise prospective buyers.
• Work through community influential such as local disk jockeys, class presidents, and presidents of
women's organizations: When Ford introduced the Thunderbird, it sent invitations to executives offering them a
free car to drive for the day. Of the 15,000 who took advantage of the offer, 10 percent indicated that they would
become buyers, whereas 84 percent said they would recommend it to a friend.
• Use influential or believable people in testimonial advertising: Quaker Oats pays basketball star
Michael Jordan several million dollars to make Gatorade commercials. Jordan is viewed as the world's premiere
athlete, so his association with a sports drink is a credible connection, as is his extraordinary ability to connect
with consumers, particularly children.
• Develop advertising that has high "conversation value": Ads with high conversation value often have
a slogan that becomes
part of the national vernacular. In the mid-1980s, Wendy's "Where's the Beef?"
campaign (showing an elderly lady named Clara questioning where the hamburger was hidden in all that bread)
created high conversation value. Nike's "Just do it" ads have created a popular command for those unable to
make up their minds or take some action.
• Develop word-of-mouth referral channels to build business: Professionals will often encourage clients
to recommend their services. Dentists can ask satisfied patients to recommend friends and acquaintances and
subsequently thank them for their recommendations.
• Establish an electronic forum: Toyota owners who use an online service line such as America Online
can hold online discussions to share experiences.
NONPERSONAL COMMUNICATION CHANNELS
Nonpersonal channels include media, atmospheres, and events.
Media consist of print media (newspapers, magazines, direct mail), broadcast media (radio, television),
electronic media (audiotape, videotape, videodisk, CD-ROM, Web page), and display media (billboards, signs,
posters). Most nonpersonal messages come through paid media.
Atmospheres are "packaged environments" that create or reinforce the buyer's leanings toward product
purchase. Law offices are decorated with Oriental rugs and oak furniture to communicate "stability" and
"success." A luxury hotel will use elegant chandeliers, marble columns, and other tangible signs of luxury.
Events are occurrences designed to communicate particular messages to target audiences. Publicrelations departments arrange news conferences, grand openings, and sports sponsurships to achieve specific
communication effects with a target audience.
Although personal communication is often more effective than mass communication, mass media might
be the major means of stimulating persona! communication. Mass communications affect personal attitudes and
behavior through a two-step flow-of-communication process. Ideas often flow from radio, television, and print to
opinion leades, and from these to the less media-involved population groups. This two-step flow has several
implications. First, the influence of mass media on public opinion is not as direct, powerful, and automatic as
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
80
supposed. It is mediated by opinion leaders, people whose opinions are sought or who carry their opinions to
others- Second, the two-step flow challenges the notion that consumption styles are primarily influenced by a
"trickle-down" or "trickle-up" effect from mass media. People interact primarily within their own social group
and acquire ideas from opinion leaders in their group. Third, two-step communication suggests that mass
communicators should direct messages specifically to opinion leaders and let them carry the message to others.
Pharmaceutical firms should promote new drugs to the most influential physicians first.
Communication researchers are moving toward a social-structure view of interpersonal communication.
They see society as consisting of cliques, small groups whose members interact frequently. Clique members are
similar, and their closeness facilitates effective communication but also insulates the clique from new ideas. The
challenge is to create more system openness so that cliques exchange information with others in the society. This
openness is helped by people who function as liaisons and bridges, A Unison is a person who connects two or
more cliques without belonging to either. A bridge is a person who belongs to one clique and is linked to a person in another clique.
Although there are many methods used to set the promotion budget, the objective-and-task method,
which calls upon marketers to develop their budgets by defining their specific objectives, is the most desirable.
In deciding on the marketing communications mix, marketers must examine the distinct advantages and
costs of each promotional tool. They must also consider the type of product market in which they are selling,
whether to use a push or a pull strategy, how ready consumers are to make a purchase, the product's stage in the
product life cycle, and the company's market rank. Measuring the marketing communications mix's effectiveness
involves asking members of the target audience whether they recognize or recall the message, how many times
they saw it, what points they recall, how they felt about the message, and their previous and current attitudes
toward the product and company.
Managing and coordinating the entire communications process calls for integrated marketing
communications (IMC).
REFERENCES:
[1] BEARDEN W.O., INGRAM T.N., LAFORGE R.W.: Marketing: principles, perspectives. – Chicago:
IRWIN, 1996, 631 p.
[2] CATCORA, PHILIP R.: International marketing. – 9th ed. – Chicago: IRWIN, 1996, 770 p.
[3] LAMBIN J. - .J: Strategic Marketing management. – Berkshire: The McGraw – Hill Companies, 1997, 692
p.
[4] PHILIP KOTLER, GARY ARMSTRONG: Principles of marketing, sixth edition, Prentice-Hall,
International, Inc., 1994, 692 p.
[5] PHILIP KOTLER: marketing management the millennium edition, Prentice-Hall, International (UK)
Limited, Sydney, 2000, 131 p.
[6] PHILIP KOTLER: Marketing management analysis, planning, implementation, and control, eighth edition,
Northwestern University, 1994, 801 p.
[7] RANGAN V.K., SHAPIRO B.P., MORIARTY R.: Business Marketing Strategy: Cases, concepts, and
applications. – Chicago: IRWIN, 1995, 328 p.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
81
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
82
MARKETING PREPAREDNESS OF SMALL AND MIDDLE-SIZE ENTERPRISES FOR
THE ENTRY INTO THE EU
Jaroslav Ďaďo
Faculty of Economics, Matej Bel University, Banská Bystrica, SLOVAKIA
Tel. 00421-48-4152 786, fax. 00421-48-411 6859, : [email protected]
Abstract: Globalisation trends are also affecting the countries of Eastern Europe, which, until
recently, were separate from the world of marketing. The signing of the accession protocols for the
entry into the EU has an enormous influence on small and middle-size enterprises. This step opens
up more room for the development of marketing activities of enterprises on a qualitative level. The
development and marketing activities of small and middle-size enterprises, which are preparing for
the entry into the international market or for those who, in their domestic market encounter
international competition, have special importance.
Keywords: small and middle-size enterprises, European Union, SWOT analysis;
The process of globalisation and integration of countries reflects the change in the sale of production
ratio between foreign exchange and domestic consumption. In the 1920s, the ratio was 93-95% to 7-5%. Now,
this ratio is reversed. Present day integration trends, specifically in Europe, are quickly and strongly influencing
Slovakia as well. On the Slovak market, these trends are more visible in the offers of foreign companies. Only a
few Slovak companies can, in this field, dictate trends on the European scale. Small and middle-size enterprises
are even less capable of this. Despite this fact, trends of globalisation are strongly showing their presence on the
Slovak market as well. Therefore, even companies that do not directly enter the European Economic Community
(EEC), have to accept global and European trends in marketing. Many companies exist that will not enter the
EEC for a long time. Nevertheless, these will have to (and already do) face heavy foreign competition on the
Slovak domestic market. The foreign competition does not only enter the market with a competing product, but
also with thought through marketing formation and with a marketing strategy.
When concerning what products are offered on the Slovak market, globalisation trends are more and
more visible in consumption. The consumer meets on a daily basis products by foreign companies, and s/he
compares this to domestic products. This whole process directly influences big, small and middle-size
enterprises.
The premise for the entry of small and middle-size Slovak enterprises into the EEC and their integration
into the trends of international marketing are defined by a lot of factors. Core areas that need to be addressed for
larger involvement of Slovak companies in European trends are mainly the following:
·
country’s macroeconomic indicators and their pro-export state policy
·
competitive abilities of companies
·
other factors
1. PREMISES FOR MARKETING
Slovakia already started with an association agreement to the European Union as a part of the Czechoslovak
Federation. Each government, since the separation of the federation, declared their attempt to become part of
international economic aggregations such as the OECD, CEFTA or the European Union. Slovakia is a member
of CEFTA, a group of countries V4, also a member of OECD, and it is also on the right track for the entry into
the European Union. The integration efforts can be also supported by a number of bilateral agreements signed by
the Slovak Government.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
83
Even if the role of the state in the market economy is declining, the state can eventually support the
involvement of small and middle-size enterprises in international economic cooperation. The state can also
create better or worse conditions for the entry of foreign capital into Slovak economy.
The development of macroeconomic indicators (as well as the balance of the foreign trade balance,
amount of export, amount of foreign investment, etc.) creates the development of globalisation and integration
trends. These indicators, sometimes, strike with similarity in post-communist countries. The difference of
individual countries’ positions depends on a thorough examination of separate indicators. The development of
these factors directly influences both small and middle-size enterprises in Slovakia.
Liberalisation of the market, which is a symbol of globalization, meant catastrophe for some countries
and several sectors. In the conflict of global tendencies, disagreements occur for example between Slovakia and
Hungary regarding the animal market. The Polish are facing pressure from Slovakia and the Czech Republic in
the sugar market. Slovaks refuse to import Czech beer, Hungarians force their corn into Poland, and the Baltic
countries are faced with the import of cheaper meat. Individual countries are competing with each other for
foreign capital. By doing so, they mainly use various tax policies, tools to support active work policy, prices of
land, and “gifts” of infrastructure, etc.
The support of pro-export policy as constructed in several countries is based on the amount of
international trade. This indicator indirectly reflects the level of active involvement of companies in the activities
of international marketing.
An important factor, which reflects the globalisation process, is the entry of foreign investment. This
creates a premise for the development of cooperation of small and middle-size enterprises with large companies.
The cooperation of small companies with middle-size enterprises, with a foreign investment, creates a premise
for the development of companies, which will allow them to become familiar with the rules of their existence
and determine their successful entry into the international market.
The field related to foreign direct investment (FDI) is necessary, not only for the economy of a specific
country, but also for SME*. This then creates an option for involving SME in cooperation with foreign capital
and indirectly an entry onto the international market as well. In the year 2000, the field structure of the FDI in
Slovakia, was dominated by production (54%), followed by transportation and communication (15%), money
and insurance (13%), and trade (12%). Construction reached only 1% and services 2%.
The importance of marketing in the activities of SME can also be derived from macroeconomic
indicators in the specific field of a SME. The amount of production in Slovakia, in 2001, was % from GDP.
SME share in the total export of Slovakia was % (from 11.9 billion USD).
2. COMPANIES’ ABILITY TO COMPETE
The ability to compete, of small and middle-size enterprises in Slovakia, (as well as in the former Board
of Common Economy Cooperation) are dominated by historical influences, which are the result of field
structure, concentration of capacities, using of technologies, and the system of taking over a market valid until
1990. The structure of small and middle-size enterprises in Slovakia practically did not exist from 1950 to 1990.
The capacities of numerous companies were, in the time, limited for the needs of an approximately 350 million
market. The field industry was unbalanced. Historical factors are also strongly visible even in the human
resources area of small and middle-size enterprises, starting with the top level management to the worker. This
human resource phenomenon in the sphere needs to be presently seen as relationships between the owners,
managers, and workers. This has also determined the formulation of the SME ability to compete.
The current analysis of strengths and weaknesses of Slovak SME reflects its characteristic in the table
number 1. The analysis is dominated by a marketing view. The indicated characteristics determine the premise
for the integration into the process of global and European marketing.
*
Small and middle-size enterprises (translator’s note)
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
84
The analysis represents a summary of information accumulated during workshops organized in the
cooperation with PHARE. The aim of the workshops was to prepare SME management, with a potential for
export, to make their own marketing plan for export. In the scope of the project, eight workshops were realized,
which over 200 SME managers from around Slovakia attended.
Table No.1 the analysis of strengths and weaknesses of SME
Strengths
Weaknesses
- high level of management creativity
- high analytical abilities of management
- creativity of management
- qualified work force
- internal reserves of companies
- ability to handle development of a product
- ability to produce in conditions with limited
resources
- socially moderate employees
- vacant production capacities
- the quality of production bases on subsidies
for special production
- local perception of the market
- missing real constructive production
- low ability to acquire information
- low level of marketing creativity
- low ability to find options on the market
- inappropriate usage of ownership rights of the owners
- lack of financial resources
- unclear as a whole, especially the marketing strategy
of the company
- orientation toward defensive strategies
- low level of willingness of managers to cooperate
with owners
- low level of customer care
The core research question in these workshops was, “Why is the level of involvement of Slovak SME
low in international marketing?” Besides the characteristics shown in table 1, the research has also shown
following shortcomings in approaches to marketing by SME in Slovakia:
·
the entry into the international market is prepared intuitively, without exact market research, without
calculations of the potential of the targeted market, based on insufficient information about the market;
SME understand the market as undifferentiated, they cannot closely identify the customer, the target
segment, its amount, and its purchasing behaviour. On the target market, they do not know the real and
potential competition as well as the substitution offer. The competition is analysed only superficially, they
are not aware of their real ability to compete.
·
their goals are focused more on the means and less on the outputs; The goals are not clearly defined,
without any substance or a time frame, they are not concrete.
·
the basic strategy of the entry is direct export; They do not have a clearly defined strategy. Companies, to
a high extent, form long term ties with one customer, who they usually get by the interest of the customer
to receive a shipment of their goods. Companies are satisfied by having one customer, which increase the
risk of their position on the international market.
·
Slovak suppliers do not have sufficient information about his/her foreign partner; On the other hand, the
foreign partner often “studies” the Slovak supplier more thoroughly
·
the basic factor of the offer is the price of the product (usually lower, based on the expenses of the
company); SME do not spend time to create a special market position for their offer, their do not take into
consideration the differences in products, they are unable to identify their USP, and they base their offer
on a higher added value. They fall back on high quality of their production, which is neither declared by
criteria, nor does it have a certificate, and it is without an analysis of a competing product. They cannot
identify why, in what, and in comparison with what does their product have a higher quality.
·
lack of confidence in their abilities and competencies is spread among SME; Their identification of
strengths and weaknesses is neither based on a real analysis of criteria nor do they compare themselves
with competition. They are unable to systematically identify threats and possibilities on the market. They
cannot guess the trends of the market development in relation to their offer and to a served segment.
·
planning of business and marketing activities does not have a formal expression, the planning process is
underestimated; They do not take advantage of economic and financial modelling, they do not work with
alternatives, and are unable to identify business risks connected with the entry on the international market.
·
SME also lag in organisation and managing of their marketing activities. This situation is also caused by
insufficient preparedness of the human potential and by execution of marketing activities. Only on a small
scale, do they try to predict the needs of customers, the marketing lacks creativity and a conceptual
approach to the market. From the researched sample, 10-15% of managers, or people working in the
marketing sphere, meet the difficult criteria used.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
85
One of the deciding factors, which could help to cope with the above mentioned shortcomings, which
can help use their strengths, is education of people working in management positions in companies, people who
work professionally in marketing, and people who specifically work in the sphere of international business. This
education must entail theoretical preparation and acquisition of practical experiences by working on various
world markets.
The research also revealed a considerable pro-export potential of Slovak managers in SME. Among the
deciding criteria of their success are:
·
managers show great effort in applying new information in practice, they show their interest in marketing
training, they want to change the present state of marketing in their company; To a positive stimuli from
around, they react proactively with the system task=challenge, they show high personal involvement.
Personal attendance of owners or managers of SME in the workshops gives guarantees that these represent
guarantors and the moving powers of changing process.
·
Attendees of the workshops showed high interest in information and information resources dealing with
international market. More than 30% of participants had real-life experiences from working in the sphere
of international marketing. These experiences guide them to the effort to establish their own information
marketing system about the international market.
·
Export and other forms of entry into the international market are understood by them as an important
factor for stable profits means of their company. In this field, there exist real cash flows for deliveries of
goods and services.
3. OTHER FACTORS
The process of integration of Slovakia (and not only of Slovakia) and Slovak companies influences a
wide range of social and cultural factors. One can positively see, in this trend, the growth in the number of
cosmopolitan consumers, who are dominated by younger consumers and representatives of middle and upperclass. On the other hand, the situation in the country is that the state policy, partially, stimulates the growth of so
called, “nationally oriented consumers”- who prefer consumption of products made in Slovakia. To the
professionals in marketing, it is known that globalisation trends in consumption strongly influence habits of
consumption as well. Habits of consumption are in Eastern Europe (in comparison with Western Europe)
concerning some needs and products a lot stronger. The consumer trend lays the premise for companies
operating with import from foreign markets.
Another important factor determining involvement of Slovak SME in international marketing is the
actual self-perception. This reflects a perception of managers as well as that of consumers. With managers, this
perception formulates their abilities and steps to be taken on the market where with consumers this, to a high
degree, formulates their consumption of a large group of products.
The position of Slovakia, within the scope of Eastern Europe based on self-perception, is shown in
research conducted by the Fesel and GFK agency, which is indicated in table number 2 (the countries are
identified according to international marking of motor vehicles). The respondents of the represented countries
evaluated criteria chosen by the agency. These also influence the speed and the level of acceptance of European
and global trends of consumption and management style by SME managers. A management style forms one of
the alternative decision-makings about the entry of SME into the international market. These criteria have a
strong influence, not only on consumption, but they also influence the level of how a company establishes itself
on the international market and the domestic market in comparison with entering worldwide companies.
Numbers of these above-mentioned characteristics directly reflect managing style.
Table No. 2 Ratio of self perception (rating form 0 to 100)
Country
Self perception as:
H
CZ
SK
PL
- moderní (modern)
36
36
32
41
- bystrí (clever)
47
34
31
34
- búrliví (loud)
39
44
56
52
- silní (masculine)
35
24
13
27
- seriózni (serious)
48
28
28
37
- mierumilovní (peace)
70
18
38
14
BG
32
51
47
39
36
61
SU
29
30
62
52
27
49
RO
35
58
25
29
72
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
86
- vzdelaní (educated)
- cieľavedomí (goal-or.)
- dôslední (pedantic)
- úspešní (successful)
- elegantní (elegant)
- pesimistickí (pessimistic)
- pomalí (slow)
- spoločenskí (sociable)
- tolerantní (tolerant)
- konzervatívni(conservative)
- súcitní (appealing)
55
50
28
28
19
41
48
66
50
22
68
59
40
10
19
24
24
42
35
36
28
63
59
44
20
23
16
23
41
47
36
25
64
37
52
11
41
12
25
31
63
36
20
53
61
33
27
21
18
25
46
59
41
28
44
46
16
14
7
8
17
40
62
68
32
46
53
36
14
22
22
13
36
73
65
26
50
Note: Counties are identified according international car codes
Resource: According “Mirror“, Central European Economic Review, The Wall Street Journal Europe´s,
December 1997 – January 1998, Vol. V. No. 10, p. 7.
According to the information in table number 2, Slovaks characterise themselves as educated, energetic,
and emphatic, but more intolerant and weak. The opposite are Russians, who feel rough, social and tolerant, but
don’t feel modern, successful and think that they have little elegance. Romanians consider themselves pacifistic
and with temperament. Poles feel modern and successful, but inconsistent, with little elegance and a bit little
pugnacious. Currently, globalisation affects small and middle-size enterprises in Eastern, Western Europe, and
around the world. Globalisation forms a premise for a new approach to product development, communication,
and distribution. Globalisation penetrates the Slovak market via global products, it has a positive influence on
Slovak companies when standardizing products, packaging, in services, in support as well as in distribution.
New distribution channels (the Internet) of some products will allow the aversion of protectionist measures in a
given country. Companies must formulate their strategy in consideration to their working on the market, where
big and strong international companies are established. Both defensive and offensive measures should have a
wider range of application in Slovak companies. Based on their ability, it is apparent that they should develop a
strategy of taking closely defined segments. Globalisation and European integration, which lead to Euromarketing, suppress the importance of state borders. The above-mentioned trends lead to a higher product quality
and its standardization.
Within the trends of globalisation, Slovak companies must consider trends that are specific for the
European market. Among the core trends, which SME must count with, in their marketing strategies and with
other marketing activities, are following:
·
development of a common European market, taking down market barriers, a free flow of goods, of
services, investment, labour force, and application of a common currency
·
the European Union will remain, for years, economically, legally, and culturally heterogeneous, in
individual regions different levels will fulfil the needs
·
a change in the competition environment in Europe and outside will have a strong impact on Slovak
companies as well; The competition will be stronger and a system of a so-called exclusive representation
will be limited.
·
a state will restrain its measures against competition, with the help of which it tried to regulate or protect
the domestic market
·
development of the market will bring establishment of new segments and present segments will change
their behaviour
This all will have a very strong impact on the marketing of Slovak small and middle-size enterprises. The
changes will be visible more on these, than on large companies and multinational corporations. It will have a
direct impact on the offer formation by SME quality production, price, distribution methods, marketing
communication, protection of the consumer, and other fields of activities. The bottom line is that it will not only
determine the economic outcomes of SME, but it will also determine their ability to survive on the market in the
upcoming years. Essentially, this effort to increase the quality of marketing activities must be visible in the
strengthening of the marketing orientation of Slovak SME and in the strengthening of their ability to compete on
the market. A crucial factor of success will be a change in their work style and in education of top managers and
marketing professionals.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
87
LITERATURE:
[1] „Mirror“, Central European Economic Review, The Wall Street Journal Europe´s, December 1997 – January
1998, Vol. V. No. 10, p. 7.
[2] JAWORSKI, B. J. – KOHLI, A. K. 1993. Market Orientation: Antecedents and Consequences, Journal of
Marketing, Vol. 57, July, pp. 53-71.
[3] KRIŽANOVA, A. 1995. Význam marketingového riadenia v podnikoch cestnej dopravy, In: Zborník prác F
PEDaS č. 3, Žilina: ES VŠDS, str. 49-55.
[4] SLATER, S. F. – NARVER, J. C. 1994. Does Competitive Environment Moderate the Market OrientationPerformance Relationship?, Journal of Marketing, Vol. 58, January, pp. 46-56.
[5] URAMOVA, M. 2002. Slovak Enterprises and their Competitivnes in the V4 Conutries Area. In:
Geopolitical Importance of Central Europe (V4) and its Prospects. Banská Bystrica: Univerzita Mateja Bela,
Ekonomická fakulta. 2002. s. 267-273. ISBN 80-8055-732-2.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
88
INTERCULTURAL DIFFERENCES BETWEEN SLOVAK AND EU COUNTRIES
Vladimír Laššák, Jaroslav Ďaďo
Faculty of Economics, University of Matej Bel, Banská Bystrica, Slovakia
Tel. 00421-48-4152786, e-mail: [email protected]; [email protected]
Abstract:Intercultural management as one of the youngest disciplines of management starts its
development after falling down trade barriers in international trade in the last century. The
globally operating businesses are based on free goods, people, and the know-how and capital
movements where the operations of companies are developed without respecting countries or
continents borders. Studies of different culture interactions and relationship of people in
international management became highly important. This article describes cultural differences
between Slovak and EU entrepreneurs in selected areas of business life as we recognise it in our
empirical research.
Keywords: intercultural differences, European Union,
INTRODUCTION
It is well known that ability to adapt in foreign country is in direct relations to degree of cultural
knowledge of foreigners. People feel cultural differences strongly when having a very short visit in foreign
country, e.g. on holiday or business trip, but everybody is able overcome its because both visitor and also host
people tolerate many partner’s habits and behaviour. To know smallest and the most specific differences in other
country culture is sort of spice belonging to passion of travellers.
In situation of long-term working relations between employees from different cultures ability to tolerate
many habits and behaviour should not be so evident. Everyday interaction of people, which has been breeding
from its pupil age in different cultures and different value systems, can change the workplace on fireplace with
day-to-day latent conflicts.
Doing job in different cultural management is much more demanding and challenging because relations
and procedures running in “my home country” without any problems are going here by totally different way.
Simply say, my action (behaviour or requirements) is perceives and evaluates differently by my foreign partners
and I face usually unexpected response and reaction than that I had usually got in my home country business
environment.
Working in multicultural environment requires integration of many elements and requirements from foreign
culture into my culture. Doing this does not mean adaptation to every rule and requirements of other culture –
because one can lose his or her identity and spontaneous in his or her behaviour.
A very frequent mistake the cultural unskilled managers have done is judging and qualifying behaviour
of domestic staff or employee according criteria which are typical for manager’s culture environment. This
means that everything what employee has done was evaluated and judged in relation to cultural criteria of
manager’s country. For example, behaviour which most people form EU see as “no problem” (to come a little bit
later, just twenty minutes, on meeting) is for them common and “o.k.”, but for my own (e.g. Slovak) culture is
hardly acceptable.
1. EU MAN COMES TO SLOVAKIA
If somebody from EU come to visit Slovakia he or her probably hear that Slovakia is a country in
Central Europe and Slovak people are in principle same as EU people without any serious differences. “Well,
then I shouldn’t be afraid of being there and my adaptation will be without problem”, thing EU people.
This truth is only relative, until we compare mentality and behaviour of Slovak people to the other West
Europe nations. So, this good advice can be also wrongful and having it results that foreigner will not be
sensitive enough to situation in new environment. He or her probably forgets that “don’t be different” cannot
mean to have similar attitudes and to be sovereign in every situation.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
89
UE entrepreneurs having analysed current daily situations from Slovak environment according France
cultural criteria is not able interpreted it correctly because naturally using of different decoding system of this
event. Being able to better understand of these cultural decoding systems we have discussed some cultural
difference in attitudes and behaviour of EU and Slovak people.
We focus primarily on two types of cultural issues:
1. differences in everyday life,
2. differences in working environment, refer to:
3. differences in business trade or sales negotiation,
4. differences in management situation.
2. DIFFERENCES IN EVERYDAY LIFE
There are many cultural differences concerning everyday life situation of average EU man and Slovak.
We concern in our research only to some of them. Fields describe below are basic artefacts through we can
identify specific culture.
Table 1 Cultural differences in everyday life EU and Slovak
in EU
in Slovakia
Greetings
Greeting in EU “Hello, how are you?” has a
function to get first contact with others. Most EU
people don’t expect real answer. The obvious answer
to this greeting is “Hello, I´ m well thank you.”
Perception of time and meeting time
To be late for fifteen or twenty minutes is very
common. Everybody tolerate this. When being
inviting to family lunch or dinner it is even expected
to be a little bit late. Meetings are often set to late
evening hours e.g. 19:00 but not in the lunch time
12:00- 14:00
Meal
The meal consists from starters, main meal, cake or
cheese. The bottle with water is always on the table.
EU man are happy in situation spending with others
sitting around the table. Conversation topic concern
usually gastronomy (food, vine).
Spirits
Spirits (like cognac, vodka etc.) are not offered as
aperitif when starting, but usually at the end of
meeting.
Working hours
Employee works for that he wants to live. Average
EU man 35 hours-working time. In other countries it
is 40 or 48 hours.
Acceptance of foreigners in family
It differ according time of arrive. The meal is offered
only when coming in time of lunch or dinner.
Otherwise are offered some soft drinks or coffee.
Body contact
When meeting friends they often symbolically touch
each other on face.
Greetings
Greeting in Slovakia has different meaning. First contact should be
an opportunity to say something about myself and share real
personal information. The answer is very often like “My children
are ill again” or “My car battery is go wrong this morning because
tough winter, etc.”
Perception of time and meeting time
The agreed meeting time is expected to keep and this is also
expected in private and business situations. Meetings are arranged
usually before or after lunchtime, e.g. 13:00 or early in morning
(7:30) exceptional in the late evening. Meetings in the evening could
be seen as intervention into private live.
Meal
The lunch starts with soup in every season of year. Drinks are
offered after the meal. Time spend by eating is much shorter. The
food is very often much important then conversation. Discussion
became much more intensive after having lunch or dinner.
Spirits
Spirits (like cognac, vodka or brandy) are offered like aperitif and
also in any time of day if there is any appropriate opportunity (in
official business meeting or in family).
Working hours
For middle or lower social classes – to have a job is necessary for
survival. To have two or more jobs is quite common (in spite of
18% of average unemployment rate).
Acceptance of foreigners in family
It does not matter on the time when visitor comes. Always are
offered meals and drinks (including spirits). Welcoming a visitor
starts with meal offer. The offered meal is a symbol of hospitality.
Body contact
Slovaks touch others when shaking hands, touching arm or shoulder
of visitor as a symbol of friendship but can be seen as too tight.
3. WORKING ENVIRONMENT DIFFERENCES
To be able understand different business culture situations it is necessary to study misunderstandings
and conflicts because through them we can easily understand to motives which lead to this accidents.
The conflict, open or close, extrinsic or intrinsic is a part of everyday working live. People do not feel good in
conflict situations and for that they try to keep out of them. By doing this they try to create friendly working
environment. But essential part when managing personal conflict is respect to other side or partner. Generally
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
90
say, we cannot be successful in global business without respect differences in multicultural environment. This
means to respect different style of life, thinking and different habits, behaviour, procedures and decisions made
in business situations.
In trade the better understanding of partner and colleagues enable create positive and effective working
environment. For being successful in negotiations it isn’t just enough to present professional competence and
sense for negotiations. The co-operation, listening, understanding a knowing habits of others is essential.
This part of the article has two sub-parts. In the first one we have focused on negotiation affairs, in
second one we will discuss differences relating to employees in the EU and Slovak working situation.
Table 2 Cultural differences in trade situations and business life of the EU and Slovak
in EU
in Slovakia
Business negotiations
Agreement
After signing agreement obligatory fulfilment is
expected. Each side is obliged to fulfil what was
promised.
Terms
Terms are firm and obligatory set and are very
important.
Planning
Planning is done systematic step by step in each level
of management.
Organisation of meetings
Seeming meeting a little bit jamming maybe chaotic
doesn’t affect opinion about partner. In spite of the fact
that program of meeting is structured it is let free for
everybody to keep it.
Legislation and relations
To fulfil goals and done results have higher importance
than good relations between partners. Laws win and are
above personal interest of people.
Negotiation process
Partners may be seen as jumping from one topic to
another and talking around using nice words.
Negotiation is opportunity to exchange opinions and
there is no problem to change topic of conversation, to
interrupt of partner - “France negotiations”.
Decision making
The decision making is made after analysis and
checking all information and knowing all possible
circumstances.
Way of thinking
Strong sense for synthesis. Short and long term
anticipation make no problem.
Creativity
Creativity is developed from pupil age at basic schools
and has its own tradition. Schools are oriented to
delivering know-how (knowledge, skills and attitude).
Respecting rules
French people are rebels. They struggle for changing
laws (strikes, manifestations, etc.) Customers of
services are very often hostages.
Relations to boss
Employees use to say very open what they think about
their boss.
Agreement
After signing agreement obligatory fulfilment is expected but
not always entirely. Agreement is perceive as guideline but not
exact setting.
Terms
Terms are symbolic part of any plan.
Planning
Planning is also done, but people do not take real care when
plan. Planning is very often just symbolic part of management.
Organisation of meetings
The way the meeting is organised can reflect how the company
done its job in organisation of things. It is expected to strictly
respect agreed program of meeting.
Legislation and relations
Very good relationships between partners are much more
important than premeditate strategy of negotiation. Personal
relations are much more important that legislation or rules.
Negotiation process
Negotiation process should be different. Partners can jump
from one topic to another without sense. Skilled businessman is
usually very structured and goal oriented in the negotiation.
Decision making
The decision making is intuitive in many cases. It is quite
common don’t have analysis and precise information especially
about markets.
Way of thinking
People think more analytically than synthetically. Difficulties
are in general imagination (concept view) and short or long
term anticipations.
Creativity
People are very creative what can be seen in many private
situations, not in jobs, or in parallel economy. Schools are
oriented to delivering of accumulated knowledge. Expected is
exact reproduction of teacher words or book text.
Respecting rules
Slovak people don’t try change bad laws, they try go out of
them. In evading the law and agreed rules creativity of Slovak
people is without borders.
Relations to boss
Before velvet revolution 1989 employees was afraid to say own
opinion. They obey bosses. After 1989 there is still distress.
Managers are not able accept employees opinions and they are
not able utilise employees skills and competencies.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
91
Table 3 Cultural differences concerning employee
EU countries
Business lunch or dinner
Business lunch or dinner is usually realised before
having official meeting. Time spend when having meal
is use for assessment of partners.
Managers
Managers are accepted and have respect if they are
competent and have appropriate results. Also freshly
graduated can be set as managers. Diploma is seen as a
symbol of being competent.
Goals of communication
Goals are declared very directly and clear.
Confidence
People are confident if fulfil what promise.
Degree
Academic degrees (like Dr., Dipl.ing.) are not
important in private and business relations.
Agreeing
If you agree always with your boss it isn’t positively
considering.
Slovakia
Business lunch or dinner
Business lunch or dinner is usually realised after having official
meeting. Time spend when having meal is use for short quick
discussion.
Managers
Managers are usually older. Principles of seniority are very
strong. But in the last years there are some positive changes
especially in foreign subsidiaries.
Goals of communication
Goals are declared indirectly very often.
Confidence
People are confident if they are loyal to authority.
Degree
Academic degrees (like Dr., Dipl.ing.) are very important is
society but in companies using it had declining tendency.
Agreeing
To agree with boss means that you are under his/her umbrella
save and secure.
CONCLUSION
The goal of the topic was description of selected cultural differences between Slovak and EU partners
in common business and working situation. We have studied main historical, political, social, cultural,
economical, etc. reasons to understand factors and forces, which formed certain typical business culture.
We provided comparison of behaviour, habits and attitudes of managers in both countries. Our
recognition was structured into three tables concerning everyday live situation, negotiations and typical working
situation.
We believe that studying and thinking about this problems can lead to better knowing and
understanding of cultural differences and improving relations in live and business of both countries.
Recommendation from this study can be describe as follows:
·
For being successful in multicultural business it is necessary to be open, perceive and tolerant to the other
country culture.
·
When need to acquire and involve foreign partners to successful co-operation it is necessary to understand
which differences are significant in each culture and how can be interpreted by both.
·
Features significantly different from cultural the point of view (strategic or managerial issues) can be on
the other side surprisingly reasons for co-operation in business situations (business issues) because looking
for new stimulus.
LITERATURE:
[1] URAMOVÁ, M. 2001. Podnik, štát a konkurenčné prostredie. In: Ekonomika firiem 2001. Zborník z
medzinárodnej vedeckej konferencie. Košice: Ekonomická univerzita v Bratislave, Podnikovohospodárska
fakulta, 2001. S. 759 -764, ISBN 80-225-1446-2.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
92
FUZZY LOGIC AND FINANCIAL TIME SERIES
Pavel Dostál1), Ladislav Žák2)
1)
2)
Strakatého 15, 636 00 Brno, Czech Republic, E-mail: [email protected], http://www.iqnet.cz/dostal/
Phone: ++420 5 44211639, Fax: ++420 5 44234750
Department of Stochastic and Non-Standard Methods, Institute of Mathematics, Faculty of Mechanical
Engineering, Technical University of Brno,Technická 2, 616 69 Brno, Czech Republic,
Email: [email protected], Tel: ++420-54114 2550, Fax: ++420-54114 2527
Abstract: The article presents the possible prediction of time series by means of fuzzy logic. The
prediction errors MAPE and the estimation of trends serve as the evaluation of used method. The
obtained results show possible applicability to prediction of financial time series and their
information efficiency during the process of decision making on the stock market.
Keywords: Time series, finance, prediction, fuzzy logic
1. INTRODUCTION
There are manifold methods used for the predictions of time series. In addition to the classical method
such as Box-Jenkins methodology, Kalman filter, etc., we can use artificial neural networks, genetic algorithms
or fuzzy logic. See articles 1,2,3,4,5,6,7. This article describes the search for the use of fuzzy logic for the
purposes of prediction of financial time series.
We used the Fuzzy Inference System (FIS) of Sugeno type for the prediction of time series. The FIS
Sugeno is designed in such a way to give one output with n inputs. The inputs of FIS are represented by the
values of time series preceding the value that we want to predict. The output value is prediction value. The longterm testing shows that the quality of prediction is given by the set up of the number of members of time series
which creates the input of FIS. The FIS does not describe the course of time series when the number of members
is either low or high (the low or high sensitivity of regulator).
The FIS has been tuned to the initial part of time series to set up the number of input values of time
series. The cluster method has been used for the purposes of assigning members of time series to clusters. The
number of clusters define the number of input linguistics value of input linguistics variable. We choose the
output of FIS Sugemo in the form of linear dependence. We found the constants for set up of the FIS suitable for
the searched time series by the optimization of Sugeno methods above preceding members of time series. The
design and tuning of FIS Sugeno has been made in the program Matlab – FuzzyTolbox.
The quality of prediction of heat consumption has been calculated according to match of prediction of
tendency of prediction of development of time series with reality and according to average error MAPE defined
æ
MAPE = 1 ç
L
Lèå
i =1
ö
(abs( Pi - R i ) / R i ) ÷ ,
ø
where R1, R2, …, RL are real values of time series and P1, P2, …, PL are the predicted members of time series
where L is the number of predicted values.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
93
2. EXAMPLE I. TIME SERIES OF NASDAQ INDEX
The first case which has been tested was the time series of development of Nasdaq index negotiable on
the New York stock exchange. The values were used in the interval from 09:30 13.06.2003 to 16:00 01.07.2003
within the following two days (10 minutes sampling, 520+40 values). The graph of time series is on the Fig.1.
NASDAQ-History
1690
1680
1670
1660
1650
1640
1630
1620
1610
1600
1
26
51
76 101 126 151 176 201 226 251 276 301 326 351 376 401 426 451 476 501
Fig.1. Nasdaq index – history
The suitable FIS Sugeno is represented by 3 input variables and 9 input values for the purposes of
prediction of Nasdaq index time series. See fig. 2, 3.
FIS surface
Structure
1680
in1 (9)
1660
(sugeno)
in2 (9)
f (u)
out1
NASDAQ
9 rules
1640
1620
1600
out1 (9)
1680
1660
1640
1620
1600
in3
in3 (9)
1640
1620
1600
in2
1660
1680
System NASDAQ: 3 inputs, 1 outputs, 9 rules
Fig. 2. FIS Sugeno of Nasdaq index
Fig. 3. The part of FIS surface
The evaluation of match of tendency of prediction with reality gives the correct prediction in 62.5 %.
The fig.4. shows the values +1 (-1), which present increasing (decreasing) real tendency and predicted tendency.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
94
Nasdaq-Prediction
1
0
-1
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
37
39
Fig. 4. Nasdaq index - prediction
The prediction by means of fuzzy logic made the value of MAPE = 0.0013. The Fig.5. shows the real
values (solid line) and values of prediction (dashed line) of time series.
NASDAQ-Prediction
1680
1675
1670
1665
1660
1655
1650
1645
1640
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
37
39
Fig.5. Nasdaq index – prediction
3. EXAMPLE II. - TIME SERIES OF QQQ INDEX
The second case which has been tested was the time series of development of QQQ index negotiable on
the New York stock exchange. The values were used in the interval from 09:30 13.06.2003 to 16:00 01.07.2003
within the following two days (10 minutes sampling, 520+40 values). The graph of time series is on the Fig.6.
QQQ-History
31.5
31.0
30.5
30.0
29.5
29.0
1
26 51 76 101 126 151 176 201 226 251 276 301 326 351 376 401 426 451 476 501
Obr.6. QQQ index - history
The suitable FIS Sugeno is represented by 2 input variables and 8 input values for the purposes of
prediction of QQQ index time series. See fig. 7, 8.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
95
FIS surface
Structure
31
(sugeno)
out1
QQQ
in1 (8)
f (u)
8 rules
30.5
30
29.5
out1 (8)
31
30.5
in2 (8)
30
in2
29.5
29.5
30
in1
30.5
31
System QQQ: 2 inputs, 1 outputs, 8 rules
Fig. 7. FIS Sugeno of QQQ index
Fig. 8. The part of FIS surface
The evaluation of match of tendency of prediction with reality gives the correct prediction in 65.0 %.
The fig.9. shows the values +1 (-1), which present increasing (decreasing) real tendency and predicted tendency.
QQQ-Prediction
1
0
-1
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
37
39
Fig. 9. QQQ index - prediction
The prediction by means of fuzzy logic made the value of MAPE = 0.0016. The Fig.10. shows the real
values (solid line) and values of prediction (dashed line) of time series.
QQQ-Predikce
31.0
30.9
30.8
30.7
30.6
30.5
30.4
30.3
30.2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
Fig. 10. Index QQQ – prediction
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
96
4. CONCLUSION
The paper describes the design of the calculation method of prediction of financial time series by means
of fuzzy logic. Other methods such as Box-Jenkins methodology, Kalman filter, Elliot’s waves, artificial neural
networks, genetics algorithms, etc., (see 1,2,3,4,5,6,7) rank the use of fuzzy logic for prediction of financial time
series to the most successful ones. The prediction error MAPE and the prediction of tendency of development of
time series give evidence of usable prediction method for the process of decision making on the capital markets.
LITERATURE:
[1] DOSTÁL, P.: Neural Networks Decision Making and Stock Market Lisabon 2000, ISF2000, Konference, 5
s.
[2] DOSTÁL, P.: Neural Networks Decision Making and Stock Market Lisabon 2000, ISF2000, Konference, 5
s.
[3] DOSTÁL, P.: Neural Networks and Shares, Zlín 2000, Nostradamus 00, Konference, 22-27 s, ISBN 80-2141668-8.
[4] DOSTÁL, P.: Stock Market and Financial Cybernetics, Zlín 2001, Nostradamus 01, Konference, 7 s, ISBN
80-7318-030-8.
[5] DOSTÁL, P., Zmeškal O.: Chaos and Stock Market, Zlín 2001, Nostradamus 01, Konference, 8 s, ISBN 807318-030-8.
[6] DOSTÁL, P.: Genetics and Shares Brno 2002, Small and Medium Firm Management with Computer
Support, Konference, s.11, ISBN 80-86510-56-5
[7] DOSTÁL P.: Soft Computing and Stock Market Brno 2003, Mendel 03, Konference, s.258-262, ISBN 80214-2411-7
[8] ŽÁK, L.: Odhad vlivu počasí na odběr elektrické energie pomocí fuzzy regulátoru. . CR – Praha:
Automatizace č.5-6. 2002. 5p. ISSN 0005-125X.
[9] ŽÁK, L., Riaz M. K., Předpověď spotřeby elektrické energie s využitím fuzzy regulátoru. Acta Mechanica
Slovaca 3/2001, pp 513-522, ISSN 1335-2393.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
97
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
98
DETERMINATION OF TAX SHIELD AND ITS INFLUENCE IN FINANCIAL
DECISIONS
#
Jarmila Radova1), Petr Marek2), Tibor Hlačina3)
1)
Department of Banking and Insurance, Faculty of Finance and Accounting,
University of Economics Prague, nam. W. Churchilla 4, 130 67 Prague, Czech Republic
Tel.: +420-2-24095161, Fax: +420-2-24095135, E-mail: [email protected]
2)
Department of Corporate Finance and Valuation, Faculty of Finance and Accounting,
University of Economics Prague, nam. W. Churchilla 4, 130 67 Prague, Czech Republic
Tel.: +420-2-24095147, Fax: +420-2-24095135, E-mail: [email protected]
3)
European polytechnical institute, Kunovice, Osvobození 699, Czech Republic
Tel. 572 548 035, Fax: 572 549 018, E-mail: [email protected]
Abstract:We aim at seeking of the real moment of tax shield realization and investigate the
influence the time span between the moment of tax shield claim origination and the real moment of
tax shield realization has on the present value of tax shield. The real moment of tax shield
realization was recognized at the moment the tax shield is reflected in the taxpayer’s cash flow, in
the form of income tax reconciliation (differences), and paid tax deposits. Subsequently, we have
investigated the dependence of tax shield present value on the discount rate, and on the size of a
time span between the moment of tax-shield claim origination and tax-period’s end. The largest
difference between elementary value of the tax shield and its present value was founded at the
maximum discount rate considered in our model, and at the maximum size of the time span between
the moment of tax-shield claim origination and tax-period’s end. The dependence was, however, not
always found as linear. We conclude suggesting modifications of financial models that employ tax
shields by coefficients reflecting the real moment of tax shield realization.
Key Words: Tax Shield, Financial Model, Time Value of Money
JEL Classification: G3, H2
Taxes are not merely a financial managers’ „nightmare“, but principally they represent an important
factor influencing most of financial decisions, primarily through tax wedges and shields. Whereas tax wedge
diminishes investor’s return, tax shields, on the other hand, mitigate impact of certain decisions on financial
performance. Tax shield can be regarded as a sum of money the tax liability is reduced by, due to occurrence of a
transaction that decreases company’s tax base, or entitles to tax relief. The transaction can be understood as
posting of tax deductible costs and expenses in taxpayer’s accounts, or occurrence of a different matter resulting
in tax liability alteration, e.g. acquisition of fixed assets.
Mainstream finance has recently turned away from highly abstract theories in favour of detailed models
of actual functioning of real world. And in this world, tax shields play an irreplaceable part – they alter capital
budgeting decisions, influence capital structure, etc. All classic models of finance1 assume that the real moment
of tax shield realization is identical with incurrence of the particular tax-deductible expense, or other taxliability-affecting transaction. However, this does not correspond to the real world. In our contribution, we aim
to elaborate the tax shield issue in the suggested direction, under conditions of the Czech Republic tax law, and
#
1
This article has been elaborated as one of the outcomes of research project registered with the Grant Agency of
the Czech Republic under reg. no. 402/03/1307 and 402/04/1313.
For review of financial models development see e.g.. Copeland, T. E. - Weston, J. F.: Financial Theory and
Corporate Finance. Addison Wesley, Reading 1988.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
99
subsequently apply learned findings through adjustments of financial models. In our discussion, we shall seek
the answers to the following questions:
1. What is the real moment of tax shield realization?
2. What influence shall the real tax shield realization moment have on tax shield present value?
Question no.1: What is the real moment of tax shield realization?
Cash flow based financial models must logically reflect the real tax shield realization in company’s cash
flows at the time of real execution of a relevant tax payment into state budget. For illustration, a tax-deductible
payment2 incurred anytime in 2002 shall not be reflected in tax payments into state budget until the 2002 income
tax is settled, i.e. by 31st March of the next consecutive year (2003 in our example), or by 30th June 2003 in case
of taxpayers liable to annual financial statements audit, or their tax assessment is elaborated and presented by a
tax advisors.3 Decreased tax base shall subsequently affect the amount of income tax deposits paid in the next
consecutive tax deposit period. Consequently, the difference between tax deposits paid and real income-tax
liability for the relevant period shall equal the amount the tax deposits payments were reduced by.
Illustrative example:4
1.
tax-deductible payment at amount of 1,000 (payment A)
9.2002
30.
tax balance reconciliation date – difference paid for 2002 is decreased by the tax
6.2003
shield, i.e. 1,000 × 0.31 = 310 (payment B)
15.
income tax deposit payment decreased by ¼ of tax shield originated in 2002, i.e.
9.2003
310 / 4 = 77.5 (payment C)
15.
income tax deposit payment decreased by ¼ of tax shield originated in 2002, i.e.
12.2003
310 / 4 = 77.5 (payment D)
15.
income tax deposit payment decreased by ¼ of tax shield originated in 2002, i.e.
3.2004
310 / 4 = 77.5 (payment E)
15.
income tax deposit payment decreased by ¼ of tax shield originated in 2002, i.e.
6.2004
310 / 4 = 77.5 (payment F)
30.
tax balance reconciliation date – difference paid for 2003 is increased by the equal
6.2004
amount the tax deposits paid in 2003 were reduced by, i.e. 77.5 × 2 = 155 (payment G)
30.
tax balance reconciliation date – difference paid for 2004 is increased by the equal
6.2005
amount the tax deposits paid in 2004 were reduced by, i.e. 77.5 × 2 = 155 (payment H)
Question no.2: What influence shall the real tax shield realization moment have on present value of tax
shield?
We shall primarily concentrate on the present value of tax shield calculation methodology. Calculation
procedure can be viewed in the following illustration depicted in Table no. 1.5
2
3
4
5
For the sake of simplicity we assume the tax shield claim originates at the same time as realization of this
payment, or execution of this payment shall sooner or later lead to origination of the claim, for example, based
on incurrence of tax-deductible costs and expenses.
Sec. 40, par. 3 of Tax administration Act no.337/1992 Coll. We have left out of considerations financial years
not corresponding to a calendar year.
Illustrative example assumes that a) investor uses the services of a tax advisor and exploit the opportunity of
tax assessment submitting by end of June of the next consecutive year, b) tax period is calendar year, c)
income tax rate is equal 31%, d) last known tax liability of the taxpayer exceeded 150 000 Kč.
Here we shall add two additional assumptions: interest compounded on monthly basis and fixed monthly
discount rate of 1% (i.e. annual discount rate of 12%).
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
100
1. 9.2002
30. 6.2003
15. 9.2003
15.12.2003
15. 3.2004
15. 6.2004
30. 6.2004
30. 6.2005
Total
A
B
C
D
E
F
G
H
X
0
10
12
15
18
21
22
34
X
Present value of tax shield
Tax Shield
Original Payment
Number of months
Date
Payment
Table no.1: Present Value of Tax Shield – Illustrative Example
-1000.0
-1000.0
+310/(1+0.01)10=259.17
+77.5/(1+0.01/2)/(1+0.01)12=62.89
+77.5/(1+0.01/2)/(1+0.01)15=61.04
+77.5/(1+0.01/2)/(1+0.01)18=59.24
+77.5/(1+0.01/2)/(1+0.01)21=57.50
-155/(1+0.01)22=-115.00
-155/(1+0.01)34=-102.05
+282.77
+310.00
+77.50
+77.50
+77.50
+77.50
-155.00
-155.00
+310.00
As the basic value of tax shield, there is understood the total (but not discounted) cash amount, we can
decrease tax liability by. In case of tax-deductible payment, or other transactions entitling the company to claim
income tax base reduction, it is calculated as a product of income tax-base reducing item value and income tax
rate:
TS = TBRI ´ t ,
where
TS
TBRI
t
=
=
=
basic value of tax shield,
income tax-base reducing item,
income tax rate.
In case the tax liability decrease is claimed on the basis of entitled income-tax relief, the basic value of
tax shield equals the relief value:
TS = relief ,
wher relief = value of income-tax relief.
e
Now, we shall approach the calculation of the present value of tax shield. As independent variables, we
shall regard monthly discount rate [i], and number of months from the moment of tax shield claim origination till
the end of tax period [n]. Other parameters reflect the model’s assumptions suggested in the illustrative example.
ì
1
1
æ
öü
+
ç
֕
n +8
n +11
ï
(1 + im )
(1 + im )
1
1
ç
֕
ï
+
´
֕
1
1
ï (1 + im )n +6 4 ´ (1 + im ) ç
+
+
ç
ï
2 è (1 + im ) n +14 (1 + im )n +17 ÷ø ïï
ï
PV (TS ) = TS ´ í
ý
1
æ
ö
ï
ï
ç
÷
ï 1 ç (1 + im )n +18 ÷
ï
ï- ´ ç
ï
÷
1
ï 2 ç+
ï
n + 30 ÷
ïî
ïþ
(
1
i
)
+
m
è
ø
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
101
wher PV(TS) = present value of tax shield,
e
n
= number of months from the moment of tax shield claim origination till
the end of tax period,
im
= monthly discount rate,
= i / 12,
i
= annual discount rate.
The issue of discount rate determination is not addressed in our article. In fact, the precise
determination it is not even possible as far as theoretical approaches are concerned, because the determination of
discount rate reflects the purpose of financial model, and vary with each individual investor reflecting their
expectations. However, one rule shall be maintained, so that cash flow after tax shall be discounted by a taxadjusted discount rate. Financial models reflecting tax shields do employ cash flow after tax.
Consequently, we shall draw our attention to investigation of dependence of [PV(TS)] on discount rate.
Graph no.1 depicts dependence of [PV(TS)] on discount rate. Individual plotted curves represent the
development of present value of tax shield for selected number of months from the moment of tax shield claim
origination till the end of tax period, i.e. in our case till the end of calendar year.
Graph no.1: Dependence of present value
of tax shield on discount rate
0,00
330,00
0,04
0,08
0,12
0,16
Annual Discount Rate
0,20
0,24
0,28
0
1
320,00
2
3
310,00
4
Tax Shield
300,00
5
6
290,00
7
8
280,00
9
10
270,00
11
12
260,00
250,00
Curve [12] demonstrates the dependence of [PV(TS)] on discount rate in situation when the tax shield
claim originates on 1st January of respected year, i.e. 12 months prior to the tax-period’s end. Similarly, e.g.
curve [0] represents the researched dependence in case when the tax shield claim originates at the year (thus, tax
period) end. All curves plotted can be characterized by a quadratic equation in common form:
Cx 2 + 2 Dy + 2 Ex + F = 0 .
One must be aware that coefficient [C] is the lower the longer the time span is between the moment of
tax shield claim origination and tax period’s end. Therefore, we can state that in case the tax shield claim has
originated at the time of the tax period's beginning, the curve depicting the tax shield present value [PV(TS)]
dependence on discount rate can be approximated by a linear function. Such situation is demonstrated by curve
[12].
In addition, we also state that a curve extreme (maximum) can be found for each curve representing a
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
102
different time span remaining to tax period's end and characterized by parabolic function, with the maximum
located at individual parabolas’ vertexes. Such maximum represents the largest [PV(TS)], and each of the
discussed functions are maximized at a different level of discount rate. The closer the tax-shield claim
origination is to the tax-period’s end, the higher the parabola-maximizing discount rate is.
The dependence of [PV(TS)] on the time span between the moment of tax-shield claim origination and
tax-period’s end represents another researched relationship. This relationship is depicted in Graph no. 2 for
various discount rates.
Graph no.2: Dependence of Tax Shield Present Value on the Moment of Tax Shield Claim
Origination
0
1
2
3
4
5
6
7
8
9
10
11
12
330,00
Number of Months from the Origination of Tax Shield Claim till Tax Period's End
320,00
0,01
310,00
0,04
0,08
Tax Shield
300,00
0,12
290,00
0,16
0,2
280,00
0,24
0,28
270,00
260,00
250,00
Investigated relationship is linear for any selected discount rate. The largest [PV(TS)] logically occurs
with the minimal time span between the moment of tax-shield claim origination and tax-period’s end, thus when
the claim is realized on the last day of tax period, i.e. in our case on 31st December. In this point, the [PV(TS)] is
paradoxically even larger than the basic value. This paradox is, however, merely ostensible. This fact has been
caused by the existence of income tax deposits, by and discounting of individual tax payments. With increasing
span, the [PV(TS)] declines, with its minimum at the point of 12 months prior to tax-period’s end, i.e. on 1st
January in our case.
The intersection of lines representing [PV(TS)] for various discount rate we consider to be rather
interesting graphical finding. In case the tax-shield claim origination moment falls on the last day of tax period,
the maximum [PV(TS)] is reached for discount rate of 20 %. Conversely, should the tax shield claim originates
12 months prior to tax-period’s end, the maximum [PV(TS)] shall be reached for the lowest discount rate (in our
illustration for 1 % annual discount rate).
With respect to above research, there is no doubt the present and basic value of tax shield can differ.
However, the question remains whether the difference is significant enough so that it becomes necessary to
adjust financial models to employ the present rather then basic value. For the purposes of further research we
shall introduce a function [α(i;n)] producing for a given discount rate [i], and a given time span between tax
shield claim origination and tax-period’s end [n] an absolute deviation of the difference between [TS] and
[PV(TS)] as a percentage of the basic tax shield value [TS].
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
103
TS - PV (TS )
´ 100 ,
TS
wher α(i;n) = absolute value of a deviation between the present and basic value of tax
e
shield.
a (i ; n ) =
Graph no. 3 shows a map of function’s values [α(i;n)] for various discount rates [i] and various time
spans (number of months) between the time of tax shield claim origination and tax-period’s end [n]. For the sake
of better clarity, we have used discount rates up to 24 % level even they can be regarded as unrealistic, though.
As follows form the graph and a consecutive Table no. 2, the deviation between the present and basic
value of the tax shield does not represent in most cases more than 2.5 % of the basic value.
12
11
10
9
8
7
6
5
4
3
2
1
0
·
·
·
○ ○ ○ ·
■■■
·
·
·
○ ○ ○
· ■■ ■
·
·
·
·
· · ■ ■ ■■
·
·
· · · · ■■ ■■
· · · · · ■ ■■ ■■
○ ○ ○ · · · · · ■ ■ ■ ■ ■
○ ○ ○ ○ ○ ○ · · · · · · ■
·
·
·
○ ○ ○ ○
·
·
·
·
○ ○ ○ ○
·
·
·
·
·
○ ○
·
·
·
·
·
·
○
· · · ■
· · ·
○ ○ · ·
○ ○ ○ ·
·
·
·
·
·
·
·
○ ○ ○ ○
·
·
·
·
·
·
·
·
○ ○ ○
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
○ ○ ○ ○ ○ ○ ○
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
○ ○ ○ ○ ○ ○
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○
· · ·
0,01
0,02
0,03
0,04
0,05
0,06
0,07
0,08
0,09
0,10
0,11
0,12
0,13
0,14
0,15
0,16
0,17
0,18
0,19
0,20
0,21
0,22
0,23
0,24
Number of months
prior to tax-period’s end
Graph no. 3: Function α(i;n) Value Map
Annual discount rate
10,0 % < ‌ α ‌ ≤ 12,5 %
·
‌ α ‌ ≤ 2,5 %
○
2,5 % < ‌ α ‌ ≤ 5,0 %
·
12,5 % < ‌ α ‌ ≤ 15,0 %
·
5,0 % < ‌ α ‌ ≤ 7,5 %
○
15,0 % < ‌ α ‌ ≤ 17,5 %
■
7,5 % < ‌ α ‌ ≤ 10,0 %
·
17,5 % < ‌ α ‌ ≤ 20,0 %
With annual discount rate between <0.01; 0.24>, almost 50 % (49.68 %) of all cases, and with the rates
falling into the interval of <0.01; 0.12> that most corresponds to the real world, even 71.15 % of all cases do not
exceed the 2.5 % level. At the same time, in the interval of <0.01; 0.12> deviations exceeding 10 % do not occur
at all, in <0.01; 0.24> only in 10.9 % of all cases. The largest deviations occur in cases, when tax shield claims
originate at the tax-period’s beginning, whereas in cases when the claims are originated at the tax-period’s end,
the deviation values are rather negligible.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
104
> 17.5 %
≤ 20.0 %
∑
‌α‌
> 15.0 %
≤ 17.5 %
‌α‌
> 12.5 %
≤ 15.0 %
‌α‌
> 10.0 %
≤ 12.5 %
‌α‌
> 7.5 %
≤ 10.0 %
‌α‌
> 5.0 %
≤ 7.5 %
‌α‌
> 2.5 %
≤ 5.0 %
‌α‌
≤ 2.5 %
‌α‌
Frequency
Discount rate
Table no.2: Function value frequency [α(i;n)]
up to absolute
0.24 relative
155
62
35
26
15
12
6
1
312
49.7
19.9
11.2
8.3
4.8
3.9
1.9
0.3
100.0
up to absolute
0.12 relative
111
30
13
2
-
-
-
-
156
71.2
19.2
8.3
1.3
-
-
-
-
100.0
When shall the deviations be considered significant enough and when not? The answer is rather
straightforward and intuitive. The deviation is significant enough only when substitution of basic tax shield value
with its present value leads to alteration in investment choice, or modification of other financial or capital
decision. Thus, the deviation shall be with each case more or less significant in accordance with the scope of
differences in results of applied decision-making criteria for various investment alternatives. This is, however,
not always possible to find out prior to calculations, but very often impossible even after the calculations are
done.
As a certain guideline on whether it is necessary to use the present tax shield value instead of the basic
one, we might use the map of function values [α(i;n)] depicted in Graph no. 5. Therefore, we recommend using
the present value of tax shield instead of the basic one in all cases, where one can rationally expect that the value
switch shall affect the relevant decision. For simplicity we can use the coefficients of tax shield present value
[kTS(i;n)]:
kTS (i; n ) =
PV (TS )
TS
, or PV (TS ) = TS ´ kTS (i; n) ,
wher kTS(i;n) = coefficient of present value of tax shield.
e
CONCLUSION
Paper conclusions can be summarized in the following points:
1. Real moment of tax shield realization is effected at the moment, when the tax shield is reflected in
taxpayer's cash flows in form of tax payment.
2. In relation to time span between the tax shield claim origination and tax period's end, holding other
variables equal, the maximal present value of tax shield is reached when the claim originates at the end of
tax period. Extending the time span the tax shield present value decreases.
3. In relation to discount rate, ceteris paribus, the maximal present value of tax shield is located at the
parabola vertex, i.e. the tax shield present value grows until a certain level of discount rate is reached; on
reaching the level the present value decreases.
4. In most of the cases, the absolute value of deviation between the present and basic value of tax shield does
not exceed 2.5 % of basic value within economic real values of discount rate. The maximal deviation is
reached with the longest possible time span between the tax shield claim origination and tax period's end,
i.e. 12 months.
5. We recommend using the present value of tax shield instead of the basic value in all cases where we can
expect the switch of values would influence financial decision. For simplicity, coefficients of present value
of tax shield [kTS(i;n)] can be used.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
105
LITERATURE:
[ 1] BREALEY, R. A., MYERS, C. M.: Principles of Corporate Finance. New York : Irwin / McGraw Hill,
2002.
[ 2] COPELAND, T. E., WESTON, J. F.: Financial Theory and Corporate Finance. Reading, Addison Wesley,
1988.
[3] COPELAND, T. E., KOLLER, T., MURRIN, J.: Valuation. Measuring and Managing the Value of
Companies. New York, John Wiley, 2000.
[4] DVOŘÁK, P., RADOVÁ, J.: Finanční matematika pro každého (Financial mathematics for Everybody).
Prague : Grada, 2003.
[ 5] HOLEČKOVÁ, J.: Analýza vlivu zdanění na finanční rozhodování na základě metody výpočtu efektivní
daňové sazby - daňových klínů) (Analysis of the Impact of Taxation on Financial Decisions based on the
Method of Calculations of Effective Tax Rates - Tax Wedges). Acta Oeconomica Pragensia, 2002, vol. 10,
no. 1, p. 90-103
[ 6] MAREK, P., RADOVÁ, J.: Skutečný okamžik realizace daňové úspory (Actual Moment of Tax Shield
Realization). Acta Oeconomica Pragensia, 2002, vol. 10, no. 1. p. 104-116
[ 7] MAŘÍK, M.: Určování hodnoty firem (Corporate Valuation). Prague : Ekopress, 1998.
[ 8] VALACH, J., OTHERS: Finanční řízení podniku (Corporate Financial Management). Prague : Ekopress,
1999.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
106
MODELLING OF INFLATION FUZZY TIME SERIES USING
THE COMPETITIVE NEURAL NETWORK
Dušan Marček
University of Žilina, Faculty of Management Science and Informatics, e-mail: [email protected]
Abstract: Based on the work by [5], in this paper a dynamic process with linguistic values of
inflation rates is defined and studied. A complete fuzzy modelling approach is proposed which
includes: fuzzifying of observed data, developing a fuzzy time series model and computing the
outputs. The fuzzy rules are by competitive neural network derived.
Keywords: Fuzzy time series, Clustering technique, Neural network, Supervised Competitive
Learning (SCL),
1. INTRODUCTION
In this paper we describe one application of the fuzzy set theory for inflation time series. Fuzzy logic
started over thirty years ago as a theory to deal with uncertainty and vague linguistic notions. The first
applications of the fuzzy sets have been realised in industrial machinery. There are lists available with hundreds
of application in all areas of the electrical and electronic market [1] and primarily for control and predictive
diagnostics of complicated systems and processes.
Fuzzy time series is proposed in order to deal with such situations in which the traditional time series
model is no longer applicable, i.e., if the historical data are not real numbers but they are linguistic variable. In
the case if the historical data are real numbers, the data should be fuzzified first. In this paper, we assume there
is a relationship between observations at time t and those at previous time. Thus, the modelling process is to
develop fuzzy relations among the observations at different times.
The concept of fuzzy time series is applied to represent simple relationship between observation at time
t and at the previous times, i.e., the observation at time t is the accumulated results of the observations at the
previous times (see also [5] ). In next section a possible new method is proposed as a fuzzy system product with
heuristic rules describing the behaviour of inflation with linear membership functions for the fuzzification of
inflation time series.
2. MODELLING INFLATION WITH FUZZY TIME SERIES
As we mentioned above, in practice it is difficult to forecast the expected stock price since economic
and financial decision is highly erratically. The objective value of the expected inflation should be vague and
uncertain. Therefore, we can apply a fuzzy analysis to the inflation time series.
Since the first two methods are frequently used in technical or technological systems, we will apply the
third method. As we mentioned earlier, for its applying it is supposed that a database describing previous inputoutput behaviour of a system and the adequate model of the observed process are available.
A powerful tool for generating fuzzy rules purely from a database in a fuzzy system are neural
networks. The neural networks can adaptively generate the fuzzy rules in a fuzzy system by supervised productspace clustering technique [2]. Next, in a numerical example, we will illustrate and show how to obtain fuzzy
rules using fuzzy set theory and neural networks.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
107
Let us consider a simple example. The data set used in this example (the 514 of monthly inflation
observations from February 1956 to November 1998) was published at http://neatideas.com/data/inflatdata.htm).
As do many economic time series, the original data exhibit considerable inequality of variance over time, and the
log transformation stabilises this behaviour. Fig. 1 illustrates the time plot of this time series. This time series
shows no apparent trend or periodic structure. We would like to develop a time series model for this process so
that a predictor for the process output can be developed. To build a forecast model the sample period for analysis
y1 , ..., y344 was defined, i.e. the period over which the forecasting model can be developed and estimated, and
the ex post forecast period (validation data set), y345 , ..., y514 as the time period from the first observation
after the end of the sample period to the most recent observation. By using only the actual and forecast values
within the ex post forecasting period only, the accuracy of the model can be calculated.
inflation
2.5
2
1.5
1
0.5
56
-0.5
61
66
71
76
81
86
91
96
t
-1
Fig. 1. Natural logarithm of monthly inflation from February 1956 to November 1998
After some experimentation, we formulated a model relating the value yt of the series at time t that
depends only on its previous value yt-1 and on random disturbance, i.e.
y t = x + f1 yt -1 + e t
(1)
where the variable yt (in our case the first difference of inflation rate) is explaned by only on its previous values,
and et is a white noise disturbance term.
Next, analogously to the conventional time series models (1), it is assumed that the observation at the
time t accumulates the information of the observation at the previous times, i.e. there exists a fuzzy relation
Rij(t, t - 1):
and equivalently
Rij(t, t-1):
yti-1 ® ytj
(2)
Yt ® Yt -1
(3)
or
ytj = yti-1 ° Rij(t, t - 1)
(4)
which can be interpreted as fuzzy implication
Rij(t, t-1) º if yt -1 then yt
i
j
t Î t
j
(5)
i
t -1
where y Y , y Î Yt -1 , i Î I, j Î J, I and J are indices sets for Yt and Yt -1 respectively, “ ° “ is the maxmin composition.
The Eq. (4) is called first-order model of the fuzzy time series with lag p = 1.
The describing a basic structure of the fuzzy rule based system for fuzzy time series modelling was
made in [4]. At this stage, we will only give some outlines to model fuzzy time series in a fuzzy environment.
The fuzzy time series modelling procedure consists of a implementation of several steps, usually as follows:
1. Define input-output variables and the universes of discourse.
2. Define (collect) linguistic values, i.e. fuzzy sets on universes discourse.
3. Define (find) fuzzy relations (fuzzy rules).
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
108
4.
5.
Apply the input to the model and calculate the output.
Defuzzify the output of the model.
The proposed fuzzy time series modelling procedure is divided into five steps. In Step 1 to Step 2, the
input data are fuzzified, in Step 3, analogously to the conventional model (1), the fuzzy time series model, i.e.
the fuzzy relation is created. The Steps 4, 5 are considered as application of the model (i.e., analysis of economic
structures and forecasting). In the literature this modelling approach is known as fuzzy rule based system (see
Fig. 2). Below, we will discuss these steps and apply them to the inflation time series at a more detailed level. In
Fig. 2 the fuzzy rule based system has three blocks: (a) block for fuzzification of input variables, (b) knowledge
base block, and (c) defuzzification block.
Fig. 2. Structure of the fuzzy system for forecast of the inflation
In the fuzzification block firstly, we specified input and output variables. The input variable indicated as
{ xt -1 } is the laged first difference of inflation values { yt } and calculated as xt -1 = yt -1 - yt -2 , t = 3, 4, ... .
The output variable indicated as { xt } is the first difference of inflation rates { yt } and calculated as xt = y t -
yt -1 , t = 2, 3, ... . The variable ranges are as follows:
- 0,75 £ xt , xt -1 ³ 0,75
These ranges define the universe of discourse within which the data of xt -1 and xt are and on which
the fuzzy sets have to be specified. The universes of discourse were partioned into the seven intervals.
Next we specified the fuzzy set values on the input and output universe. The fuzzy sets numerically
represented linguistic terms. Each fuzzy variable assumed seven fuzzy set values as follows: NL: Negative
Large, NM: Negative Medium, NS: Negative Small, Z: Zero, PS: Positive Small, PM: Positive Medium, PL:
Positive Large.
Fuzzy sets contain elements with degrees of membership. Fuzzy membership function can have
different shapes. The triangular membership functions were chosen. Fig. 3 shows membership function graphs of
the fuzzy subsets above.
Fig. 3. The membership functions for each linguistic fuzzy set value
The input and output spaces were partioned into the seven disjoint fuzzy sets. From membership
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
109
function graphs mt -1 , mt in Fig. 3 is shown that the seven intervals [-0,75; -0,375], [-0,375; -0,225], [-0,225 0,075], [-0,075; 0,075], [0,075; 0,225], [0,225; 0,375], [0,375; 0,75] correspond respectively to NL, NM, NS, Z,
PS, PM, PL.
Next we specified the fuzzy rule base or bank of fuzzy relations. The competitive neural network was
applied which uses supervised competitive learning to derive fuzzy rules from the database. The bank contains
the following 5 fuzzy rules
i
j
IF xt -1 = NL THEN xt = NS
i
j
IF xt -1 = PS THEN xt = NM
i
i
j
i
j
IF xt -1 = PS THEN xt = PS
IF xt -1 = PS THEN xt = PL
j
IF xt -1 = PM THEN xt = PS
(6)
Finally, we determined the output action given the input conditions. We used the Mamdani´s
implication [3]. Each fuzzy rule produces the output fuzzy set clipped at the degree of membership determined
by the input condition and the fuzzy rule. When the input, say xt -1 = x344 , is applied to the model (4), the
i
j
i
j
j
output xt = x345 can be calculated. It is possible to compute the output fuzzy value xt by following simple
procedure consisting of three steps:
·
Compute the membership function values mNL( xt -1 ), mNM( xt -1 ), ..., mPL( xt -1 ) for the input xt -1 using
the the membership functions pictured in Fig. 3.
·
Substitute the computed membership function values in fuzzy relations (5), (6).
·
Apply the max-min composition to obtain the resulting value xt of fuzzy relations.
j
j
Following the above principles, we have obtained the predicted fuzzy value for the inflation xt = x345
= 0,74933.
j
The inflation values in the output xt , t = 345, 346, ... are not very appropriate for a decision support
because they are fuzzy sets. To obtain a simple numerical value in the output universe of discourse, a conversion
of the fuzzy output is needed. This step is called defuzzification. The simplest defuzzification scheme seeks for
the value x$ t that has middle membership in the output fuzzy set. Hence, this defuzzification method is called
middle of maxima, abbreviated MOM. Following this method, we have obtained the predicted value for the x$345
= - 0,15 . The remaining forecasts for ex post forecast period t = 346, 347, ... may be generated similarly.
As a final point, let us examine what has been gained by use of a fuzzy time series model over an
ordinary AR(1) model for the output x345 . For this purpose we computed prediction limits on the one-stepahead forecast from the AR(1) model, and fuzzy time series model. The 95 percent interval around the actual
inflation value based on statistical theory is
xˆ345 m u1-a / 2sˆ e (1 + f12 ) 2 = 0,00312 m 1,96 0,15476(1 + (-0,1248) 2 ) 2 = (-0,0442 ; 0,05043)
where x$ 345 represents the forecast for period t = 345 made at origin t = 344, u1-a is a 100 (1 - a 2 ) percentile
1
1
2
of the standard normal distribution, and s$e an estimate of the standard deviation of the noise. An intuitive
method for constructing confidence intervals for fuzzy time series model is simply the defuzzification method
firs of maxima and first of minima to obtain prediction limits on the one-step-ahead forecast. In our example, the
„confidence“ interval for fuzzy time series value x$ 345 = 0,00312 is (-0,30256 to 0,3088). The actual value for
the AR(1) model fell not within the forecast interval, and moreover, its sign is apposite to the forecast value sign.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
110
3. CONCLUSION
In this paper we have discussed the fuzzy model for inflation forecast. Using the technical analogy, we
have designed the fuzzy system structure. From the data we have derived the short time behavior in the inflation
movements. The interval linear membership functions are employed in order to avoid non-linearity. We have
evaluated inflation forecasts by fuzzy model. To determine the fuzzy relation of the first-order fuzzy time series
model, the neural network with SCL clustering technique was used to derive the rules directly from the
database. The method may be of real usefulness in practical applications, where usually the expert can not
explain linguistically, what control actions the process takes or there is no knowledge of the process.
ACKNOWLEDGEMENT
This work was supported by Slovak grant foundation under the grant No. 1/9183/02.
REFERENCES
[1] HELLONDOOM, H.: Industrial Application of Fuzzy Control. Fuzzy Structures, Current Trends. Tatra
Mountains Mathematical Publications, Volume 13, 1997, pp. (69-91)
[2] KOSKO, B.: Neural networks and fuzzy systems - a dynamical systems approach to machine intelligence.
Prentice-Hall International, Inc. 1992
[3] MAMDANI, E., H.: Application of a fuzzy logic to approximate reasoning using linguistic synthesis. IEEE
Trans. Comput. 26 (1977), pp. 1182-1191
[4] MARČEK, D.: Forecasting Stock Prices With Fuzzy Time Series. Studies of University in Zilina.
Mathematical-Physical Series. Vol. 12., December 1999, ISSN 80-7100-690-4, pp.(31-35)
[5] SONG, Q. and CHISSON, B.,S.: Fuzzy time series and its models. Fuzzy Sets and Systems 54 (1993) 269277, North-Holland
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
111
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
112
USE OF NEW METHOD FOR LEARNING OF NEURAL NETWORKS
Jindřich Petrucha, Vladimír Mikula
European polytechnical institute, Kunovice, Osvobození 699, Czech Republic
Tel. 572 548 035, Fax: 572 549 018, E-mail: [email protected], [email protected]
Abstract: Neural networks are used for classification and prediction of data at future. For training
of neural networks we can use the backpropagation procedure, although there are the limitations
associated with this technique and these problems are well documented. Global search techniques
such as genetic algorithm and tabu search have been used for training of neural networks, too.
These algorithms have been focused more on accuracy than on the training speed because the
training speed is not important in cases where the training procedure can be
done off- line. In
this paper a training procedure is described, that can achieve the needed accuracy at limited
training time. We then use the data from simulation of time series and we compare the proposed
methods with the backpropagation method in our simulator.
Keywords: artificial neural networks, artificial neurons, scatter search, improvement method,
synaptic weights, layers of neurons, hidden layers, input patterns, learning (training) of neural
nets, error function, time series, prediction of future values, single step prediction.
1. INTRODUCTION
Training of neural networks is a complicated problem. There are the situations, where the training
procedure spends hour of the computer time and we need to find the results of the problem in some minutes. The
speed of the training arises in context of optimising a simulation. Sometimes the data are not inside a large
database but they are generated on-line, this situations needs the on-line training of the neural network.
There is e.g. a simulation model of the factory – for the simulation of the flow of products through the
factory over a period. The product managers make virtual changes of the production, to predict the impacts
without ever changing of real equipment or manufacturing. The models are applied in financial planning and in
marketing strategy.
The limitation of simulation is an inability to evaluate then fraction of the large of options. Practical
problems need to consider numbers of interconnected alternatives. In fact these complexities and uncertainties
are primary reasons that simulation is chosen as a basic means for handling of such problems.
The area that incorporates artificial intelligence evolutionary processes led to the creation of new
approaches that successfully integrated simulation and optimization. Specifically the neural network can be used
to filter out the solutions that are likely to perform poorly when the simulation is executed.
2. NEURAL NETWORKS AND PROCESS OF THE OPTIMIZATION
We try to formalize the use of neural network as a prediction model for the simulation and we describe
on-line training procedure that is producing fairly accurate prediction within a reasonable time.
Let us introduce the notation that helps us to formalize the process of optimization and training process.
x
f(x)
p(x,w)
x*
l
u
-a solution problem
-output of the simulation
-the output as predicted by neural network with weights w when the solution x is used as inputs
-the best solutions of the problem
-set of lower for decision x
-set of upper for decision x
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
113
During the optimization process we try to find :
Min f(x), where it is valid l ≤x≤u
·
·
·
The training problem consists of finding the set of weights w that minimize an aggregate error, we use
MSE (mean squared errors).
During the search for the optimal values of x, the procedure generates set of ALL solutions x.
Let the TRAIN be a random sample of solutions in ALL, such that TRAIN < ALL.
We define the problem:
Min
g ( w) =
1
å
| TRAIN | x ÎTRAIN
( f ( x ) - p ( x, w ) 2
w - is the set of optimization weights of neural network
2.1 Training algorithm
Scatter search is an evolutionary method that is founded on strategies that only piecemeal come to
propose as augmentations to GA. The approach uses principles and strategies that are still not emulated by other
evolutionary methods. We can find good results for solving variety of complex optimization problems, including
neural network training.
Outline of training procedure
0.
Normalise input and output data
1.
Start with P = 0
Use diversification method to construct a solution w between wlow and whigh
If w Ï P add w to P, otherwise discard w.
Repeat this step until P| = |PSize|
Apply the improvement method to the best b/2 solutions in P to generate w(1)..w(b/2)
Generate b/2 more solutions, where w(b/2+i)=w(i) (1+[U-0.3,0.3] for i=1..b/2 RefSet {w(1), ... w(b)}
2. Order RefSet according to their objective function value
w(1)is best w(b) is worst
while ( NumEval < TotalEval )
3. Generate NewPairs, which consist of all Paris of solutions in RefSet that include at lest one new solution.
Make NewSolutions = 0
for ( all NewPairs ) do
4. select the next pair (w(i), w(j)) in NewPairs
5. obtain new solutions w as linear combinations of (w(i), w(j))and add them to NewSolutions
end for
6. Select NewSolutions and apply the improvement method.
for ( each proved w) do
if ( w is not in RefSet and g(w) < g(w(b)) ) then make w(b) =w and reorder RefSet
end for
while ( IntCount < Intimit)
7. Make IntCount=IntCount + 1 and w = w(1)
8. Make w(b/2+i)=w(i) (1+[U-0.05,0.05]) and apply the improvement method.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
114
if ( g(w) < g(w(1)) then make w(1) = w and IntCount=0.
end while
9. Apply the improvement method to w(i) for i=1,…, b/2 in RefSet
10. Make w(b/2+i))= w(i)(1+ U[ -0.01, 0.01] ) i=1,…, b/2 in RefSet
end while
Let us describe each steps of this outline :
·
The main goal of the diversification is to generate solutions that are diverse with respect to those solutions
that have been generated in the past.
·
In step 2 we order RefSet according to quality, we use MSE on training set of data for measure error
function. The best solution is first one in the list.
·
In step 3 we construct the NewPairs by using linear combinations when the NewPairs are selected one at
time in lexicograpical order.
·
In step 6 New Solutions are subject to the improvement method. Each improvement solution is tested for
admission into RefSet.If new solution improves upon the worst solution in RefSet, the new solution
replaces the worst and RefSet is ordered.
·
In step 7 the procedure intensifies the search around the best-known solutions.
·
In step 8 the solutions is perturbed and improvement method is used
·
In step 9 and 10 procedure uses improvement method to the best b/2 solutions in RefSet and replaces the
worst b/2 solutions with perturbation of the best b/2.
2.2 Architecture of Neural Network
Our aim was to develop the program simulator that can be able to predict data for future period on the
base of known data in past. We used well-known feed forward multilayer perceptron neural network architecture
consisting of at least three layers.
The input layer is for distribution of input pattern.
x1
Output of
predicted
value
x2
X
xn0
Output layer of neurons
Input (fun out )
layer of neural
network
The first hidden layer of neurons
Fig. 1. MLP architecture of neural network for data prediction
There are several ways to compute the forecast:
Let us assume we have a simple time series D1, … ,D100 values of data, we want to forecast future data values
Di , i > 100, and we desire to use five lagged values as inputs
Single step – open loop forecasting:
Train a network with target Di and inputs Di-1 , Di-2 ,….. Di-N , (N – number of train patterns).
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
115
Then forecast
D 101 = Net (D 100 , D 99 , D 98 , D 97 , D 96 )
D 102 = Net (D 101 , D 100 , D 99 , D 98 , D 97 )
D 103 = Net (D 102 , D 101 , D 100 , D 99 , D 98 )
Multi-step – closed loop forecasting:
Forecast D 101 as P 101 = Net (D 100 , D 99 , D 98 , D 97 , D 96 )
Forecast D 102 as P 102 = Net (P 101 , D 100 , D 99 , D 98 , D 97 )
Forecast D 103 as P 103 = Net (P 102 , P 101 , D 100 , D 99 , D 98 )
N– step ahead forecasting:
Assume N=3
Compute P 101 = Net (D 100 , D 99 , D 98 , D 97 , D 96 )
Compute P 102 = Net (P 101 , D 100 , D 99 , D 98 , D 97 )
Forecast D 103 as P 103 = Net (P 102 , P 101 , D 100 , D 99 , D 98 )
Forecast D 104 as P 104 = Net (P 103 , P 102 , D 101 , D 100 , D 99 )
Forecast D 105 as P 105 = Net (P 104 , P 103 , D 102 , D 101 , D 100 )
Which method we choose for computing of forecasting depends in part on the requirements of our application.
For our simulator we choose single step – open loop method and we try to compare the experimental results
with the simulator that uses the backpropagation method.
3. SIMULATOR – PROGRAM CHARACTERISTICS
Our aim is to make a program for using times series, that can be able to predicted data for future
period. For making this program we use neural network architecture that uses three layers. In our simulator we
focus on problematics of learning the neural networks from past data and on finding the weight parameters of
the net for the best time prediction.
Program is designed on the principle of three layer architecture with the possibilities to determine the
number of neurons for input layer and for hidden layer . Number of neurons for output layer is one. This
information we can load from the text file that obtain information about architecture and times data.
The program is written in PASCAL DELPHI, using graphics component for visualising the data.
Following figures show in graphical form the result obtained by the proposed program.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
116
Fig. 2. Fundamental screen of program
Graph of
MSE during
the learning
Graph of testing
and prediction of
experimental data
4. TESTING
During our tests we tried to find the best dimension of input vector and the number of neurons in hidden
layer for architecture of neural nets. The number of neurons is written in the header of text files before the data
of time series.
Architecture of neural network
5 - 6 -1
5 - 7 -1
5 - 10 -1
10 - 15 -1
Table 1. Experimental results
MSE
0.1606
0.21
0.203
0.17
The best results have been reached by using the architecture 5 – 6 – 1. The another parameters for algorithm are
constants b = 10 , PSize = 100 , TotalEval = 1000 cycles and IntLimit=20.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
117
5. CONCLUSION
The neural network was used for the prediction of data at future and real result that we have got we
have compared with actual data. There are some differences between the prediction and actual data. The scatter
search algorithm has good ability to find the results in the first phase of searching when the procedure intensifies
around the best known solution in the step 7. The experiments show that the Scatter Search algorithm can be
used for practical using when we need a reasonable prediction accuracy during on-line learning of the neural
network.
REFERENCES:
[ 1] NOVÁK, M..: Umělé neuronové sítě - teorie a aplikace. Nakladatelství C.H.Beck, Praha 1998
[ 2] KOSKO, B.: Neural Networks and Fuzzy Systems. Prentice – Hall, 1992
[ 3] KUTZA, K.: Neural Networks at your Fingertips. http://www.geocities.com/CapeCanaveral/1624/
[ 4] LAGUNA, M. MARTÍ, R.: Neural Network Prediction in System for Optimizing Simulation. University of
Colorado, University of Valencia, 2001
[ 5] GLOVER, F. : A Template for Scatter Search And Path Relinking. University of Colorado, 1998
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
118
NEURAL NETWORK AND JOINT TRANSFORM CORRELATOR
APPLIED FOR THERMAL IMAGERY-BASED RECOGNITION
Krzysztof Kościuszkiewicz1), Joanna Kobel2), Jacek Mazurkiewicz3)
1,3)
Institute of Engineering Cybernetics, 2) Institute of Physics
Wroclaw University of Technology
ul. Wybrzeże Wyspiańskiego 27, 50-370 Wrocław, POLAND
Phone: +48-71-3202681, +48-71-3202825 Fax: +48-71-3212677
E-mail: [email protected], [email protected], [email protected]
Abstract: This paper focuses on two different implementations of infrared-based biometric system.
Artificial neural network and computer simulated joint transform correlator realize the classification and identification of human face’s thermograms. Accuracy of both systems is evaluated using
standard ratios for biometric systems: FRR, FAR and EER.
Keywords: Multilayer Perceptron, thermogram, identification
1. INTRODUCTION
Authentication methods based on recognition and comparison of infrared face images are considered to
guarantee a third best security level among other biometric methods. Infrared imagery cross-over error rate
places it just behind two eye recognition methods: retina and iris scanning.
Infrared face recognition is based on the fact that thermal image of each human’s face is individual and
unrepeatable. Even identical twins have different infrared images. The IR approach posses some advantages
over visible wavelength images. First, thermal image is not vulnerable to disguises, can’t be altered or camouflaged. The systems based on facial thermogram technology are external lighting independent and passive. This
means that authentication can be done in low light or even in the total darkness, without physical contact and
human cooperation [6, 8].
This technology also has some drawbacks. The main disadvantage is a low stability of thermal pattern.
Thermogram can be disturbed by many external and internal factors, like environment temperature, illnesses,
emotional state etc.
Fig. 1. Visible wavelength and thermal photographs of the same face
2. RESEARCH MATERIAL
The experiment was performed on 10 adult healthy volunteers. The group consisted of five males and
five females, aged 23 to 40. The faces were recorded with AGEMA 900 LW system, at the ambient temperature
21,5°C and the 28% humidity of the environment. Spectral region from 8 to 13 mm was chosen as a sensing
range. For every subject 10 images in different conditions were taken. From original photographs recorded in
format of 135x270 pixel bitmaps, the square sample of central face part was cropped. This way database of 100
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
119
images of size 85x85 pixels was created.
3. ARTIFICIAL NEURAL NETWORK BASED SYSTEM
Neural implementation is based on a classic model of multi-layered perceptron. This topic is well described in numerous publications, just to mention [2, 3, 5]. Good introduction to general neural processing can
be found in [3].
The system itself consists of three main blocks (Fig. 2). The first stage performs data acquisition and
preprocessing. Artificial neural network is the second block - main processing is done in this stage. The third
(output) block transforms output vector of the NN into response of the whole system.
training vectors
image
preprocessing
(block 1)
input
vector
ANN
(block 2)
threshold
output
vector
output vector
interpretation
(block 3)
response
Fig. 2. Neural processing system model
Input to the system is presented as 85x85 pixel 8-bit grayscale bitmap. In the first stage size of the
bitmap is rescaled to 25x25. As value of each pixel falls between 0 and 255 another rescaling has to be done in
order to effectively utilize linear part of activation function. Also the system response should be invariant with
respect to changes in environment temperature. Both problems can be solved with rescaling all pixel values in
the input image to fall into range (0, 1). This is achieved by applying
pi =
where:
xi - min{ x}
,
max{ x} - min{ x}
(1)
pi is i-th element of the input ANN vector p,
xi is i-th element of the input image vector x,
to each pixel.
As it was stated before, the second layer is a classic implementation of MLP. The ANN has 625 neurons in the input layer, one hidden layer consisting of 12 units (number of neurons in the hidden layer has been
chosen experimentally). Response of the ANN is coded in the "1 of N" manner. Unipolar sigmoid function
f (ui ) =
1
,
1 + exp(- bui )
(2)
where:
ui is weighted sum of i-th perceptron input values,
β is parameter defining “steepness” of activation function f,
was chosen for the nonlinear activation element, so all neuron output values fall in range (0, 1).
Training set consisted of 3 photos of five first subjects. The net was trained with use of backpropagation
algorithm with momentum and variable learning rate. Mean square error measure (MSE)
MSE (o, t ) =
1
N
N
å (oi - ti )2 ,
(3)
i =1
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
120
where:
oi is i-th element of network output vector o,
ti is i-th element of target (desired) output vector t,
N is the number of output neurons,
was used as the base of network energy function.
The third block of the system as an input takes an output vector of the ANN. It performs MSE calculation of output vector o against the set of target output vectors T. Subject is considered authorized iff MSE of at
least one of target output vectors ti ÎT with o is lower than threshold d. Index i of neuron with the highest response, max{oi}, indicates the class to which the corresponding input vector (and the subject itself) belongs.
4. JOINT TRANSFORM CORRELATOR APPROACH
The joint transform correlator (JTC) is one of the main optical architectures (except for the Van der
Lugt correlator) that are being used for purposes of pattern recognition. There exists a comprehensive bibliography describing the applications of JTC, as e.g. [1, 4]
In this study we have used computer simulation of the JTC. As the basis of an optical correlator the
fundamental relation for optical signal correlation was applied:
s * f = FT -1 ( S × F * ) ,
where:
(4)
s and f are functions (images) being correlated,
S and F are Fourier transforms of s and f, respectively,
FT-1 is the inverse Fourier transform.
For evaluation of the correlation quality two criterions were used: Discriminant Capability (DC) and
Peak-to-Correlation Energy (PCE). DC characterizes the ability to recognize the target image against non-target
(the recognized thermogram) and is defined as
DC =
where:
CC
,
A
(5)
CC is non-target correlation signal (cross correlation),
A is target correlation signal (autocorrelation).
Peak-to-Correlation Energy is defined as
c(0,0 )
PCE =
,
Ec
2
where:
(6)
c(0,0) is the highest value of the correlation peak,
Ec is the correlation plane energy:
¥
¥
-¥
-¥
Ec = ò dx ò dy c( x, y ) .
2
(7)
The PCE calculated for a high and sharp correlation peak has a larger value in comparison to the case of
low and broad correlation peak [7].
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
121
5. ACCURACY MEASURES FOR BIOMETRIC SYSTEMS
Performance of biometric system is measured with two main parameters: false acceptance rate (FAR)
and false rejection rate (FRR). FAR is the probability that a biometric system will incorrectly identify an individual or will fail to reject an impostor. FFR is the probability that a system will fail to identify a subject, or
verify the legitimate claimed identity of an individual.
Usually both ratios depend on one parameter. In ANN-based system it is the threshold of MSE (3) calculated for output vector o against desired output vector t. In the JTC case it’s the threshold computed for described above criterions values, which are correlation measures.
It is obvious that change in any of two measures implies change of another one in opposite direction.
When threshold decreases the system becomes more rigorous, which results in decrease of FAR but an increase
of FRR. Analogous situation occurs when threshold increases.
Equal error rate (EER), sometimes called cross-over error rate, is evaluated at the intersection point of
FAR and FRR plotted against threshold. Corresponding threshold value, called equal error point (EEP) is often
used in a final system implementation. Another way to evaluate optimal threshold is finding minimum of
weighted FAR+FRR function, which minimizes total number of authorization errors (regardless of their type).
6. ACCURACY ANALYSIS OF ANN-BASED SYSTEM
Error ratios have been averaged for 20 training sessions. For each rate sessions with different MSE
training goals have been computed. To preserve clarity FAR and FRR have been presented on two different plots
(Fig. 3., Fig. 4.).
One can notice that averaged FAR increases linearly with threshold value. This fact can be simply explained: during training sessions images of subjects to be rejected weren’t presented to the network. It wouldn’t
be advisable, as during training ANN in a real implementation one cannot present images of all unauthorized
subjects to the network. It’s obvious, that network response for images from classes, that haven’t been used during training sessions, has a statistical character.
Fig.3. Average FRR of neural system
Fig. 4. Average FAR of neural system
In order to maximize system performance ANN should be trained to lowest possible MSE goal. Table 1
presents EER values for the same ANN configuration and for different training goals.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
122
Table 1. Average neural system EER
1
2
3
Training MSE goal
0.01
0.005
0.001
Average EER
0.1393
0.0858
0.0558
7. ACCURACY ANALYSIS OF JTC BASED SYSTEM
JTC based thermogram recognition was used in identification mode, where the biometric system identifies a person from entire enrolled population by searching a database for a match. For accuracy evaluation DC
(5) and PCE (6) criterions were used. The results are presented in Fig. 5. and Fig. 6.
Fig. 5. JTC error rates based on DC criterion
Fig. 6. JTC error rates based on PCE criterion
In the next step, for each person the mean images based on 5 different thermograms were created and
accuracy evaluation was repeated (Fig. 7. and Fig. 8.). Generally in both cases using the DC criterion for correlation accuracy evaluation yield better results. Moreover the applying of thermograms averaging improved results of recognition from EER=11,4% to ERR=8,3% for DC and form EER=30,4% to EER=28,3% for PCE.
Fig. 7. JTC error rates based on DC criterion
for averaged thermograms
Fig. 8. JTC error rates based on PCE criterion
for averaged thermograms
8. CONCLUSIONS AND FURTHER RESEARCH
In this paper we have shown two complete neural biometric systems. Accuracy has been measured using standard ratios for this kind of applications.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
123
Further research in this area will include testing the systems with larger databases of subjects and development of better preprocessing methods in order to improve accuracy. Visible wavelength and infrared-based
systems shall be compared. We will also make attempts to implement both systems in hardware.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
COLIN J., LANDRU N., LAUDE V. et al., 1999: High-speed photorefractive joint transform correlator
using nonlinear filters, Applied Optics, 1, pp.: 283-285
HECHT-NIELSEN R., 1990: Neurocomputing, Addison-Wesley, USA
HERTZ, J., KORGH, A., PALMER R., 1991: Introduction to the Theory of Neural Computation,
Addison-Wesley, USA
KODATE K., INABA R., WATANABE E., KAMIYA T., 2002: Facial recognition by compact parallel
optical correlator. Measurement Science and Technology, vol. 13, pp.: 1756-1766
OSOWSKI S., 1996: Sieci neuronowe w ujęciu algorytmicznym, Wydawnictwa Naukowo-Techniczne,
Warszawa
SOCOLINSKY D. A., WOLFF L. B., NEUHEISEL J. D., EVELAND C. K., 2001: Illumination Invariant
Face Recognition Using Thermal Infrared Imagery, IEEE Proceedings on Computer Vision and Pattern
Recognition
VIJAYA KUMAR B. V. K., HASSEBROOK L., 1990: Performance for correlation filters, Applied Optics, 129, pp.: 2997-3006
WILDER J., PHILLIPS P., JIANG C., WIENER S., 1996: Comparison of visible and infrared imagery
for face recognition, Proceedings of the 2nd International Conference on Automatic Face & Gesture Recognition, Killington, pp.: 182-187
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
124
NEURAL NETWORK IN ADAPTIVE CONTROL
Václav Veleba, Petr Pivoňka
Dept. of Control and Instrumentation, FEEC, BUT, Božetěchova 2, 612 66 Brno, Czech Republic
Tel.: +420 5 4114-1193, Fax: +420 5 4114-1123, E-mail: [email protected]
Abstract: This paper describes some approaches to using neural networks as adaptive controllers.
We used three different heterogeneous structures with neural network working as controller,
operator and estimator:
·
Direct neural controllers
·
PSD controller tuned by neural networks
·
Adaptive controllers tuned by Ziegler-Nichols method with process identification using neural
networks
The controllers were implemented in MATLAB-Simulink and programmable logic controller.
They were tested on different dynamic models and real process. Their principle, advantages and
disadvantages were compared.
1. INTRODUCTION
Used neural networks in adaptive control have several employments: they were applied as controller,
for model inverse dynamics or for parameter identification of controlled system. The tested algorithms were
implemented in ANSI C, which is easy portable. Algorithms performances were validated by simulation in
Matlab and then algorithms were implemented in the programmable logic controller B&R 2005.
During implementation was strictly keeping principle of three-step design of control algorithm [4]. The
first step is simple simulation and algorithm is tested in MATLAB on a number of mathematic models. In the
second step, we keep the same simulating scheme, but numeric model is replaced by an interface connected in
PLC with real process. Heterogeneous algorithm is now tested on real process and comfort of simulation
environment is retained. Direct implementation in PLC is the last step. The previous steps make it very simple
with minimum changes in code.
2. NEURAL ADAPTIVE CONTROLLERS
2.1. Direct inverse controller
A controller shown in figure 1 is one of the simplest neural controllers.
Classical neural network is trained using known inputs and targets, but network used as this controller is
trained to mineralize control error e(k). Network is trained on-line in any control step. When error is nearly zero,
we could say, then, that neural network is able to approximate an inverse dynamic of controlled process.
w(k)
Neural network
u(k)
Process
y(k)
e(k)
Fig. 1: Control with direct inverse controller
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
125
Signals, which give information of process dynamic, are connected to inputs of neural networks, for
improving properties of control. These inputs usually contain delay signals of set point w(k) and output y(k)
according to figure 2, but exist variants with simple state reconstruction [5].
During control of different process was established following properties of direct inverse controller:
·
Simple and efficient control mechanism
·
Low memory cost – last process data are not saved
·
Zero control error despite non integral character of process
·
Relative fast and damped oscillation response of the process to a step change
·
Slow adaptation
·
Impossibility of batch learning and selecting significant training instance
·
Steady control error during linear changing of set point
·
Casual oscillations due to over train
Regulating proces of direct inverse controller is shown in figure 3.
w(k)
w(k)
z
z
z
e(k)
w(k-1)
-1
-1 w(k-2)
NN
y(k-1)
-1
z
u(k)
F(s)
y(k)
y(k-2)
-1
z
-1 y(k-3)
Fig. 2: Direct inverse controller
Set hodnota
point and
process output
Zadana
a regulovana
velicina
6
4
2
0
-2
-4
-6
0
100
200
300
t [s]
400
500
600
400
500
600
Control
Akcni zasah
value
1.5
1
0.5
0
-0.5
-1
-1.5
0
100
200
300
t [s]
Fig. 3: Control process of direct inverse controller. Transfer function of process is F2(s).
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
126
2.2. Adaptive PSD controller tuned by neural networks
The PID controller and discrete PSD controller are the most usual controllers in world. Robustness,
simplicity and stability are controller rated features in many industry applications. Adaptive PSD controller
tuned by neural networks is based on idea of neural network working as human operator settings parameters of
PSD.
Although he provides less efficient transient response, it is simple to implement and standard algorithm
PSD ensure stability between change of plant dynamic and response of adaptive mechanism. Direct access to
tuned controller parameters is decided advantage.
A structure shown in figure 4 uses multilayer neural network. These networks inputs are connected to
delayed control value and control error signals. Neural network is trained to mineralize control error,
analogically to direct inverse controller.
Feature of adaptive PSD controller tuned by neural networks:
·
Easy to implement
·
Fast computed control value
·
Access to tuned controller parameters
·
Robustness due to conventional PSD algorithm
·
Passable noise immunity
·
Damped oscillating step response
·
Casual steady control error during zero order signal
Regulating proces of adaptive PSD controller tuned by neural networks is shown in figure 5.
u(k)
w(k)
e(k)
K
TI
F(s)
z
TD
z
z
-1 y(k-3)
u(k-1)
-1
z
NN
TI
u(k-2)
-1
z
Fig. 4:
-1
y(k-2)
-1
z
z
-1
K
y(k-1)
-1
z
y(k)
-1 u(k-3)
TD
Adaptive PSD controller tuned by neural networks
2.3. Adaptive Ziegler-Nichols rule based PSD controllers with neural network identification
In this chapter, we discuss a controller that uses an algorithm [2] for computing the ultimate gain KULT
and the ultimate period TULT. We compute the ultimate gain and the ultimate period from parametric model of
controlled process. This model is based on neural network. We used a popular Ziegler-Nichols algorithm to set
the parameters of PSD controller. This sequence (figure 7) ensures adaptive response for process dynamic
change. Neural network dynamic is represented by step delays as above.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
127
Set point and process output
Zadana hodnota a regulovana velicina
10
5
0
-5
0
100
200
300
t [s]
Akcni
Controlzasah
value
400
500
600
0
100
200
300
t [s]
400
500
600
10
5
0
-5
-10
Fig. 5: Control process of adaptive PSD controller tuned by neural networks. Transfer function of process is
F2(s).
The algorithm for minimization can be the back propagation (BP) or Levenberg-Marquardt (LM).
An advantage of (LM) is foremost a speed of convergence, but a disadvantage is a complicated
implementation due to complex matrixes computations. LM is sensitive to disturbances and offset. From this
point of view, LM is similar to most frequent method for identification – least square method. For this reason,
we prefer BP that provides very fast sampling even high noise level.
Nonlinear structure together with powerful learning algorithm has ability to overcome some
nonlinearity in control loop as quantization effect.
We used two standard controllers in heterogeneous control structure:
·
Takahashi controller
·
PSD controller with derivate part filtration using Tustin approximation
The different between two controllers is evident from transient step shown in figure 6. The start of step
was set after end of adaptation (stability of estimated parameters).
Set point and process output
Set point
Tustin approximation
Takahashi controller
Fig. 6: Comparison between neural identification based controller. Takahashi controller and PSD controller
with Tustin approximation control process F2(s). Adaptation was over after 32s.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
128
KULT
Critical
parameters
finding
TULT
a1
y (k-1)
a2
y (k-2)
NN
b1
u (k-2)
y (k)
w (k)
Controller
u (k)
z
-1
z
-1
-1
u (k-1)
b2
Z-N
algorithm
z
z
-1
e (k)
y (k)
F(s)
Fig. 7: The structure of controller tuned by Z-N method with neural network identification
The Takahashi controller provides faster transient response then PSD with Tustin approximation, but it
has notable overshoot. The PSD with Tustin approximation has derivate character otherwise.
Both adaptive controllers have similar features:
·
Simply implementation of controller and adaptation
·
Fast transient response
·
Stability even noise at sensor
·
Control of complex root and integrating systems
·
High ability to adapt to rapid change process dynamic and process gain
·
Damped control value
·
Access to controller parameters
·
Non-filtered derivation cause oscillation of controller value if exist noise in control loop
·
The convergence speed of adaptation is decreased with increasing buffer length
Set point and process output
Control value
Fig. 8: Control process of adapting PSD controller with derivate part filtration using Tustin approximation
with neural networks identification. Transfer function of process is F2(s).
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
129
3. TRANSFER FUNCTIONS
F1 (s ) =
1
(5s + 1)(4s + 1)
F2 (s ) =
5
(3s + 1)(4s + 1)
4. CONCLUSION
The paper describes several adaptive structures based on neural networks. The standard adaptive control
approach could fail because some aspects are omitted in theory. These aspects are too short sampling period,
nonlinearities in control loop, highly decreased numerical precision due to used A/D and D/A converter. In that
case, we could use neural network as effective substitution for standard control element.
All algorithms described in this paper were implemented in ANSI C, they were validated by simulation
in MATLAB and then they were implemented in PLC B&R 2005. The results of simulations are shown in
figures 3, 5, 6 and 8. The output of real process controlled by algorithm implemented in PLC is shown in figure
9.
Set Žádaná
point and
hodnota
process
a regulovaná
output
4
veličina
2
0
2
40
1
0
2
0
3
0
4
0
5
t0
[s]
6
0
7
0
8
0
9
0
10
0
6
0
7
0
8
0
9
0
10
0
Control
Akčnívalue
1
0
zásah
5
0
5
10 0
1
0
2
0
3
0
4
0
5
t0
[s]
Fig. 9: Programmable logic controller B&R 2005: Control process of adapting PSD controller with derivate
part filtration using Tustin approximation with neural networks identification. Transfer function of process
is F1(s).
ACKNOWLEDGEMENTS
This research was particularly supported by GACR 102/01/1485 – Environment for Developing,
Modeling and Application of Heterogeneous Systems, MSMT CEZ MSM: 260000013 – Automation of
Technological and Manufacturing Processes.
REFERENCES
[1] OMATU, S., KHALID, M. , YUSOF, R.: NEURO-Control and its Applications, Springer, Berlin, 1996.
[2] BOBÁL, V., BŐHM, J., PROKOP, R., FESSL, J.: Practical aspects of self tuning controllers: algorithms and
implementation (in Czech). Vutium, Brno, 1999.
[3] PIVOŇKA, P.: Digital control (in Czech). Electronic textbook, ÚAMT FEKT VUT, Brno, 2000.
[4] VELEBA, V.: Adaptive neural controllers (in Czech). Diploma work, ÚAMT FEKT VUT 2003, Brno.
[5] KRUPANSKÝ, P., PIVOŇKA, P.: Adaptive neural controllers and their implementation. ÚAMT FEI VUT,
Brno, 2000.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
130
FUZZY INFERENCE SYSTEM AND PREDICTION
Libor Žák
Department of Stochastic and Non-Standard Methods, Institute of Mathematics
Faculty of Mechanical Engineering, Technical University of Brno,
Technická 2, 616 69 Brno, Czech Republic,
Email: [email protected]
Tel: +420-541142550, Fax: +420-541142527
Abstract: This paper describes the implementation of fuzzy set theory and Fuzzy Inference System
(FIS) for prediction of electric load. The proposed technique utilizes fuzzy rules to incorporate
historical weather and load data. The use of fuzzy logic effectively handles the load variations due
to special events. The fuzzy logic has been extensively tested on actual data obtained from the Czech
Electric Power Company (ČEZ) for 24-hour ahead prediction. Test results indicate that the fuzzy
rule base can produce results better in accuracy than artificial neural networks (ANNs) method.
Keywords: fuzzy sets, fuzzy logic, fuzzy inference system, prediction.
1. INTRODUCTION
Short-term load forecasting (STLF) of power demand plays a very important role in the economic and
secure operation of power systems [1]. Improvement in accuracy of load forecasts results in substantial savings
in operating cost and also increase the reliability of power supply. In order to predict the future load accurately,
numerous forecasting techniques have been used during the past 40 years.
Fuzzy logic model has been selected as an alternative method for the load forecasting problem in this
paper. It is a suitable technique in case when the historical data are not real numbers, but linguistic values [4].
This paper presents the results of a preliminary investigation of the feasibility of use of a fuzzy logic model for
short-term load forecasting. In this research, historical load and weather data are converted into fuzzy set theory
to produce fuzzy forecasts and defuzzification is performed to generate a point estimate for system load.
2. FUZZY INFERENCE SYSTEM
Fuzzy set is one, which has vague boundaries. Fuzzy sets can successfully represent a human's
ambiguous estimations. They are represented by a membership function, which takes values between 0.0 and
1.0. Fuzzy reasoning uses this membership function to describe the human's reasoning process successfully.
2.1 Description of Fuzzy Inference System
Fuzzy set A is defined in terms (U, mA), where U is relevant universal set and mA: U ® á0,1ñ is a
membership function, which assigns each elements from U to fuzzy set A. The membership of the element xÎU
of a fuzzy set A is indicated mA(x). We call F(U) the set of all fuzzy set. Then „classical“ set A is fuzzy set
where: mA: U ®{0, 1}. Thus xÎA Û mA(x) = 0 and xÏA ÛmA(x) = 1. Let Ui , i = 1, 2, ..., n, be universals. Then
fuzzy relation R on U = U1 ´ U2 ´...´ Un is a fuzzy set R on universal U.
Fuzzy Inference System: One of the possible applications is a fuzzy inference system (FIS) (fuzzy
regulator). There are a few types of regulators. We use the regulator of type P: u = R(e) for our purposes, where
the action values depend only on a regulation deviation:
Input variables: Ei = (Ei, T (Ei ), Ei, G, M), i = 1,..., n.
Output variables: U = (U, T (U), U, G, M).
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
131
We consider the fuzzy regulator as the statement of the type: Â = Â1 else Â2 else ,..., else Âp,
where  k is in the form:  k = if e1 is X E1 ,k and e2 is X E2 , k and ..... and en is X En ,k then u is YU ,k ,
where e1ÎE1,..., enÎEn, uÎU, X Ei ,k Î T (Ei), YU ,k Î T (U) for "i= 1,...,n, for "k=1,.., p.
We mark meaning of the mathematical statement  by M(Â)=R. M(Â) and the fuzzy relation on
p
E1´E2´...´En´U is defined by R=M(Â)= U M(Âk), where else it is considered as an union and M(Âk) is
k =1
defined in the form: M(Âk) = AE1 ,k ´ AE2 ,k ´...´ AEn , k ´ AU , k . This is the fuzzy relation on E1´ E2´...´En´U,
where AEi , k = M( X Ei ,k ) is the fuzzy set on universal Ei for "i= 1,…, n and AU , k = M( YU ,k ) is the fuzzy set on
universal U for "k=1,…, p.
Let a Ei i=1,…,n be a regulation deviance. Let a Ei be any fuzzy set on Ei.
Then the dimension of action values aU is defined by term aU = ( a E1 ´ a E 2 ´...´ a E n )°R. This is a
composition of fuzzy relation ( a E1 ´ a E 2 ´...´ a E n ) on universal E1´E2´...´En and relation R defined on the
universal E1´...´En´U. The result of this composition is the fuzzy set on the universal U.
We do not require the output of fuzzy regulator to be a set in many cases, but we require the concrete
value z0ÎZ, e.i. we want to make a defuzzification. The centroid method is the most frequently used method of
defuzzification. The FIS defined in such a way is called Mamdami.
Fuzzy rule base
Nonfuzzy
input
Fuzzification
interface
Defuzzification
interface
Nonfuzzy
output
Fuzzy interface
machine
Fig. 1: Block diagram of a fuzzy inference system.
2.2 Description FIS for prediction of load
The four principal components of a fuzzy system is shown in Figure 1. The fuzzification interface
performs a scale mapping that changes the range of values of input variables into corresponding universe of
discourse. It also performs fuzzification that converts nonfuzzy (crisp) input data into suitable linguistic values,
which may be viewed as labels of fuzzy sets. Fuzzy rule base, which consists of a set of linguistic control rules
written in the form “If a set of conditions are satisfied, Then a set of consequences are inferred”. Fuzzy inference
machine, which is a decision-making logic that employs rules from the fuzzy rule base to infer fuzzy control
actions in response to fuzzified inputs. Defuzzification interface performs a scale mapping that converts the
range of values of output variables into corresponding universe of discourse. It also performs defuzzification that
yields a nonfuzzy (crisp) control action from an inferred fuzzy control action [2]. A commonly used
defuzzification rule known as centroid method is used here, according to which the defuzzification interface
produces a crisp output defined as the center of gravity of the distribution of possible actions. This centroid
approach produces a numerical forecast sensitive to all the rules.
Fuzzy logic is a tool for representing imprecise, ambiguous, and vague information. Fuzzy set
operations are grounded on a solid theoretical foundation, although they deal with imprecise quantities, they do
so in a most precise, and well-defined way. Fuzzy operations that act on the membership functions lead to
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
132
ExL
1
VL
L
M
H
VH ExH
Degree of membership
Degree of membership
consistent and logical conclusion [3]. If we use appropriate membership function definitions and rules, we can
achieve useful results. One of the most useful properties of the fuzzy set approach is that contradictions in the
data need not cause problems. Fuzzy systems are stable, easily tuned and can be conventionally validated.
Designing of fuzzy sets is very easy and simple. Abstract reasoning and human-like responses in cases involving
uncertainty and contradictory data are the main properties of fuzzy systems [4].
0.8
0.6
0.4
0.2
VC
1 ExC
C N W H
VH ExH
0.8
0.6
0.4
0.2
0
0
4000
6000
8000
Input Load [MW ]
1000
0
-10
0
10
20
Temperature [°C]
30
Fig. 2: Triangular type membership functions for input load, input temperature.
ExL
VL
L
M
H
VH
ExH
ExC
1
VC
C
N
W
H
VH
ExH
0.8
0.8
Degreeof membership
Degree of membership
1
0.6
0.4
0.2
0.6
0.4
0.2
0
0
4000
5000
6000
7000
8000
Input Load [MW]
9000
-15
10000
-10
-5
0
5
10
15
20
Temperature[°C]
25
30
Fig. 3: Gaussian curve membership function for input load and input temperature.
ExL
VL
1
L
M
H
VH
ExH
1
0.8
VC
C
N
W
H
VH
ExH
Degree of membership
0.8
0.6
Degreeofmembership
ExC
0.6
0.4
0.4
0.2
0.2
0
0
4000
5000
6000
7000
8000
InputLoad[MW]
9000
10000
-15
-10
-5
0
5
10
15
20
25
30
Temperature [°C]
Fig. 4: Trapezoidal membership function for input load and input temperature.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
133
2.3. Fuzzy rules for prediction of load
Heuristic and expert knowledge are often expressed linguistically in the form of If-Then rules. These
rules can be extracted from common senses, intuitive knowledge, general principles and laws, and other means
that reflect actual situations. For example “If the temperature is extremely cold, Then load demand will be very
high” is an example of logic statement. The input temperature is sorted into eight categories and labeled as
Extremely Cold (ExC), Very Cold (VC), Cold (C), Normal (N), Warm (W), Hot (H), Very Hot (VH), and
Extremely Hot (ExH). The input and output load is sorted into seven categories and labeled as Extremely Low
(ExL), Very Low (VL), Low (L), Normal (N), High (H), Very High (VH), and Extremely High (ExH) as shown
in Figure 2.
A tentative list of input and output variables using statistical analysis, and engineering judgements was
compiled. The input and output variables within the [0, 1] region was normalized. The next step was the
selection of the shape of the fuzzy membership for each variable. This is purely arbitrary, but one usually starts
with a particular shape of membership function and changes it if the forecasting accuracy is not good. The
triangular shaped mappings shown in Figure 2 are very common, however, alternative shapes, such as Gaussian
and trapezoidal curve with different amounts of function overlapping, were also used as shown in Figure 3 and 4,
respectively.
VL
L
M
H
VH ExH
Output Load [MW]
Degree of membership
1 ExL
0.8
0.6
0.4
0.2
6000
8000
8000
6000
4000
30
20
10
0
Temperature [°C] -10
0
4000
10000
10000
8000
4000
10000
6000
Input Load [MW]
Output Load [MW]
Fig. 5: Triangular type membership functions for output load and FIS surface for triangular input.
3. TEST RESULTS
Two years of historical load and temperature data were used, one year (1998) for the fuzzy rule base
design and the following year (1999) for testing the model performance. The inputs used to forecast hourly load
for day (t) are; hourly load of the previous day (t-1) and minimum, maximum and average temperature for the
day (t). In case, the accuracy is un-satisfactory, then the number of fuzzy membership functions and/or shape of
the fuzzy membership functions can be changed and a new fuzzy rule base is obtained. The iterative process of
designing the rule base, choosing a defuzzification algorithm, and testing the system performance may be
repeated several times with a different number of fuzzy membership functions and/or different shapes of fuzzy
memberships. The fuzzy rule base that provides the minimum error measure for the test set is selected for realtime forecasting.
The quality of prediction of heat consumption has been calculated according to average error MAPE.
Let (R1, R2, …, Rk) be the real values of time series and ( P1, P2, …, Pk ) are the predicted members of time series
where k is the number of searched members. Thus
æ k
ö
MAPE = 1 k çç å (abs( Pi - R i ) / R i ) ÷÷ .
è i =1
ø
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
134
MAP = max (abs( Pi - R i ) / R i ) .
i=1,..., k
Using different membership functions, the mean absolute percentage error (MAPE) and maximum
absolute percentage error (MAP) for working days of the week, weekend and special days are computed as given
in Tables 1, 2 and 3, respectively. Similarly, MAPE for each day of the week during 3rd week of January 1999
using fuzzy logic and back-propagation neural network is also shown in Figure 6.
Working
Days
Monday
Tuesday
Wednesday
Thursday
Friday
Membership Functions
Gaussian Curve
Triangular
Trapezoidal
MAPE (%)
MAP (%)
MAPE (%)
MAP (%)
MAPE (%)
MAP (%)
2.8
1.3
0.99
0.88
0.89
6.1
4.7
2.7
2.3
2.4
2.3
1.1
1.4
1.8
1.6
8.8
3.9
4.9
4.0
4.3
2.2
1.4
1.2
1.1
1.2
4.8
5.4
3.5
2.5
3.2
Table 1: MAPE and MAP for working days in 3rd Week of January 1999 using various membership functions.
Weekend
Days
Saturday
Sunday
Membership Functions
Gaussian Curve
Triangular
Trapezoidal
MAPE (%)
MAP (%)
MAPE (%)
MAP (%)
MAPE (%)
MAP (%)
3.6
3.1
8.5
7.6
5.4
4.9
7.8
8.6
6.0
6.2
8.6
9.8
Table 2: MAPE and MAP for 3rd Weekend in January 1999 using various membership functions.
Special
Days
New year day
Easter
Labor day
Victory day
Yanhose day
Independence day
Students’ day
Christmas eve
Christmas 1st day
Christmas 2nd day
Triangular
Membership Functions
Gaussian Curve
Trapezoidal
MAPE (%)
MAP (%)
MAPE (%)
MAP (%)
MAPE (%)
MAP (%)
2.6
3.0
2.6
3.2
2.0
1.6
2.6
3.1
3.4
3.2
8.0
7.1
7.8
7.9
4.1
4.4
5.1
5.6
8.7
9.8
3.4
3.0
4.0
1.3
4.8
2.2
1.8
3.6
4.6
4.2
7.6
8.1
9.2
4.1
4.4
5.0
6.8
5.1
8.9
7.9
3.8
2.7
3.3
4.0
2.2
4.8
2.7
1.7
4.9
5.1
7.4
6.8
9.2
9.9
5.6
8.5
5.4
4.4
7.3
10.1
Table 3: MAPE and MAP for special days/holidays during the year 1999 using various membership functions.
To compare the performance of the fuzzy logic model, a back-propagation network (BPN) was also
developed. A separate BPN with only one hidden layer and 24 hidden neurons was used for each day using the
same inputs and single output as the fuzzy logic model. The training and testing of BPN were performed with the
same training and testing data sets as mentioned earlier.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
135
F u zzy
BPN
4
3 .5
3
MAPE [%]
2 .5
2
1 .5
1
0 .5
0
M onday
Tuesday
W ednesday
T h u rs d a y
F rid a y
S a tu r d a y
S unday
D a y s o f th e W e e k
Fig. 6: MAPE for each day of the Week during 3rd Week of January 1999 using fuzzy logic and back
propagation neural network.
4. CONCLUSIONS
A forecasting technique based on fuzzy logic approach has been presented in this paper. This approach
can be used to forecast loads with better accuracy than ANN technique. The flexibility of the proposed approach,
which offers a logical set of rules, readily adaptable and understandable by an operator, may be a very suitable
solution to the implementation and usage problem that has consistently limited the adoption of STLF models.
This technique is simple to implement on a personal computer and allows for operator intervention.
Various membership functions have been discussed, and for the particular application data sets, their
effects on model performance have been demonstrated. The proposed model has been able to generate forecasts
with a MAPE frequently below 2.8 % for working days, 3.6 % for weekends and 3.4 % for special days. The
simulation results demonstrate the effectiveness of the fuzzy model for 24-hour ahead prediction.
REFERENCES:
[1]
[2]
[3]
[4]
[5]
[6]
YANG HONG-TZER, HUANG CHAO-MING: “A new short-term load forecasting approach using
self-organizing fuzzy ARMAX models”, IEEE Transactions on Power Systems, Vol. 13, No. 1, (1998),
217-225.
HAYKIN, S.: “Neural networks _ A comprehensive foundation“, Macmillan College Publishing
Company, New York, 1994.
MASTERS, T.: “Practical neural network recipes in C++, Academic Press Inc., 1993.
KHAN, M.R., ŽÁK, L., ONDRŮŠEK, Č.: "Forecasting Weekly electric load using a hybrid fuzzyneural network approach", submitted for publishing in Engineering Mechanics _ International Journal of
theoretical and applied mechanics, Technical University of Brno, Czech Republic, December 2000.
KHAN, M.R., ŽÁK, L., ONDRŮŠEK, Č.: Fuzzy Logic Based Short_Term Electric Load Forecasting,
ELEKTRO 2001, 4th International Scientific Conference, Žilina, 2001, pp 19-25, ISBN 80-7100-836-2.
ŽÁK, L. Fuzzy Objects and Fuzzy Clustering. Proc. 8th Zittau Fuzzy Colloquium, Zittau, 2000, pp 293 –
302, ISBN 3-00-006723-X.
ACKNOWLEDGEMENT
The paper was supported by research design CEZ: J22/98:261100009 “Non-traditional methods for
investigating complex and vague systems” and the Grant Agency of Czech Republic under grant No:
101/01/0345.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
136
EVOLVING CONTROLLERS
Pavel Ošmera 1), Uday K. Chakraborthy2), Imrich Rukovanský 3)
1)
2)
3)
Brno University of Technology, Technická 2, 616 69 Brno, Czech Republic
E-mail: osmera @uai.fme.vutbr.cz
Dept. of Math & Computer Sc., Univ. of Missouri, St. Louis, MO 63121, USA
E-mail: [email protected]
European Polytechnical Insstitute Kunovice, Osvobozeni 699, 686 04 Kunovice, Czech Republic
Tel. 572 548 035, Fax: 572 549 018, E-mail: [email protected]
Abstract: The adaptive significance of genetic algorithms (GAs) with diploid chromosomes and an
artificial immune system has been studied. An artificial immune system was designed to support a
parallel evolutionary algorithm. We implemented hybrid and parallel genetic algorithms for design
of evolvable controllers. A flexible structure with PID controllers can be designed by parallel
evolutionary algorithms. The adaptive significance of parallel GAs and the comparison with
standard GAs are presented.
Keywords: artificial immune systém (AIS), parallel genetic algorithms, neural networks, evolving
controllers, adaptive systems, immune network model
1 INTRODUCTION
Darwin suggested that slight variation among individuals significantly affects the gradual evolution of
the population. He called this differential reproductive process of varying individuals natural selection.
Gregor Mendel (1822-1884) accurately observed patterns of inheritance and proposed a mechanism to
account for some of the patterns. Genes determine individual traits. Various kinds of offspring appear in
proportions that can be predicted from Mendel’s laws. We often use the term Mendelian genetics (see memory I
in Figure 1) to refer to the most basic patterns of inheritance in sexually reproducing organisms with more than
one chromosome. Mendel presented his classic paper Experiments in Plant Hybrids for Natural Science Society
in Brno in 1865. Mendel observed that the spherical seed trait was dominant, being expressed over the dented
seed trait, which he called recessive. In diploid organisms, chromosomes come in pairs (memory Ia and memory
Ib in Fig. 1).
Dawkins [3] described organisms as vehicles for the genes, built to carry them around and protect them.
Pictures, books, tools and buildings are meme vehicles (memory III). In our own bodies, thousands of genes
cooperate to produce organs and to result in a machine that effectively carries all genes around inside it.
However, when we examine a biological cell or an organism, the situation is quite different: not only
are these systems open, but they exist only because they are open. The feed of the flux of matter and energy is
coming to them from the outside world. The free energy E can create P different organisms (species), every
species with Ni copies that are created by DNA information Ii , where Qin_i is the metabolic heat inside of an
organism released by their activities, Qout is the metabolic heat that is lost. Ti is the temperature, mi is a mass, Vi
is a volume, W is a scrap (waste) and c1, c2 are constants [7, 8]:
E = E stru cture + E heat =
P
å (c N V o
i =1
1
i
i
i
+ Q in _ i ) + W + Q o ut ,where
Qin _ i = c2 N i miTi
(1)
oi, is the measure of energetic order (defined by the energetic density = Estruxture_i/Vi). On the Earth there appeared
a complex biosphere with the food chains that must satisfy (1). An increase of energy DE is covered by sun or
earth activity. In a far-from-equilibrium condition, various types of self-organization processes may occur. The
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
137
goal of evolution (fitness Fe) is to maximize the accumulated energy in living systems that have a complex
evolution structure (see Fig.1):
ì P
ü
Fe = max through _ evolution íc1.å NiVi oi ý
î i =1
þ
(2)
A living system appears very complex from the thermodynamic point of view. Certain reactions are
close to equilibrium, while others are not. Open systems evolve to higher and higher forms of complexity.
Molecular biology shows that not everything in a cell is alive in the same way. Some processes reach
equilibrium and others are dominated by regulatory enzymes far from equilibrium [4].
2 SELF-ORGANIZATION AND ADAPTATION OF COMPLEX SYSTEMS
In dynamical systems, transitions can be found: order, complexity, and chaos. Analogously, water can
exist in solid, transitional, and fluid phases [4, 8]. In nonlinear systems, a chaos theory tells us that the slightest
uncertainty in our knowledge of initial conditions will often grow inexorably, and our predictions are nonsense.
Complex adaptive systems share certain crucial properties (non - linearity, complex mixture of positive and
negative feedback, nonlinear dynamics, emergence, collective behavior, spontaneous organization, etc.). In the
natural world, such systems include brains, immune systems, ecology, cells, developing embryos, and ant
colonies. In the human world, they include cultural and social systems. Each of these systems is a network of a
number of “agents” acting in parallel. In a brain, the agents are nerve cells; in ecology, the agents are species; in
a cell, the agents are organelles such as the nucleus and the mitochondria; in an embryo, the agents are cells, and
so on. Each agent finds itself in the environment produced by its interactions with the other agents in the system.
It is constantly acting and reacting to what the other agents are doing. There are emergent properties, the
interaction of a lot of parts, the kinds of things that the group of agents can do collectively; something that the
individual cannot. There is no master agent - for example - a master neuron in the brain. Complex adaptive
systems have a lot of levels of organization (hierarchical structures), with agents at a given level serving as
building blocks for agents at a higher level. The immune system is governed by local interaction between cells
and antibodies, there is no central controller in distributed control. Similar behavior can found in the
development of the Internet.
We can use biological laws to describe the development of the Internet. It is no wonder that complex
adaptive systems (with multiple agents, building blocks, internal models, and perpetual novelty) are so hard to
analyze using standard mathematics. We need mathematics and computer simulation techniques that emphasize
internal models, emergence of new building blocks, and a rich web of interactions among multiple agents.
We now have a good understanding of chaos and fractals showing how simple systems with simple
parts can generate very complex behaviors. The edge of chaos is a special region onto itself, the place where you
can find systems with lifelike, complex behavior. Living systems are actually very close to this edge-of-chaos
phase transition, where things are much looser and more fluid. A natural selection is not an antagonist of selforganization. It is a force that constantly pushes emergent, self-organizing systems towards the edge of chaos
from a chaos area. Mutation and crossover are opposite forces pushing the systems from an order to a chaos
areas. Evolution always seemed to lead to the edge of chaos. The complex evolutionary structure described in [7,
8] can be transformed to the structure of the computational intelligence (see Fig. 2).
A random genetic crossover or mutation may give a species the ability to run much faster than before.
The agent starts changing, then it induces changes in one of its neighbors, and finally you get an avalanche of
changes until everything again stops changing. Systems get to the edge of chaos through adaptation: each
individual (agent) tries to adapt to all the others. Co-evolution can also get them there; the whole system coevolves to the edge of chaos. In ecosystems or ecosystem models, three regimes can be found: ordered regime,
chaotic regime, and edge-of chaos like a phase transition. When the system is at the phase transition, then - of
course - order and chaos are in balance. There is an evolutionary metadynamics, a process that would tune the
internal organization of each agent so that they all reside at the edge of chaos. The maximum fitness occurs right
at the phase transition.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
138
3 AN ARTIFICIAL IMMUNE SYSTEM
The immune system is a complex, self-organizing and highly distributed system that has no centralized
control and that uses learning and memory when solving particular tasks. An artificial immune system (AIS)
fully exploits self-organizing properties of the vertebrate immune system. The biological immune system is an
efficient natural protection system whose primary function is to generate multiple antibodies from the antibody
gene libraries and to keep them alive even if the unknown foreign pathogen infects the body. The AIS has
several desired features for optimization purposes, such as robustness, flexibility, learning ability and memory.
The AIS is self-organizing, since it determines survival of newly created clones, and it determines its own size.
This is referred to as meta-dynamics of the system. These characteristics are often useful in scheduling problems.
Two distinct algorithms have emerged as successful implementations of artificial immune systems: the
immune network model described by Jerne and the negative selection algorithm developed by Forrest. The
immune system implements two types of selection [11] – negative selection and clonal selection. These two
processes have to be dealt with separately by researchers. Negative selection, which operates on lymphocytes
maturing in thymus (called T-cells), ensures that these lymphocytes do not respond to self-proteins. This is
achieved by killing any T-cell that binds to a self protein while maturing. The second selection process, called
clonal selection, operates on lymphocytes that have matured in the bone marrow (called B-cells). Any B-cell that
binds to a pathogen is stimulated to copy itself. The copying process is subject to high probability of errors
(“hypermutation”). The combination of copying with mutation and selection amounts to an evolutionary
algorithm that gives rise to B-cells that are increasingly specific to the invading pathogen. If the same or similar
pathogens invade in the future the immune system will respond much more quickly because it maintains a
memory of successful responses from previous infections. We used a constraint-handling approach based on
emulation of an immune system (particularly, using the negative selection model) that was incorporated into a
parallel GA [11].
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
139
ecosystems
Kauffman’s enzyme nets
Dawkins’ theory of memes
memory III (memes of mankind )
culture
social environment
books & school & television &
newspaper & Internet & religions & ..
laws & morals & society & friends &
enemy & terrorism & ..
&
family
mother & father & wife/husband &
sexual partner & children & ....
Lamarckian imitation of memes
memes memů
body
carrier of genes
imunitní
systém
hormones
Baldwin effect
individual (descendant)
brain memory II
carrier of memes
chemical messenger
immune system
learning
behavior
(rules)
(instincts)
instinctive
behavior
conscious behavior
central nervous system
mitochondrial genes (epigenetic information), order of cells (epigenetic structure)
memory Ia
genes for structure of the body
memory Ib
genes for structure of the brain and instincts
diploid chromosomes & sexual reproduction
integrated fitness
Mendelian genetics
memory IVa
Darwinian selection process
parasites
memory IVb
prey-predator interaction (living part of nature)
flow of memes
environment (energy of sun and earth....)
influence of genes
direction of influence
Fig. 1 Complex evolutionary structure
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
140
Computational Intelligence
Game Theory and Collective Behavior Intelligence
memory III (genome & memes)
social environment
culture
Particle Swarm Optimization, Multi-Agent EA
Agent-based Multiobjective Optimalization
Ant Colony Optimization, Dynamical Systems
Modeling Niches, Team Optimization
Culture Algorithms
family
Evolutionary Computation:
Evolution Strategies
Genetic Programming
Genetic Algorithms, Parallel GAs
body
individual (descendant)
brain memory II
Hormone Systems
Clonal Selection
Enzyme Behavior
Artificial Immune Algorithms
Fuzzy Systems
Fuzzy-rough Sets
Hybrid Learning
Neural Networks
Fuzzy-neural Modeling
Intelligent Control
Mitochondrial Systems
memory Ia
genes for structure of the body
memory Ib
genes for structure of the brain and instincts
Diploid GA with Sexual Reproduction
DNA Computing , Messy GA
memory IVb
memory IVa
Cooperative co-evolutionary Algorithms
Parasitic Optimization, Bacterial EA
Artificial Life Systems, Differential EA
Parallel Hierarchical EA, Meta-Heuristics
Evolutionary Multi-objective Optimization
Evolutionary Design , Robotics
Evolvable Hardware, Embryonic Hardware
Human-Computer Interaction
Molecular-Quantum Computing
Data Mining, Chaotic Systems, Scheduling
Fig. 2 Soft Computing
4 DESIGN OF EVOLVABLE CONTROLLERS
In recent years, there has been growing interest in using intelligent approaches such as fuzzy, neural
network, evolutionary methods, and their combined technologies for the PID controller [13], [14]. The
Proportional-Integral-Derivative (PID) controller has been widely used owing to its simplicity and robustness in
chemical processes, power plants, and electrical systems. Its popularity is also due to easy implementation in
hardware and software. However, with only P,I,D parameters, it is very difficult to control a plant with complex
dynamics, such as large dead time, inverse response and highly nonlinear characteristics. That is, since the PID
controller is usually poorly tuned, a higher of degree of experience and technology is required for tuning in the
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
141
actual plant. Also, a number of approaches have been proposed to implement mixed control structures that
combine a PID controller with fuzzy logic.
GAs can be used for the optimal design of a fuzzy controller. The GA chooses the number and
position of membership functions and modifies the rules. The fuzzy method does not require professional
expertise or mathematical analysis of the plant model. Unfortunately fuzzy controllers are still difficult to
implement on commercial low-cost microcontroller, and may not offer significant improvements in terms of
robustness and performance. The use of the classical PID controllers is preferred whenever non-linear techniques
are not strictly required. A GA is used to optimize both the structure and the associated parameters of controllers.
We used several versions of GAs:
Standard GA – One population with population size (= number of individuals) = 40. Individuals are
sorted by fitness. The higher the fitness, the higher is the probability that the individual will be selected as a
parent (a roulette wheel mechanism). The second parent is selected in the same manner. After that, crossover,
mutation, and correction are applied. The reproduction process is repeated until the worse half of population is
replaced.
GA with two sub-populations (the sexual approach where the male and the female are distinguished) -Each sub-population has 20 individuals. Individuals are sorted by fitness. The first parent is selected from the
male sub-population while the second parent is selected from the female sub-population. The selection
probability of the first parent is determined from a uniform distribution, while the selection probability of the
second parent is obtained using a modified roulette wheel approach that more often prefers better individuals.
Crossover, mutation, and correction operators are then applied [12]. The reproduction is repeated until the worse
half of every sub-population is replaced.
The GA with two sub-populations is a special case of the parallel GA with two sub-populations (male
and female) and sexual selection. Sexual recombination generates a variety of genotypes that increase the
evolutionary potential of the population [2,7]. As it increases the variation among the offspring produced by an
individual, it improves the chance that some of them will be successful in the varying environment. Mutation,
genetic drift, migration, nonrandom mating, and natural selection cause changes in variation within the
population. The adaptation of GAs depends on the speed of landscape changes through time.
A parallel GA can have a two - level structure. The first level is created by several populations with
different GAs [7], [8]. The best or random individuals from the first level are sent to the second level. At this
level, the standard GA with elitism runs. This two-level structure allows us to find a better solution than that
found by GAs in the first level; the best solution from the first level can never be lost but only overtaken in the
second level.
We tested the following modifications of GAs:
Limited lifetime – an individual is not removed (replaced) until the minimum lifetime is reached [2].
The minimum lifetime is randomly generated when the individual is born. Individuals can survive for several
generations even if they are not good. They have an opportunity to improve their fitness during the lifetime. This
approach can slow down the evolution process, but also prevents the loss of potentially good individuals.
HC operator – a hill climbing approach is used. In randomly selected individuals and in randomly
selected genes, several small modifications are carried out; the best modification with the best fitness is retained
for further use.
Adaptive version of the GA with artificial immune system (as in [11]) -- We have used a constrainthandling approach based on emulation of immune system (particularly, using the negative selection model) that
was incorporated into a parallel GA.
Comments on figures 3 and 4:
G_
standard genetic algorithm
S_
GA with diploid chromosomes (with sexual reproduction, with two sub-populations)
H_
hill climbing approach is applied
L
limited lifetime is applied
A
adaptive version of GA with artificial immune system is applied
P
parallel structure of GA is applied
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
142
250
500
G_H
450
G_LH
200
400
G_HA
G_LHA
350
G
300
S_H
150
G_L
S_H
G_H
250
S_LH
G_LH
200
S_LH
100
G_HA
S_HA
G_LHA
150
S_HA
S_LHA
50
100
S_LHA
P1
50
P2
0
0
1
10 19
28
37
46
55
64
73
82
1
91 100
Fig. 3: The comparison of standard methods
5
9 13 17 21 25 29 33 37 41 45 49 53 57
Fig. 4: GAs with parallel architecture
5. DISCUSSION
The artificial immune system and sexual reproduction are typical of complicated creatures. Parallel
GAs with the sexual reproduction and an artificial immune system can increase the efficiency and robustness of
systems, and thus they can track better optimal parameters in a changing environment. From the experimental
session it can be concluded that modified standard GAs with two sub-populations can design controllers much
better than classical versions of Gas do. The HC modification of the standard GA (see Fig.3) can improve the
speed of the convergence, but the lifetime modification can support the higher variety of population. In the
beginning, the HC modifications have better results in most cases, but the lifetime modifications have slower
convergence; in the end, they can find better solution than the HC modifications. It is not easy to say which
individual modifications are the best (see Fig. 4). If we join them together by the parallel GA, then - in the higher
level - it is not important which method will contribute more to the final solution.
Diploid chromosomes can increase the efficiency and robustness of GAs [1, 7], and they can better
track optimal parameters in a changing environment. It can be demonstrated that standard GAs with haploid
chromosomes are unable in many cases to correctly locate optimal solutions for time-dependent objective
functions.
Acknowledgments: This work has been supported by MŠMT grant No: CZ J22/98 260000013
REFERENCES
[1] OŠMERA, P.: An Application of Genetic Algorithms with Diploid Chromosomes, Proceedings of
MENDEL’98, p. 86 – 89, Brno, Czech Republic, 1998.
[2] OŠMERA, P. - Roupec, J.: Limited Lifetime Genetic Algorithms in Comparison with Sexual Reproduction
Based GAs, Proceedings of MENDEL’2000, Brno, Czech Republic, 2000.
[3] DAWKINS R.: The Selfish Gene, Oxford Univrsity Press, Oxford 1976
[4] KAUFFMAN, S.A.: Investigations, Oxford University Press, New York 2000
[5] PRIGOGINE, I., STENGERS, I.: Order out of Chaos, Flamingo 1985
[6] OŠMERA, P.: Complex Adaptive Systems, Proceedings of MENDEL’2001, Brno, Czech Republic (2001)
137 – 143
[7] OŠMERA, P.: Complex Evolutionary Structures, Proceedings of MENDEL’2002, Brno, Czech Republic
(2002) 109 – 116.
[8] OŠMERA, P.: Genetic Algorithms and their Aplications, the habilit work, in Czech language (2002) 3 –
114.
[9] WALDROP, M.M: Complexity – The Emerging Science at Edge of Order and Chaos, Viking 1993
[10] DAVIS, L.: Job Shop Scheduling with Genetic Algorithms, International conference ICGA’85 (1985) 132138
[11] COELLO, C.A., CORTÉS, N.C.: A Parallel Implementation of an Artificial Immune System to Handle
Constrain in Genetic Algorithms, WCCI 2002, 819-824, Hawai
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
143
[12] OŠMERA, P., MASOPUST, P.: Schedule Optimization Using Genetic Algorithms, Proceedings of
MENDEL’2002, Brno, Czech Republic (2002) 109 – 116.
[13] CUPERTINO, F., NASO, D., SALVATORE, L., TURCHIANO, B.: Design of Cascaded Controllers for
DC Drivers using Evolutionary Algorithms, Proceedings of CEC’2002, Honolulu, USA (2002) 309 – 316.
[14] KIM, H. D., HONG, P.W., PARK, J.I.: Auto-tuning of Reference Model Based PID Controller Using
Immune Algorithm, Proceedings of CEC’2002, Honolulu, USA (2002) 509 – 516.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
144
AN IMPROVED ALGORITHM BASED ON STOCHASTIC HEURISTIC METHODS FOR
SOLVING STEINER TREE PROBLEM IN GRAPHS
Daniel Smutek
VUT v Brně, Fakulta strojního inženýrství, Ústav automatizace a informatiky, Technická 2, 61669 Brno
e-mail: [email protected]
Abstract: The Steiner Problem in a Graph (SPG) is one of the classic problems of combinatorial
optimization. Given a graph and designated subset of the vertices, the task is to find a minimum cost
subgraph spanning the designated vertices. In this paper we present an improved algorithm based
on stochastic heuristic methods (tabu search, simulated annealing and genetic algorithm) which
approximate SPG instances up to 500 vertices in a feasible time. Computational results are given
for randomly generated problems from OR-Library having 500 vertices. The SPG arises in a large
variety of diverse optimization problems such as network design, multiprocessor scheduling and
integrated circuit design.
Keywords: heuristic, Steiner tree, minimum spanning tree, graph
1 INTRODUCTION
Steiner Tree Problem in a Graph (SPG) is a variant of the minimal spanning tree problem. Given a
connected, undirected graph G = (V, E) with a vertex set V, an edge set E, a positive weight function defined on
its edges and a non-empty set C Í V of terminals. The problem is to find a shortest tree interconnecting C. The
tree satisfying these properties is a Steiner minimum tree (SMT) or Steiner tree (see Fig. 1). Vertices in C are
called terminal vertices or customer vertices. All terminal vertices must be in the Steiner tree. Other vertices in V
that aren’t terminal vertices, are called Steiner points or optional vertices and they are in set S. Steiner points can
be in the Steiner tree, but they don’t have to.
The SPG has been shown to be NP–hard [11] for general graphs. As a result, the existing exact methods
can only solve moderately sized instances and heuristic approaches are required to tackle larger instances likely
to be encountered in real–life applications of the problem. Since the SPG can be designed as a combinatorial
problem, we can use stochastic heuristic methods such as genetic algorithms (GA), simulated annealing (SA) or
tabu search (TS).
The purpose of this paper was to present an improved algorithm for the SPG that is based on three
steps. First step is a graph reduction, the second step is using of the stochastic heuristic and the final step is a
simple deterministic heuristic. A computer program was created to test efficiency of the designated algorithm.
Then the graphs with 500 vertices presented in OR-Library were tested by the program. We will compare also
our algorithm with two well-known approximation algorithms (Shortest paths approximation (SPA) created by
Takahashi and Matsuyama [20] and the distance network approximation (KMB) created by Kou, Markowsky,
and Berman [13]).
This work is the continuation of the previous work which solved the SPG with the GA, SA [19] and TS
[18] for graph instances up to 100 vertices only.
1
1
3
...... Terminal vertices
1
10
8
2
13
8
3
7
2
6
14
4
..... Steiner points
4
7
8
9
Fig. 1: Steiner minimum tree (bold lines) in a graph
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
145
2 GRAPH REDUCTIONS
Before the GA, SA or TS itself is executed, an attempt to reduce the size of the given problem is
performed using standard graph reduction techniques. Routine GraphReductions of Figure 2 performs three
kinds of rather simple reductions all of which are described in [21, 22]. More elaborate reductions as well as
proofs of the correctness of the reductions used here can be found in [2]. Let evw denote the edge between
vertices v and w, and let sp(v,w) Í E denote the shortest path between vertices v and w. The three reductions
used are following:
a) Any Steiner vertex of degree 1 can be removed along with the edges incident with it.
b) If v Î V – C, deg(v) = 2 and euv, evw Î E, then v, euv and evw can be deleted from G and replaced
by a new edge between u and w of equivalent cost. More specifically, if euw Ï E, then E = E È
{euw} and c(euw) = c(euv) + c(evw). If there is an edge from u to w already, i.e., euw Î E, then
c(euw) = min { c(euw), c(euv) + c(evw)}.
c) If evw Î E and c(evw) > c(sp(v, w)), then no SMT can include evw, which, therefore, can be
deleted.
To obtain the largest possible overall reduction of G, the above reductions are performed repeatedly [3]
as described bellow. Knowledge of the cost of a shortest path is required whenever a reduction of type c is
performed. We can find a shortest path with e.g. Dijkstra’s algorithm [1]. The chosen scheme for performing
reductions in the routine GraphReductions is shown in Figure 2. The routine GraphReductions terminates when
no reduction of any type succeeded for a complete iteration, i.e., when no reduction can reduce G further.
repeat
repeat
Reduction(a);
until no improvement in one iteration;
Reduction(b);
Reduction(c);
until no improvement in one iteration;
Fig. 2: Outline of routine GraphReductions
3 STOCHASTIC HEURISTIC METHODS
3.1 Tabu search
The general principles of tabu search algorithms were first described by Glover [5, 6, 7, 8] and,
independently, by Hansen and Jaumard [9]. Given a search or solution space defining possible solutions, the
basic idea of the approach is to explore this search space by moving at each step from the current solution to the
best one in its "neighborhood". A neighborhood is defined as a set of possible solutions that are found by
applying an appropriate transformation of the current solution. A key feature of tabu search is that it allows
moves resulting in a degradation of the objective function, thus avoiding the trap of local optimality. One of the
fundamental objectives of the method is to prevent the search from cycling. This is accomplished by forbidding
the choice of moves to recently encountered solutions or moves that "reverse" the effect of recent moves. Such
moves are said to be "tabu". The search can be intensified on some specific types of solutions or it can be
diversified to previously unexplored portions of the solution space.
3.2 Simulated Annealing
Simulated Annealing is an optimization method applicable to searching for the global minimum of the
cost function. Annealing denotes a physical process, where the solid is first heated to a high temperature and
then cooled slowly down to the original temperature. Annealing gives the system the opportunity to jump out of
local minima with a reasonable probability while the temperature is still relatively high. The success of this
method depends especially on choosing starting temperature and rapidity of cooling. This method is described
e.g. in [17].
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
146
3.3 Genetic algorithms
Genetic algorithms [17] are based on natural evolution. The algorithm maintains a population of
individuals, each of which corresponds to a specific solution to the optimization problem at hand. Starting with a
set of random individuals, a process of evolution is simulated. The main components of this process are
crossover, which mimics propagation, and mutation, which mimics the random changes occurring in nature.
After a number of generations, highly fit individuals will emerge corresponding to good solutions to the given
optimization problem. A phenotype is the physical appearance of an individual, while a genotype is the
corresponding representation or genetic encoding of the individual. Crossover and mutation are performed in
terms of genotypes, while fitness is defined in terms of phenotypes. For a given genotype, the corresponding
phenotype is computed by a decoder. One of the most important steps in designing GA is to specify the
representation of Steiner tree by a chromosome (see Fig. 3).
4 IMPLEMENTATION OF HEURISTIC METHODS
4.1 The Solution Space
For all of presented heuristic methods (GA, SA, TS) the solution is determined by a set of Steiner points
S thus this solution can be represented by a bit–string, where 1 corresponds to whether the Steiner vertex is
included in the solution tree. The length of the bit–string is equal to the number of Steiner points (see Fig. 3).
0
1
1
0
1
0
1
1
1
0
1
0
0
0
1
0
The number of Steiner points
Fig. 3: Representation of the solution space by a bit–string
4.2 The Bit–string decoder
If the representation of the solution space is made, we have to build a decoder which computes minimal
Steiner tree. For that purpose we use the procedure which decodes the bit–string, returns the Steiner tree and the
cost of this tree. The decoding procedure works as follows (see Fig. 4):
1) Assign Steiner points from an input graph G to corresponding bits in the bit–string. If the value of the
bit is 0, then set an attribute of the corresponding Steiner point to FALSE.
2) Remove all Steiner points with attribute FALSE and remove all edges that are incident with them too.
We obtain the subgraph F, which is made from terminal vertices and from the rest of Steiner points.
3) Search through the subgraph F with the breadth-first search algorithm [1] and determine if the
spanned tree is connected over all terminal vertices.
·
If the subgraph F is disconnected between terminal vertices (e.g. terminal vertex is standalone),
then value of the corresponded bit–string is evaluated with the penalty function. In other case
we can continue with the next step.
·
Make the minimal spanning tree K(F) from the graph F using minimal spanning tree algorithm.
In our case we used Kruskal’s algorithm (e.g. in [1]). The value of the corresponded bit–string
is set to the weigh of K(F). In this case K(F) is the Steiner tree. If K(F) is minimal from all
possible combinations of spanning trees, then K(F) is the minimal Steiner tree.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
147
Input graph G
… Terminal vertices
… Steiner points
The bit–string:
9
4
1. Assign values from the bit–string
to Steiner points.
2
0
1
8
14
2
10
1
5
3
5
20
3
6
7
Steiner points
3. Spanning tree K(F) (bold lines)
is the minimal Steiner tree.
2
1
1
8
8
5
4 5
Fig. 4: Decoding procedure (part one)
2
2
7
5
3
10
1
6
2
1
7
2. Removal of Steiner points with
attribute FALSE and all incident
edges. Subraph F is build.
10
1
6
2
1
1
2
20
5
3
2
7
5
20
6
3
3
Fig. 4: Decoding procedure (part two)
However this scheme does not ensure that every bit string corresponds to a valid solution. This is a
disadvantage of the decoding procedure. Algorithms based on this procedure are practicable used only on the
graphs where the number of vertices is between 50 and 100 [19]. In the next part of this paper we will show, how
we can correct this disadvantage.
4.3 Correction techniques
In our algorithm we used four correction techniques:
If the terminal vertex has degree = 1 (i.e. the terminal vertex is incident with only one edge) and his
neighbor is a Steiner vertex, then the Steiner vertex must be in minimal Steiner tree. Thus we add
number 1 into the corresponding position of the bit–string. This step is performed always, when a new
bit–string is created.
2) If the initial solution (i.e. bit–string) is penalized, then we find the Steiner vertex with the maximal
number of the incident edges. Then we add number 1 into the corresponding position (corresponding to
the Steiner point) of the bit string and we recalculate the value of the bit string with decoding procedure.
This step is performed until the initial solution (or initial chromosome) is valid. This step is performed
only at generating of the initial solution.
3) If the initial solution (i.e. bit–string) is penalized, then we find all of the Steiner vertices, which are on
the shortest path between randomly chosen terminal vertices. Then we change all of the corresponding
bits (corresponding to the Steiner points) in the bit–string to the number 1 and we recalculate the value
of this bit–string. This step is performed until the initial solution (or initial chromosome) is valid. This
step is performed only at generating of the initial solution.
4) If the initial solution (i.e. bit–string) is penalized, then we select the low-cost edge and its incident
Steiner points. Then we change all of the corresponding bits (corresponding to the Steiner points) in the
bit–string to the number 1 and we recalculate the value of this bit–string. This step is performed until
the initial solution (or initial chromosome) is valid. This step is performed only at generating of the
initial solution.
Now we must choose the correction method on the concrete graph. The choosing was performed after
testing the correction techniques on the graph instances. Result of the test is shown on the table Tab. 1.
1)
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
148
Graph
c01
c02
c03
c04
c05
c06
c07
c08
c09
c10
c11
c12
c13
c14
c15
GA
Correction 4
Correction 2
Correction 2
Correction 3
Correction 2
Correction 3
Without method
Correction 3
Without method
Without method
Without method
Without method
Without method
Without method
Without method
TS
Correction 4
Correction 2
Correction 2
Correction 3
Correction 2
Correction 3
Correction 3
Correction 3
Correction 3
Correction 3
Without method
Without method
Without method
Without method
Without method
SA
Correction 4
Correction 2
Correction 2
Correction 3
Correction 2
Correction 3
Correction 3
Correction 3
Correction 3
Correction 3
Without method
Without method
Without method
Without method
Without method
Tab. 1: Correction techniques for graphs
4.4 The Initial Solution
The initial solution is a randomly generated bit–string. However, the randomly generated initial solution
does not ensure to be a valid solution. This disadvantage is corrected with the special designed correction
techniques (see chapter 3.3).
4.5 The Neighborhood Structure
Given a particular solution xÎ X , the neighborhood of such a solution is denoted by N(x). N(x) consists
of the set of minimum spanning trees that can be obtained by the removal or the addition of a single Steiner node
j Î S from/to the current solution x. Thus, the neighborhood from the initial solution in TS and SA is generated
by inverting one position in the bit–string. The neighborhood structure in GA is given by the crossover operator
(e.g. uniform crossover).
4.6 Tabu Search
t:=1;
generate randomly initial solution Po;
correction of the Po;
Pmin := Po(t);
fmin := f(Po(t));
TL := (); {Tabu List}
while t <= max_iterations do
begin
Ploc := Po(t);
floc := f(Po(t));
change := false;
for i:=1 to count_of_Steiner_points do
begin
Q := Transform(Po(t),i); {inverts a position i in the bit–string}
if (f(Q) < floc) and ((not InTabuList(i)) or (f(Q) < fmin)) then
begin
Ploc := Q;
floc := f(Q);
Vloc := i; change := true;
end;{if}
end; {for}
if change = false then {if all of transformations are unsuccessful then ivert one bit
randomly}
begin
i := (Random(count_of_Steiner_points))+1; {generate randomly a number in an interval
[1,SP]}
Q := Transform(Po,i); {invert one position in the bit–string randomly}
Ploc := Q;
floc := f(Q);
end;
if floc < fmin then
begin
fmin := floc;
Pmin := Ploc;
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
149
end;
if change = true then
if |TL| < k then
TL := TL + Vloc else
TL := TL – V1 + Vloc;
t := t+1;
Po(t) := Ploc;
end; {while}
Pmin is the approximation of the optimal solution.
Fig. 5: The Tabu search algorithm
The initial solution Po is generated randomly and it is corrected with the correction technique. TL is a
Tabu list (FIFO) which contains the last progressive transformations. The tabu list is set to the length 4. The
number of iterations t was chose to be t=200. The aspiration criterion is implemented in the command:
if (f(Q) < floc) and ((not InTabuList(i)) or (f(Q) < fmin)),
< fmin).
where aspiration criterion is represented by the expression (f(Q)
4.7 Simulated annealing
randomly generate starting bit string P0;
correction of the Po;
Pmin := P0;
set starting temperature T0 > 0;
set terminal temperature Tf;
T := T0;
nT := number of Steiner points;
repeat
for i := 1 to nT do
begin
randomly generate neighborhood state P from a set of neighborhoods N(P0);
Df := f(P) – f(P0);
if Df < 0 then
begin
P0 := P;
{ better solution is always accepted}
if f(P) < f(Pmin) then
Pmin := P;
{ updating of the best solution}
end
else
begin
randomly select r from the steady-state distribution (0,1);
if r < exp(–Df / T) then
P0 := P;
{ moving to the worse solution}
end;
end;
T := decr * T;
until T < Tf; { crystallization in the annealing }
{ Pmin is the approximation of the optimal solution }
Fig. 6: The Simulated annealing algorithm
The initial solution Po is generated randomly and it is corrected with the correction technique. The
starting temperature is chosen to be 10000 and the terminal temperature is chosen to be 1. The number of
repeating of the Metropolis algorithm is chosen to be equal 100. The cooling schedule is given: T = T * decr,
where decrement decr = 0.99.
4.8 Genetic algorithm
randomly generate starting population P(0) := {P1, ..., PN};
correction of the P(0);
compute fitness of all chromosomes: {f(P1),..., f(PN)};
seek for best chromosome Pmin from starting population: f(Pmin) £ f(P), "P Î P(0);
t := 1;
while t < iteration_count do
begin
repeat
select 2 parents R1, R2 from population P(t) ;
generate offspring D crossovering parents R1 and R2;
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
150
mutate offspring D;
until not (D in P(t));
compute fitness of offspring D;
seek for worst chromosome Pmax in population P(t): f(Pmax) ³ f(P), "P Î P(t);
replace chromosome Pmax with offspring D;
if f(D) < f(Pmin) then
Pmin := D;
{update best solution}
t := t + 1;
end;
{Pmin is approximation of optimal solution}
Fig. 7: The Genetic algorithm
The starting population is randomly generated and it contains 200 chromosomes. Parents of new
offspring are selected with the binary tournament selection. Mutation operator randomly inverts one bit in
offspring’s chromosome with probability 1/n, where n is length of the chromosome. The least-fit member
replacement is implemented in the program. The least-fit chromosome is replaced by a new offspring in each of
GA iteration. The algorithm is stopped after achieving of 50000 iterations.
5. FINAL ALGORITHM
The final algorithm is designed as follows (see Fig. 8):
GraphReductions(InputGraph);
SteinerTree := Stochastic heuristic(TS or SA or GA) with Correction algorithm;
ApproximationOfSteinerMinimumTree := FinalOptimization(SteinerTree);
Fig. 8: The Final algorithm
Routine FinalOptimization performs simple local hill–climbing by executing a sequence of mutations
on bit–string, each of which improves the solution. An exhaustive strategy is used so that when the routine has
been executed, no single mutation exist, which can improve solution further.
6. COMPUTATIONAL RESULTS
The FinalAlgorithm was tested on the SPG instances from the OR-Library [12]
(http://elib.zib.de/steinlib/testset.php), denoted c01 – c15. The problem sizes after performing the initial graph
reductions are listed in the Tab. 2. Computational results of the final algorithm are described in the Tab. 3. Each
graph (c01 – c15) was tested ten times to obtain the closest solution to the optimum. In the Tab. 3 we present a
minimum from ten achieved solutions (column “Min”), average cost obtained from ten tests (column “Æ”) and
the computational time (column “time”; h=hour, m=minute, s=second, ms=millisecond). Empty lines in the
result table presents that the algorithm SPA was not found a solution within 5 hours. The testing program was
built in Borland Delphi 6.0 and the algorithm was tested on the computer P4 Celeron, 2018 MHz, 512 MB RAM
and Windows XP Professional.
Graph
c01
c02
c03
c04
c05
c06
c07
c08
c09
c10
c11
c12
c13
c14
c15
|V|
500
500
500
500
500
500
500
500
500
500
500
500
500
500
500
|E| reduced |V| reduced |E|
625
149
267
625
133
242
625
208
329
625
250
374
625
336
457
1000
369
847
1000
383
870
1000
395
882
1000
424
916
1000
443
934
2500
499
2184
2500
499
2239
2500
498
2210
2500
500
2195
2500
500
2174
|C| Optimum
5
85
10
144
83
754
125
1079
250
1579
5
55
10
102
83
509
125
707
250
1093
5
32
10
46
83
258
125
323
250
556
Tab. 2: Original and reduced graphs „C“
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
151
INPUT GRAPH
Graph Optimum
c01
c02
c03
c04
c05
c06
c07
c08
c09
c10
c11
c12
c13
c14
c15
GA
Min
Æ
SA
time
Min
Æ
TS
time
Min
Æ
KMB
time
Min
time
SPA
Min
time
85
85
85,8 4m
85
87,2
4m
85
92,2
2m
88 31ms
87
94ms
144 148
153 3m 145
148,1
3m 148
156,3
1m 144 78ms 144 391ms
754 754 757,3 7m 754
755,1
8m 758
762,6
2m 783
8s 772
6m
1079 1079 1079 9m 1079 1080,4
11m 1079 1084,9
3m 1118
25s 1098
20m
1579 1579 1579 12m 1579 1579,2
20m 1579 1579,1
4m 1602
3m 1595 3h 11m
55
55
63 15m
55
61,9
23m
61
73,3
18m
60 187ms
55 594ms
102 107
113 15m 107
114,7
25m 114
126,1
20m 114 827ms 102
4s
509 516 520,8 16m 514
522
32m 524
532,6
19m 533
56s 523
33m
707 713 720,9 19m 713
716,6
38m 720
734
21m 727
4m 722 1h 55m
1093 1093 1095 26m 1093 1097,5
47m 1093 1095,2
17m 1122
18m
32
36
42,8 46m
35
39,2 1h 20m
36
41,2 1h 24m
35
4s
34
7s
46
53
60,5 51m
58
61,4 1h 24m
58
69,2 1h 30m
49
14s
48
1m
258 264 268,4 53m 272
275,7 1h 27m 281
283,6 1h 17m 276
14m
323 324 325,9 47m 335
339,6 1h 50m 342
343,6 1h 24m 342
37m
556 556
557 56m 564
565,9 1h 45m 560
561,8
50m 574 2h 9m
-
Tab. 3: Computational results
In the result table Tab. 3 we can see, that the GA found the global optimum 7 times, SA found the global
optimum 6 times and TS found the global optimum 4 times. If the set of terminal vertices is too small (5 or 10),
then best results we obtained using of the KMB or SPA heuristic. For the much more terminal vertices (80 – 250)
were approximation algorithms KMB and SPA practically unusable due to the infectivity and runtime
consumption.
7. CONCLUSION
In this paper the improved algorithm based on stochastic heuristic methods for Steiner Problem in a
Graph was presented. The main idea behind the algorithm is the application of the spanning tree algorithm on the
subgraph. Subgraph is created by removal of some Steiner points from the input graph. The state space is
represented by a bit string. This scheme does not ensure that every bit string corresponds to a valid solution. This
disadvantage we corrected with the special correction techniques. At the beginning the input graph is reduced
before starting of the heuristic algorithm, thus we applied heuristics on lesser graph instances. Algorithms
presented in this paper are practicable used on the graphs where the number of vertices is 500 and the number of
edges is 2500. Testing of the algorithms is verified on graphs from OR-Library using a program. In the future
work we can try to improve the time complexity of base algorithms (Kruskal’s and Dijkstra’s) with heap data
structures to obtain results in a shortest time. Then, we can try to combine also stochastic heuristic methods with
the approximation algorithm to obtain better solutions.
REFERENCES
[1] ChARTRAND, G. and OELLERMANN, O.R.: Applied Algorithmic Graph Theory. McGraw Hill, New
York, 1993.
[2] DUIN, C. W. and VOLGENANT, A.: Reduction tests for the Steiner problem in graphs. Networks 1, 1971,
pp. 549–567.
[3] ESBENSEN, H.: Computer Near-Optimal Solution to the Steiner Problem in Graph Using a Genetic
Algorithm. Networks, 1995, Vol. 26, pp. 173-185.
[4] GENDREAU, M., LAROCHELLE, J–F. and SANSO, B.: A Tabu Search Heuristic for the Steiner Tree
Problem, Networks, 1999, Vol. 34, pp. 162–172.
[5] GLOVER, F., TAILLARD, E. and de WERRA, D.: A user's guide to tabu search, Ann Oper Res 41 (1993),
pp. 3–28.
[6] GLOVER, F.: Future paths for integer programming and links to artificial intelligence. Comput Oper Res
13 (1986), pp. 533 – 549.
[7] GLOVER, F.: Tabu search–Part I, ORSA J Comput 1 (1989), pp. 190–206.
[8] GLOVER, F.: Tabu search–Part II, ORSA J Comput 2 (1990), pp. 4–32.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
152
[9] HANSEN, P. and JAUMARD, B.: Algorithms for the maximum satisfiability problem, RUTCOR Research
Report, Rutgers, New Brunswick, NJ, 1986, pp. 43–87.
[10] HERTZ, A. and de WERRA, D.: Tabu search techniques – A tutorial and application to neural networks.
Oper Res Spekt 11 (1989), pp. 131–141.
[11] KARP, R. M.: Reducibility among combinatorial problems. In R.E. Miller and J.W. Thatcher, editors,
Complexity of Computer Computations, pages 85-103. Plenum Press, 1972.
[12] KOCH, T., MARTIN, A. and VOß S., SteinLib Testdata Library, http://elib.zib.de/steinlib/steinlib.php
[13] KOU, L., MARKOWSKY, G. and BERMAN, L.: A fast algorithm for Steiner trees. Acta Info. 15, 1981,
pp. 141–145.
[14] PLESNÍK, J.: Grafové algoritmy. ALFA, Bratislava, 1983.
[15] ŠEDA, M.: Aplikace Steinerových stromů v síťové optimalizaci. In Proceedings of the 4th Scientific –
Technical Conference PROCESS CONTROL 2000 (ŘÍP 2000), Kouty nad Desnou, 2000, 10 str.
[16] ŠEDA, M.: Solving Steiner Tree Problem Using Local Search Metheds. In Proceedings of the 22nd
International Conference Telecommunications and Signal Processing – TSP’99, Brno, 1999, pp. 102-105.
[17] ŠEDA, M.: Využití moderních heuristických metod v rozvrhování. [Dissertation thesis], Brno 1998. – VUT
Brno. Faculty of Mechanical Engineering. Department of Automation and Computer Science.
[18] SMUTEK, D.: A Tabu Search Heuristic for the Steiner Tree Problem in Graphs. In Proceedings of the First
International Conference on Soft Computing Applied in Computer and Economic Environments ICSC
2003. Evropský polytechnický institut, Kunovice, 2003, pp. 124–128.
[19] SMUTEK, D.: Steinerovy stromy v grafech. [Diploma thesis] Brno 2000. - VUT FSI Brno. Faculty of
Mechanical Engineering. Department of Automation and Computer Science.
[20] TAKAHASHI, H. and MATSUYAMA, A.: An approximate solution for the Steiner problem in graphs,
Math. Jap. 24, 1980, pp. 573–577.
[21] WINTER, P. and MACGREGOR SMITH, J.: Path–distance heuristics for the Steiner problem in
undirected networks. Algorithmica 7, 1992, pp. 309 – 327.
[22] WINTER, P.: Steiner problem in networks: A survey, Networks 17, 1987, pp. 129–167.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
153
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
154
DETECTING PARETO OPTIMAL SOLUTIONS WITH PSO
Ulrike Baumgartner, Ch. Magele, W. Renhart
Kopernikusgasse 24/3, A-8010 Graz, Austria,
email: [email protected],
phone: +43 316 873 7257
Abstract - Real world optimization problems often require to minimize/maximize more than one
objective which, in general, are conflicting each other. These problems (multiobjective optimization
problems, vector optimization problems) are usually treated by using weighted sums or other
decision making schemes. An alternative way is to look for the Pareto Optimal Front. In this paper
the Particle Swarm algorithm is modified to detect the Pareto Optimal Front.
Keywords - Stochastic Optimization, Particle Swarm Optimization, Pareto Optimality, Multimodal
Problems
I. INTRODUCTION
Stochastic optimization methods have successfully been applied to scalar and vector optimization
problems. In the latter case, the different objectives, which are in general conflicting each other have to be
transferred into a single scalar objective function. This is usually done by normalizing and processing the
contributions to the objective function (weighted sums, fuzzy membership functions, ...) [1]. Another approach is
to find the Pareto optimal front which summaries all Pareto optimal solutions [2]. A Pareto optimal solution is,
by definition, the best that could be achieved for one objective without disadvantaging at least one other
objective.
Particle Swarm Optimization (PSO) [3] is a stochastic optimization technique which imitates the social
behaviour of a bird flock flying around and sitting down on a pylon. The idea of the particular PSO
implementation, discussed in this paper, is that the flock should now sit down on the transmission line (the
pareto front) and the pylons are controlling points (sampling points) for the flock. The reliability and the
performance of the proposed method will be tested using analytical examples.
2. PARETO OPTIMALITY
The solution to a multiobjective problem is, as a rule, not a particular value, but a set of values of
optimization variables such that, for each element in this set, none of the objective functions can be further
increased without a decrease of some of the remaining object functions (every such value of a optimization
variable is referred to as pareto-optimal).
The simplest way illustrating this behaviour is shown in Fig. 1. The point xt+1 is in both cases pareto
optimal because (1) or (2) is true (OF ... objective function value).
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
155
Figure 1. Pareto optimal points - single dimension
Figures 2 and 3 are illustrating pareto optimality in a two dimensional parameter space on simplified Schaffer
functions (3)
[4].
In point A both objective function values can be improved by moving into the region enclosed by the
(negative) gradient vectors, while in point B the two objectives are in conflict; an improvement of one objective
causes a deterioration of the other objective, which corresponds to the definition of a pareto optimal point. The
gradient vectors can be formulated as a positive linear combination. It is obvious that all objective function
values can be improved for point C, while Point D is again pareto optimal; none of the objective functions can be
increased without a decrease of some of the remaining objective functions. In point D it is also possible to state
g1k1 + g2k2 + g3k3 = 0 where k1, k2 and k3 are positive numbers.
Figure 2. Two conflicting objectives - 2D
Figure 3. Three conflicting objectives - 2D
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
156
Figure 4. Two conflicting objectives - 3D
Figure 5. Three conflicting objectives - 3D
The three dimensional cases are shown in Fig. 4 and Fig. 5, the behaviour of pareto optimal solutions
and the corresponding gradient vectors are the same as in two dimensions. The mathematical interpretation of
this considerations needs the use of linear algebra and the treatment of a system of equations using the gradient
direction vectors. A point is identified as pareto optimal if (4) is fulfilled, that means that the rank of the gradient
matrix G is smaller than m.
3. PARTICLE SWARM OPTIMIZATION
In the population-based search procedure σ particles fly around in a multidimensional search space.
During flight each particle adjusts its position xt and velocity vt according to its own experience and the position
of the best of all particles.
The discrete ordinary differential equation controlling the motion of a single swarm member is given in
(6).
The PSO algorithm for single objectives works as in the following described: The swarm is uniformly
spread in the
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
157
parameter space, bounded by [xmin,xmax]. The fitness of all swarm members is evaluated and the first swarm
leader is going to be specified. The position and velocity of each swarm member are updated according to (6).
After evaluating the performance of all particles a new swarm leader (the member with the best performance) is
found. If the swarm collapses (i. e. the swarm radius is below a certain positive small constant ε) or the
maximum number of iterations has been
exceeded, the algorithm terminates.
4. PARTICLE SWARM OPTIMIZATION AND PARETO OPTIMALITY
To implement the idea of a bird flock landing on a transmission line using pylons as orientation points
the whole particle swarm is portioned into sub swarms. The ”pylons” are additional objective functions, which
are weighted sums of the main objectives (no accessorily function calls - only multiplications and additions).
The algorithm starts again with a randomly distributed starting configuration, the initialization swarm,
which will be generated in the range [xmin,xmax]. The objective function values OF1, OF2, ... ,OFm of each
swarm member are determined and (7) is calculated for a set of weights.
For each weighted sum, the particle yielding the best objective function value becomes a swarm leader
x t,k best. The swarm is equally partitioned and each part of the swarm evolves into the direction of its own swarm
leader x t,k best according to (8) where x t,k best = min(OFwk) is the position of the current swarm leader of the
swarm part k. The size of a swarm part k is set to s/n.
The major modification compared to the standard algorithm is the detection of pareto optimal solutions.
To speed up the algorithm this process is two fold. The first step (preliminary pareto decision) is narrowing
down the choice by testing all swarm members against the simple Pareto formulations (1) and (2). The fitness
OFj(xt+1) and the previous fitness OFj(xt) are already calculated and therefore no additional solutions of the
forward problem are needed in this step . If (1) or (2) is fulfilled it is possible that the point is pareto optimal and
the procedure is followed by a substantially stricter selection process (main pareto decision), taking the gradient
information (9) into account. To do so, additional objective function values OF1, OF2, ...,OFm are evaluated at
positions slightly displaced from the current position xt+1 of a particle under investigation (OFj,Δx).
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
158
If a particle is recognized to be pareto optimal, it has reached its final position and is removed from the
swarm. If the number of remaining swarm members has reached a prescribed threshold or the maximum number
of iterations maxit has been exceeded, the algorithm stops.
5. ANALYTICAL TEST FUNCTIONS
The simple test functions (3) were scaled up to the five dimensional space. To obtain the pareto optimal
front, the proposed pareto PSO version was applied and compared to a classical PSO algorithm, both in
reliability of the results and in
computational effort.
The pareto PSO version was started with a population size s= 100 and the algorithm was stopped when,
at least, 25
pareto optimal points were detected. Only one solution for one objective can be found by a classical PSO run,
therefore the standard algorithm was applied for each weighted sum of objectives (7) sequently. The same
weighted sums were used for the pareto PSO version as orientation points. The standard PSO procedure was
applied twice, firstly with a constant swarm size sS for all dimensions and secondly with a swarm size sS
depending on dimensions and the number of main objectives. Two test cases were investigated, one with two
conflicting objectives and the other one with three conflicting objectives. The complete series has included ten
optimization runs for two, three, four and five optimization parameters (dimensions), respectively.
A. Analytical Test Functions - Two Conflicting Objectives
For the two objectives test series the weights were chosen as l = [0 0.25 0.5 0.75 1] leading to five
objective function values OFwk = lkOF1 + (1 - lk)OF2. Table 1 summarizes and compares the results.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
159
The proposed pareto PSO was able to locate at most 36, 34, 32 and 31 pareto optimal points (pop)
during a single run in the 2D, 3D, 4D and 5D case, respectively, while the mean value of found pops was
between 25.1 and 28.6. As can be seen, the number of function calls increased from 2198 in the 2D case to
10391 in the 5D case. This is not the case when running the classical PSO with a constant swarm size sS = 20.
But this approach fails completly if higher dimensional problems are tackled. In the 5D case, the best run was
able to determine two (out of five) optimal points, the mean value turned out to be even worse, namly 0.6 (out of
five) orientation points.
Scaling the swarm size with the dimension of the problem (sS =20, 40, 60 and 80 in the 2D, 3D, 4D and
5D case, respectively) brings the classical PSO back into the game, at the cost of additional function calls, which
are comparable to the ones of the pareto PSO. The mean value of obtained optimal points corresponds in all
cases with the number of orientation points, which means that all possible solutions are found. Nevertheless one
should prefer the solutions of the pareto PSO, which represent the pareto front much better than the discrete
orientation points.
Figure 6. Two conflicting objectives - 2D
Figure 7. Pareto optimal front - OF1 and OF2
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
160
Figure 8. Two conflicting objectives - 3D
Figure 9. Two conflicting objectives - 3D - all pops
Figures 6 and 8 show the pareto optimal solutions (denoted by dots) obtained during a single run in the
2D and in the 3D case respectively. They are distributed nicely along the Pareto optimal front, spanned by the
orientation points (denoted by stars). In Fig.9 all pareto optimal solutions in the 3D case obtained by the 10
successive runs are plotted. Figure 7 displays an OF2(OF1) diagramm, which is the same for all tested cases with
two conflicting objectives.
B. Analytical Test Functions - Three Conflicting Objectives
When three conflicting objectives have to be taken into account, 15 orientation points are set up as
given in (7). From Fig. 10 it can be seen, that in the 2D case these orientation points (denoted by starts) are
distributed along and inside a triangle with the three individual optimal points (max(OF1), max(OF2) and
max(OF3)) as verteces. Figure 11 also shows the distribution of pareto optimal points (denoted by dots) obtained
by a single pareto PSO run, while in Fig. 8 the pops of the 3D case are given.
Figure 10. Three conflicting objectives - 2D
Figure 11. Three conflicting objectives - 3D
Having a closer look at Tab. 2 reveals results similar to the previous ones, but much more in favour of
the pareto PSO. The classical PSO with a constant swarm size sS fails completely, once the 2D space is left.
Increasing the swarm size corresponding to the dimension of the problem brings back the reliability of the
method but at the cost of an enormous number of function calls (5D about 50000). The pareto PSO, though, is
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
161
able to approximate the pareto front very accurately with a remarkable low computational effort.
6. CONCLUSION
The particle swarm optimization algorithm was applied to multiobjective optimization problems with
two and three objectives. The algorithm revealed an excellent performance concerning the number of solutions
of the forward problem and completed with a very reliable representation of the pareto optimal front.
The pareto optimal front gives a survey of all reasonable solutions of the multiobjective problem. Now
the user is free to choose the preferred realiziation from the whole set of solutions.
REFERENCES:
[1] ALOTTO, P.G., BRANDSTÄTTER, B., CELA, E., FÜRNTRATT, G., MAGELE, Ch., MOLINARI, G.,
NERVI, M., PREIS, K., REPETTO, M. and RICHTER, K.R.: “Stochastic Algorithms in Electromagnetic
Optimization” IEEE Trans. Magn., vol. 34, No. 5, pp 3674-3684, 1998
[2] BARTA, P. Di, FARINA, M., SAVINI, A.: “An improved technique for enhancing diversity in Pareto
evolutionary optimization of electromagnetic devices” A COMPEL: Int J for Computation and Maths. in
Electrical and Electronic Eng.; Volume 20 No. 2; 2001
[3] BRANDSTÄTTER, B., BAUMGARTNER, U.: “Particle swarm optimization - mass-spring system
analogon” Magnetics, IEEE Transactions on , Volume: 38 Issue: 2 Part: 1 , March 2002
[4] SCHAFFER, J.D.: “Multiple Objective Optimization with Vector Evaluated Genetic Algorithms”. In Genetic
Algorithms and their Applications: Proc. first Int. Conf. on Genetic Algorithms, pages 93–100, 1985.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
162
THE RISK ANALYSIS OF SOFT COMPUTING PROJECTS
Branislav Lacko
Institute of Automation and Computer Science, Faculty of Engineering
Brno University of Technology, Technicka 2, 616 69 Brno, Czech Republic
Lacko @ uai.fme.vutbr.cz
Abstract: The paper provides basic information on the RIPRAN method which has been developed
at VUT Brno to support project teams in risk analysis of soft computing projects. The method is
suitable especially for the risk analysis of company information system projects.
Key words: Project Control, Risk Analysis Method, Risk Identification, Risk Quantification
1. THE IMPORTANCE AND POSITION OF PROJECT RISK MANAGEMENT
1.1. The goal of project management
Project Risk Management includes processes dealing with identification, analysis and response to risks
in projects, with the aim to minimize their impact on the project.[3]
The goal of project management is represented by a successful project. A project can be deemed
successful if complying with the following:
- The goals planned have been achieved
and
- The project has been finished as scheduled
and
- The budget planned has been observed
and
- Sources available have been used
and
- Optimum efforts have been made to carry out the project
and
- Design and documentation have been done in required quality
and
- The project has no negative impact on environment, project participants or other projects
The above-mentioned factors characterizing a successful project show us that the requirements for a
successful project are very demanding. At the same time we must reject such an approach stipulating deliberately
small goals and low requirements to make their fulfilment easy – a so-called soft project. A project which
enables its easy performance as a successful project is not a quality project!
Projects of soft computing are very complicated:
· Goal is very sophisticated
· Goal is changed through project
· Goal is difficulty described
The main goal of project management is to design and carry out successful projects.
In practice, we meet unsuccessful projects very often. A project is considered unsuccessful in the
following cases:
·
Any of the goals planned has failed to be achieved
·
The project has not been finished in a term scheduled
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
163
·
·
·
·
·
·
Costs planned have been exceeded
Available sources have not been used or necessary sources have been lacking
Inadequate efforts have been made to achieve the goals, meet the deadline and costs
The project has been done in confusion and a whole number of critical situations occurred,
the reasons of which could have been assumed beforehand
The project has negative impact on environment or other projects
A whole number of projects even show combination of several unsuccessful characteristics mentioned
above.
There are many general reasons of a project's failure:
·
The project's definition (goals, deadlines, costs, sources) does not correspond to real needs
·
Changing requirements for the project due to changes in the project's surroundings, which have not
been accepted
·
Underestimated planned costs
·
Lack of necessary works
·
Wrong choice of project's individual parts contractors
·
Wrong identification of the initial state
·
Wrong project team management
·
Bad team work
·
Underestimation of various unfavourable factors or even their negligence
·
Insufficiently precise project design
·
Insufficient qualification of team members
·
Insufficient involvement of the project results end user in the project
·
Failure in project management methods mastering
·
Bad environment and bad standard of work management in the company carrying out the project
·
Wrong link to other projects
·
Unclear definition of goals
·
Wrong estimates of time and costs with specific works
·
Failure of project manager
·
Fluctuating team staff
·
Bad team line-up
·
etc.
Moreover, with software projects the danger of project impairment can emerge due to programme errors
occurrence.
As we can see, the list of frequently repeated, general failures is not and cannot be complete.
Furthermore, each of us is sure to remember specific reasons of some concrete software project failure!
Project management, however, wants to assure the greatest hope possible of achieving a successful
project by means of a project risk analysis. That is the main purpose of the project risk analysis done. [1]
Therefore, we must ask the following questions as early as in the stage of project design:
·
To what extent can we expect our project to be successful?
·
To what extent can we expect our project to fail to be successful?
·
What can endanger our project's success?
·
What can support our project's success?
·
What can we do to increase the success expected of the project?
·
What can we do to decrease the failure expected of the project?
Answering the above mentioned questions and a broader analysis of the reasons helps us in preparing
measures, moderating the possibility of the project's failure and increasing the probability of the project to be
successful.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
164
1.2. The importance of the environment's impact to the project
In many cases, the source of the project's failure lies in the mistakes committed by the individual project
team members. We can understand them as internal reasons of the project's failure. Those can be eliminated
relatively easily by thorough preparation of the team staff and improving their professional knowledge and
experience. With software projects those mistakes include mistakes in programme products in consequence of
programmers´ bad work.
The influences coming from the project's environment have a far more substantial, unfavourable and
sometimes hardly predictable impact. Those can be called the external causes of the project's failure. These
influences may often come as a response to those exerted by the project to its environment.
Project environment
PROJECT
A project and its environment influence mutually!
What changes can the project environment cause?
· Changing the goal
· Changing the road to the goal
· Changing the sources available
· Changing the conditions of the project implementation
A good project manager thus does not focus exclusively on the project – he observes the project's
environment as well.
The relation of project management and risk engineering can be shown as the following scheme:
PM
RE
Project Risk Management
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
165
2. BASIC RISK ENGINEERING TERMS
The RIPRAN method (see further) is based on the risk engineering principle [1], namely that to
analyse the risk we must determine the following quartet first and prepare a relevant list thereof:
Thread - Scenario - Probability - Loss
As the number of accidental events can never be predicted precisely, the list cannot be complete. The
incomplete list is also caused by the knowledge or respectively lack of knowledge of the project team members,
therefore we talk about a representational list, i.e. a list presenting all the significant risks we were able to
determine and which we take as a base for specific risk analysis.
Let us present briefly the meaning of those expressions, as understood by the RIPRAN method.
Thread
Hazard which threatens and causes disastrous consequences and troubles within the project. ( E.g.
violent windstorm, insufficient loan, glaze, devaluation of currency, strike, project manager resignation, bad subdelivery for the project, ....)
Scenario
Course which we assume in the project as a consequence of the thread occurrence. ( E.g. We are not
granted a loan – the project will not be covered financially, Johnny falls ill – we will lose a single staff member
who is able to do the work for our project...)
Probability
The probability of implementing the Scenario expressed as PÎ< 0,1>
We relate the probability to the project duration *1) and/or so-called reference time*2) when we feel to
be endangered by the thread. We must note that it is probability, that there is a scenario allotted with certain
probability to the thread with certain probability. Usually we expect both phenomena to be mutually
independent. If the windstorm probability is 0.03 and the probability of windstorm coming and makes the crane
fall down is 0.7, then the resulting probability which we will taken into account in the respective case is 0.7 x
0.03 = 0.021
1) E.g. for a strong windstorm of 11 degrees of Beaufort scale in our geographic conditions, the
probability of occurrence within one year is 0.01, but for 100 years it makes 0.63.
2) E.g. Cables laying is to be done from 1 March to 25 March – then we are interested in the occurrence
of ground freezing in this period.
Loss
A loss for a project occurred through implementation of the scenario. We usually express it by financial
units (we can do it in some else way of course, such as time delay, losses of the staff lives, etc. ).
To which n-group we can adjoin the risk value
Risk value = probability * loss
3. THE RIPRAN METHOD
3.1. Characteristics of the method.
The RIPRAN method (RIsk PRoject ANalysis) represents a simple empiric method for risk project
analysis, especially for medium-size enterprises.
It is based strictly on process approach of the risk analysis. It understands the risk analysis as a process
(inputs into process – outputs of the process – activities transforming the inputs to the outputs with certain
target).
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
166
The method accepts the Philosophy of quality management (TQM) therefore it comprise activities
assuring the risk project analysis quality, as required by ISO 10 006 [2]
The method is designed to respect the principles of Risk Project Management, described in PMI and
IPMA literature.
It is focused on processing the project risk analysis which must be carried out before the
implementation itself.
It does not mean that we should not work with threads in the other stages. On the contrary – in each
stage of the project's life cycle we must carry out activities which gather basic data for the very project risk
analysis for the project implementation stage and evaluate the possible risks of failure in the stage being
currently in progress. The risks recorded will then be used for the general project risk analysis.
The entire project risk analysis process [3] consists of three activities:
·
Risk identification
·
Risk quantification
·
Risk reduction
These activities are conceived as processes related mutually.
3.2. Risk identification
Goal: Finding threads and scenarios
Inputs:
·
·
·
·
·
·
Project description
Historical data on previous projects
( Post Implementation Analysis, Trouble List)
Prognoses of possible external influences
Prognoses of possible internal influences
Experience
Activies: Application WHAT-IF questions
Application WHAT-IS questions
Output: A list of pairs
thread – scenario
№
Thread
Scenario
Quality supporting activities:
·
Input data validity and completeness test
·
Test of the team competency and completeness
·
Test of prognostic input data up-to-date condition
·
Test of output pair list completeness
3.3. Risk quantification
Goal: Evaluate the probability of scenarios, scope of damages and assess the risk rate
Inputs:
· A list of pairs
thread - scenario
· Statistical data on previous projects and various other statistical data
· Experience
Outputs:
Complete n-groups (thread, scenario, probability, loss)– interim result
№
Thread
Scenario
Probability
Loss
Risk Value
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
167
·
·
·
·
List I. for the project design completing
List II. for information to operative actions possible
List III. for the follow-up Risk reduction process
Preliminary standard of the general project risk
Quality supporting activities:
· Input data list validity and completeness test
· Test of the team competency and completeness
· Test of statistical data up-to-date condition
· Test of output pair list completeness
3.4. Plans for the risk reduction
Goal: Based on information and awareness of the danger, measures are to be prepared to reduce the risks impact.
Input:
List of n-items (thread, scenario, probability, loss), which must be taken into account
(List III.)
Activities: Creative application of 9 prototyping risk reductions
Output:
· Plans of risk reduction
· (A plan of measures to reduce the risk)
· Evaluation of associated project risk based on the measures
Quality supporting activities:
· Test of the input list validity and completeness
· Test of the team competency and completeness
· Checking the plans for project risk reduction
4. CONCLUSION
Soft computing projects usually miss the chapter on risks analysis. It is certainly one of the reasons why
a whole number of soft computing projects fail. When designing software, our software firms should start using
modern project management to much greater extent, considering also the problems of risks in project design and
implementation.
The RIPRAN method is a suitable means, how to carry out the risk analysis and elaborate a design to
decrease the risk in top quality. The Institute of automation and computer science wants to focus on concrete
company projects of control and information systems, soft computing project, etc.
At present the 1st version is available and the 2nd version will be prepared, which will include computer
aid of the RIPRAN method plus a form of documents complying with the ISO 9000:2000 standards
requirements.
REFERENCES:
[1] BOEHM, W.B.: Risk Management. IEEE Computer Society Press. Los Alamos 1975 ISO 10006
[2] IPMA Competence Baseline. Report of International Project Management Association Project Management
Body of Knowledge. Project Management Institute. Upper Darby 1996, (Chap. 11 Risk Project Management)
[3] LACKO,B.: RIPRAN Method. Research Report 02/2000. TU of Brno – Faculty of Mechanical Engineerig –
Institute of Automation and Computer Science, Brno 2000, 57s.
The paper was elaborated within the frame of the research plan of the Ministry of Education No. CZ J22/98 260
000013
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
168
NETWORK VIRTUAL LABORATORY – INDUSTRIAL ROBOT
Wlodzimierz M. Baranski, Tomasz Walkowiak
Institute of Engineering Cybernetics, Wroclaw University of Technology
ul. Janiszewskiego 11/17, 50-372 Wroclaw, POLAND,
Phone: +48-71-3202969, +48-71-3203996 Fax: +48-71-3212677,
E-mail: {mwbar, twalkow}@ict.pwr.wroc.pl
Abstract: The paper describes an approach to distance learning in computer engineering. A
virtual simulated laboratory available for students through Internet is presented. The aim of
laboratory is to teach students how to control external device: an industrial robot type b. Students
has a possibility to test robot’s functionality by setting individual bits of control ports.
Furthermore, distant access to programming environment is implemented which gives the user
possibility to write programs controlling a virtual robot. The whole system has been implemented
in Java and requires on user end the web browser with Java plugin.
Keywords: distance learning, virtual laboratory, industrial robot, Internet
1. INTRODUCTION
Development of Internet has a big influence on global information society. As it is stressed in [7],
education policy is of one of the most importance for the success of the transformation towards the global
information society.
We think that one of solutions for the global education is the internet based distance learning. In
science and engineering the laboratory is a lifeblood of teaching. We are focusing here on computer
engineering laboratory exercises where students write and verify their own programs on real peripheral devices.
In this testing phase equipment is very often destroyed what causes that during the next laboratory test students
are not able to work. Service and price of such devices are usually high and reparation requires a qualified
service man. The idea of replacement of real devices with simulator with an Internet access allows having
laboratory exercises in distance learning. It is important to remember that simulator must behave in the same
way like a real device. The distance learning laboratory presented in this paper is functionally compatible with
real device [1].
In the case of distance learning there is no possibility to access a real laboratory. Therefore, practical
classes are real challenge. Generally, there are two approaches to this problem and each of them has its own
advantages and disadvantages:
·
real laboratory with distance access;
·
virtual laboratory.
In work reported here the concept of virtual laboratory was used. The idea of distance access to real
devices was rejected because of following disadvantages: limited number of simultaneously working students
(depending on the number of real devices) and the total cost of the system - devices, webcams (to view the
actual state of the process) and converters of distance user commands.
The virtual laboratory doesn't exist in reality. Real devices, on which students practise, are replaced by
programs simulating behaviour of those devices in real time. Access to such laboratory gained through the
Internet is limited only by the availability of free resources on the server and the network bandwidth and not by
the number of real devices or time of the day. There is no need to use additional devices such as webcams or
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Kunovice, Czech Republic, January 29-30, 2004
169
special interfaces; therefore no special service is needed in case of a device malfunction. The simulation
software can run on the client side or on the server (when simulator needs more resources). There is also an
option of running one part of the program on the user’s machine and the second one on the server. The choice
depends on many factors such as quality of the network connection, simulation algorithm complexity, difficulty
of installation on the user’s machine, system maintenance (new versions and patches distribution) and widely
interpreted computer system security.
2. NETWORK VIRTUAL LABORATORY
Network Virtual Laboratory (NetVL) [2] is a computer aided learning system developed at Wroclaw
University of Technology. The aim of the NetVL is to support learning of programming such types of external
devices as a milling table [4] or industrial robot. There is also a possibility of programming microprocessor
devices such as PCs. The main purpose of the system is a distance access to those devices through the internet.
The user is expected to have only a Java enabled internet browser and of course, the internet access.
Beside the ability to view the simulated device and its registers, the user can operate in the
programming and running environment. He can write programs controlling devices compile and run them
without having any compiler installed on his machine. Access to the compiler and the computer, where the
user's program actually is being run, is possible through the internet. Like in the case of inspecting the
simulated device, only a Java enabled browser is required. A Java applet which provides communication with
the compiler runs in the browser's window.
NetVL consists of three main modules (see Fig 1):
·
Management subsystem (MS) – allows user identification and acts as an communication point for
all other modules ;
·
Virtual external device (VED) – simulates and visualises behaviour of a given external device;
·
Virtual microprocessor (VM) – makes possible to create and run on the device simulator user
programs which interacts with VED.
visualisation
applet
TCP/IP
simulator
instance for
user 1
VED
server
manipulation
applet
TCP/IP
web browser
Internet
MS
server
simulator
instance for
user 2
Java program
Intranet
Java program
Fig. 1. NetVL modules communications
3. VIRTUAL EXTERNAL DEVICE (VED)
Internal structure of virtual device (Fig .1) consists of:
Virtual Ports. Module that represents (communication) ports in a real control computer (but that
are accessed through the net). Acts as an interface between the user (VM and Manipulator) and the
Simulator. This module runs on the server side.
·
Internal Ports. The same functionality as the Virtual Ports, but having the additional information of
the internal state of the device. It sends (on-line) info to the Visualization module that visualizes
the device’s functionality. This module runs on the server side.
·
Simulator. Module that simulates the device’s behavior. This runs on the server side. The simulator
works on-line. The outside world could change the internal state of the simulator at any time by
·
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
170
·
·
accessing the virtual ports (by playing with the manipulator or through any outside program). The
work of the simulator is visualized on-line in the visualization module, where the user can see the
movement of the simulated device and the results of playing with the virtual ports.
Manipulator. Gives the ability to control the device (Simulator), by simply playing with the input
ports in Virtual Ports (via the Manipulator). This module runs on the user’s computer.
Visualization. Module that visualizes the behavior of the device based on the data written to the
Internal Ports. This module runs on the user’s computer.
4. ROBOT EXTERNAL DEVICE
One of realised Virtual External Devices is an industrial robot [7]. The External Device simulates the
behaviour of industrial robot type b (IRb) [5]. The industrial robot type b, shown in Fig 2, consists of: arm
motor, arm, forearm, shoulder motor, hand motor, handswivel motor, gripper, shoulder, base motor and base.
The robot is made up from several rigid bodies, called links, connected sequentially by revolute or prismatic
joints, to achieve the required rotational and/or translation motion. Mechanically, a robot manipulator is
composed of an arm and a wrist attached to a tool fixture (a gripper). The arm typically has three degrees of
freedom, which accomplish the major positioning of the robot arm and place the wrist unit at the work piece.
The wrist consists of up to three rotary motions that help to obtain an appropriate orientation of the tool toward
the object.
The aim of this lesson is to give familiarity with an industrial robot (Fig. 4), and to teach how to
control and operate it. It teaches the student how to write programs for controlling such devices. The user is
able to control a virtual robot Simulator of robot allows the user to play with its functionality (Fig 3). Practical
work (writing own programs) allows the user to practise the methods on real problems.
Fig. 2. Industrial robot type b
Fig. 3. Robot applet
1.1. The ports – general description
The robot is connected to controlling unit (say PC) and controlled through one 16-bit input virtual port
(Control Port, we assigned addresses to 300h) and ten 16-bit output virtual ports (at addresses 300h-312h).
The robot has ten output ports:
· the first at address 300h and known as a Status Port,
· the second at address 302h and known as a Sensor Port,
· the next five known as Motor Ports:
· the Base Motor Port at address 304h,
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Kunovice, Czech Republic, January 29-30, 2004
171
· the Shoulder Motor Port at address 306h,
· the Arm Motor Port at address 308h,
· the Hand Motor Port at address 30Ah,
· the Handswivel Motor Port at address 30Ch,
· and next three known as Position Ports:
· the X-Coordinate Port at address 30Eh,
· the Y-Coordinate Port at address 310h.
· the Z-Coordinate Port at address 312h.
1.2. The ports – detailed description
The Control/Status Port consists of an input port (Control Word) and an output port (Status Word)
located at the same address (300h).
The Control Word controls the state of the robot, making it possible to do the following:
·
turn the power on or off for each motor in the joints of the robot,
·
turn the power on or off for the gripper,
·
set the direction of motion of each motor (up/down or left/right),
·
set the type of the gripper operation (close or open: when the gripper is on and the gripper
operation is set to close the robot can write in space).
The Status Word includes the same information, but is read-only. Its function is to show the dynamic
state of the robot:
·
which motors are on or off,
·
if the gripper is on or off,
·
what direction of motion is set for each motor (up/down or left/right),
·
what type of gripper operation is being performed (close or open).
The Sensor Port (302h) lets you check the state of the sensors that show when the extreme positions of
each motor (right/left or highest/lowest) are reached and check the state of the gripper (closed or open).
The Motor Ports (304h-30Ch) give the actual state of each motor (deflection in degrees relative to their
standard position at reset). Each port has its own range of motion:
·
Arm Motor from -30 to 30 degrees,
·
Shoulder Motor from -40 to 20 degrees,
·
Hand Motor from -20 to 200 degrees,
·
Handswivel Motor from -90 to 90 degrees,
·
Base Motor from -170 to 170 degrees.
The Coordinate Ports (30Eh-312h) give the current position of the gripper (edge). The position of the
gripper is described by three coordinates: X, Y and Z (at addresses 30Eh, 310h, 312h, respectively).
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
172
a)
b)
c)
d)
Fig. 4. Robot simulator in action
1.3. Simulator architecture
The robot simulator was written in Java and it’s components agrees with VED concept presented in
part. It allows communication with NetVL system components as presented in Fig. 5. It consists of two separte
programs: robot server (Java program runs as a demon) and robot client (with front end presented in Fig. 3)
which is a Java applet.
RobotClient
Manipulator
PortRegister
Wizualizator
ViewController
new Client
(A pplet)
RobotServer
LO G IN
RobotServer
Thread
send
W irefr
(0)
Wireframe
Matrix[]
LO G IN
PCClient
Internal
Ports
RobotApple PortServer RobotPrgm
tThread
Thread
Control
new Client
(P rogram)
Status
Sensor
Auth.
module
UserTable
IBM
PS/2
terminal
emulation
Control
ApplMotion
Thread
NetVLServer
Virtual Ports
RobotCom
Internal Ports (local)
RobotPortClient
check
LO G IN
PrgmMotion
Thread
NetV L.h
Status
Sensor
DOF
DOF
Fig. 5. Robot Simulator architecture
5. VIRTUAL MICROPROCESSOR (VM)
The Virtual Microprocessor (VM) subsystem of the NetVL is focused on teaching how to write
programs. Internal structure of virtual microprocessor consists of following elements:
·
Programming environment. Allows the user to prepare (write, compile), run and test controlling
programs on a remote computer. It will run some commercial compilers and processor simulators
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Kunovice, Czech Republic, January 29-30, 2004
173
·
·
·
·
·
·
·
and send terminal-like outputs to the user. The executable programs will stay on the server side. It
will also allow execution of these programs. The program running on the server side will
communicate with the outside world: with a user through the Terminal and with the VED via the
Virtual ports. This module will be run on the server side.
Control. Module that allows the user to edit any programs and execute compilation or programs
running on the remote computer. Also any compiler-linker outputs will be seen on the user’s
screen. This module will be run on the client side.
Terminal. Module that allows terminal-like communication of an executing program with the user.
Real device. There is a possibility that the program will be run on a real device fitted in the
computer on the server side [2].
The user of VM can:
create a new file or directory;
edit and save an existing file (Fig 6) ;
change a file or directory name (Fig 7);
delete a file or directory.
All users have their own disk space that can only be accessed by them (and of course by the
administrator). When an authorized user accesses a given VM for the first time, the VM server automatically
creates a user disk space.
The user can create executable programs. The user applet in the VM subsystem allows the user to run
compilers or linkers on remote machines and to see the program outputs on a terminal-like window. Programs
can be run on a remote machine and can communicate through a terminal-like interface with the user.
One of realization of VM [2] (so called PC VM) is is based on the Borland C++ 5.5 freeware compiler.
It allows the user to write any C or C++ language program (though only terminal user interface is supported)
and run it. Using a terminal window, the user can communicate with a program running on a remote computer.
Fig. 6. VM applet – contents of directory
Fig. 7. VM applet – contents of directory
The PC VM also has the ability to write programs that control Virtual External Devices (VED), such
as a milling table (see Fig. 9). We have developed a special C library giving a user two functions (inb() and
outb()) that allow reading or writing to a VED port. Writing to or reading from a VED port is performed by a
TCP/IP connection with the VED server. The visual effects of controlling the VED can be seen in the client
applet.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
174
TCP/IP
VM C lient
stdin,
stdout
Sofware calls
inb(),
User outb() NetVL
progr am
Librrary
TCP/IP
TCP/IP
MSServer
VM Server
TCP/IP
VED server
VED Client
Fig. 8. NetVL modules communications
6. SUMMARY
We have presented a virtual laboratory system available for students equipped with a computer and
internet access. This system distance learning laboratory system is very useful for teaching students how to
control an external device: industrial robot. All materials are delivered by the internet. Tests proved that
ordinary phone-line modem access is sufficient for effective usage, however the main drawbacks are form long
delays. So the main requirement for All the software has been written efficiently and computing resources are
not heavily challenged. It’s possible [3] to include part or whole the system in the standard Learning
Management Systems (like for example WebCT).
7. ACKNOWLEDGMENT
The work reported here was partly sponsored by European Union in the framework of „Telematics
Application Program” INCO-Copernicus project „Multimedia Education: An Experiment in Delivering CBL
Material” [6].
REFERENCES
[1] BARANSKI W., MAJEWSKI J., DOBROWOLSKI A., Functional simulation of microprocessor external
devices, Proceedings of 4-th International Conference Computer Aided Engineering Education, CAEE
'97. Krakow, Poland, September 11-13, 1997, vol II, pp.280-287.
[2] BARANSKI W., WALKOWIAK T., MM-EDU: network virtual laboratory, Technical Reports Institute of
Engineering Cybernetics, SPR 2/2001, Wroclaw University of Technology, 2001.
[3] BARANSKI W., WALKOWIAK T., Multimedia approach to distance learning, First International
Conference on Soft Computing Applied in Computer and Economic Environments. ICSC 2003., January
30- 31, 2003 Kunovice, Czech Repubic, pp. 150-158.
[4] BARANSKI W., WALKOWIAK T., Network Virtual Laboratory - Milling Table. ICTINS’2003,
International Conference on Information Technology and Natural Sciences, October 19-21, 2003, ALZaytoonah University, Amman, Jordan, pp.. 122-126.
[5] CRAIG, J.J. Introduction to Robotics: Mechanics and Control. Addison-Wesley, 1986.
[6] MM-EDU project http://mm-edu.ict.pwr.wroc.pl.
[7] Poland and Global Information Society: Logging on. Human Development Report, UNDP, Warsaw 2002.
[8] WALKOWIAK T., Storyboard of integrated MM-EDU module full version, Technical Reports Institute of
Engineering Cybernetics, SPR 43/2000, Wroclaw University of Technology, 2000.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Kunovice, Czech Republic, January 29-30, 2004
175
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
176
SIMD APPROACH TO RETRIEVING
ALGORITHM OF MULTILAYER PERCEPTRON
Jacek Mazurkiewicz
Institute of Engineering Cybernetics, Wrocław University of Technology
ul. Janiszewskiego 11/17, 50-372 Wrocław, POLAND,
Phone: +48-71-3202681 Fax: +48-71-3212677,
E-mail: [email protected]
Abstract: The paper is a proposal related to partial parallel realisation of retrieving phase of
Multilayer Perceptron algorithm. The method proposed is based on pipelined systolic arrays – an
example of SIMD architecture. The discussion is realised based on operations which create the
following steps of the algorithm. The data which are transferred among the calculation units are the
second criterion of the problem. The efficiency of proposed approach is discussed based on
implementation quality criteria for systolic arrays. The results of discussion show that it is possible
to create the architecture which provides massive parallelism and reprogrammability.
Keywords: Multilayer Perceptron, systolic array, SIMD architecture
1. INTRODUCTION
The paper is related to partial parallel realisation of retrieving phase of Multilayer Perceptron algorithm.
The method proposed is based on pipelined systolic arrays. The described methodology can be used as the
theoretical basis for hardware of software simulators of Multilayer Perceptron. The discussion is realised based
on the following assumptions:
· the outcome of algorithms realised true to proposed methodology is exactly the same like the
outcome of classical Multilayer Perceptron algorithm,
· three-layer Multilayer Perceptron is taking into account, but the presented approach can be easily
adopted to more sophisticated Multilayer Perceptron networks,
· the systolic structure is realised using only digital elements, input and output data are represented in
proper binary code,
· the number of neurons which create the layers of Multilayer Perceptron is unrestricted, but the
maximum number of elementary processors can be limited.
2. MULTILAYER PERCEPTRON ALGORITHM – RETRIEVING PHASE
The Multilayer Perceptron network is composed by sets of neurons which create the layers. The
classical net includes three layers:
· input layer,
· hidden layer,
· output layer.
Each layer consists of the proper number of neurons. The size of input layer equals to the size of input
vector, the size of output layer is related to the code used for the output of the network description and finally the
number of hidden layer neurons is estimated by different ways. For our discussion lets assume that we have:
· N neurons in input layer,
· K neurons in hidden layer,
· L neurons in output layer.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
177
It means that we operate with N-elements size input vectors and L-elements size output vectors. It is
possible – of course – to discuss more than single hidden layer, but it does not change the general idea of
presented approach. here are no connections among the neurons from the same layer, but output signal from
single neuron is transmitted as input signal to all neurons from the next layer. So the outputs from the input layer
neurons are inputs from the hidden layer neurons and the outputs from the hidden layer neurons are the inputs for
the output layer neurons.
The only task related to the neurons from input layer is input vector components transfer to neurons
from hidden layer. This way there is no need to discuss their implementation – we have only to guarantee the
input vector components transfer to all neurons from hidden layer. Neurons from hidden and output layers realise
exactly the same operation, but using different data.
For hidden layer neurons we can describe the following equation:
N
ui = f (å xl wli )
( 1)
(1)
l =1
where:
ui
- output value calculated by single neuron from hidden layer – single component of K-elements size
vector generated by neurons from hidden layer,
xl
- component of N-elements size input vector,
wli (1)
- weight associated with connection from component of input vector xl and neuron from hidden layer
indexed by i,
f( )
- non-linear, usually sigmoid, neuron activation function.
For output layer neurons we can describe the following equation:
K
yi = f (å u l wli )
(2)
(2)
l =1
where:
yi
- output value calculated by single neuron from output layer – single component of L-elements size
vector generated by neurons from output layer,
ul
- component of K-elements size vector generated by neurons from hidden layer,
wli (2)
- weight associated with connection from component of vector generated by neurons from hidden layer
ul and neuron from output layer indexed by i,
f( )
- non-linear, usually sigmoid, neuron activation function.
As we can notice each neuron calculates the weighted sum which is an argument of neuron activation
function. Calculations related to neurons from the same layer can be done in parallel mode, but the sequence of
operations ought to be preserved in order to succeeding layers. We assume – of course – that the values of
weights are fixed and ready to use – as a product of any proper for Multilayer Perceptron learning algorithm.
3. DATA DEPENDENCE GRAPHS FOR MULTILAYER PERCEPTRON DURING RETRIEVING
PHASE
A Data Dependence Graph is a directed graph that specifies the data dependencies of an algorithm. In a
Data Dependence Graph nodes represent computations and arcs specify the data dependencies between
computations. For regular and recursive algorithms, the Data Dependencies Graphs are also regular and can be
represented by a grid model. Design of a locally linked Data Dependence Graph is a critical step in the design of
systolic array [8].
The Data Dependence Graphs for Multilayer Perceptron retrieving algorithm ought to be discuss
individually for each layer. Let’s start from the hidden layer. The input layer – as we noticed before is
responsible only for proper distribution of input vector components. So this layer is absent during retrieving
algorithm realisation.. The hidden layer includes K neurons (according to the assumption in part 2 of this
document) and each neuron from this layer collects signals from N neurons related to input layer. For such
topology there are (N ´ L) weights – obtained during learning phase [8].
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
178
The hidden layer ought to be described by rectangular Data Dependence Graph (Fig. 1.). Each node in
this graph – excluding the last horizontal line of nodes - is responsible for elementary multiplication product
calculation. This means that each node realises the operations described by rule (1), but without summing using
proper value of component of N-elements size input vector and the proper weight and without the value of
activation function calculation. The local memory of each node should put the value of the single weight
obtained earlier after learning algorithm. The size of the graph equals to the size of the weight matrix plus single
extra horizontal line with nodes responsible for activation function calculation. Each node of the graph should be
loaded by two signals. The first one is the component of the input vector. The second one is the current value of
weighted sum calculated by single neuron of Multilayer Perceptron. So each node ought to add the calculated
product of multiplication to the loaded previous value of weighted sum signal (1) an pass updated value to the
next node in the same column. Such proposed solution requires capacity of the local memory of each node large
enough to store the single weight, but this way reduces to minimum the number of data which ought to be
transmitted during retrieving algorithm realisation by
presented Data Dependence Graph. The operations
realised by single neuron from Multilayer Perceptron
are described by the single column of the graph. The
components of input vector are loaded to nodes by
horizontal arcs of the Data Dependence Graphs.
These values are passed to the next neighbour on the
right hand. The current value of weighted sum
generated by single neuron of Multilayer Perceptron
is loaded by vertical arcs (Fig. 1.). The update this
value is passed to the next bottom neighbour. This
way we can observe only point-to-point
communication among the nodes which means that
presented Data Dependence Graph is local graph.
This property guarantees construction of well defined
systolic array to realise the retrieving algorithm of
Multilayer Perceptron [4]. The nodes from the last –
extra horizontal line calculates the value of activation
function when the previously calculated weighted
sum is the argument. We propose to realise this
operation using lookup table instead of analytical
calculation. Such solution guarantees quite short time
of operation preserving satisfactory level of
precision. The only problem is that the local memory
ought to store the lookup table.
Fig. 1. Data Dependence Graph for hidden layer of Multilayer Perceptron during retrieving algorithm
Now let’s try to discuss the Data Dependence Graph for output layer of Multilayer Perceptron. In
general steps which ought to be realised are the same, but the number of neurons is different and neurons are
loaded by the vector generated by neurons from hidden layer. The output layer includes L neurons (according to
the assumption in part 2 of this document) and each neuron from this layer collects signals from K neurons
related to hidden layer. For such topology there are (L ´ K) weights – obtained during learning phase [8].
The output layer ought to be described by rectangular Data Dependence Graph (Fig. 2.). Each node in
this graph – excluding the last horizontal line of nodes - is responsible for elementary multiplication product
calculation. This means that each node realises the operations described by rule (2), but without summing using
proper value of component of K-elements size vector generated by hidden layer neurons and the proper weight,
and without the value of activation function calculation. The local memory of each node should put the value of
the single weight obtained earlier after learning algorithm. The size of the graph equals to the size of the weight
matrix plus single extra horizontal line with nodes responsible for activation function calculation. Each node of
the graph should be loaded by two signals. The first one is the component of the vector generated by hidden
layer neurons. The second one is the current value of weighted sum calculated by single neuron of Multilayer
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
179
Perceptron. So each node ought to add the calculated product of multiplication to the loaded previous value of
weighted sum signal (2) an pass updated value to the next node in the same column. Such proposed solution
requires capacity of the local memory of each node large enough to store the single weight, but this way reduces
to minimum the number of data which ought to be transmitted during retrieving algorithm realisation by
presented Data Dependence Graph. The operations realised by single neuron from Multilayer Perceptron are
described by the single column of the graph. The components of the vector generated by hidden layer neurons
are loaded to nodes by horizontal arcs of the Data Dependence Graphs. These values are passed to the next
neighbour on the right hand. The current value of weighted sum generated by single neuron of Multilayer
Perceptron is loaded by vertical arcs (Fig. 2.). The update
this value is passed to the next bottom neighbour. This way
we can observe only point-to-point communication among
the nodes which means that presented Data Dependence
Graph is local graph. This property guarantees construction
of well defined systolic array to realise the retrieving
algorithm of Multilayer Perceptron [4]. The nodes from the
last – extra horizontal line calculates the value of activation
function when the previously calculated weighted sum is the
argument. We propose to realise this operation using lookup
table instead of analytical calculation. Such solution
guarantees quite short time of operation preserving
satisfactory level of precision. The only problem is that the
local memory ought to store the lookup table.
In general Data Dependence Graphs presented
above are quite similar and this is a chance to create the
single set of Elementary Processors (PE) with switched
functions to operate first with hidden layer and next with
output layer. The same idea can by adopted for more than
single hidden layer if the Multilayer Perceptron has more
sophisticated topology.
Fig. 2. Data Dependence Graph for output layer of Multilayer Perceptron during retrieving algorithm
4. MAPPING DATA DEPENDENCE GRAPHS ONTO SYSTOLIC ARRAY STRUCTURE
4.1. Processor assignment via linear projection
r
Mathematically, a linear projection is often represented by a projection vector d . Because the Data
Dependence Graph of a locally recursive algorithm is very regular, the linear projection maps an n-dimensional
Data Dependence Graph onto an (n-1) dimensional lattice of points, known as processor space [8]. It is common
to use a linear projection for processor assignment, in which nodes of Data Dependence Graph along a straight
line are projected to an Elementary Processor in the processor array (Fig. 1.).
4.2. Schedule assignment via linear scheduling
A scheduling scheme specifies the sequence of the operations in all Elementary Processors. More
precisely, a schedule function represents a mapping from the n-dimensional index space of the Data Dependence
Graph onto a 1-D schedule (time) space. Linear scheduling is very common for schedule assignment (Fig. 1.). A
linear schedule is based on a set of parallel and uniformly spaced hyperplanes in the Data Dependence Graph.
These hyperplanes are called equitemporal hyperplanes - all the nodes on the same hyperplane
r are scheduled to
be proceed at the same time. A linear schedule can also be represented by a schedule vector s , which points in
the direction normal to the hyperplanes. For anyr computation node indexed by a vector n in the Data
Dependence Graph, its scheduled processing time is sn .
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
180
4.3. Mapping policies
r
r
Given a Data Dependence Graph and the projection direction d not all schedule vectors s are valid for
the Data Dependence Graph. Some rmay violate the precedence relations specified by the dependence arcs. For
systolic design, the schedule vector s in the projection procedure must satisfy the following two conditions [1]:
r r
- causality condition:
sTe > 0
(3)
r
e - represents any of the dependence arcs in the Data Dependence Graph
r r
- positive pipeline period:
sTd ¹ 0
(4)
This way the rectangular Data Dependence Graphs are converted into linear pipelined systolic arrays.
This situation we can observe both for hidden and output layers for Multilayer Perceptron (Fig.3., Fig.4.). The
number of elementary processors which are used for array construction equals to the number of neurons in
simulated layers of Multilayer Perceptron neural network. Each elementary processor combines all functions
described by nodes of Data Dependence Graph placed at the same column of single slab of Data Dependence
Graph. If we want to reduce the number of elementary processors we can change the classical systolic structure
into ring structure – where each elementary processor is responsible for modelling of greater number of neurons.
Of course the reduction of number of elementary processors ought to be done in the way which preserve the
same number of neurons for single processor.
Fig. 3. Systolic array for hidden layer of Multilayer Perceptron during retrieving algorithm
Fig. 4. Systolic array for output layer of Multilayer Perceptron during retrieving algorithm
5. EFFICIENCY OF PROPOSED APPROACH
5.1. Computation time
This is time interval between starting
the first computation and finishing the last computation of
r
problem. Given a coprime schedule vector s , the computation time of a systolic array can be computed as [1]:
r r r
T = max { s ( p - q )} + 1
T
r r
p ,q ÎL
(5)
where L is the index set of the nodes in the Data Dependence Graph. In the presented architecture the schedule
r
vector is defined as: s = 1 , 1 . The total computation time is a sum of computation time related to hidden layer
and computation time related to output layer. These two values ought to be calculated independently because
operations of each layer are realised in sequence and we have two independent Data Dependence Graph. So total
computation time equals:
[ ]
Tsystol = Tsyshid + Tsysout
(6)
where:
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
181
Tsyshid - computation time for hidden layer neurons,
neurons
Tsysout
- computation time for output layer
For hidden layer we have N+1 elements within vertical axis and K within horizontal axis of space for
Multilayer Perceptron (Fig.1.). Based on this remark the computation time we can estimate as:
Tsyshid = (N + K )t
(7)
where t - processing time for elementary processor
For output layer we have K+1 elements within vertical axis and L within horizontal axis of space for
Multilayer Perceptron (Fig.2.). Based on this remark the computation time we can estimate as:
Tsysout = (K + L )t
(8)
Tsystol = (N + 2 K + L )t
(9)
Based on (6) finally we can say that:
5.2. Pipelining period
rThis isr the time interval between two successive computations in a processor. As previously discussed,
if both d and s are irreducible, then the pipelining period equals [1]:
r r
a = sT d
(10)
r
In the presented approach the schedule vector is defined as: s = [1 , 1] , the projection direction vector
r
equals: d = [ 0 , 1] . The pipelining period is constant: a = 1. It means the time interval between two successive
computations in an elementary processor is as short as possible. The pipelining period is exactly the same for
both Data Dependence Graphs and there is no sense to combine them into single value.
5.3. Processor utilization rate
Lets define the speed-up factor as the ratio between the sequential computation time and the array
computation time, then the utilization rate is the ration between the speed-up factor and the number of processors
[1].
sequential computation time
(11)
speed - up =
array computation time
speed - up
(12)
utilization rate =
number of processors
Sequential computation time always is proportional to the number of nodes, which are responsible for
calculations. On the other hand time of computations related to each layer ought to be analysed independently
because of operation sequence in Multilayer Perceptron. Based on these remarks the total sequential computation
time equals:
Tseq = Tseqhid + Tseqout
(13)
For hidden layer sequential computation time we can estimate as:
Tseqhid = (N + 1)Kt
(14)
For output layer sequential computation time we can estimate as:
Tseqout = (K + 1)Lt
(15)
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
182
Based on (13) finally we can say that:
Tseq = ((N + L + 1)K + L ) t
(16)
where: t - processing time for elementary processor
For systolic implementation we need K elementary processors to realise calculations for hidden layer
and L elementary processors to realise calculations for output layer respectively. If we assume that all
elementary processors need the same time-period to proceed their calculations the speed-up and utilization rate
factors for three layer Multilayer Perceptron can be estimated as:
speed - up =
(N + L + 1)K + L
utilizationrate =
N + 2K + L
(N + L + 1)K + L
(N + 2 K + L )(K + L)
(17)
6. CONCLUSIONS
Summarising, the paper proposed a new methodology for retrieving algorithm of Multilayer Perceptron
neural network simulation based on systolic array structure. The discussion is focused on operations which are
realised during the following steps of algorithm and the data which are transferred among the calculation units. It
is clear which operations can be done in parallel way and when the sequence is necessary. We can notice as very
promising the results of estimation of total computation time, speed-up and utilization rate parameters. The
proposed methodology is presented for three layer network – because such Multilayer Perceptron is very often
used in different tasks. On the other hand there are no barriers to adopt the solution for more sophisticated
Multilayer Perceptrons. The main idea is to create next Data Dependence Graphs for next layers and transfer
them into linear systolic structures. The time parameters in efficiency calculation will increase in additive way.
The proposed approach generates no barriers to tune the Multilayer Perceptron to completely new tasks.
The proposed methodology can be used as a basis for VLSI structures which implement Multilayer
Perceptron or as a basis for set of general purpose processors – as transputers or DSP processors - which can be
used for Multilayer Perceptron neural network implementation. At the end proposed methodology can be also
useful for parallel programme realisation of Multilayer Perceptron and can be easily adopted for other feedforward neural nets. The main advantage of the proposed architecture is that it can be easily extended to larger
networks and – this way – to different practical task where Multilayer Perceptron can be used as a classification
tool for example.
REFERENCES
[1] KUNG S. Y., 1993: Digital neural networks, PTR Prentice Hall
[2] MAZURKIEWICZ J., 1998: A processor pipeline architecture for Hopfield neural network, II SCANN’98
Slovak Conference on Artificial Neural Networks, Smolenice, Trnava, Slovakia, pp.: 158 – 163
[3] MAZURKIEWICZ J., 2001: Efficiency of systolic simulator based on elementary processors with switched
functions for Hopfield neural network, VII International MENDEL 2001 Conference on Soft Computing,
Brno, 6 – 8 June, 2001, Czech Republic, pp.: 349 - 354
[4] MAZURKIEWICZ J., 2001: SIMD-type simulator for recurrent neural nets, 35th Spring International
MOSIS’01 Conference Modelling and Simulation of Systems, Ostrava, 9 – 11 May 2001, Czech Republic,
vol. 1, pp.: 173 – 180
[5] MAZURKIEWICZ J., 2003: Systolic realisation of self-organising neural networks, 1st International ICSC
2003 Conference on Soft Computing Applied in Computer and Economic Environments, Kunovice, 30 – 31
January 2003, Czech Republic, pp.: 116 – 123
[6] PETKOV N. 1993: Systolic parallel processing, North-Holland
[7] SHIVA S. G., 1996: Pipelined and parallel computer architectures, Harper Collins Publishers
[8] ZHANG D., 1999: Parallel VLSI neural system design, Springer-Verlag
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
183
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
184
ACOUSTIC LOCALIZATION OF MULTIPLE VEHICLES
Wojciech Zamojski, Tomasz Walkowiak
Institute of Engineering Cybernetics, Wroclaw University of Technology
ul. Janiszewskiego 11/17, 50-372 Wroclaw, POLAND,
Phone: +48-71-3203996 Fax: +48-71-3212677,
E-mail: {zamojski,twalkow}@ict.pwr.wroc.pl
Abstract: In this paper we describe an approach to localize of multiple vehicles based on sound
emitted by them. The problem is known as the acoustic direction of arrival (DOA) in case of
multiple sources. The standard techniques for solving this problem were found inadequate in the
presence of normal disturbances (such as produced by wind) and wide-band non-stationary
signals. The paper presents a method which is based on the MUSIC beam forming technique.
Also, a usage of a multilayer perceptron neural network to determine the number of sound sources
on a scene monitored by an array of linearly spaced microphones is presented as well as some
heuristics algorithms for false sources cancelling. The method is experimentally shown to deal
with this problem. Field experiments included scenes with zero, one or two moving vehicles.
Keywords: direction of arrival estimation, localisation o f vehicles, neural networks
1. INTRODUCTION
The paper discusses a problem of localization of military ground vehicles based on sound emitted by
them [5][7][8]. The sources of those sounds are mainly the engines, tracks and wheels of those vehicles. Their
characteristics depend on many factors such as speed of a vehicle, environmental conditions and surface type.
However, a priori knowledge of those characteristics is not needed. The localization here is understood as a
direction angle, not a position on XY space. This problem is known in signal processing as a direction of
arrival estimation [2]. There are standard algorithms for dealing with the problem, even in case of multiple
sources [4]. These assume that the sound sources are narrow banded with presence of white noise and that the
knowledge of the number of sources to be localised and tracked are known. This is not a case of our work since
the sound emitted by vehicles is a wide band not-stationary signal. Also the noise could not be assumed as
white, it is mainly produced by wind and trees. Moreover the number of vehicles presented in the analysed filed
is unknown a priori and could not be determined by non-acoustic observation and measurement. Therefore, the
number of presented sound sources (vehicles) has to be estimated on the basis of the received sound signals.
In our experiments there were 4 microphones placed in an ULA (uniform linear array) configuration
with d=0.6 m span. Therefore, we could detect direction angles form -90° to +90° not the full 360° space. The
number of microphones (4) in theory allows to localise and track 3 sound sources, however due to limitation in
our field experiments we are focusing on a case when we have 1, 2 or non vehicle present.
2. BACKGROUND
The input data are the sounds emitted by military wheeled vehicles such as BRDM, Star and
caterpillar vehicles such as BWP and Gozdzik assault gun in a military filed [7]. Besides that, inspected
vehicles have particular routes and speeds in certain range (Fig 1). We have marked several points on vehicles
routes which allow later analysis of results.
The experiments were repeated at two different locations, at dissimilar weather conditions. The
vehicles moved around with speeds ranging from 0 to 30 km/h (in most of the case it’s around 15 km/h). We
assumed that the area can be treated as flat and without any obstacles obscuring sound propagation and
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Kunovice, Czech Republic, January 29-30, 2004
185
producing unwanted reverberations. The mentioned factors allow us to assume that the vehicle are far field
point sources which implies that the sound wave coming to the measure point is flat.
The sounds were gathered using 4 Brüel & Kjær 4188 microphones. The microphones were connected
to 4 Brüel & Kjær 2671 preamplifiers. Next, the signals were gathered in a NEXUS 2693A0S4 amplifier.
Finally, the data was converted into digital form using a PC and a dSpace DSP set. Used format was 16 bit and
5 kHz sampling. We have the additional signal which recorded information when which control point was
passed by a vehicle.
Y
X
Fig. 1. Experiment field
3. BEAMFORMING
The microphones placed in the uniform linear array allow creating a beamforming system. Such
system magnifies signals from a given direction, and lowers from all other directions. It acts as a kind of
directional microphone. The beamforming is performed by summing M delayed signals (yi (t)) according to
equation:
M
(1)
z( t ) = å y i ( t - D i )
i =1
Setting proper values of delays Di one could position a beamforming system for a given direction angle
a. The power P(a) of output signal z(t) reaches the maximum when the beam is directed on source of the signal
[2]. Therefore, our problem is equivalent to finding K -largest maximum in function P(a), where K is number
of vehicles presented in analysed filed.
More convenient for digital signal processing is an operation of beamforming system in frequency
domain. It could be obtained by Fourier transformation of equation 1 [8]. However in such case we obtain a set
of signal powers for each analysed frequency. So the global power is calculated as a sum of individual powers
for given frequencies. Since the most of the signal energy is concentrated in the 50-200Hz band (lower band
limit is set to ignore the sound generated by wind, i.e. most of the power of signal up to 50 Hz was generated
by wind) we limit our frequency analysis to a band 50-200Hz. So we got a new equation for the output power of
the beamforming system:
P( a) =
200
å e' R f e
(2)
f =50
where e are steering vectors (denoting the shifts of phase from individual microphones for given frequency):
é exp(- jvD1T ) ù
ê exp(- jvD T ) ú
2
ú,
e=ê
ê
ú
...
ê
ú
ëexp(- jvDM T )û
(3)
and Rf is so called spatial correlation matrix, given as a product of the Discrete Fourier Transformation (Yf) of
a windowed (multiplied by Hamming window) snapshot (snapshot) of input signals from each microphone Let
denote it as:
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
186
R f º Y fY f '.
(4)
It is worth to mention that the Rf matrix (with size 4x4, since we have 4 microphones) is really a
function of a frequency (f). In our experiments we have used an adaptive method of estimating covariance
spatial matrix Rf so called MUSIC algorithm.
4. MUSIC ALGORITHM
The MUSIC (from MUltiple SIgnal Classification) algorithm has been first presented in [4]. The data
model comprises the one assumed in this system. Having an ULA microphone array with M omnidirectional
sensors which is pinged by a set of (K<=M) narrowband zero-mean, far-field signals (M – number of signal
sources), according to [2] eigenvectors corresponding to K largest eigenvalues li,f of the spatial covariance
matrix Rf (given by equation 4) span the signal subspace. This space is orthogonal to the noise subspace which
is spanned by the rest of the eigenvectors. Therefore, the adaptive version of spatial covariance matrix Rf could
be calculated as [4]:
R MUSIC
f
-1
M
=
å v i, f v i, f ' ,
(5)
i = K +1
where eigenvectors (vi,f ) are sorted according to corresponding eigenvalues (li,,f):
l1, f ³ l2, f ³ K ³ lM -1, f ³ lM , f > 0 .
(6)
Therefore, the inversion of MUSIC covariance matrix is a base for calculation of the beamforming adaptive
power:
200
[
-1
P(a ) = å e' R MUSIC
e
f
f =50
]
-1
.
(7)
5. NUMBER OF VEHICLES RECOGNITION BY NEURAL NETWORK
In our case we don’t know the number of the sources K, so we are unable to form the definite signal
subspace matrix. On the other hand decomposition of matrix R gives information on the eigenvalues (li,f)
which can be interpreted as the powers of consecutive potential sound sources. Since N is unknown the
assumption is that there are M pseudo-sources some of which correspond to the real sources and the remaining
ones to noise. One has also to admit that preceding considerations are about narrow-band sources; however in
this case the moving vehicles are rather wide-band emitters.
6
8
l1
l1
6
4
4
2
2
0
0
50
100
150
200
250
0.8
f [Hz]
0
50
100
150
200
250 f
50
100
150
200
250
[Hz]
0.4
l2
0.6
0.2
0.2
0.1
0
l2
0.3
0.4
0
0
50
100
150
200
250
f[Hz]
0
0
f [Hz]
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Kunovice, Czech Republic, January 29-30, 2004
187
0.06
0.5
l3
0.4
l3
0.04
0.3
0.2
0.02
0.1
0
0
50
100
150
200
250
0
f
0.4
0.04
l4
0.3
0.02
0.1
0.01
0
50
50
100
150
200
250
50
100
150
200
250
f
l4
0.03
0.2
0
0
100
150
200
250
f
0
0
f
Fig. 2. Pseudspectro for 1 (left side) and two vehicles (left side)
The proposed solution is introduction of pseudo-spectra of the pseudo-sources. A pseudo-spectrum is a
set of k-th eigenvalues in all processed bands (e.g. 1st pseudo-spectrum comprises of largest eigenvalues of the
estimate of the covariance matrix R in each band). In case of this system, there are 4 pseudo-spectra (four
eigenvalues, since we have had four microphones). As one can see on the figure 2, noise pseudo-spectra have
common structure (i.e. follow the 1/f pattern) and contain much less energy than the pseudo-spectra associated
with real sources.
Decision on what the real number of sources is taken by the neural classifier [1]. The multilayer
perceptron neural network, with it’s very good ability to estimated the unknown function, uses the date from the
pseudo-spectra to determine the number of sources. Using all data contained in the pseudo-spectra would imply
too much complexity so we decided on deriving some measures describing the pseudo-spectra. The firs class of
such measures used is connected with the power of the sources whereas the second one deals with the structure
of each pseudo-source. The power of a pseudo-source is expressed in the following equation:
Pi = å l i , f .
(8)
f
The measure describing the structure is the relation between the power below 50Hz and the whole Pi:
ål
i, f
Ki =
f <50 Hz
Pi
.
(9)
The final feature vector consisted of 8 components: { log(P1),…,log(P4),log(K1),…,log(K4) }. The
logarithmic vector space was dictated by the fact that most relations between powers are either divisions or
multiplications. Therefore, the neural network has 8 input neurons corresponding to feature vector, 22 hidden
layer neurons - selected experimentally and 2 output neurons. The output coding was 0,0 for no vehicles; 1,0
for one vehicle and 1,1 for two sources. The network was trained using the Levenberg-Marquardt [1] algorithm
which gave the fastest convergence. [6]
6. MAXIMUM DETECTION
As it was stated in chapter 4, the localisation is equivalent to finding local maximums in function P(a)
given by equation 7. In a case when the neural network gives an answer that there is only one vehicle the
problem is trivial - just a simple maximum detection. However, in a case of two vehicles, it is more
sophisticated. Several disturbances are presented in a power signal due to background noise. Therefore, a
heuristic algorithm for maximum detection was developed. In the first step the power function is convoluted
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
188
with 7-point Hanning window. Next, local minimums MINq and maximums MAXq in smoothed power function
is detected (i.e. local maximum is presented for a given angle when power values for previous and next
arguments are lower). Maximums, below given threshold value Tp are rejected. The threshold value is
calculated based on all detected maximums:
T p = 0.1
1 Q
å MAX q ,
Q q =1
(10)
where Q is a number of maximums.
In the next step, some disturbance resulting in false maximums are rejected. Two kinds of disturbance
could be distinguished, presented in Fig. 3. The detection of false maximums is done by analysis of relative
values of maximums compared with nearest minimums: (MAXq- MINq-1) / MAXq i (MAXq- MINq)/ MAXq..
When these both values are below a given threshold (experimentally set to 0.2) then the maximum is rejected
(assumed as type A disturbance). And similarly in a case of disturbance type B, when one of theses values is
lower below other threshold (experimental set to 0.05). Finally, two largest maximums are selected and its
argument values give the estimated values of position (angles) of two vehicles.
Type A
P
o
w
e
r
Type B
P
o
w
e
r
MAX q
MAXq
MINq
MINq
MINq-1
angle
angle
Fig. 3. Kinds of disturbance in power signal
a)
a
α’’ D2
d)
a 22 next
g)
2
2 next
α’
D3
D4
a 12 next
D4
D2
D1
α’’
a 12 next
α’’
D3
α’’
D1
c)
α’ a 22 next
a 12 next
D4
D3
D2
f)
D1 α’
D3
D2
a 12 next
a 22 next
D2
D4
a2
D2
D3 2 next
α’
D1
D1
a 12Dnext3
D1
D4
a 12 next
D2
α’’ D1
D3
D4
a 22 next
D2
a 22 next
D3
D1
a 12next
D3
α’
α’
α’’
D2
a 22 next
a 22 next
l)
α’’
a 12 next
α’
D2
D4
k)
D3 α’
α’
D4
i)
a 12 next
a 22 nextα’’
α’’
D1
D2
α’’ α’
D4
j)
α’’ D3
a 22 next
h)
α’
a 12 next
a 22 next
D4
D2
e)
D1 α’
D3
a 12 next D1
α’’ D4
b)
D1
D4
a 12next
Fig. 4. Possible estimated position of vehicles in analyzed field
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Kunovice, Czech Republic, January 29-30, 2004
189
7. TRACING ALGORITHM
The results of previous localizations step requires further preprocessing. The main aim of this part of
the system is to assigned two angles to a given object and averaging of results.
At first two angles: α’ i α’’ from previous step are assigned to a calculated by MUSIC method
directions: a 12 and a 22 . This is done based on the nearest neighborhood rule. At first four distances: D1, D2, D3
and D4 are calculated (see Fig. 4) according to following equations:
D1 = || α’ - a 12 ||, D2 = || α’’ - a 12 ||, D3 = || α’ - a 22 ||, D4 = || α’’ - a 22 ||.
(11)
Next, directions are assigned according to simple rules, deduced from figure 4:
If D1+D2 ≤ D3+D4, then a 12 = α’ and a 22 = α’’.
(12)
If D1+D2 > D3+D4, then a = α’’ and a = α’.
2
1
2
2
Having assigned the directions to traced objects the directions are averaged. The averaging is based on
an idea that the vehicles are driving in linear directions, i.e. changes of directions are slow. This assumption is
justified for military vehicles. Therefore, linear regression model is used. However in some cases the directions
achieved from MUSIC algorithm are far away
A single frame of N
from previous direction assumptions. It is caused
samples
by signal disturbances, for example by bad
weather conditions. Therefore, the heuristic
y1(t)
algorithm was developed. It is performed for
y2(t)
each of two analyzed direction as follows:
y3(t)
·
Linear prediction of current
y
4(t)
direction based on ten previous
direction, resulting in a new .
·
Calculation of current position
FFT
based on MUSIC algorithm, i.e
a.
·
Linear prediction of next direction
MUSIC power
MUSIC power
Control vectors
based on nine previous directions
determination for
determination for
generation for
hypothesis that
hypothesis that
and current one form MUSIC
angles in range
n is 1
n is 2
a Î - p, p
block, resulting in a next .
·
Finally, current position is given
by:
2 greatest local maxima of
Power maximum for all a
Neural
power for all a
a ' = Aa new + Ba next - if a differs
·
network
more then a given threshold
for number of
vehicles
from a new or a next ,(A+B=1).
Tracing
Tracing
determination
·
a ' = Ca new + Da next + Ea otherwise, (C+D+E=1).
a 12 , a 22 – DOA's for two vehicles
a1 – DOA of one vehicle
(13)
Final result
Fig. 5. System overview
8. SYSTEM OVERVIEW AND PERFORMANCE
The presented above system (see Fig 5 for system overview) was tested on real data gathered in a
military filed (see chapter 2). The exemplar results for BWP and Gozdzik vehicles are presented on Fig 6.
Some problems with estimation of the direction angle of the vehicle localized far away from the microphones
(for large values of angles) could be noticed. Such effect is due to the fact that the vehicle sound was “blown
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
190
Direction [°]
away” by the very strong wind.
80
60
40
20
0
-20
-40
-60
-80
10
20
30
40
50
60
Time [s]
70
80
90
100
Fig. 6. Exemplar results of two vehicles localizations
Performed experiments showed quite good effectiveness of the presented method. The usage of the
MUSIC algorithm, tracking and neural networks for number of vehicle estimation resulted in a good overall
estimation of the direction angle of the vehicles. In the area from -45° to +45° the direction angle could be
estimated with an error below 3°. For distances up to 100 m the error of localization of the vehicle position is
within the length of the observed vehicle (± 5.24 m). In the area beyond -45° to +45° part of the half-plane
results are less satisfactory. It is probably due the fact of a larger distance from the vehicle to the position of
array of microphones (more then 100 m) and therefore low signal to noise ratio. But also in this area, the
system is able to estimate the position quite well. However, some real problems could be encountered in a bad
weather conditions. On the other hand, this is a limitation of all methods based on acoustic signals.
The overall result for neural network estimation of number of vehicles is more then 90% [6]. The
Neural network estimator has some problems when two vehicles are meeting together resulting in one sound
source (two vehicles on the same position), therefore it it’s physically impossible to reach 100% efficiency.
9. CONCLUSION AND FURTHER WORK
The researches related to the mobile object localization based on acoustic information can be used as
the basis for the new systems to support the contemporary military equipment. The system observes the stage in
a passive way, so it is very hard to find and destroy the military devices equipped by such intelligent
attachment. The proposed algorithm of localisation can be implemented in a very efficient way using DSP
technology (some primary work has been done on it) and microcontrollers as an intelligent robot. We think that
future investigation it to extend the system the other kinds of signals achieved from moving vehicles, like visual
one (from video camera) or seismic. It could improve results especially in bad weather conditions. Seismic
signals could be analysed in similar way than acoustic signals presented here however video signal requires
more sophisticated localisation algorithms, see for example [3] for some primary experiments.
REFERENCES:
[1] BISHOP, Ch. M.: (1995) Neural Networks for Pattern Recognition, Clarendon Press, Oxford.
[2] JOHNSON, D.H., Dudgeon D.E. (1993) Array Signal Processing: Concepts and Techniques. Prentice Hall,
Englewood Clifs, New York.
[3] MAMICA, J., WALKOWIAK, T. (2003) Genetic Algorithm for Motion Detection, MENDEL’2003, 9th
International Conference on Soft Computing. June 4-6, 2003, Brno, Czech Republic, pp. 29-34.
[4] SCHMIDT, R.O. (1979) Multiple emitter location and signal parameter estimation. Proc. RADC Spectrum
Estimation Workshop, Rome, 243-258
[5] WALKOWIAK, T. (2003), Acoustic Localization of Vehicles, ICTINS’2003 International Conference on
Information Technology and Natural Sciences, October 19-21, 2003, AL-Zaytoonah University, Amman,
Jordan, pp. 303-306.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Kunovice, Czech Republic, January 29-30, 2004
191
[6] WALKOWIAK, T., ZOGAL, P (2003) Neural Network Approach to Acoustic Detection of Number of
Vehicles, INNSC 2002 – 6th International Conference Neural Networks and Soft Computing, June 11-15,
2002, Zakopane. Heidelberg; New York: Physica-Verlag, cop. 2003, series Advances in Soft Computing,
pp. 909-914.
[7] ZAMOJSKI, W. et al. (1999) A System for Recognition and Tracking of Moving Objects on the Basis of
Acoustic Information (in Polish). Reports of the Institute of Engineering Cybernetics, Wroclaw University
of Technology, PRE 17/99, Wroclaw, Poland.
[8] ZAMOJSKI, W., MAZURKIEWICZ, J., WALKOWIAK, T. (1997), Mobile Object Localisation Based on
Acoustic Information, Proceedings of the IEEE International Symposium on Industrial Electronics,
Guimaraes, Portugal, July 7-11, 1997, vol. 3, pp. 813-818.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
192
THE FUZZY RATIONAL DATABASE SYSTEM FSEARCH 2.0
Josef Bednář
Department of Stochastic and Optimisation Methods, Institute of Mathematics
Faculty of Mechanical Engineering, Brno University of Technology,
Technická 2, 616 69 Brno, Czech Republic
E-mail: [email protected], Phone: +420-541142550, Fax: +420-54114 2800
Abstract: The main purpose of this paper is the theoretical development of methods of fuzzy logic
and fuzzy searching in fuzzy sets especially in fuzzy sets with a polygonal membership function. On
the basis of these methods a new approach has been developed to the theoretical solution of fuzzy
relational database systems. The results will be applied to the creation of the fuzzy database system
Fsearch 2.0 as an upgrade of the one-dimensional fuzzy database system Fsearch 1.0.
Keywords: fuzzy metric, fuzzy distance, fuzzy metric space, fuzzy number, fuzzy point
1.
INTRODUCTION
In a classical database, the information is represented by exact data. Such database systems can only
deal with unambiguous and well-defined data, but a database is a computer model of the real world. There exist
uncertain or ambiguous data and information which cannot be defined in a certain and well-defined form by any
means in the real world. Since in everyday life, we often make decisions based on such fuzzy data. An approach
exists in which the fuzzy set theory is used to designing the fuzzy relational database. The emphasis of this paper
is put on the fuzzy searching in fuzzy sets because information, which is represented as a fuzzy set in a fuzzy
database, is available only when we can find it in great quantity other information.
If we search item in classical databases, then we probably set an interval and we look for all items,
whose values lie within the interval and we can order these items by different rules. This way is a special case of
a fuzzy question, because the interval is a special case of a fuzzy number. We generalise a question so that it is
represented by a fuzzy number. The values of quantity are numeric data of a vague character, therefore they are
represented by fuzzy numbers. These fuzzy numbers are called information. We look for all items, whose
intersection of information and a question is not empty, this intersection is called the answer. We can order these
items by different rules.
2. FUNDAMENTAL DEFINITIONS OF FUZZY SETS THEORY
We will use the following basic notion of the theory of fuzzy sets [ 1, 2 ].
Definition: Suppose that X ¹ Æ is a universe of discourse (i. e. the set of all possible elements to be
considered with respect to a vague property. Then a fuzzy set A in X is defined as a set of all ordered pairs
{(x, μ A (x)); x Î X; μ A (x) Î 0,1 }, where mA: Xàá0,1ñ is the membership function of a fuzzy set A. The number
mA(x) is the grade of membership of x in A, from 1 for full belongingness to 0 for full nonbelongingness,
through all intermediate values.
Definition: The support of a fuzzy set A = (X, mA) is defined as the ordinary set
supp A = {x Î X; mA(x) > 0}.
The kernel of a fuzzy set A = (X, mA) is defined as the ordinary set
ker A = {x Î X; mA(x) = 1}.
The complement of a fuzzy set A = (X, mA) is defined as the fuzzy set
A =(X, m A ), such that m A (x) = 1 - mA(x) for " x ÎX.
The height of a fuzzy set A=(X, mA) is defined as a number
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
193
h(A) = sup m A ( x ) ,
X
if h(A) = 1, then a fuzzy set A is called a normal fuzzy set.
Definition: A fuzzy set A = (X, mA) is included in (is a subset of) a fuzzy set B = (X, mB), A Í B, if and
only if mA(x) £ mB(x), "xÎX. Two fuzzy sets A = (X, mA), B = (X, mB) are equal, A = B, if and only if mA(x) =
mB(x), "xÎX. A fuzzy set A = (X, mA) is empty, A = Æ, if and only if mA(x) = 0, "xÎX.
Definition: The intersection of two fuzzy sets A = (X, mA), B = (X, mB), is defined as a fuzzy set C =
(X,mC) = = A Ç B, such that mC(x) = min{mA(x), mB(x)} for " x ÎX.
The union of two fuzzy sets A = (X, mA), B = (X, mB), is defined as a fuzzy set C = (X, mC) = A È B,
such that mC(x) = max{mA(x), mB(x)} for " x ÎX.
Definition: Suppose aÎá0, 1ñ, then an a-cut of a fuzzy set A = (X,mA) is defined as the ordinary set Aa
Í X such that, Aa = {xÎX; mA(x) ³ a}, for "aÎá0,1ñ.
Definition: A fuzzy number A = (X, mA) is defined as a fuzzy set, where X Í R, which is normal and
convex (i. e. whose all a-cuts are convex) and mA is continuous in parts.
3.
THE CHARACTERISTICS OF FUZZY SETS
The answers (fuzzy sets) C on a question B can be ordered by different means. We order answers from
a most satisfactory answer to a least satisfactory answer. The fuzzy database will be used by different users.
Therefore, we select several right characteristics of answer. The answers will be ordered downward by height of
answers C and following characteristics. These characteristics and their properties are described
in Mendel’99 [3].
Definition: The measure of a fuzzy set A = (X, mA) is defined as a number
m( A ) =
ò mA (x)dx = ò mA (x)dx .
X
sup pA
Definition: The relative measure of a fuzzy set C = (X, mC) with respect to fuzzy set B = (X, mB) is
m(C)
m(C/B) =
, if m(B) ¹ 0.
defined as a number
m(B)
Note: To comprehend following n-dimensional characteristics we must note that the question
(information, answer), in more dimensions which are designated as a1, a2, …, an, is defined as the cartesian
product of partial questions ( information, answers) in single dimensions. The measure and the relative measure
of an n-dimensional fuzzy set can be defined, but this approach is unsuitable with respect to the numerically
demanding calculation. Therefore the characteristics of an n-dimensional fuzzy set are defined by the medium of
characteristics of partial questions (information, answers) in single dimensions.
Designation: the question B(ai)
the question B in dimension ai,
the information A(ai)
the information A in dimension ai,
the answer C(ai)
the question C in dimension ai,
the answer C(a1, a2, …, an) the n-sizeable fuzzy set C(a1, a2,…, an),
where C(a1, a2,…, an) = C(a1) ´ C(a2) ´… ´ C(an).
Definition: The height of a n-dimensional answer C(a1, a2, …, an) is defined as the height of the fuzzy
set C(a1, a2,…, an) = C(a1) ´ C(a2) ´…´ C(an). Height of an n-dimensional answer C(a1, a2,…, an) is denoted
by h(C(a1, a2,…, an)). Therefore h(C(a1, a2,…, an)) = min( h(C(a1)), h(C(a2)),…, h(C(an))).
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
194
Definition: The average height of an n-dimensional answer C(a1, a2,…, an) is defined as the number
n
hg(C(a1 , a 2 ,..., a n )) = n
Õ h(C(a )) .
i
i =1
The minimal measure of an n-dimensional answer C(a1, a2,…, an) is defined as the number
m(C(a1, a2,…, an)/B(a1, a2,…, an)) = min( (C(a1)/B(a1)), h(C(a2)/B(a2)),…,
h(C(an)/B(an))).
The average relative measure of an n-dimensional answer C(a1, a2,…, an) is defined as the number
The measure of an n-dimensional answer is not defined, it is not standardised.
n
mg(C(a1 , a 2 ,..., a n ) / B( a1 , a 2 ,..., a n )) = n
Õ m(C(a ) / B(a )) .
i
i
i =1
4.
THE MODIFICATION OF INTERSECTION
Question
Ç
Information
Answers
Evaluation
characteristics
of answers
Optimal
item
selection
Chart 1.
Chart 1 of fuzzy searching in fuzzy information, which is stored in a fuzzy database, has been
developed on the basis of foregoing considerations.
Different users will use the fuzzy database and they can require a less uncertainty of the answers.
Therefore we define two operations, which are modifications of the intersection, but they are more rigorous.
Definition: The confident intersection of two fuzzy sets A = (X, mA), B = (X, mB), is defined as a fuzzy
set C = (X,mC) = A Ç B, such that mC(x) = max{0, mA(x) + mB(x) - 1}.
Definition: The algebraic product of two fuzzy sets A = (X, mA), B = (X, mB), is defined as a fuzzy set
C = (X,mC) = A · B, such that mC(x) = mA(x) . mB(x).
Theorem 1. If two fuzzy sets A = (X, mA), B = (X, mB) exist, then A Ç B Í A · B Í A Ç B.
Theorem 2. Suppose that classical sets A, B with characteristic functions cA,cB: U ® {0, 1} are
defined in a universe X. If we understand characteristic functions cA and cB as membership functions mA and mB,
then A Ç B and A · B and A Ç B coincide with the intersection of classical sets.
Theorem 3. If a fuzzy set A = (X, mA), then A Ç A = Æ , A Ç A ¹ Æ and A · A ¹ Æ.
Definition: A question B1 is a contraction of a question B, if the fuzzy number B1 = (X, mB1) is included
in the fuzzy set B = (X, mB).
Theorem 4: Suppose that fuzzy sets and fuzzy numbers are defined in a universe X Í Rn, information
A(a1, a2, …, an) is fixed, a sequence of questions {Bm(a1, a2,…, an)} is defined so that question Bm+1(a1, a2,…,
an) is a contraction of a question Bm(a1, a2, …, an), a sequence of answers {Cm(a1, a2, …, an)} is defined either
so that answer Cm = A Ç Bm or so that answer Cm = A · Bm or so that answer Cm = A Ç Bm , mÎN. Then
1. The sequence of the heights of answers Cm(a1, a2, …, an), {h(C1),h(C2),…,h(Cm),…} is non-increasing and
the domain of this sequence is in interval H = á0,1ñ.
2. The sequence of the average heights of answers Cm(a1, a2,…, an), {hg(C1), hg(C2),…, hg(Cm),…} is nonincreasing and the domain of this sequence is in interval H = á0,1ñ.
3. The sequence of the minimal relative measures of answers Cm(a1, a2,…, an), {mg(C1/B1),
mg(C2/B2),…,mg(Cn/Bn),…} is not generally monotonous and the domain of this sequence is in interval
H = á0,1ñ.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
195
4. The sequence of the average relative measures of answers Cm(a1, a2,…, an), {mg(C1/B1),
mg(C2/B2),…,mg(Cn/Bn),…} is not generally monotonous and the domain of this sequence is in interval
H = á0,1ñ.
5. The sequence of products of the average height of a answers Cm(a1, a2,…, an) and the relative measure of
answers Cm(a1, a2,…, an), { hg(C1).mg(C1/B1), hg(C2).mg(C2/B2),…, hg(Cm).mg(Cm/Bm),…} is not generally
monotonous and the domain of this sequence is in interval H = á0,1ñ.
5.
THE FUZZY DATABASE SYSTEM FSEARCH 2.0
The foregoing results were applied to the creation of the fuzzy database system Fsearch 2.0 for metal
materials. In this database, fuzzy sets are represented as trapezoid fuzzy numbers [3, 4] whose membership
æ
æ x -a x -d ö ö
,
,1÷, 0 ÷÷ . The trapezoid fuzzy number is defined by four
function is given by μ A (x) = maxçç minç
è b-a c-d ø ø
è
numbers and that is evident from its membership function, therefore information are stored in the fuzzy database
as four-dimensional vectors and a question is represented as four-dimensional vector too.
Since the search algorithm has been described by now, next we demonstrate a run of the fuzzy database
in a particular example.
Example: The identification marks of metal materials and values of material characteristics are stored
in a material.dat file. The editing window, which is exemplified in Fig. 1, is intended for file management. For
illustration, dimensions, which are designated as a1, a2, a3, a4, are in turns tensile strength, hardness,
normalisation anneal, annealing point. If fuzzy database contains some data, then we can insert a question, which
is exemplified in Fig. 2.
· Select materials characteristics for searching.
· Complete Table 1 on editing panel.
· Press a button Solve. The characteristics, which are defined in part 2, are displayed on Table 2. We can order
downward every material in accordance with these characteristics in panel, which is above the Table 2.
· If we press a button Draw, then graphic window, which is exemplified in Fig. 3, is open and particular
answers for materials, which are specified in Table 2, are displayed in this window.
Standard number
of material
Material
characteristics
Figure 1.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
196
The question
Table 1
Solve
Downward order by
Draw
Figure 2.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
197
Figure 3.
Acknowledgement: This paper was supported by research design CEZ: J22/98: 261100009 “Nontraditional methods for investigating complex and vague systems”. The author would like to acknowledge
Associate Professor Zdeněk Karpíšek for his advice and support. Thanks go to Pavel Štarha for his assistance
in programming fuzzy database FSearch 2.0 .
REFERENCES
[1] NOVÁK, V.: Fuzzy množiny a jejich aplikace. SNTL, Praha 1990.
[2] DUBOIS, D. and PRADE, M.: Fuzzy Sets and Fuzzy Logic.Prentice Hall, New York 1980.
[3] KLIR, G and YUAN B.: Fuzzy Set and Fuzzy Logic: Theory and Applications. Prentice Hall PRT, New
Jersey 1995.
[4] BEDNÁŘ, J.: The Theoretical Properties of Fuzzy Searching. In: Mendel´99. 5th International Conference on
Soft Computing. Brno 1999, pp. 199-204.
[5] ROGER JANG, J. S. and GULLEY, N.: Fuzzy Logic Toolbox User´s Guide. Matlab Works, Inc.,
Natick 1995.
[6] KACPRZYK, J.: Fuzzinessin Database Management Systems. Physica-Verlag Heidelberg 1995.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
198
A SPECIAL DATABASE FOR THE MEASURING IN FIELD
Dalibor Bartoněk
Institute of Geodesy, Faculty of Civil Engineering, Brno University of Technology.
Veveří 95, 662 37 Brno Czech Republic.
Tel. 54114 7204, Fax: 54114 7218, E-mail: [email protected]
Abstract: The measuring in field is the most commonly used activity in various industrial branches.
To be the results useful the data must be stored in a suitable database. There is a problem: to use
either one of most of commercial database systems or to create the special purpose designed
database? This question is discussed and an own approach of the database design is presented in
this paper. The goal is the designed database with minimum of memory capacity occupied and the
most rapidly information provided. It is supposed that the basic database design rules e.g. to avoid
redundant data (normalization), ensure that the relationships among attributes are represented or
facilitate the checking of updates for violation of database integrity constraints are respected. A
special optimization method for common attributes reduction, for storing strings of variable length
and for database reorganization is used. The database model was implemented in the information
system for anticorrosive protection of pipeline used in gas enterprises in the Czech Republic.
Keywords: Data model, entities, attributes, object, period of measuring, measured values.Database,
object, measured values, quantities, relational model, E-R diagram, entities, common and
individual attributes, equivalent classes, overlay table, optimization.
I. INTRODUCTION
The measuring of quantities on the object in field belongs to common activities in many spheres of
individual usage. The gained values serve either to ensuring the run of certain devices or they inform the users
e.g. in the frames of prophylactic check up. In order to take effective advantage of these values it is necessary to
store it in optimal designed database. There are two possibilities: 1. to use a commercial database software
product (dBASE, MS Access Oracle etc.), 2. to create own special purpose database in any programming
language (Borland Pascal, Delphi, C etc.). The commercial database systems, however, were designed for
general purpose with the wide offer of functions which is connected with 2 main problems:
·
a lot of functions lead to complications of the system and inconvenient usage (some functions would
hardly be made use of, others the missing ones would have to be programmed by the experts
themselves),
·
by complications we mean tedious training of the operators who lack in experience with computers.
Having discussed all “pros and cons“ the authors together with a group of potential users decided to
make their own solution in Borland Pascal programming language (MS DOS application), later in Borland
Delphi (application for Windows). The results were used in the system for anticorrosive protection of gaspipeline.
2. ANALYSIS OF THE MEASURING PROCESS IN FIELD
Lets ponder about measuring on the objects of various types. On each of them we measure different
quantities in different time. The moment of measuring can be:
· regularly (periodical e. g. monthly, quarterly, yearly etc.)
· irregularly that means the measuring are done at random according to our needs.
The value of the measured quantity is under these hypotheses a function of two variables:
v = f(t, p),
(1)
where
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
199
v is measured value,
t is the type of the object,
p is the period of the measuring.
The value of the measured quantity depends upon the type of the object and the period of the
measuring. It means that of every type of the object various quantities depending upon the time of the measuring
can be measured. In every period (monthly, yearly etc.) a different kind of quantities on the given type of the
object of all types bat at the same time the required measuring plan which is the description of the individual
period of the measuring further function relationship (1). There are several basic types of the object. Every object
can be either of basic type or the combination of several basic types. This fact should be taken into consideration
as well.
3. CONCEPTION OF THE DATA MODEL
For the storage of all data measured in field we have decided to choose the most commonly used the
relational database. Relational data model will contain these entities:
· Object and sub-objects with the ISA hierarchy,
· Period of the measuring,
· Measuring (with measured values),
· List of quantities representing the equation (1).
Object and period are the strong entity sets, whereas measuring and measured values are the weak entity
sets, because they are dependent on the previous entities according to the equation (1). The list of quantities
represents the relationship between period and measuring and realizes the function in the equation (1).
The objects attributes from the point of view of belonging to the objects can be divided into 3 classes:
a) key attributes - Ak,
b) common attributes for each type of the object - Ac,
c) individual attributes, which distinguish the individual types of the objects - Ai. These attributes can
be from the point of view of data stored length divided into two groups:
· fixed length,
· variable length.
E-R diagram of the data model of the measuring in field shows figure No 1.
*Akq
*Ako
Aco
object
list of
quantities
measuring
ISA
*Akp
Aom
*Ako
period
+Adm
subobject1
subobjectn
Aio1
Aion
measured
values
+Adv
Aop
Aoq
+Adm
*Ako
Aov
Explanations:
Ako, Akp and Akq – primary key attributes of the object, period and list of quantities,
Aco – common attributes of the object, Aio1 … Aion – individual attributes of the sub-object No 1, … ,n,
Adm – discriminator of the measuring entity, Adv – discriminator of the measured values entity,
Aom – other attributes of the measuring entity, Aov – other attributes of the measured values entity,
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
200
Aop – other attributes of the period entity, Aoq – other attributes of the list of quantities entity,
…… cardinality 1,
cardinality N
…… non obligatory relationship
double rectangles – weak entity set,
double ellipses – grouped attributes
Fig. 1. E-R diagram of the data model of the measuring in field
4. IMPLEMENTATION OF THE DATA MODEL
The data model on Fig. 1 was transformed into tables of relational database according to the rules
described in (Korth, H. F., Silberschatz, A., 1996), (pokorný, J., 1998). The tables are of two types:
· Static, that means they have a fixed structure (tables of object, sub-objects, period, measuring
and list of quantities),
· Dynamic – the structure of which is variable according to the equation (1) – table of measured
values.
As for the static tables, the most complicated problem was with the creation of tables of object and subobjects (ISA hierarchy). To avoid data redundancy the method of attributes overlay was used. The principle
consists in dividing the sub-objects into basic types so that any sub-object can be composed by combination of
one or some objects of these basic types. Thus the number of sub-objects is very (approximately by one order)
reduced. Then is the overlay table created the structure of his is shown in Fig. 2. Rows in overlay table consists
attributes Aio1 – A ion, in columns are basic types of sub-objects. In the individual cell we write either symbol “*”
(string date type) or “#” (other date type). Another reduction of the database structure is by creation of overlay
table achieved. The process consists in 2 phases:
1)
2)
Deleting all of the columns in the table 1, which have the null values in all own rows. Objects of these
types are sufficiency covered by attributes in the basic table of object (common attributes). We obtain
a new table.
In the new table we step by step join columns according these rules:
a) We join these columns that have in his rows the some symbols.
b) Further we join columns that is different in n symbols only according these criterion:
n
m
i =1
i =1
nå ci tOj < må d i tOk
where
n
m
ci
di
tOj
tOk
is number of attributes, in which are lower-level entity sets Oj and Ok (sub-objects) different,
is number of attributes, in which are lower-level entity sets Oj and Ok (sub-objects) the same,
is length of i. attribute (common for lower-level entity sets Oj and Ok) in Byte,
is length of i. attribute (in which are lower-level entity sets Oj and Ok different) in Byte,
is estimation of frequency of occurrence lower-level entity sets Oj in database,
is estimation of frequency of occurrence lower-level entity sets Ok in database,
Table 1: Attributes overlay table
Attributes/SubO1
objects
A1
*
A2
Am
(2)
O2
On
*
#
#
#
After the reduction process every row in the table 1 must contain at least one character * or #
(completely overlay). Columns in the table 1 represent equivalent classes of sub-objects. We can define 2n tables
in the database, where n is number of columns in the table 1. N tables are for attributes of string date type (there
are a special compression method for string saving used on the internal database level) – see Fig. 3, n tables for
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
201
attributes of non-string attributes) – see Fig. 4. Internal structure of the database for saving records of objects is
in Fig. 2 and. We use two files; first for the basic data e.g. common attributes Aco1, that must have a value (not
null), second file contains two groups of attributes: Aco2 (common attributes, that may be null value), Aio,
(individual attributes of object type). The field “Info object” uses trigger for record resize. The measuring data
are saved also in two files. First file for all attributes of measuring Am = Ako + Adm + Aom, second file for
measured values. This file has 3 parts: 1. “info-measuring” serves for trigger to record resize, 2. “bitmap” is
sequence of bits (flags) corresponding to measured values. If the flag bit is zero, the correspondent value is null,
if it is “1”, the correspondent value is valid. D1 is length of bitmap, D2 is length of array of measured values.
Dynamic table for measured values is created with the help of the table of quantities list part of which is
shown in the table 2. This table realizes the dependency in equation (1). The quantity Qi is loaded into the
dynamic table of measured quantities if is measured on one of the sub-objects types <T1, … Tm> and in one of
the periods <P1, … Pn> written in the row - see table 4.
Basic data of the object
Extended data of the object
File 2 – extended
Aco1
File 1 head
Aco2
Aio
Info_object
pointer
size
Fig. 2. Internal structure of object records saved in the file
Table 2. Part of text attributes of sub-object type
Attribute i
Attribute i+1
…
…
…
…
string i
string i+1
…
…
…
…
Attribute m
…
string m
…
file
ni
string i
ni+1
string i+1
nm
ni+2
string m
number of string characters
Fig. 3. Compression method of saving the text attributes into the file
File 1 – head
data
Bitmap
File 2 - values
Am
Info_meas.
2B
D1
pointer
Table 3. Measured quantities
Q1
Q2
…
Qk
…
…
…
…
1.45
12.61
…
4587
… 2B … 4B
…4B
… 4B
1261
4587
BM
D2
145
D1
D2
Fig. 4. Method of saving non-text attributes and measured values into the file
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
202
Table 4: Part of the quantity list
Quantities
Qi
Sub-object types
T1
.....
Periods of measuring
Tm
P1
…..
Pn
5. DATABASE UPDATE ACCORDING TO THE USE IN WINDOWS
At present the database model was updated to run uder the Windows operating system. The MS Access
was used as the base relational database system. The source code of the application is written in Borland Delphi
and it is connected with the MS Access by ODBC interface through the data modul. The conception is shown in
Fig. 5.
MS Access
database file
(tables)
Data module
(database, data
sources) +
BDE components
(Borland Delphi
Engine) – (tables,
queries)
ODBC link
Application
Fig. 5. Conception of the current database solution
The advantage of this conception consists in simple data manipulation (queries, filters etc.), whereas the
insert database operation has to be in code source ensured against the primary key collision. If the primary key
collision occurs, the MS Access causes exception error in the application.
6. CONCLUSION
The database model was used in the information system for anticorrosive gas-pipeline protection in gasworks in the Czech Republic. This system with the name GAS-ACOR has already been running for more then
10 years and it had been upgraded according to the users needs. First version was made in Borland Pascal for
MS-DOS. The last upgrade consisted in introducing module for universal file print in the network environment.
This module was created in Borland Delphi. The authors provide the users with help in the form of hot line or
personally.
Last year it has been worked on the converting of the GAS-ACOR system from the MS-DOS system
into Windows. Source modules are made in Borland Delphi. The testing run of the new system is operated in
South-Moravia gasworks in Brno. The introduction of GAS-ACOR into practice will mean a considerable time
saving and the increase of the efficiency of the anticorrosive protection.
REFERENCES
[1] KORTH, H.F, SILBERSCHATZ, A., (1996), Database System Concepts. Third Edition. McGRAW-HILL.
[2] POKORNÝ, J., (1998), Databázová abeceda. SCIENCE, Veletiny, [in Czech].
[3] SEDLÁČEK, S., BARTONĚK, D., (1996), GAS – ACOR. User manual. SHINE, Brno, [in Czech].
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
203
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
204
INTERNET - A MODERN E-LEARNING MEDIUM
Marek Woda, Tomasz Walkowiak
Institute of Engineering Cybernetics, Wroclaw University of Technology
ul. Janiszewskiego 11/17, 50-372 Wroclaw, POLAND,
Phone: +48-71-3202969, +48-71-3203996 Fax: +48-71-3212677,
E-mail:{ mwoda,twalkow}@ict.pwr.wroc.pl
Abstract: The Word Wide Web, popular called Internet, as contemporary medium for distant
learning will be presented.Some techniques and technical aspects of substitution traditional
methods will be also discussed. Conventional techniques of teaching are in experts’ opinion rather
outdated and should be substituted by new ones. The author reveals a routine to improve
traditional way of e-learning by a usage of Learning Agents.
1. INTRODUCTION
Distant learning is a process of teaching, which takes place between teacher and students usually
separated in time and space. This kind of teaching is become even more popular, so wiling to vary this way of
education educators started using all available tools possible to apply [6] specified by the modern e-technologies.
Though idea of distant learning came into being about three century ago along with appearance of first
mail courses, in informational and computerization age is being even more attractive form of teaching. Growing
need of gaining larger and larger knowledge and reduction of time required for stationary methods, especially for
the beginners, caused development and improvement in distant learning routines. This kind learning is often only
one alternative form for people who are handicapped who for health reasons cannot afford for studies far away
from home. It is also alternate proposition in case of the unemployed training. On the other hand development of
this form of education gives universities opportunity to reach potential wider audience.
Essentially exist only two categories of distant learning: synchronous and asynchronous [2].
Synchronous one requires simultaneous participation both students and tutors (educators). Main advantage of
such kind of education is mutual interaction that takes place in real time. Exemplar forms of synchronous
teaching are interactive television, teleconference, different kind of internet chat-rooms etc. Asynchronous
teaching does not require students and instructor's simultaneous participation. It isn't required from students, to
come together in specific place and time. The students have plenty of possibilities: choice of place, range of
time, what they intend to learn and what materials are going to use according to the tasks that they ware
entrusted to. Asynchronous education is more elastic than synchronous one. The asynchronous forms of teaching
appear as electronic letters, audio-video broadcast transmissions, all sort of correspondence or Web pages. The
biggest disadvantage of this form appears in possibility of accumulating significantly big amount of unfeasible
tasks by the student [4].
Nowadays the utilization of Internet in Poland for distant teaching is on initial stage of development.
The Internet, first of all, serves for asynchronous access to educational materials. This is the kind of access to
information became substitute for traditional textbook.
Dynamic development of new techniques in sending and processing data thru the net let us improve and
develop more effective forms of teaching. Researches on this subject are carried out on whole world. One of the
academic centers in Poland, which carries out this kind of experiments, is Wroclaw University of Technology
that is one of the leaders in putting into practice multimedia and Internet in e-learning.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
205
2. TECHNOLOGICAL ASPECTS OF DISTANT LEARNING
Research group of Institute of Engineering Cybernetics in Wroclaw University of Technology
developed experimental course MultiMedia Education: “An Experiment in Delivering CBL Material”
(nr. PL1046) [1]. That course exploits newest internet technologies. All discussed below (in point 2) techniques
concern this course but could be applicable in general to the most e-learning solutions.
Material of this course concerns microprocessor systems and it is accessible for users equipped only
with computer along with internet browser and Internet access. All materials are delivered through net. Tests
have proven that access by telephone line is sufficient to effective using all materials.
During developing distant learning courses many conventional solutions ought to be replaced by the
alternative ones and the main goal is to integrate various techniques of learning to make this process as much
possible attractive to learners. Traditional lectures could be replaced by the previously recorded of video
presentations. Every user then have possibility with assistance of internet browser to view and listen earlier
recorded lectures (quick preview of content is great support) along with additional on screen notes that appear on
screen. Traditional textbook should be replaced too. For this purpose multimedia presentation must be used.
Taking advantage of animations and necessity of interaction from user side greatly increase effectiveness of this
form of learning.
Fig. 1. Main menu of Multimedia Education Course
(based on: mm-edu.ict.pwr.wroc.pl)
Another task is to create an internet form of laboratory activity. This is frequently the greatest challenge
in all projects but if it is successful it results in the development of a virtual laboratory. Such kind of laboratory
exists only virtually not physically. Devices accessible by students are entirely simulated by computers’
software. The purpose of such laboratory is to make available advanced learning contents in a network
environment. Furthermore, distant access to programming environment is ought to be implemented, which gives
the user the possibility to write programs for controlling any device.
Certainly we cannot stop thinking about checking learning progress. Therefore we must include several
sets of test questions to check student’s knowledge. Results of those tests should be immediately available to the
teacher and the student. To test practical skills, additional sets of tasks for students that verify gained knowledge
should be placed in the course content. Students are required to perform special tasks with virtual devices. To
integrate all mentioned above subtends activities course management system must be used e.g. like commercial
WebCT system. Everything needed is included in its environment, it provide decent set of administrative tools to
assist the instructor in the process of management and continuous improvement of the course.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Kunovice, Czech Republic, January 30-31, 2004
206
2.1. Internet University
First element that is crucial during construction a base for e-learning solutions is the virtual equivalent
of university (understood as institution, facility) that have to meet three essential functions as: student’s
management, dissemination of knowledge, and testing progress of its absorption. Such assignments are fulfilled
by distant education supporting systems called Course Management Software.
With assistance of CMS systems integrating many techniques is feasible. Such systems can make
possible course material sharing, checking progress of knowledge absorption by each student, provide users’
authorisation policy, present large statistics and many more.
Virtual Learning Environments [9] is an important part of all distance learning courses. In the many elearning solutions the WebCT core system is commonly used for the designed course. WebCT is a widely used
course management system in higher education, enabling the delivery of online education around the world. The
WebCT solution integrates the richest and most flexible pedagogical tools with existing campus infrastructure.
Secure "virtual classroom" environment can be deployed enterprise-wide to supplement the traditional classroom
or for pure distance programs.
Use of the WebCT course management system allows institutions to efficiently leverage campus
resources to both extend their offerings and enhance the teaching and learning experience. For example, one of
the many benefits of using WebCT includes the ability to offer an "always on" environment, providing more
time for the student to interact with professors and classmates as well as with the course material in an efficient,
engaging and effective manner.
Course management is possible thru the browser window. Worth to add is complex students’ progress
indicator module. It usually integrates wide range of tests, quizzes along with open questions and simple content
analysis. It automatically checks the answers along with result analysis for entire students group that easily allow
check students knowledge.
WebCT provides rich capabilities for tracking student learning data within a course. Among the
numerous pieces of student performance data that instructors can analyse are portions of the online course that
students are focusing their time on, frequency that students participate in online learning activities, student
performance assessments.
Fig. 2. Assessment canter in WebCT Core
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Kunovice, Czech Republic, January 30-31, 2004
207
These data provide a rich source of information for improving the learning experience, for individual
students as well as for continually improving the, course experience for all students, current and future.
Other useful features of the WebCT system are the questionnaire and quiz facilities. WebCT is one of
many available distant learning systems. There are a many different solutions available: LearningSpace (Lotus
Notes), TopClass (WBT Systems) or BlackBoard (BlackBoard Inc.).
2.2. Internet lectures
One of the most important ways of training is a lecture. In the case of remote learning usually TV or
video broadcast is used. When producing an internet lecture it is necessary to transfer audio and video recordings
to computer using special video equipment (cameras, video cards). After initial processing, video material is
compressed to lower bandwidth demand when transmitting it to the audience. Files with recorded audio-video
content are stored on internet server and later transmitted on demand to the students’ computer as a data stream.
The quality of audio and video in student’s computer highly depends on data bandwidth with server. One of
producers who make software enabling video transmission is Real Networks Inc.
Not only audio-video recordings could be used but there also exists a possibility to transmit to students
a virtual view of lecturer’s whiteboard. Putting all together we used the following equipment called MiMio
Whiteboard. MiMio is exceptionally interesting equivalent of conventional blackboard and it should be used in
any distant learning course to improve comprehensibility of lectures. It transforms a standard whiteboard from a
simple writing medium into a powerful collaboration and communication tool. It is easy to make any whiteboard
digital by simply attaching MiMio.
Fig. 3. Application of MiMio Whiteboard technology
After that, everything written or drawn is captured electronically into the computer in colour and realtime - ready to be saved and shared with anyone, anywhere, anytime. MiMio improves productivity and
collaboration by allowing meeting participants, team members and students to concentrate on sharing ideas,
participating in discussions and solving problems instead of taking notes. But the most important thing in our
situation is that the whole process of creating a picture on a whiteboard can be recorded. In connection with
audio and video recording this gives us a possibility to make a presentation containing a real lecture.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Kunovice, Czech Republic, January 30-31, 2004
208
Fig. 4. MiMio tool
2.3. Internet textbook
One of most popular forms of learning is simply a book. In computer aided learning the ideal
substitution for a book is a so called multimedia presentation that integrates many technologies in one dynamic
presentation. One can say that this is even more than just a book. Possibility the intercalation with users has
similar effect as practice classes.
The most popular system for creating guided dynamic presentations is Macromedia Authorware.
Authorware is an integration program. It is designed for putting multi-media together in one structured,
interactive program. It is not only one, but basically the best tool for creating images, sounds or animations.
The metaphor used is that of a procedural program's execution flow line. The developer adds graphics,
animation or sounds to the flow line in the order he wants them to appear. Media components can be created in
Photoshop, Director, SoundEdit, FreeHand, Authorware itself, etc, and can stay up on the screen for as long the
developer wants by setting the components' properties.
Media components are added to the flow line using Icons, hence Authorware's Icon-Based authoring
scheme. There are different kinds of icons that represent different media components. You drag an icon from the
Icon Palette to the main flow line and double-click on it to add desired media. The icons that distinguish
Authorware from other authoring tools are: Framework, Interaction and Decision Icons. The Framework Icon
allows you to navigate through parts of your program, the Interaction Icon allows you to interact with the user,
and the Decision Icon allows you to take specific paths according to criteria you set. It is important to note that
other programs will allow you to create the same functionality, but they will inevitably require heavy scripting,
while Authorware Icons have their functionality self-contained.
2.4. Laboratory exercises
One of the most indispensable forms of classes is a laboratory exercise – working with genuine objects.
It allows the students to attain practical skills and to combine theoretic knowledge with practical proficiency.
When we deal with distance learning on a daily basis there is no opportunity to contact with real devices.
Therefore, practical classes are the real challenge. Generally, there are two approaches to this problem and each
of them has its own advantages and disadvantages: real laboratory with distant access and virtual laboratory.
The plan of remote access to real devices in a laboratory is mainly rejected because of following
shortcomings limited number of simultaneously working students (depending on the number of devices available
in classroom) and the entire cost of the system - devices, video cameras (to view the actual state of the process)
and any other additional devices that make communication possible. Therefore more often the Virtual
Laboratory is used. This kind of facility actually does not exist. Genuine devices, on which students could
operate, are being replaced by the programs which can simulate activities of those devices usually in real time.
This kind of laboratory can be accessed thru user computer using Internet connection and is limited only by
amount of resources on the server and the network bandwidth, neither by the number of real devices nor the time
of the day. There is no additional requirements needed to start exercising, neither additional devices such as
video cameras nor any other special interfaces are required; therefore no supplementary human service is
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Kunovice, Czech Republic, January 30-31, 2004
209
needed in case of a device malfunction. The simulation software usually runs on the client side or rather seldom
on the server only when simulator consumes extraordinary large number of resources. Sometimes there is an
option of running one part of the simulator (responsible merely for visualisation) on the user’s machine and the
second one on the server (where server simulates behaviour of device). The choice depends on many factors
such as quality of the network connection, simulation algorithm complexity, difficulty of installation on the
user’s machine, system maintenance (new versions and patches distribution) and widely interpreted computer
system security.
The aim of the each virtual laboratory is to support learning of as much types of devices as possible
along with possibility of programming microprocessor based devices such as PCs. The main intention of the
system is a remote access to those devices through the Internet. The user is expected to have internet browser
with any runtime environment such as Java and at least dial-up connection.
Naturally all virtual peripheral devices could be are programmed in previously fixed programming
language e.g. C/C++ or Java.
The students can see the controlled device thru the window in the Internet browser. Usually, in the same
window there is a module presenting the status of the device components, which can be accessed from the
program. There is also a possibility to manipulate any moving part of device by sending to it suitable sequence
of commands.
Beside the ability to view the computer-generated device, the user can operate in the programming and
running environment. He can write programs that control devices, compile and run them.
In order to prevent unauthorised access to the laboratory every admission is controlled. Each time
student logon to the lab he must obey the rules and provide his login and password. Only after the successful
login the access to all devices and tutorials is granted.
To ensures portability most of virtual laboratories are written using the Java language to make possible
operating on many hardware platforms.
3. DISADVANTAGES AND PROBLEMS OF DISTANT LEARNING
Despite many of undeniable strong points distant learning with utilization Internet has its own weak
points. It Students feel isolated without any real tutor, from whom they can seek advice, they often lose their
eagerness to learn [2]. Distant learning process leads also to isolation of human being, by delimitation
interpersonal contacts and finally to dehumanization of gaining knowledge. To the worst disadvantages we may
include time-consuming and expensive process of preparing complicated teaching materials (e.g. course content,
lesson, electronic lectures, quizzes, tests)[8], complexity of management or impossibility teaching some
disciplines of science such as medicine.
Not infrequently we can deal with phenomenon when well prepared distant learning course is
characterized by worst efficiency than the regular one. The cause of this phenomenon is a human factor. It could
be well noticed in case of students who are not proficient in operating computers [6]. Main reason is that
students are incapable to select essential information from so-called “information noise” and lack of
interpersonal contacts with other disciples.
Apart from mentioned before problems elementary problem in distant learning lays in difficulty to
match material that will suit student needs and will be proper for his background or his level of knowledge.
Another huge difficulty constitute problem how to manage the students with minimal amount of workers. If
number of students exceeds lets say hundred or more it is considerably more difficult to manage them or track
their learning progress. With growing number of disciples the assessment, scheduling their classes or exams
become highly inconvenient.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Kunovice, Czech Republic, January 30-31, 2004
210
During the work with students there is a need to draw attention to different times of knowledge
absorption by individual students. It indirectly follows that each disciple has own knowledge perception and
absorption mechanism.
In classical form of teaching all presented procedures are executed by a “human factor” what could
cause (and usually causes) crisis situations, such as lack of control and supervision over entire group during
learning process (because of tutor’s tiredness), delivering inadequately prepared learning content (because of
ignorance of students’ skill). All these actions have an effect on disciple’s eagerness factor.
Remedy for formerly presented inconveniences connected with e-learning and the manner how
improves efficiency of absorbing knowledge could be application of Learning Agents [11].
4. AGENT AND DISTANT LEARNING
4.1. What is agent?
Speaking in most general terms, agent is processes that operates in background and performs some
activities due to its own agenda after any event occur.
In Internet environment Agent is a system component that reside on client side - in user computer. It
collects information about users’ state and sends data back to the server that should be perceived as repository of
data and management system [5].
Agents’ management system could operate on main computer in the net, though in case of distributed
management systems can be also located in many nodes of net, and some local harvesting data could be also
possible that send periodically gathered data to the main node.
Measured
vector
data
Data processing
Data
Analysis
Observatio
n of process
Image of
activities
user
Fig 5. Illustration of agent activities
In simplification „agent” is a computer program, which is capable to execute and supervise complex
operations such like retrieving data, “customer maintenance”, etc.
In academics spheres term agent is still being discussed. Scientists cannot present one coherent
definition for the agent. That is why this term is widely used for many, sometimes, different products and
technologies. Some called agents just “robots”, abbreviate by “bots” – because idea of the agents is to help
people out in doing their common boring action. What is more, agents are capable to perform their actions, even
when users are temporarily not connected to Internet – agents’ autonomy is one of their essential features.
Multi-agent systems, as a rule, have their application almost everywhere – nonetheless net applications
of agents are their fastest developing domain of science. The agents are commonly used for collecting user data
and for realization very complex tasks thru activity analysis and adjusting to suit user needs. Such systems have
a great influence on distant learning techniques that exploitation of Internet as a transport medium, where they
are perceived as some kind distributed supervision over learners. Agents are capable significantly relieve human
supervisor of their duties and help in a proper selection of the course content to be optimal for every individual.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Kunovice, Czech Republic, January 30-31, 2004
211
There is a need during work with students to draw attention on a various level of knowledge absorption
by particular individuals and it involves with user intelligent profile calculation. One of the best known means,
to determine person profile, its abilities, interests or weak points is application of multi-agent system, which is
one of many techniques that support e-learning [10].
Minimal aim in this case is to create an agent that will navigate and select material within the one Web
course page limits. This aim can be easily achieved after the ability recognition and classification of learners is
attained. For example if system is able to differentiate at least two groups of students.
First dilemma that we encounter is the problem how to estimate if given portion of material has been
absorbed or not and next what course level will suitable for him as next step.
These problems cannot be solved directly by the agent but with its help tutor can get very useful hints
for each individual. Assigning a proper level, on a basis of estimation some previously established parameters,
for each learner is a right task for an agent. Next step that should be undertaken is user interests analysis by the
his prior activity analysis in order to presented learning material not to be bored for him but encourage to further
learning.
It can take place on the basis of complex synthesis of course movement path across course content on
initial skill analysis. Each node in hierarchy could posses ascribed some values that correspond to agreed worth –
units that describe user skill. Thorough analysis of navigation process along with units counting and their proper
interpretation reveals such desirable user profile. In further stages constant analysis of user progress allow to
automatic material selection and will relive supervisors duties in significant degree.
Both learner’s behavior and his analysis made by agent could be used by the other agents – for example
retrieving agents in order to find information that could be useful in the near future. Action of retrieving data is
merely, unconsciously triggered by the learner and it happens without his knowledge in independent way.
4.2. Characterization of user profile
Single agent or even groups on them could be scheduled to determine the intellectual profile of users.
Agent that models user behavior, adapts interface and marks nodes in movement path in knowledge tree, it
passes information about user’s material interest factor. All information are collected and passed on to
classification agent. It’s the first step in ascribing one of possible advancement levels to a particular student. The
analysis of collected data allows for general evaluation and assigning to one of three levels (basic, intermediate,
advanced). Qualification to one of these levels requires earlier selection of material in a knowledge base.
4.3. Navigational hints
Navigational agent, as the name indicates, is responsible for a proper “navigation” across course
material placed on WWW page. Materials could have placed on many locations – so that the task for such agent
is facilitate the access and collecting information about them depending on user profile obtained form agent that
analyze user movement across course content. Thru material classification performed by mentioned above agent
the most vital materials or information are presented using interface agent. Materials strongly connected with
course topic are in queue and wait for user access.
Every time, when student made any choice each movement is tracked and analyzed by the bunch of
agents, which are keep trying to adapt course content to suit user needs.
4.4. Harvesting data
Finding materials useful in a given course / lesson is not an easy matter. Not only for the sake for its
appropriate selection, but their placement. It is extremely seldom situation when complete course is placed on
www site along with thorough set of materials concerning the course subject. Usually additional materials are
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Kunovice, Czech Republic, January 30-31, 2004
212
placed on multiple web pages outside of course main page.
Process of retrieving such data, for the average user, is usually too much time-consuming and not
always data found fully match the course subject. Harvesting agent could turn out be very exceptionally useful in
such situation – it helps retrieve data and additional information gathered by the previously mentioned agents
(info about type of user and his preferences).
Results of its work are strictly connected with performed (or not) tasks of other agents. And the
efficiency of harvesting agent will be strictly dependent on them.
5. SUMMARY
Distant education plays even more important role in process of education many social groups. Its
importance is growing considerably year by year, among other things, thanks to spreading Internet accessibility.
And its popularity, especially among students form large area countries (like Canada, USA, Australia), matches
traditional one. Both in Europe and USA majority of well known universities started teaching using e-learning
courses. For these reasons distant learning should not be consider as a margin phenomenon, and the interest of it
should be aroused by academics.
It should be emphasized that e-learning is more effective and more economical than conventional one,
but only in one condition – adequate preparation of material along with putting great effort into its realization.
Distant education posses many advantages but quite a lot of negatives too. Some of them could be removed
effortlessly. But unfortunately to get rid the others disadvantages it ought to be introduced a brand new
intelligent model of teaching employing so called smart agents.
Multi-agent systems become more popular at present in e-learning solutions. In particular individual
Agents are perceived as some kind supervisors of simultaneous distributed education process for many
participants. They could be very helpful in preparing personal paths of learning for individual students and come
in handy to estimate progress of learning for each student.
Agent’s architecture used in e-learning solutions creates new possibilities of efficient and quick work in
the distance learning area. Agent-specific techniques are mainly used for estimation of knowledge absorption,
adjusting tasks to be suitable for an individual and optimization a whole process of gaining knowledge to be
optimal for each student.
Presented approach that incorporates Learning Agents into teaching process may considerably shorten
time of gaining knowledge by increasing efficiency and allow for increasing number of students without
employing more human supervisors or even additional control.
So that we think that highly specialized distant learning systems with built-in agent’s techniques are more
effective for delivering e-learning multimedia content and solving distance problems than classical one.
Multi-agents systems, in near future, will most probably constitute the basis of the majority systems for
distance learning education.
REFERENCES
[1] BARÁNSKI, M., WALKOWIAK, T., MM-EDU: network virtual laboratory. Raporty Inst. Cyb. Tech.PWr., Seria - SPR 2/2001, 2001.
[2] BOCZUKOWA, B., „Edukacja na odległość” - Akademia Podlaska, Siedlce, 2000 [In Polish]
[3] CRADDOCK, I., MENDDRELA, M., "Capture and Asynchronous Remote Delivery of Handwritten,
Audio and Video Lecture Content"ALT2001
[4] Distance Learning Week: What is Distance Learning? Public Broadcasting Services, 1999;
http://www.pbs.org/adultlearning/als/dlweek/whatis.htm
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Kunovice, Czech Republic, January 30-31, 2004
213
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
HECZKO, R., „Systemy Wieloagentowe” - http://ie.silesnet.cz/mas-pl.html [In Polish]
KUBIAK, M., „Internet dla nauczycieli – nauczanie na odległość”, Mikom, Warszawa, 1997 [In Polish]
KUBIAK, M., „Wirtualna edukacja po polsku”, Computerworld, 2000/26 [In Polish]
Online Journal of Distance Learning Administration, 1998 – http://www.westga.edu
Projekt MM-EDU http://mm-edu.ict.pwr.wroc.pl
STEINER, V., "Technology in education",
1995 - http://www.wested.org/tie/dlrn/distance.html
Ulises Cortes García,Ramon Sanguesa Solé, Javier Béjar,Tim Hall “Improving Learning Tools by Means of
Cooperative Agents Technology” - Dept. Llenguatges i Sistemes Informŕtics, Universitat Politčcnica de
Catalunya,
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC
2004
EPI Kunovice, Czech Republic, January 30-31, 2004
214
USING THE RUP FOR SMALL WEB INFORMATION SYSTEMS
Petr Švec
Brno University of Technology, Faculty of Mechanical Engineering,
Institute of Automation and Computer Science
Technická 2, 616 69 Brno, Czech Republic
e–mail: [email protected]
Abstract: This paper deals with describing and customizing the method for proceeding in the
software development. The objective is to reduce quantity doing operations at the Rational Unified
Process thereby to accelerate development of the small information systems with special demands
for deploying on the Internet.
Keywords: Rational Unified Process, Unified Modeling Language, Web Application Extension,
customizing RUP.
1. INTRODUCTION
It is necessary to apply the project management during development of large projects as well as of small
ones. The Rational Unified Process (RUP) is a clearly defined and structured software engineering process
framework. It is a software development approach that is iterative, architecture–centric, use–case–driven but
mainly it is customizable for all needs of the users. The various configurations can be made to support small or
large teams or more or less formal approaches to development. The paper is concerned with particular parts of
the RUP that are presented in [4, 5]. The books [1, 6, 7] introduce use–case–driven approach and the Unified
Modeling Language (UML) is explained in [2, 8, 9].
2. THE STRUCTURE OF THE RUP
The RUP consists of two dimensions: a dynamic aspect (horizontal) and a static aspect (vertical). The
dynamic aspect expresses lifecycle, phases, iterations and milestones. The static aspect expresses activities,
disciplines, artifacts, and roles.
Figure 1: The structure of the RUP
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
215
The dynamic structure deals with a lifecycle or a time dimension of the project. The RUP uses an
iterative approach that divides a project into four phases: Inception, Elaboration, Construction, and Transition.
Each phase consists of one or more iterations. The iteration focuses on producing the technical matters necessary
to achieve business objectives of that phase. At the end of each phase is a milestone that helps to specify whether
we can continue with the next phase or not, thereby it decreases possible risks.
The Inception phase establishes a good understanding of all customers’ needs, requirements and a scope
of the system. Further it mitigates many of the business risks and produces the business analysis for building the
system. The Elaboration phase takes care of many of the most technical tasks (design, implementation, tests, key
components, etc.) and it tries to pinpoint major technical risks (performance, data security, etc.) by implementing
and validating actual code. The Construction phase does most of the implementation over whole project. Finally,
the transition phase builds the final version of the product and delivers it to the customer.
The static structure defines how individual disciplines are ordered in each iteration of the concrete
phase. The disciplines are logical containers for all process elements – roles, activities, artifacts and associated
concepts, guidelines and templates. The disciplines can be divided into two main parts – fundamental and
supporting. The fundamental part includes business modeling, requirements, analysis and design,
implementation, test and deployment. The supporting part includes configuration and change management,
project management and environment. The individual processes explain who does what, how and when. The role
is an individual or a group that has a competence and a responsibility for certain part of the system. The artifact
is a piece of information that can be produced, modified or used by another processes. The activity is a unit of
work that is assigned to specific role. Workflows are used to describe meaningful sequences of activities that
produce some valuable result and to show interactions between roles. The most common workflows are the
disciplines.
3. PRINCIPLES OF THE RUP APPROACH
The essential principles, so–called best practices, are iterative development, requirements management,
architecture and use of components, modeling and UML, quality of process and product, and configuration and
change management.
A major goal with the iterative development is to reduce risk early on project. This is done by analyzing
and reducing top risks in each iteration. As shown at Figure 2, a few requirements, analysis, design,
implementation, and testing is done in each iteration. Each iteration finishes with an executable application.
Figure 2: The structure of iteration
Tools called “use cases” are used to capture customer’s requirements in the requirements management.
They are used during following analysis, design, implementation, and test disciplines. The component–based
development encapsulates data and the functionality operating upon that data into a component. Every
component has an interface in order to allow the other applications or part of system to connect to. This feature
of component–based development is called encapsulation, which makes components easy to reuse.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
216
In the RUP maintaining and developing models describe individual parts of the system. To construct
these models the UML is used. The UML is a graphical language for visualizing, specifying, constructing, and
documenting the concrete parts of the system. Besides the UML we have to use the Web Application Extension
(WAE) for representing Web pages and other architecturally significant elements in the model.
Figure 3: System modeling from different views
A model is simplification of reality that completely describes a system from different views (Figure 3).
The models are used for better understanding the system we are modeling. They are important because they help
the development team and the customer to communicate.
As mentioned above, the software is rebuilt in each iteration. As soon as the software is built, testing is
executed. Regression testing ensures that new defects are not included in new iterations. Finally, the change
management is a systematic approach to managing changes in the requirements.
4. CUSTOMIZING THE RUP FOR SMALL WEB INFORMATION SYSTEM
In our example we introduce a small web information system for outsourcing organization. The
outsourcing service must be founded upon regular communication (feed–back) between the customer and the
provider. Our system includes secure access for users, administration of customer’s requirements, administration
transactions, generating reports etc.
We adapt all of the processes (disciplines) in the RUP to our needs. The RUP is very customizable
process. We have two dimensions that are shown in Figure 4. The horizontal axis expresses the documentation
quantity and the vertical axis expresses a development approach. In our example, creation of a small web
information system, we follow slender documentation approach and the iterative development.
Our first discipline, which is used in the initial of each iteration, is the business modeling. The business
modeling is used to understand the structure and the dynamics of the organization, current problems in the target
organization and to identify improvement potentials. The business modeling is a common language for
understanding the customer and the provider each other. The business model consists of business use cases. The
use cases describe the services provided with our business to our customers. The next advantage of the business
modeling for our system is the possibility to map the business models with object models into the subsequent
models. For our outsourcing information system, the business modeling is valuable for understanding how this
new system affects the way we conduct our business. In the case of building the small project we focus on only
the parts of the business that are the most unclear or crucial and the information is captured less formally. There
are six business modeling scenarios – organization chart, domain modeling, one business – many systems,
generic business system, new business, and revamp. The organization chart scenario for our purposes is suitable.
In this scenario we build a simple chart of the organization and its processes so that we get a good understanding
of the requirements of the information system we are building. The business modeling is a part of the software–
engineering project, primarily performed during the inception phase.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
217
Figure 4: The possibilities customizing the RUP for projects
The business object model mentioned above captures the responsibilities, organizational units, and
items within organization, and how they are related to each other. The business object model gives insight into
the organization. In our example project we have simplified business use case model capturing different
requirements (orders) and confirmed protocols for made services. Further we have a couple of business class
diagram and activity diagram that implement the object model. The first class diagram represents individual
relations between entities and employees (their responsibilities) and the second represents associations between
workers. The unique activity diagram describes a flow customer’s requirement until the filling and confirming of
the protocol is done.
In the requirements discipline arise the most important models of the system, the use case models. They
are created from the vision document, supplementary specification and partly from the business model. The use
case models are used to establish and maintain agreement between the customers and the providers. The
agreement on what should be done and why in order to provide system developers better understanding of the
system requirements, to define the boundaries of the system, to provide a basis for planning the technical
contents of iterations, to provide a basis for estimating cost and time to develop the system and to define a user
interface for the system. In our example, we collect the customer’s requests and develop a brief vision document
containing a set of key customers’ needs and the high–level features of the system. Further we compile a risk list
enumerating hazards that could arise. The most important requirements decided to include in the vision
document are based on analysis of the cost of implementing desired features. Then we translate these features
into the detailed software requirements for designing, building a system and for identifying test cases to repeated
testing of the system behavior. These detailed and the most essential requirements are captured in the use–case
model and other supplementary specifications. While we are detailing these requirements, we will certainly find
flaws that must be repaired by getting clarifications from our customer. The same person can take the role of
analyst and developer, which means that the person who describes a use case will also implement it, so the
second eventuality capturing requirements is that we spend less time on documenting the detailed requirements
and come back to them as we implement the use case. Then we design the user interface by using a prototyping
tool. The main inputs to the design of the user interface are the vision document, the use–case model and
supplementary specifications. Finally, we must build a glossary that defines a common terminology that is used
across the project.
Further there are analysis and design discipline whose purpose is to translate the requirements into a
specification that describes how to implement the system – that is, to a set of classes and subsystems. This
transformation is driven by the use cases so that it ignores many of the nonfunctional requirements and the
constraints of the implementation environment. The design of the system is captured in a design model. The
design model is the primary artifact of the analysis consisting of co–operations of classes providing the system
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
218
behavior. For a transparence in our project we can have classes aggregated into packages and subsystems, e.g. it
consists of subsystems for administering users, administering requirements and administering protocols. We
elaborate the design of the system to the point when it can be implemented through a direct and systematic
transformation of the design into code. We omit an analysis model that is an abstraction, or generalization, of the
design to represent only the most important details of how system works. If we used the analysis model, we
would have to ensure that the analysis and design models remain consistent. From that reasons the analysis
model is superfluous for our small project. The set of operations performed by classes, subsystems and
components is called an interface. We use interface because it improves the flexibility of designs by reducing
dependencies between parts of the system and therefore making them easier to change. Interfaces are used by
components that are dependent only on the interfaces of other components.
The purpose of the implementation discipline is to implement classes and objects in terms of
components and to test the developed components. The tests are limited to test units of individual components.
During the iterative software development are created operational versions of a system or parts of the system
which represent a subset of the capabilities to be provided in the final product, so–called builds. They represent
ongoing attempts to demonstrate functionality. After we develop some subsystem of the project, we will
integrate it into a complete system. The RUP approach is the incrementally software integration. For example,
while we are gradually developing some subsystems, e.g. administration customers or requirements, we are
integrating it into the whole system. The benefit of this approach is the testing individual components properly in
order to locate unsafe faults. To reduce uncertainty affecting the stability, performance, project commitment,
funding, understanding of requirements, usability, and ultimately the look and feel of the product, we use
prototypes. In the RUP we can optionally use two kinds of prototypes – a behavioral prototype and an
evolutionary prototype. The behavioral prototype focuses on exploring specific behavior of the system whereas
the evolutionary prototype evolves to become the final system. The evolutionary prototype is used in our small
project. The key artifacts of an implementation are an implementation subsystem, component, and integration
build plan. The implementation subsystem is a collection of components and other implementation subsystems
which are used to structure the implementation model by dividing into the smaller parts. The component is a
piece of software code (e.g. source or binary). The integration build document defines the sequence in which the
components and subsystems should be implemented. In our small project we create a component diagram. The
component diagram contains components which are connected to each other.
The purpose of the test discipline is to assess product quality and to provide feedback. The test
discipline is being done throughout the lifecycle. It is used primarily for verifying that all requirements have
been implemented correctly without defects. The simplified test model contains test procedures, test cases and
notes. The test cases are a set of execution conditions and expected test results developed for a specific test. The
test cases can be derived from use cases. The test procedure is the set of the detailed instruction for the setup,
execution, and evaluation of test result for test cases. Finally, the notes are textual information describing
constraints or additional information. We can also use the prototype as a tool for testing.
The deployment discipline guarantees that the finished software is delivered to the customer. It contains
testing the software in its final operational environment above all, distributing software and installing. There are
several kinds of deployment processes – deployment of software in custom–built systems, deployment shrink–
wrapped software and deployment of software that is downloadable over the Internet. Since we are developing
the information system we use deployment of software in custom–built systems. The key artifact of the
deployment discipline is a deployment diagram. The deployment diagram depicts the implementation and
environment of a system.
5. CONCLUSION
In this paper was outlined a manner of the customization of the RUP for small web information
systems. It describes the most significant steps which are used in this method omitting the less important parts,
using less formal approach which is not burdened by enormous quantity of documentation.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
219
REFERENCES
[1] BITTNER, K., SPENCE, I.: Use Case Modeling. 1st ed., Addison–Wesley, 2002.
[2] BOGGS, W., BOGGS, M.: UML with Rational Rose 2002. 1st ed., SYBEX Inc., 2002.
[3] CONALLEN, J.: Building Web Applications with UML. 1st ed., Addison–Wesley, 2002.
[4] KROLL, P., KRUCHTEN, P.: The Rational Unified Process Made Easy: A Practitioner’s Guide to the RUP.
1st ed., Addison–Wesley, 2003.
[5] KRUCHTEN, P.: The Rational Unified Process: An Introduction. 2nd ed., Addison–Wesley, 2000.
[6] KULAK, D., GUINEY, E.: Use Cases: Requirements in Context. 2nd ed., Addison–Wesley, 2003.
[7] LEFFINGWELL, D., WIDRIG, D.: Managing Software Requirements: A Use Case Approach. 2nd ed.,
Addison–Wesley, 2003.
[8] MELLOR, S. J., BALCER, M. J.: Executable UML: A Foundation for Model–Driven Architecture. 1st ed.,
Addison–Wesley, 2002.
[9] SINAN, S. A.: Learning UML. 1st ed., O’Reilly & Associates, Inc., 2003.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
220
SHAPE OPTIMISATION OF THE FERROMAGNETIC CORE
OF FLUXSET MAGNETOMETER
Gábor Vértesy1, J. Pávó2
1
Research Institute for Technical Physics and Materials Science
Hungarian Academy of Sciences
H-1121 Budapest, Konkoly Thege út 29-33, Hungary
phone: +361-3922677, fax: +361-3922226, e-mail: [email protected]
2
Department of Broadband Infocommunications and Electromagnetic Theory
Budapest University of Technology and Economics
H-1521 Budapest, Egry József u. 18., Hungary
phone: +361-4632913, fax +361-4633189, e-mail: [email protected]
Abstract: Optimisation of the sensing core’s shape of a high sensitivity magnetic field sensor
(Fluxset sensor) is presented. An optimisation method is discussed that can be used for the
enhancement of the sensitivity of the Fluxset sensor by varying the shape of the ferromagnetic core
in the sensor. The task is solved by electromagnetic modelling of the ferromagnetic conductor
materials in a non-conducting medium. The presence of the ferromagnetic object is represented by
volumetric and surface magnetic currents flowing in the homogeneous medium. The actual
distribution of these magnetic currents are determined by the solution of the integral equation
derived in the paper. Numerical examples have been presented for the demonstration of the results
obtained for the modelling of ferromagnetic films.
Keywords: Magnetic sensors, Numerical electromagnetic field modelling
INTRODUCTION
Magnetic field sensors play an important and continuously increasing role in many fields of science and
of modern technique. Fluxgate type sensors are solid-state devices for measuring the absolute strength of a
surrounding magnetic field or the difference in the field strength between two different points within a magnetic
field. Their measuring range and their resolution are just within the gap between inexpensive sensors such as the
magnetoresistive or Hall type sensors and very expensive magnetometers based on quantum effects such as
SQIUDs and others.
A new type of magnetic field sensor (Fluxset), which belongs to the family of fluxgate sensors has been
developed recently [1] for measuring DC and AC (up to 100 kHz frequency) low level magnetic fields (less, than
1 nT) with high accuracy. Its principle of operation is close to the so called pulse-position type fluxgate
magnetometers [2]. The particular advantage of these magnetometers is an output signal that can be simply
converted into binary signal. The measurement of a small magnetic field is reduced to a high accuracy time
measurement through the displacement of the magnetization curve produced by the field. The Fluxset sensor has
excellent sensor properties because of its principle of operation. These are the good linearity, good stability,
calibration accuracy, high signal to noise ratio, high sensitivity of magnetic field measurement, ideal spatial
resolution and wide operation temperature range. The probes are suitable for axial measurement of the magnetic
field. The transverse sensitivity is negligible. The sensor has a small size, it is versatile, inexpensive and
sufficiently robust to meet the demands of the industry and requires no microelectronic technology. It can be
easily integrated due to the simple construction of both the probe and electronics.
One of the most important factors, which determines the sensitivity of the device is the sensing element.
The optimisation of the shape of this ferromagnetic core might help to improve the performance of the sensor. In
order to find the optimal shape of the core, exact modelling of the ferromagnetic field, i.e. rigorous solution of
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
221
the corresponding direct problem is reguired. In this paper we present a method, which can be used for the shape
optimisation of the ferromagnetic core of the Fluxset sensor. We use a recently developed method for the
modelling of the electromagnetic field in the presence of a ferromagnetic conductor [3]. The method is based on
the solution of an integral equation. Since the size of the ferromagnetic core in the Fluxset sensor is small, the
developed method is extremely efficient because only the discretization of the volume occupied by the
ferromagnetic object is required. The ferromagnetic material is assumed to be linear. The presence of the
ferromagnetic object is represented by volumetric and surface magnetic currents flowing in the homogeneous
medium. The actual distribution of these magnetic currents are determined by the solution of the integral
equation derived in this paper. The main advantage of the presented model is that the numerical solution of the
integral equation requires only the discretisation of the volume occupied by the ferromagnetic object.
FLUXSET SENSOR
The schematic drawing of a Fluxset sensor is shown in Fig. 1. The sensor is made of two solenoids
wound on each other. The inner and outer solenoids are called driving and pick-up solenoids, respectively. The
sensing element of the probe is an amorphous alloy ribbon with high initial permeability and low saturation.
Manufacturing techniques and properties of amorphous magnets are well known for a long time and a number of
monographs devoted to these aspects were published (see, e.g. [4-6]). The idealized magnetization curve of the
core material is shown in Fig. 2. The actual BH curve and other parameters of the material strongly depend on
the different mechanical, heat, chemical, etc. treatments of the raw material carried out to improve the magnetic
properties of the core [7]. The approximate length and diameter of the different realizations of the sensor are
between 5 and 15 mm and around 2 mm, respectively. The thickness of the ribbon core is about 25 mm.
Pick-up coil
A
vs(t)
Driving coil
A
id(t)
Glass sensor support
Ferromagnetic core
Fig. 1: Schematic drawing of a Fluxset sensor
The core material is periodically magnetized to saturation in both directions by the triangular current
excitation, id(t), of the driving coil. Considering the idealized magnetization curve of the core material shown in
Fig. 2, we can see that the induced voltage impulse in the pick-up solenoid, vs(t), is almost zero when the core is
saturated, on the other hand this induced voltage has a relatively large value (proportional to the time derivative
of the driving current) while the driving magnetic field changes its direction (i.e. the core material is magnetized
in the linear range). If an external magnetic field (i.e. the field to be measured) is superimposed on the periodic
driving magnetic field, the time spent in saturation in one direction (e.g. when B=Bs) is longer than the time
spent in saturation in the other direction (e.g. when B=-Bs), consequently the impulse, vs(t), is shifted. The time
shift of the impulse, vs(t), can be accurately measured, as a result the external field can be predicted. One can see
that the measured magnetic field is actually the field of the sensor core. Since the volume of the sensor core is
very small, we assume that it does not significantly disturb the magnetic field to be measured, consequently the
sensor measures approximately the magnetic field that can be detected around its core if we neglect the
disturbing effect of the presence of the sensor core.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
222
Fig. 2: Idealized BH curve of the core material
The accuracy of the measurement is basically depends on the shape of the pulse, vs(t), detected on the
sensor coil. Since the pulse shape is formed by the fast change of the magnetic flux density inside the core when
the exciting field changes its sign, the time dependence of the pulse can be calculated by assuming a linear
ferromagnetic material that permeability is obtained as the gradient of the BH curve. Of course, the results of
such calculations are useful only if the core material is not saturated. It is easy to see that, the sharper the
detected voltage signal, the easier to detect small time differences, consequently the magnetic field is measured
more accurately. Based on this consideration we may conclude that the more uniform the magnetic field in the
core (i.e. the different regions of the core are saturated in the same time instant), the higher accuracy can be
achieved in the measurement. The requirement of the uniformity of the magnetic field in the core might be
fulfilled by changing the shape of the sensor core. Since our goal is to find a configuration where the whole
volume of the sensor core is saturated in the same time, the assumption of linear ferromagnetic material is
acceptable.
CALCULATION OF THE ELECTROMAGNETIC
FERROMAGNETIC CONDUCTOR
FIELD
WITH
THE
PRESENCE
OF
Integral equation describing the ferromagnetic material
Assume that a linear ferromagnetic conductor occupies the volume region V1 in the otherwise
homogeneous non-conducting medium. The schematic drawing of the studied geometrical configuration is
shown in Fig. 3.
Fig. 3: Ferromagnetic conductor in a homogeneous non-conducting medium.
The ferromagnetic object is surrounded by air and its permeability and conductivity are s1 and m1,
respectively. Assuming that the electromagnetic field is varying in time as the real part of exp( jwt ) , the
Maxwell equations are written in the following form:
(1)
Ñ ´ H (r ) = [s (r ) + jwe 0 ]E (r ) ,
Ñ ´ E (r ) = - jwm (r )H (r ) ,
(2)
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
223
where,
r ÎV1 ,
ìs
s (r ) = í 1
î 0 otherwise
(3)
r Î V1 .
ìm
m (r ) = í 1
otherwise
m
î 0
(4)
Taking the curl of (1) and substituting (2) into this equation we obtain,
[
]
Ñ ´ Ñ ´ H (r ) - w 2 m 0e 0 H (r ) = w 2e 0 (m (r ) - m 0 ) - jwm (r )s (r ) H (r ) + Ñs (r ) ´ E (r ) .
(5)
Assuming (1), E(r) can be written as,
E (r ) =
1
Ñ ´ H (r ) .
s (r ) + jwe 0
(6)
Using (6) E(r) can be eliminated in (5). As a result we obtain,
[
]
Ñ ´ Ñ ´ H (r ) - w 2 m0e 0 H (r ) = w 2e 0 (m (r ) - m 0 ) - jwm (r )s (r ) H (r ) +
+ d (r - r0 )
s1
Ñ ´ H (r ) ´ nˆ
s (r ) + jwe 0
,
(7)
where r0 is the co-ordinate vector pointing to the surface S1 of the volume V1, n is the outward normal unit
vector of surface S1 (see Fig. 3) and d(r) is Dirac's delta function. Note that the first term on the right hand side
of (7) is zero outside of V1 and the second term is not zero only on the surface S1. Based on (7), the difference
between the electromagnetic field that can be calculated with and without the presence of the ferromagnetic
object may be assumed as the field generated by a secondary source concentrated in the volume V1 inside a
homogenous non-conducting medium (in the present case in air). This secondary source might be a volumetric
magnetic current flowing in V1 and a magnetic surface current concentrated on the surface S1 (see Fig. 3). In
order to find the actual values of these secondary sources, the Maxwell equations in homogeneous nonconducting medium with the assumption of the magnetic current, M(r), and magnetic surface current, Ms(r), are
written as,
Ñ ´ H (r ) = jwe 0 E (r ) ,
Ñ ´ E (r ) = - jwm 0 H (r ) + M (r ) + d (r - r0 )M s (r ) .
(8)
(9)
Note that M(r)=0 outside of V1 and Ms(r)¹0 only on S1. Taking the curl of (8) and substituting (9) into this we
get:
Ñ ´ Ñ ´ H (r ) - w 2 m0e 0 H (r ) = jwe 0 M (r ) + d (r - r0 ) jwe 0 M s (r ) .
(10)
The comparison of (7) and (10) results that the effect of the presence of the ferromagnetic conductor
can be modelled by placing the following magnetic current and magnetic surface current into the homogeneous
non-conducting medium:
ìw 2e 0 (m1 - m 0 ) - jws 1m1
,
r ÎV1 ,
ï
M (r ) = n (r )H (r );n (r ) = í
jwe 0
ïî
0,
otherwise
s1
ì
,
r Î S1 .
ï
M s (r ) = J (r )Ñ ´ M (r ) ´ nˆ;J (r ) = í w 2e 0 (m1 - m 0 ) - jws 1m1 (s 1 + jwe 0 )
ïî
0,
otherwise
[
]
(11)
(12)
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
224
The magnetic field may be written as the sum of an external field, H0(r), and the field generated by the
magnetic currents, Hf(r),
(13)
H (r ) = H 0 (r ) + H f (r ) .
Here H0 represents the field induced by sources outside of the examined homogeneous medium (consequently it
satisfies the homogeneous wave equation), while Hf represents the magnetic field due to the presence of the
ferromagnetic conducting object, i.e. due to the secondary sources in (7) or (10). Hf can be expressed with the
help of dyadic Green's functions as [8],
(14)
H f (r ) = jwe 0 ò G (r | r ') × M (r ')dv' + jwe 0 ò G (r | r ') × M s (r ')ds' ,
V1
S1
where G(r|r') is the dyadic Green's function that transforms the magnetic excitation into the magnetic field. By
substituting (14) into (13) and multiplying the equation with n(r) (see (11)) the following integral equation is
obtained:
M (r ) = M 0 (r ) + jwe 0n (r )ò G (r | r ') × M (r ')dv' + jwe 0n (r ) ò G (r | r ') × [J (r ')Ñ ´ M (r ') ´ nˆ ]ds'
V1
(15)
S1
where
(16)
M 0 (r ) = n (r )H 0 (r ) ,
is assumed to be known (in most practical problems M0(r) represents the excitation). By solving (15), M(r) can
be calculated, consequently the electromagnetic field of the considered configuration is obtained. Note that only
the volume region V1 is to be discretised for the numerical calculation of the field.
NUMERICAL SOLUTION OF THE INTEGRAL EQUATION
In Descartes co-ordinate system the discrete approximation of (15) is found by subdividing the region
of space occupied by the ferromagnetic object, V1, into a regular lattice of N cells, each cell being a rectangular
parallelepiped. The distance between the nearby points in the x, y and z co-ordinate directions are Dx, Dy and Dz,
respectively. The magnetic current is approximated as,
N
N
n =1
n =1
(
)
M (r ) » å M n an (r ) = å M xn xˆ + M yn yˆ + M zn zˆ an (r );r ÎV1 ,
(17)
where r=x x̂ +y ŷ +z ẑ ( x̂ , ŷ and ẑ are the unit vectors of the co-ordinate directions). The approximating
function, an, is chosen to be linear, i.e.,
æ x - xn y - yn z - z n ö
an ( x, y, z ) = f çç
,
,
÷;n = 1,2, K , N ;
Dy
Dz ÷ø
è Dx
.
ì1- | x | - | y | - | z | + | xy | + | xz | + | yz | - | xyz |, - 1 < x, y, z < 1
f ( x, y, z ) = í
0,
otherwise
î
After testing the discrete approximation of (15) by the testing function,
(18)
(19)
t m ( x, y , z ) = am ( x, y , z );m = 1, 2, K, N ,
n
a system of 3N linear equations is obtained. Solving this equation system, the unknown coefficients Mi (i=x,y,z;
n=1,2,...,N) can be calculated.
The most serious difficulties in the solution of the integral equation arise when the coefficient matrix is
to be calculated. The coefficients are obtained as the linear combination of the following type of integrals,
é
ù
m, n = 1,2,K, N
ò êê ò G(r | r ')× a (r ')xˆdv'úú × t (r )zˆdv; xˆ,zˆ = xˆ, yˆ , zˆ .
n
m
(20)
ë
û
Using similar considerations that was published in [9], the spatial Fourier transform of the integral in
the square brackets can be evaluated analytically. This analytical formula is particularly useful because the
kernel of the integral (i.e. the dyadic Green's function) is singular. This singularity might cause serious problems
in the numerical evaluation of the integral. Having the mentioned analytical formula, fast and numerically stabile
solution of the integral equation is obtained.
V1 V1
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
225
MODELLING OF FERROMAGNETIC FILMS; NUMERICAL EXAMPLES
For modelling of the ferromagnetic core of Fluxset sensor, the above introduced integral equations
should be applied for a high permeability ferromagnetic film in homogeneous magnetic field. In this section we
deal with linear ferromagnetic object that thickness is very small compared to its other dimensions. The
permeability of the material is assumed to be high, on the other hand the electric conductivity of the film is
sought not to be considerably high. Based on these assumptions, the ferromagnetic film is modelled as an
infinitesimally thin film, i.e. a mathematical surface where the tangential component of the magnetic field is
zero. The prescribed boundary conditions are fulfilled by placing tangential magnetic surface current, Ms(r) on
the surface representing the ferromagnetic film. The eddy currents flowing inside the film are neglected in this
model.
As an example, assume that a rectangular ferromagnetic film is placed in the xy plane. The sizes of the
film in the x and y directions are 5 mm and 1 mm, respectively. The external magnetic field is x directed, its time
variation is assumed to be sinusoidal with the frequency of 100 kHz, the magnitude of the magnetic field is 1
A/m. The x component of the magnetic surface current representing the ferromagnetic material in the case of the
given external field is plotted in Fig. 4. The y component of the magnetic surface current (not shown here) is
considerably smaller than its x component. Note that the magnetic current has p/2 phase shift with respect to the
external magnetic field. In Fig. 5, the z component of the magnetic field due to the presence of the ferromagnetic
film is plotted above the surface representing the film (note that this field is in phase with the external field).
Knowing the x component of the magnetic surface current and using (9), the induced voltage can be easily
calculated in the solenoid around the ferromagnetic core. For this calculation we assume that the voltage induced
by the magnetic field around the core is negligible compared to the induction due to the magnetic currents.
0.035
0.03
0.025
0.02
0.015
0.01
0.005
0
1
0
1
2
x [mm]
0.5
3
4
5
y [mm]
0
Fig. 4: x component of the magnetic surface current, Msx [V/m].
5
0
-5
-3
-2
-1 0
x [mm]
1
2
3
-1
0.5
0 y [mm]
-0.5
Fig. 5: z component of the magnetic field above the ferromagnetic film, Hz [A/m].
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
226
OPTIMISATION OF THE SHAPE OF THE CORE
Based the outlined method for the calculation of the magnetisation of the sensor core, the optimal core
shape is found by simulated annealing optimisation procedure. The objective, expressing the requirement of
smoothness of the magnetisation of the core material, is the minimisation of the expression,
1
N
N
åH
n =1
t
- Htn
(21)
where Ht denotes the average of the absolute value of the tangential magnetic field on the surface, S, and Htn is
the absolute value of the tangential component of the magnetic field at the n-th point. Several constrains are
posed on the optimisation. One of the constrains comes from the symmetry of the sensor arrangement,
consequently the core is designed to be also symmetric. By the outer dimension of the sensor, the maxium size of
the core is defined, it cannot be bigger than 5mmx0.8mm rectangle. To obtain a sensor signal with sufficient
amplitude, the minimal size of the core is also fixed.
Due to the symmetry and the special elongated shape of the sensor, the shape of the core is described
with a small number of parameters, accordingly, the number of degrees of freedom of the optimisation is 4. The
particular choice of the simulated annealing optimisation procedure [10] is supported by the facts that it is very
easy to apply this algorithm and it works almost for sure if it is properly implemented. On the other hand, the
efficiency of this method is not particularly good because of the high number of function calls required to find
the optimum. However, the design of a Fluxset sensor is not a kind of problem that has to be solved very
frequently, that is why the robustness of the optimisation is the primary concern and the efficiency is not
crucially important.
As a demonstration of the result of the optimisation, in Fig. 6 the tangential magnetic field component
parallel to the axis of the sensor in the optimum shape sensor core is plotted. As the final result, ellipsoidal shape
of the sensor core was found as optimal one. Note that the tangential magnetic field perpendicular to the axis of
the core is negligible compared to the one that is plotted in Fig. 6. In the presented plot the function values at the
points outside the core surface are set to zero, these values have no meaning.
Fig. 6: x component of the tangential magnetic field for the optimum shape core
(in phase with the external magnetic field)
CONCLUSIONS
An optimisation method targeting the enhancement of the sensitivity of the Fluxset sensor by varying
the shape of the ferromagnetic core of the sensor has been discussed. As part of the optimisation method, the
solution of the related direct problem, i.e. the calculation of the magnetisation of a ferromagnetic conductor thin
film due to a given external field, has been outlined. An integral equation has been derived for the modelling of
the electromagnetic field of a ferromagnetic conductor in a homogeneous non-conducting medium. The presence
of the ferromagnetic object is represented by secondary sources, namely by volumetric and surface magnetic
currents concentrated in the region occupied by the ferromagnetic object. Numerical solution of the derived
integral equations have been outlined by using analytical expressions for the spatial Fourier-transform of the
singular kernel of the integral equations. Numerical examples have been presented for the demonstration of the
results obtained for the demonstration of the result of the optimisation.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
227
ACKNOWLEDGMENTS
This work was supported by the Hungarian Scientific Research Fund (T035264).
REFERENCES
[1] VÉRTESY, G., GASPARICS, A., SZÖLLŐSY, J.: High sensitivity magnetic field sensor, Sensors and
Actuators A, 85 (2000) 202-208
[2] RIPKA, P.: Review of fluxgate sensors, Sensors and Actuators A, 33 (1992) 129
[3] PÁVÓ, J., SEBESTYÉN, I., VÉRTESY, G., DARÓCZI, Cs.S.: Electromagnetic field modelling of the
ferromagnetic core of Fluxset type sensors, in Studies in Applied Electromagnetics and Mechanics 11,
Applied Electromagnetics and Computational Technologies, eds.: H. Tsuboi and I. Sebestyén, IOS Press,
Amsterdam, pp.172-181, 1997
[4] LUBORSKY, F.E.: Amorphous Metallic Alloys, Butterworths, London 1983
[5] MOORJANI, K., COEY, J.M.D.: Magnetic Glasses, Elsevier, Amsterdam 1984
[6] LACHOWICZ, H.K.: Magnetic materials – progress and challenges, Journal of Technical Physics, 42 (
2001) 127
[7] VÉRTESY, G., GASPARICS, A., VÉRTESY, Z.: Improving the sensitivity of Fluxset magnetometer by
processing of the sensor core, J. Magn. Magn. Mat., 196-197 (1999) 333
[8] TAI, C.T.: Dyadic Green's Functions in Electromagnetic Theory, Intext, Scranton, 1971
[9] PÁVÓ, J. and MIYA, K.: Reconstruction of crack shape by optimization using eddy current field
measurement, IEEE Trans. Magn., 30 (1994) 3407
[10] PRESS, W.H. et al.: Numerical Recipes, Cambridge University Press, Cambridge, 1986
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
228
EVALUATION OF DEFECT OF PIPELINE ISOLATION DETECTED
BY PEARSON METHOD
Dalibor Bartoněk1), Imrich Rukovanský2)
1)
Institute of Geodesy, Faculty of Civil Engineering, Brno University of Technology.
Veveří 95, 662 37 Brno Czech Republic. Tel. 54114 7204, Fax: 54114 7218, E-mail: [email protected]
2)
Evropean popytechnical institute, Kunovice, Osvobození 699, Czech Republic
Tel. 572 548 035, Fax: 572 549 018, E-mail: [email protected]
Abstract: This article deals with Pearson system programme assessing defects of pipeline isolation
evaluation that were found when measured by Pearson method. It describes a new method how to
apply this, nowadays already a classical way of detecting isolation defects including a special
software which automatically processes the data taking into consideration the individual defects of
pipeline isolation on given routes. The conclusion sums up practical experience complemented with
a graphical output from the software.
Keywords: Defect, pipeline isolation, Pearson method, locator, organizer, measuring evaluation.
1. INTRODUCTION
The extremly wide steel pipeline network of gas pipeline and productline in the Czech Republic is
usually placed in the soil exposed to an agressive medium of the soil which causes material corrosion. This
negative phenomenon can be avoided if the pipeline is protected:
1.
actively by the system of anticorrosive devices spread over territories, working on an
electrochemical principle [1],
2.
passively by isolation of pipeline which will not enable the access of H2O + O2 to the steel mantle
and thus it will be prevented from corrosion.
Even the best insulation has a limited durability therefore its state must be checked up not only when
the pipeline is installed, but also in certain time intervals. There are a lot of check up methods; some of them can
be found in ČSN 03 8375 (Czech standard). Except for some visual methods during which it is necessary to
uncover the pipeline, the two remaining ways of check up of the quality of insulation can be divided into two
groups:
1.
2.
methods, finding out the individual insulation defects,
methods, finding out the average quality of insulation in a given section of pipeline.
Pearson method, we will speak about, belongs to the first group.
2. PRINCIPLE OF PEARSON METHOD
When used this method the alternate voltage from tone generator is conveyed into the pipeline in a
certain point (check up output or connecting object). The signal in a defected places of pipeline insulation starts
spreading into the environment where it causes voltage drop down. In order to detect defects on the pipeline
locator and signal receiver are necessery. There must be two persons wearing special shoes with metal tips
ensuring galvanic contact with the earth, who search for these insulation defects. They move either one after
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
239
another or next to each other and receiver detects difference of voltage between points, of their positions – see
fig. No 1. If both of them are in the points with undeffected insulation no voltage difference is detected. If one of
them comes across the deffected points the receiver detects the change of signal level which causes both the
increasement of deflection of measuring device-receiver and the acustic signal. The defects are located towards
the fixed points on the pipeline (devices on the route). These points have their own positioning which is the
distance in [km] from the beginning of the given route. The measurement requires experience, because the
device can record the change of the signal even if there is not any defect e. g. the change of environment.
Fig. No 1. Principe of Pearson method
with undeffected insulation no voltage difference is detected. If one of them comes across the deffected points
the receiver detects the change of signal level which causes both the increasement of deflection of measuring
device-receiver and the acustic signal. The defects are located towards the fixed points on the pipeline (devices
on the route). These points have their own positioning which is the distance in [km] from the beginning of the
given route. The measurement requires experience, because the device can record the change of the signal even
if there is not any defect e. g. the change of environment.
3. INNOVATION OF THE METHOD
In the original variant, the Pearson method was done by at least three persons. The two of thems worked
as explained in previous chapter, and another pearson had to write the measuring results together with
positioning of measured places into a special form. The positioning values were found out by measuring of
distances with a tape from the fixed points of the route in order the defects could be located. The measured data
were manually rewritten into the computer, to be later evaluated. The greatest disadvantages of this variant were:
· complicated manipulation with the tape, especially in the broken field.
· a great probability of making mistakes both in manual recording of measured data and in the
course of rewrtiting into computer.
Therefore in the half of the 90’s a more effective variant of pipeline defects detection started to be
pondered about. The innovation of this method consisted in implementing of a pipeline locator with the special
A-shaped detection sond and connected with serial cable with the organiser PSION type – see fig. 2.
sond
locator
switch
organiser
PSION
tip for contact with the earth
Fig. 2. Block scheme of device configuration for measuring by Pearson method
Locator RD 432 PDL type is the product of English Radiodetection company producing also special
measuring sond to order. PSION holder with switch is produced by Elgas Pardubice company (CZ). How to
measure in the field:
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
240
Generator of the AC signal with frequency of 8 or 16 kHz is connected with the check point of the
pipeline according to the special variant as follows. In this case the measuring is done by only two persons; one
of thems identifies the route by locator, another one equipped with devices – see fig. 2 follows him. In one hand
he holds a sond connected with locator and he places the tips into the surface above the pipeline, with the other
hand he handles the locator with PSION organiser. Both PSION and sond are connected by serial cable with the
locator. The sond receives the signal from the pipeline and sends it into the locator, where processed in the form
of text message. This is a standard message, which goes through the serial interface into the organiser where,
with the help of a special software, written into a file. There are two methods how to write measured values:
·
·
manually processing of a given button on the locator,
automatically in regular time intervals which can be set on the control panel of PSION.
The positioning of the measured point is, in this case, set automatically on the basis of a number of
steps walked by a workman with measuring equipment between two writing of measured values. The accuracy
of positioning calculation requires the handling person to keep the same number of steps (the same length)
during measuring. This disadvantage taxes the person havily (it requires a great experience) is ballanced because
this unpleasant manipulation with the tape can be avoided. Further the measuring can be interrupted by
mechanical switch at any time when necessary (being tired etc.) – see fig. 2. The software in the organiser adds a
number of steps to the initial positioning during of every interval between values writing. The initial positioning,
length and a number of steps can be set up before measuring. When the worker comes to the fixed point on the
route he writes his precise positioning into the file either manually or semiautomatically by the choice of the
menu list. This list must be transformed from PC into organiser long before measuring in the terrain and serves
as a basis for calculation of approximate positioning in steps into precise positioning in km. This file must
obligatorily contain at least two fixed points, otherwise the precise positioning can not be calculated. Every
measuring record can be interactivelly complemented with a note. This is important at the moment, when the
state of terrain disables to continue measuring (natural obstacle, private allotment etc.). In this case the message
in the measuring record (both at the beginning and end of defected section) writes conventional character “*”
into the note. This given section is excluded from the statistic evaluation by the program in the computer. When
measuring, one of these three variants of Pearson method can be chosen:
·
·
·
current, used for pipeline defect detection or distances ordered in hundreds of meters,
voltage, detecting defects in the section in tens of meters,
CD (current direction), used for measured distances in units of meters.
Current or voltage method is used if we need to test some longer section of the route to find out pipeline
defects. If the amplitude of the signal in a section of a given lengths (current method) drops down by more then
80% or (voltage method) by more then 70 dB it is very obvious that some pipeline defects in a given section are
to occur. This section should be measured by CD method detecting pipeline defects with the accuracy of 0,5 m.
4. SOFTWARE FOR PEARSON METHOD
Software for measuring and evaluation of results made by SHINE firm (S. Sedláček owner) has two
parts:
1.
2.
Procedures for PSION organiser,
software for PC
ad 1) Software for PSION was created by the author of this paper in OPL (Organiser Programming
Language), which is in the organiser at disposal. It concerns procedures ensuring following functions:
· scanning the file list with measured values,
· procedure for creation of a new file of measuring,
· data transmittion between PS and organiser,
· deleting and editting records of measuring,
· scanning the file with measuring or the file with fixed points of the route,
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
241
·
·
·
module for help,
subroutine for the system information,
procedure for measuring belonging to the most complicated one because apart from others it
analyses the message from the locator, out of which only the most important information is
choosen (reduction of memory capacity).
ad 2) Application PEARSON for PC processes and evaluates measured values from the terrain. It is
composed of relatively separated modules each of them performes a certain group of functions. It concerns
following modules:
· communication module ensuring the data transmittion between computer and organiser,
· database module with basic functions for database operations with the files (database of the
routes, fixed points and measuring),
· module for graphical presentation of measured results. The gained values can be shown on the
screen, printed on the printer or drawn on the plotter – see fig. 3. On the graph we can see
considerable clusters of defects on positioning 62, 62.4 and 62.8 km,
· module of system functions. It concerns utilities for data backup, exit into operation system
and running of the favourite text editor for possible file editing etc.
· the main modul for the processing of measurings containing procedures for
a) data transformation from the formate of organiser into the database of measuring
b) calculation of approximate positioning of the measured place (in steps) which is found
c) between two fixed points into accurate positioning according to interpolate equation:
(
xm = x + xmk - x
1
p
1
pk
)
x 2p - x1p
x 2pk - x1pk
(1)
where x1p or. x2p, is accurate positioning 1. or. 2. fixed point on the route in [m],
x1pk or. x2pk is approximate positioning 1. or. 2. of fixed point of fixed point of the route in steps,
xm or. xmk is calculated in [m] or. approximate in steps positioning of measured place
a)
b)
filtres serve for scanning of database of measured values according to various criteria
creation of protocol with measuring that is composed of three parts:
·
header of measuring which contains common data e. g. the name of organisation, names of
pearsons who measured, measuring conditions, the name and number of the route, used method
and devices, which were used,
·
from the detailed message about the measuring, which contain records of the measured places of
the route,
·
from the statistics of measuring. In this part the defects are divided according to the intensity
into: small, medium and large. Criteria of division can be set in the configuration file. In the fig.
3 there are denoted values between 0.5 – 3.5 – 7.5. Within the statistic the following data are
processed above all:
· number of defects in the given section,
· cluster analysis e. g. a number of clusters of defects and a number of defects in every cluster.
The cluster means a great number of defects of the pipeline their mutual distance does not
cross a certain limit (in the graph on fig. 3 has limit a value of 5 m)
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
242
Datum: 29/05/1996
Route: 5320400000 ZNOJMO – KVĚTNOV;
Method : CD
Intensity (length) [m]
Positioning [km]
Small
Medium
Large
<F3> - Raster; <F4>/<F5> - Zoom; <F6> - Previous zoom; <F7> - Original size; <ESC> Return to menu
Fig 3. Graphical output of measuring of pipeline defects detected by Pearson method – variant CD
5. CONCLUSION
The above described Peason method is used in these gas plants in the Czech Republic:
· Jihomoravská plynárenská a. s. Brno,
· Severomoravská plynárenská a. s. Ostrava,
· Středočeská plynárenská a. s. Praha,
· Východočeská plynárenská a. s. Hradec Králové,
· Severočeská plynárenská a. s. Ústí nad Labem
· Východoslovenské plynárny Michalovce (Slovak Republic).
Further the method is used as a part of system of anticorrosive protection in enterprises:
· MERO ČR, Kralupy nad Vltavou and
· ČEPRO a. s.
The practical experience proved that this method is reliable because in the areas with the occurance of
great clusters defects the insulation of pipeline was always considerably damaged.
REFERENCES
[1] ČLUPEK, O., DAVIDOVÁ, H.: Anticorrosive protection. GAS s. r. o., Praha, 1998, 141 s., ISBN
80-902339-8-8, in Czech.
[2] SEDLÁČEK, S., BARTONĚK, D.: System PEARSON. User manual, SHINE Brno, 1996, in
Czech.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
243
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
244


 











                 
                   
              




 


                 


           

              
 

                 



      
                  






 
                    

               
 

                   
                 

 

               
             
                
                
               

              



                












                

        







                   










             


                  



                     



                  


 










                   







                   




                  
              
                



                

                 





                




     

                  

                      






























                
  
                      








































































            




                   


 
                   


                  












                
              
                

                 




 
                  

 

 
 

                 
                

 
  

                
  
  

                 

 
        

 
   
 
                 
 
                 
 
              

 
 
           

SYSTEMS FOR EVALUATION OF ANTICORROSIVE PROTECTION OF PIPELINE
Dalibor Bartoněk1), Jaroslav Nesvadba2)
1)
2)
Institute of Geodesy, Faculty of Civil Engineering, Brno University of Technology.
Veveří 95, 662 37 Brno Czech Republic.
Tel. 54114 7204, Fax: 54114 7218, E-mail: [email protected]
European Polytechnical Insstitute Kunovice, Osvobozeni 699, 686 04 Kunovice, Czech Republic
Tel. 572 548 035, Fax: 572 549 018, E-mail: [email protected]
Abstract: Program systems GAS-ACOR and GASSERV made for the processing of the periodical
measurement within anticorro
sive protection of pipelines are described in this paper. These systems have been used since 1990’s in
Czech gas-enterprises and in some gas-enterprises in Slovakia as well. They have also been used in
companies Transgas Inc., ČEPRO Inc. and MERO ČR Inc.
Keywords: Gas-pipeline, corrosion, protective devices, database, quantities.
1. INTRODUCTION
Czech gasworks make use of transport pipeline net in the lengths over 33 000 km. Out of this number 14 000
km of remote and transit gas conduits and 19 000 km local gas conduits net [2]. Lately, for the dislocation of gas,
the pipeline made by polyethylene has been widely used. This material is not suitable for remote gas conduit due to
the limited dimension and pressure. Commonly used insulated steel pipeline located in the earth is connected with
very serious problem, which is corrosion of metal. Metal materials located in the earth can be attacked by the
corrosion for various reasons. Physical and chemical influences connected with the character of soil cause the so
called simple corrosion. In the earth there can flow through electric currents their effect upon the devices located in
the earth causes corrosion called „corrosion caused by the wondering currents“. Under certain conditions colonies of
microorganisms can attach the surface of metals. This is called microbial corrosion. The direct corrosive losses
calculated on the share of rough national product are moving in all countries arround 4%; 10% - 15% of these direct
losses concern pipeline located in the earth. Undirect losses e. g. losses brought about the primary corrosive problem
such as break out in the production, ecological disasters etc. are even several times higher than direct losses. This
given data confirm, that anticorrosive protection of pipeline has a great importance both from the point of view of
economy (especially with gas conduits) and from the point of view of safety of operation. The application of
anticorrosive protection is rather expensive but due to its rapid economic restitution and compared to damages
which can appear when the care is neglected, we should not regret our investments.
2. THE PRINCIPE OF ANTICORROSIVE PROTECTION
As early as 1819 a member of French Science academy L. J. Thénard published in one of his books, that the
corrosion is an electrochemical phenomenon. Sir Humphrey Davy and his pupil Michael Faraday who in 1934
discovered quantitative dependence between corrosive falls and electrical current. Two reactions are proceeding
during the corrosion of metals:
·
Anodic, in the course of which the metal dissolves,
·
Cathodic also called depolarisation reaction during which electrons rising from anodic reaction are consumed
and therefore corrosive medium is reduced.
The aim of anticorrosive protection is to minimalise losses brought about by corrosion (both economic and
ecological). As far as the supply of outer power to the protected object is concerned we distinguish anticorrosive
protection:
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
255
·
·
passive,
active.
Corrosive processes on the pipeline or other devices located it the earth can appear only when water or
oxygen come into the contact with the metal. In the passive protection it is necessary to separate the surface of the
metal from the environment with a suitable material insulation in such a way to prevent none of depolarisators from
the access to the metal. Further there is a possibility of making use of many organisation measures leading to the
reduction of corrosion (the choice of the route, the way of pipeline location, the choice of the accessories etc.). In the
active protection we come out of the fact, that corrosive procedures are caused by electrochemical reactions of the
surface of the metal, thus it is possible to influence them by other electrochemical reactions. According to
polarisation we differenciate anodic protection (the protected device appears as anode) and cathodic (the protected
device appears as cathode). For the protection of outer surfaces of the devices located in the earth only cathodic
protection can be used.
The principle of cathodic protection is drawn on fig. No 1. Graph in the fig. No 1a outlines the dependence
of anodic Ja and cathodic Jc current density on the U potential. At the value of corrosive potential Ucor is Ja
characterising dissolving of the metal equals Jc which determines exclusion of the metal back to the environment.
By lowering the voltage under Ucor the value Ja drops down which results
+J
Ja
anode
Jo
A
+
–
U
insulation
Jc
-J
Umin
Ucor
pipeline
a) potential graph
b) scheme of configuration
Fig. No. 1. Principe of the cathodical protection
in dissolving of the metal. Our aim is to lower value U under Ucor so that the corrosion can drop under technically
possible limit. The value of the potential under which this state has been achieved can be denoted as a minimum
protection potential Umin. It is important to supply the power from the external source. The value of a protective
current corresponding to the current density J must be always somewhat higher than a corrosive current abstracted
from the graph in the fig. 1a for Umin.The scheme of the cathodic protection of the pipeline see fig. 1b. The device
we want to protect must be plugged in the negative pole DC and a suitable anode in plus pole thus we can
cathodically protect any steel device. The routes of gas conduits called station of cathodic protection (SCP) electric
polarised drain (EPD), saturation (S) work on the principle of cathodic protection.
3. ANALYSIS OF ANTICORROSIVE PROTECTION SYSTEM IN GASWORKS
An extensive net of gasworks in the Czech Republic is divided into the individual parts – routes according to
the topological principe. On each route are placed devices of anticorrosive protection both active (SCP, EPD, S) and
passive (joining object - JO, insulation fittings – IF, protective fittings – PF) – detailed description see [2]. On every
device various quantities characterising the state of the protection against corrosion are measured. The measuring is
done periodically in various intervals (14 days, one month, within three months, yearly); it depends on the type of
the device and the quantities. According to the valid enterprise standards every measurement must be written into
the protocol which is implemented with a graph of protective quantities for each route once a year. In order to
achieve an unambiguous identification the routes are denoted with numerical codes, each device has its own
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
256
positioning which is a distance from the beginning of the route in km. Let us define the requirements for the
database of device and measuring which are not the only ones within the systém of anticorrosive protection
nevertheless they have a great meaning for its right function. The database of the objects on the routes must be
designed in such a way so that it could be possible:
·
to insert or to delete the certain device on any place of the route,
·
to insert several devices into the place with the same positioning on the route,
·
to keep all information concerning the individual devices the range of information depends on the type of the
device. There are about 100 various devices which can be divided from the point of view of data attributes
into the following groups:
1. station of cathodic protection (SCP), controlled station of cathodic protection (CSCP),
2. electric polarised drain (EPD),
3. joining object (JO),
4. joining object with diode element (JOD),
5. joining object between gas conduit and another device (JOA),
6. insulation fitting (IF),
7. protection fitting (PF),
8. potential scanner (PS)
Each group within the data model is represented by the same set of attributes. From the point of view of
measured quantities we must implemented the above mentioned groups with some other groups:
·
critical point – any device where the cathodic protection is almost over (T),
·
all the devices except for SCP, CSCP or EPD – (X),
Every device can be characterised by some of the described types or combinations of these types (e. g. joining
object with insulation fitting which is at the some time a critical point – JOIFT). We must admit, that some
combinations are impossible to be made e. g. CSP + T or EPD + SCP.
1.
to file the measured values of quantities (Q) on all devices (D) in various period (P). Each quantity is a
function of two variables:
Q = f(P, D)
2.
3.
4.
(1)
Novadays about 200 – 300 of various quantities that are sorted according to the locality and number of
devices placed on these routes have been used.
to research measured values on the device which does not exist at the given moment (change of the
positioning, change of the type of device and so on),
simply to restructuralize record in the device and measuring database without the loss of data integrity.
These are the most important requirements for the database of devices and measurings. Some of other important
functions concerning the system of anticorrosive protection are mentioned in another paragraph.
4. SOFTWARE SYSTEMS GAS-ACOR AND GASSERV
In late 80’s the team of research workers of former Czech gaswork headed by Svatopluk Sedláček in cooperation
with the author of this paper gave birth to the idea to create the system for automatization of anticorrosive protection
[3] - [6]. A great experience of these technicians of all the involved enterprises in the Czech Republic and Slovakia
was made use of. It was pondered about taking advantage of all accessible softwars e. g. dBASE, later MS Office
and so on that were already tested in practice. These systems, however, were designed for general purpose with the
wide offer of functions which is connected with 2 main problems:
·
a lot of functions lead to complications of the system and uneffective usage (some functions would hardly be
made use of, others the missing ones would have to be programmed by the experts themselves),
·
by complications we mean tedions training of the operators who lack in experience with computers.
Having discussed all “pros and cons“ the authors together with a group of potential users decided to make
their own solution in Turbo Pascal programming language consequently Borland Pascal.
In 1990 – 1992 basic moduls of programes GAS-ACOR were made (used in Czech gasworks since 1997 in
ČEPRO) and GASSERV (used in Transgas, since 1996 in MERO CR). Both systems come out of the some
conception and they have following groups of function at their disposal:
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
257
1.
Database. Except for these described databases of devices and measurings there are also databases of
pipeline deffects at our disposal, further database of all routes (GAS-ACOR), measuring in areals
(GASSERV) and the database of the soil resistance and measuring pipeline defects by method pigl-sigl
(GASSERV for MERO). The database of the routes serves as the control dials so that it cannot be possible
to assign non-existing route numbers. The same measure holds for the code dial of the device.
2.
Measuring. In this part of the application we can read or edit the measured values of the quantities in the
form of tables. Various types of measuring are at our disposal (monthly, within 3 months, yearly) but at the
same type it is possible to scan as many as of six different periods on the same route. Also the output of
measured quantities in the form of protective quantities for the screen, printer and plotter ranks among very
important functions of the system. In the graph can be facultativelly shown:
·
description of the device in short form,
·
as many as 6 different curves of the same quantity measured in various periods,
·
curves for various quantities measured at the same period for a given route,
·
places with defected insulation measured by Pearson method,
·
auxiliary raster in both horizontal and vertical direction for easy substraction of the values on x
and y axes,
·
points on the routs with pipeline defects being found by inspection of the “pigl-sigl” method
(only in GASSERV for MERO).
The protective quantities can be seen in the fig. No 2. It is possible to import or export the measured values into
the organiser of PSION, CASIO or BVComp type, with the help of which measuring is done directly in the terrain.
A very important function of the system is the transmission of measuring between ČEPRO and MERO
organizations through special convert file (contains corresponding numbers of routs used in ČEPRO and MERO). In
the tables of measuring there are built in filters of values of the individual quantities and names of the devices with
the possibility to print.
1. Application. This menu item was made for the needs of the user because all the submenu is writeen into the text
file GACMENU.TXT. For each item of the pull-down menu it is necessary to write into the GACMENU.TXT
file 3 lines:
·
the first line with keyword “Menu Item” followed by the name of the item,
·
second line begins with keyword “Menu Help” with the text of help,
·
third line contains the batch statement which is run when this menu item is activated.
It has its advantage because everybody can make his own submenu by inserting 3 lines into the text file in any text
editor. The number of items is limited to 100. These menu items are preset:
·
yearly backup of measuring,
·
daily backup measuring,
·
communications with PSION, BVComp or CASIO organisers,
·
run of own special interpreter of SQL statements with the in-built list of the most frequent
questions.
2. Outputs. This menu item contains the pull-down menu with following items:
·
converting into TXT or DBF formate,
·
compress of all database files,
·
joining database files of various enterprises.
3. Services. This menu item comprises:
·
configuration and initial settings of parametres for the output on the printer and plotter,
·
run of external favourite text editor,
·
exit to operation system.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
258
Year: 2000; Routes Number: 108 ... 115
Positioning [km]
<F3> - Raster; <F4>/<F5> - Zoom; <F6> - Previous zoom; <F7> - Original size; <ESC> - Return to menu
Fig. No. 2 Graph of the protective quantities
5. CONCLUSION
The GAS-ACOR and GASSERV systems have already been in run for more then 10 years and they have
been upgraded according to the users needs. The last upgrade consists in introducing module for universal file print
in the network environment. This module vas created in Borland Delphi. The authors provide the users with help in
the form of hot line or personally. This year it has been worked on the converting of the GAS-ACOR system from
the MS-DOS system into Windows. Source modules are made in Borland Delphi and database is created in MS
Access. Link between source modules and database is realized by ODBC (Open Database Connectivity) interface.
The testing run of the new system will be operated in South-Moravia gaswork in Brno. The introduction of GASACOR into practice will mean a considerable time saving and the increase of the efficiency of the anticorrosive
protection.
REFERENCES
[1] THÉNARD, L. J.: Annales de Chimie et de Physique 11, 1819, p. 40.
[2] ČLUPEK, O., DAVIDOVÁ, H.: Anticorrosive protection. GAS s. r. o., Praha, 1998, 141 s., ISBN 80-902339-88, in Czech.
[3] SEDLÁČEK, S., BARTONĚK, D.: GAS – ACOR. User manual. SHINE, Brno, 1996, in Czech
[4] SEDLÁČEK, S., BARTONĚK, D.: GASSERV. User manual. SHINE, Brno, 1997, in Czech
[5] SEDLÁČEK, S., BARTONĚK, D.: GASSERV pro MERO. User manual. SHINE, Brno, 1998, in Czech
[6] SEDLÁČEK, S., BARTONĚK, D.: GAS-ACOR pro ČEPRO. User manual. SHINE, Brno, 1999, in Czech
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
259
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
260
COURSE TIMETABLING BY AN ARTIFICIAL IMMUNE SYSTEM
Jaroslav Čechman
Institute of Automation and Computer Science
Brno University of Technology, Technická 2, 616 69 Brno, Czech Republic
E-mail: [email protected]
Abstract: This paper is aimed at design an artificial immune system that is used to solve a basic
school course timetabling problem. The design of basic components and methods is progressively
depicted. The experimental section includes parametres that have a bearing on behave of an
artificial immune system.
Keywords: timetabling, optimization, artificial immune system
1 INTRODUCTION
Scheduling is difficult for two reasons. Firstly, it is computationally complex problem, described in
computer science terms as NP-complete. This means that search techniques that deterministically and
exhaustively search the space of possibilities will probably fail because of time requirements. In addition, search
techniques that use heuristic to prune the search will not be guaranteed to find an optimal (or even good)
solution. Secondly, scheduling problems are often complicated by details of a particular scheduling task.
Timetabling is a common example of a scheduling problem and can manifest itself in several different
forms. The particular form of timetable required is specific to the environment or institutions in which it is
needed.
The application of computers to timetabling problems has a long and varied history. Almost as soon as
computers were first built there were attempts to use them to solve these problems. The first generation of
computer timetabling programs in the early 1960’s was largely an attempt to reduce the associated
administration work. Soon, programs were presented with the aim of fitting classes and teachers to periods.
There has been a growing interest in the use of metaphors extracted from the immune system for the
development of the so-called artificial immune systems. AIS have been applied to a wide variety of domain
areas, such as pattern recognition and classification, optimization, data analysis, computer security and robotics.
The paper is arranged as follows: the following section describes a background and a framework to
general design of artificial immune systems. The design of fundamental components and methods of AIS is
detailed in section 3. In section 4 the timetable requirements are defined and section 5 describes experiments and
results. The final section discusses all findings and draws conclusion.
2 ARTIFICIAL IMMUNE SYSTEMS
The establishment of the field of the artificial immune systems (AISs) has been difficult for a number of
reasons. Firstly, the number of people active in the research area is still small but has been increasing in the past
few years. Secondly, researchers found it difficult to identify the difference between an AIS and the work
undertaken in theoretical immunology. Thirdly, the application domain of artificial immune systems is very wide
range. Finally, only recently the first textbook proposing general framework to design AIS has been published.
There were limited number of attempts to define the field of artificial immune systems. The present
work adopts the concept in which artificial immune systems are defined as computational systems inspired by
theoretical immunology and observed immune functions, principles and models, applied to solve problems [4].
This definition covers some of the aspects mentioned obove by drawing a fine line between AIS and theoretical
immunology: the applicability. While works on theoretical immunology are usually aimed at modelling and
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
261
providing a better understanding of the immune functioning and laboratory experiments, works on AIS are
applied to solve problems in computing, engineering and other research areas. This is more akin to a soft
computing paradigm.
It is beyond the scope of this paper to present and discuss theoretical aspects of artificial immune systems
in more detail. However, in order for further comprehensibleness, we summarize a framework to design artificial
immune system, at least, in the following basic elements:
·
a representation for the components of system;
·
a set of mechanisms to evaluate the interactions of individuals with the environment and each other;
·
procedures of adaptation that govern the system dynamics.
A basis of framework to design artificial immune systems: a representation to create abstract models of
immune organs, cells and molecules; a set of functions, termed affinity functions, to quantify the interactions of
these „artificial elements“; and the set of general purpose algorithms to govern dynamics of the AIS. Figure 1
summarizes the elements involved in the framework to engineer an AIS: this can be thought of as a layered
approach of the design procedure.
Figure 1: The framework to engineer AIS and its layered structure.
3 AIS DESIGNING – TIMETABLE IMPLEMENTATION
Educational timetabling is a major administrative activity for a wide variety of institutions. A
timetabling problem can be defined to be the problem of assigning a number of events into a limited number of
time periods. A. Wren defines timetabling [9] as follows:
“Timetabling is the allocation, subjects to constraints, of given resources to objects being placed in space time, in
such a way as to satisfy as nearly as possible a set of desirable objectives.“
In this paper, we will concentrate on basic school course timetabling problem. This problem is subject
to many constraints that are usually divided into two categories: “hard“ and “soft“ ([2]).
Hard constraints are rigidly enforced. Examples of such constraints are:
·
no resource (pupils or stuff) can be demanded to be in more than one place at any one time;
·
for each time period there should be sufficient resources (e.g. rooms, teacher etc.) available for all the
events that have been scheduled for that time period.
Soft constraints are those that are desirable but not absolutely essential. In real-world situations it is, of
course, usually impossible to satisfy all soft constraints. Such problems can be:
·
time assignment: a school subject may need to be scheduled in a particular time period;
·
time constraints between events: one course may need to be scheduled before/after the other;
·
spreading events out in time: pupils should not have two same lessons on the same day;
·
coherence: teachers may prefer to have all their lessons in a number of days and to have a number of
lecture-free days – these constraints conflict with the constraints on spreading events out in time;
·
resource assignment: teachers may prefer to teach in a particular room or it may be case that the particular
school subject must be scheduled in a certain room.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
262
3.1
A Representation Scheme
The fundamental components of artificial immune systems are immune cells (T-cells, B-cells). They
present surface receptor molecules whose shapes are complementary to the shapes of antigens, allowing them to
recognize disease-causing agents and then perform an effector function. The immune cells and molecules are
therefore elements that have to be modelled and used to create AIS.
Perelson and Oster [8] first proposed the concept of shape-space (Sh). Bearing in mind that the
recognition of antigens is performed by the cell receptors, shape-space allows a quantitative descriptions of
receptor molecules and antigens interactions. As in the biological immune system, in a shape-space, the degree
of binding (degree of match or affinity) between an antigenic receptor (antibody Ab or T-cell receptor TCR) and
an antigen (Ag), is measured via regions of complementarity.
The set of features that describe the relevant properties of a molecule from a recognition perspective is
termed its generalized shape. The generalized shape of an antibody is described by a set of L parameters. Thus, a
point in a L-dimensional shape-space specifies generalized shape of antibody binding region with regard to its
antigen binding properties. Mathematically, the generalized shape of a molecule (m), either an antibody (Ab) or
an antigen (Ag), can be represented as an attribute string (set of coordinates) m=m1,m2,..,mL, mShL, or other
more elaborate structures such as a neural networks or a Petri net.
In a course timetabling case, in addition to basic attributes (teacher, subject and classroom), the string
should contains the information about scheduled time (day, hour). If we imagine timetables as a sequence of
each group timetable we should use matrix notation, where matrix coordinates code a scheduled time of an item
on each group. The item consists information like teacher‘s name, number of classroom, subject‘s name etc.
Figure 2 depicts this conception: there are three groups (1.A, 1.B, 1.C) and three subjects (History, Mathematics,
Physics). Group 1.A has a history lesson on Monday morning (first time slot) and a physics lesson on Thursday
afternoon (sixth time slot). Group 1.B has a mathematics lesson on Tuesday morning (eighth time slot). In other
words, timetables form matrix S(g,d,h), where item si,j,k is a record that contains information about subject,
teacher, classroom etc. In our case item s0,0,0 corresponds to a history lesson in 1.A group on Monday morning.
Figure 2: Schedule Scheme
3.2
Affinity Measures
The type of shape-space (representation), used to model an antibody and an antigen, will partially
determine a measure to calculate their affinity. As the Ag–Ab affinity is related to their distance, it can be
estimated via any distance measure between two strings or vectors, such as the Euclidean, the Manhattan, or the
Hamming distance. Hence, if coordinates of an antibody are given by Ab=Ab1, Ab2, .., AbL and those of an
antigen are given by Ag=Ag1, Ag2, .., AgL , then the distance D between them can be defined as:
L
D
‡”(Abi - Agi )2
(1)
i =1
L
D
‡”Abi - Agi
(2)
i =1
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
263
L
D=
‡”δ1 ,where δi = {
i=1
1 if Abi •‚Agi
0
otherwise
(3)
where Eq. 1 is the Euclidean distance, Eq. 2 Manhattan distance and Eq. 3 the Hamming distance.
It is said that constraints of timetabling problem form two categories: hard and soft. To quantify these
constraints we establish a certain functions (discrete or continuous) that depend on how important the constraint
is. In case of basic school we could prefer to reschedule the history lesson to the afternoon and subjects as
mathematics and physics are occurred in the morning. Figure 3 depicts two examples of priority graphs. The
time flow at the Fig. 3(a) signifies our request for situating particular subject to fourth time slot (the best
solution). The Fig. 3(b) refers to case when a subject is learned twice a week.
Figure 3: Samples of Priority Graphs. (a) Daily subject characteristic. (b) Weekly subject charactristic.
The cell affinity corresponds to quantitative evaluation of the solution. If the solution consists of many
collisions (conflict in the schedule), then affinity of the cell that represents this solution is lower than cell affinity
that represents solution without collisions. The criterial function is resulted from cell affinity:
g
fa =
d
h
‡” ‡” ‡”St , j ,k •¨ max .
(4)
t =1 f =1 k =1
where fa is cell affinity, g, d and h are boundaries of timetable (max. number of groups, days per week, hours per
day), si,j,k is the item of matrix S(g,d,h).
3.3
Immune Algorithms
Immune algorithms govern the behaviour of the system. The fundamental functions are cell generator,
clonal selection and affinity maturation.
The immune cells and molecules are generated in the bone marrow. The genes used to encode the
receptor molecules are stored in separate and distinct libraries. The encoding of these molecules occurs through
the concatenation of different gene segments that are randomly selected from each of the gene libraries. The
bone marrow model is used to create the matrix that represents the immune receptors.
The simplest bone marrow model is the one that generates matrix items by using random number
generator. The more sophisticated model takes advantage of knowledge about timetable arrangement. This
means that we can control the new cell growth and the system doesn‘t produce transparently useless cells.
In [4] the authors focused on clonal selection principle and affinity maturation process of adaptive
immune response in order to develop an algorithm suitable to perform task such as machine learning, pattern
recognition and optimization. The main immune aspects taken into account to develop the algorithm were:
selection and cloning the most stimulated cells proportionally to their affinity; death of non-stimulated cells;
affinity maturation and selection of cells proportionally to their affinity; and generation and maintenance of
diversity.
On the basis of this algorithm two types of mutation were designed: hypermutation and somatic
mutation. The difference between hypermutation and somatic mutation is that the hypermutation uses whole
timetable range of group to reschedule given subject but somatic mutation shifts the subject through a day only.
Moreover, somatic mutation takes subject priority time flow into account.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
264
It is said that number of mutations is proportional to cell affinity. This means that the highest affinity
cell (with affinity Dmax) undergoes a mutation where the only one subject is rescheduled. The mutation rate ()
of all cells is determined by Eq. 5:
( )
α D* = exp(- p.D)
(5)
where is a parameter that controls the smoothness of the inverse exponential function, and D is the normalized
affinity that can be determined by D=D/Dmax.
4
TIMETABLE SPECIFICATIONS
The basic school course timetabling problem is solved. Table 1 summarizes all requirements that must
be observed. Category of subjects represents the identical subject characteristic, i.e. first category covers all
subjects that have same characteristic as priority graph that is depicted in Fig. 3(a). All subjects and teachers
have only symbolic names.
Table 1: Timetable specifications
Category of Subject Subjects
Subjects
a Week
1.
S01
2
S02
2
S03
5
S04
2
S05
2
2.
S06
2
S07
2
3.
S08
5
S09
1
4.
S10
1
S11
2
S12
2
S13
2
5
Teacher
T01
T02
T03
T04
T05
T06
T07
T08
T09
T10
T11
T12
T13
8.A
301
101
301
301
301
301
301
301
301
102
301
103
301
8.B
302
101
302
302
302
302
302
302
302
102
302
103
302
Classroom
8.C
303
101
303
303
303
303
303
303
303
102
303
103
303
8.D
304
101
304
304
304
304
304
304
304
102
304
103
304
EXPERIMENTS AND RESULTS
The experiment incorporates two levels. On the first level we work with simplified timetabling problem,
where no conflicts of the schedule are presented (take only the one group into account). On the second level we
work with timetabling problem exactly as is denoted in table 1.
Figure 4 depicts two kind of mutation: hypermutation (M1) and combination of hypermutation and
somatic mutation (M2). Figure 5 depicts how is important to find a suitable ‘smoothness‘ parameter (cS) for
optimal system working. The value of cS2 () is counted inverse to Eq. 5, where (D)=1. In Fig. 6 different sizes of
population (PS) are explored.
It is said that the clonal selection produces new clones of the cells proportionally to their affinity. But
the size of the population is limited. Clonal ratio parameter (CR) determines the size of cloned cells that
constitute whole population. Parameter CR is evaluated in order that clone population size is 10% (CR1), 50%
(CR2) and 100% (CR3) of whole population. Figure 7 depicts how the clone population size is effective in
exploration of the search space.
Figure 8 is similar to Fig. 6, but all groups in timetable are presented. Figure 9 show the average value
of the conflicts in the schedule.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
265
Figure 4: Type of mutation. M1 – hypermutation; .
M2 – hypermutation and somatic mutation
Figure 5: Mutation rate. cS1 < cS2 < cS3.
Figure 6: Population size. PS1=100, PS2=500,
PS3=1000.
Figure 7: Clone ratio. CR1≈10%, CR2≈50%,
CR3≈100% of population size
Figure 8: Population size. PS1=500, PS2=1000
6
Figure 9: Conflict in the schedule. PS1=500,
PS2=1000.
CONCLUSIONS
The paper shown the power of AIS to solve course timetabling problem. In previous section we can see that is
very important to make the right evaluation of such parameters as population size, mutation rate and clone ratio.
These parametres are essential to effective system functionality. Two types of mutation procedure show that we
can use mutation control to gain in optimization efficiency.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
266
REFERENCES
[1] BURKE, E.; ROSS, P. (Eds.) The Practice and Theory of Automated Timetabling. Springer Lecture Notes
in Computer Science Series, 1996, sv 1153.
[2] BURKE, E.; KINGSTON, J.; JACKSON, K.; WEARE, R. Automated University Timetabling: The
State of the Art. The Computer Journal, 1997, str. 565–571.
[3] BURKE, E.; PETROVIC, S. Recent Research Directions in Automated Timetabling. European Journal of
Operational Research, 2002.
[4] de CASTRO, L. N.; TIMMIS, J. Artificial Immune Systems: A New Computational Intelligence Approach.
Springer, 2002. ISBN 1-85233-594-7.
[5] de CASTRO, L. N.; TIMMIS, J. Artificial Immune Systems as a Novel Soft Computing Paradigm. To
appear in the Soft Computing Journal, 2003, sv. 7.
[6] ECHMAN, J. Optimalizace systému s využitím umlého imunitního systému. Diploma Thesis, Brno
University of Technology, 2003.
[7] OŠMERA, P.; MASOPUST, P. Schedule Optimization Using Genetic Algorithms. Proceedings of
MENDEL 2002, Brno, Czech Republic, 2002, str. 132–138
[8] PERELSON, A.; OSTER, G. Theoretical Studies of Clonal Selection: Minimal Antibody Repertoire Size
and Reliability of Self-Nonself Discrimination. Journal of Theoretical Biology,1979, sv.81, str. 645–670.
[9] WREN, A. Scheduling, Timetabling and Rostering – A Special Relationship? In: (Burke1996a), str. 46–75.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
267
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
268
MANAGING A HIGH SPEED LAN USING DISTRIBUTED
ARTIFICIAL INTELLIGENCE
Ibrahiem M.M-El-Emary
Faculty of Applied Science, Al-Balqa Applied University, Al Salt, Jordan.
E-mail:[email protected]
Abstract: This paper is concerned with a practical application of distributed artificial intelligence
for managing the high data rate bus structured local area computer network that uses deterministic
multiple access protocol. In the selected network that is managed using distributed artificial
intelligence, the dynamic sharing of the available bandwidth among stations is achieved by forming
“train to which each station may append a packet after issuing a reservation. Reservation and
packet transmissions are governed by the reception of control packets (token) issued by the network
end stations. Managing approach that was suggested depends on using intelligent autonomous
agents, which are responsible for various tasks among it: election of the end stations, the recovery
from failures, and the insertion of new stations in the network. All these tasks are based on the use
of special tokens.
Keywords: Distributed Artificial Intelligence, Autonomous Agents, neural Networks, EXPRES-NET,
CSMA/CD
1. INTRODUCTION
The main task of the network management system researchers is to develop new tool that work better
than current by available tools for doing this laborious work. The work presented in this paper directed about
how to delegate as much as possible to the machine, using network administrators as knowledge engineers that
teach the machine how it should perform its work based on what is called Intelligent Autonomous Agents.
There are a few predictable advantages of using Intelligent Autonomous Agents for any management
system, network management being just one. First of all, there is the intelligent nature, which sounds promising.
Having an intelligent and adaptable system is usually better than having dedicated applications for specific
solution. Likewise, the word Autonomous gives us the idea of something that can work by itself or need almost
no human interference. Finally, agent gives the impression of a helper or a wizard that some how works between
the machine and human. So, the main objective proposed here is to project a system where the human system
administration does not have to work as the truly intelligent-meaning cognitive- element in the management
process, feeding the system with the rules of work or knowledge-it needs to operate, no more boring work, but
intelligent work [5].
The network that is managed by the above approach categorized as a local area network labeled by
TOKENET [4], its architecture based on a linear bidirectional bus, and on a deterministic channel access
protocol, whose efficiency is similar to that of EXPRESS protocols [3]. Advantages of this network over the
linear topology (L-EXPRESS-NET) are that stations need not be numbered, and that no silence counter is needed
in order to schedule transmissions. The used LAN is named TOKENET since tokens (control packets) are used
in the operations of the channel access scheme. The protocol algorithms were developed for a scenario in which
Ethernet-like networks cannot be used, i.e. considering a large population of users that can be connected using
several kilometers of cable, and whose traffic requirements must be satisfied with a very high data rate (>100
Mbps). The packet transmission time is assumed to be much shorter than the end-to-end propagation delay, so
that the CSMA/CD yields extremely poor performance.
In order to manage the above mentioned network using the Intelligent Autonomous Agents, we present
this paper which is organized from the following: section two describes the architecture of the managed network.
In section three, we explain the approach that is used in accessing such a network type. The procedures that
indicate the system initialization was discussed in section four. The concept of how to recover a failure as well as
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
269
inserting a new station was devoted in section five. Section six was dedicated to clarify how to manage these
TOKENET using the Intelligent Autonomous Agents. Finally, the paper was enclosed in section seven with
conclusions and further works to be done by others.
2. ARCHITECTURE OF THE TOKNET NETWORK
TOKENT is based on a linear topology. It is thus possible to identify the rightmost and leftmost active
stations (where by active we mean that stations are executing the protocol algorithm). These two stations are
named “end stations”. Stations are connected to the bus by means of passive taps that can simultaneously receive
and transmit [1]. Packet collisions can be detected. Interfaces are capable of recognizing a limited set of short
control packets, named tokens, whose functions will be described later on. The reception of tokens as well as the
reception of packets may produce some response by a station. The time needed in order to start this response is td
seconds; this includes the time for the recognition of the token (or the detection of the end of a packet reception)
and the time to initiate the response. td should be of the order of a few bit times[4].
Data packets are transmitted in trains, composed of a token (the locomotive) followed by some packets
(possibly none, in which case the train is said to be empty). The time separation between successive elements of
the same train is td seconds. Packets may board the train only after a reservation has been issued. By listening to
other stations reservations each station can determine the correct position of packet in the train.
Transceivers can operate in four different modes depending on the state of the station: the CSMA-CD mode and
the polite sensing mode are used during system initialization in order to elect the end stations; the reservation
mode is used when a station has a packet ready for transmission in order to reserve a place on the next train; the
scheduling mode is used after a reservation has been made in order to transmit the packet in the train in the
reserved position.
The CSMA-CD mode implies the transmission of packets following the CSMA-CD protocol [1, 2]. It is
assumed that tc seconds are necessary to recognize a collision. The time tc is a few times longer than td. When a
station is in the polite sensing mode it starts transmitting its packet after recognizing the end of a packet
transmission from another station. If a collision is detected in the initial tc seconds the transmission is aborted
and a new attempt will be made at the end of the colliding packet. When a station is in the reservation mode it
transmits a short burst (any predefined bit pattern) when it receives the reservation token. Stations in the
scheduling mode count other station reservations, determine which their received wagon in the train is and
transmit their own packet in the reserved position. If a collision occurs in the initial tc seconds the transmission is
immediately halted. Receivers are always enabled. Packets (and burst) must be separated by at least td seconds to
be received.
Packets can be of variable length. Before each packet a preamble is transmitted in order to synchronize
the destination station receiver. The preamble duration is of the order of several tens of bits. In some cases a long
preamble is necessary. It is not necessary, insteade, to transmit a preamble before the reservation bursts, since
they must only be detected and need not be decoded.
3. ACCESS SCHEME USED IN TOKENET NETWORK
Assume the two end stations have been found, and denote, as in Fig.1, by L the leftmost active station,
and by R the rightmost active station. Upon recognizing the end of a train, R issues an R-token, denoted by Tr ,
and after td seconds transmit a reservation burst. The token and the burst propagate towards all stations to the left
of R. Each station with packets ready for transmission, upon reception of Tr immediately transmits a burst of
short duration that serves as a packet reservation. Due to the station latency time the burst is separated from Tr
by a gap of duration td. Denote by tb the time duration of the burst. Provided that the distance between any two
adjustment stations is larger than the distance over which the line signal propagates in (td+tb)/2 seconds, stations
receive bursts transmitted from station to the left, separated by intervals of duration larger than td. It is thus
possible for each station to count the number of bursts transmitted by stations to the left. Denote by Nj the count
of station j. Bursts transmitted by the station to the right are instead seen as completely overlapping and are thus
undistinguishable.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
270
Fig.1 TOKENET topology; L and R are the network end stations
Finally Tr reaches L, that may transmit a reservation burst as any other station if it has a packet ready
for transmit ion. After either transmitting the burst or just hearing the overlapped bursts of all stations with ready
packets, L issues on L-token T1 that serves as locomotive of a train of packets. If L has just transmitted a
reservation burst it also append the data packet to T1, leaving a silent gap of duration Td in between. The train
propagates now in the L to R direction and those stations that just made a reservation may append packets.
Before appending its own packet to the train station j must recognize T1, and count Nj packets in the train
beyond T1. The train propagates all the way to R collecting new wagons. R may append a packet to the train
too, and when R recognizes the end of the train it issues a new R token, thus starting a new cycle.
A fault in the operation of the station may cause a truncation of the train: if a station has a wrong
reservation count, it will either transmit too early (the transmission is stopped after Tc seconds because of the
collision), or not transmit because it will not find enough packets in the train before its own. Both events prevent
all stations to the right of the faulty one to transmit in the current train. The same thing happens if the station
issues a reservation and then dose not transmit the packet. If instead, a station transmits a packet without
transmitting the corresponding reservation, a collision may take place id some station to the right transmits a
packet, but other stations can still board the train. Otherwise, if no collision occurs, then the train maybe larger
than expected. All above faults conditions only influence next cycle. Other faults concerning either token losses
or failures of the end station must be reserved through a system reinitializing.
Quantatively, when we show the impact of the constraint on the minimum distance between adjustment
stations, we see that for a 100 Mbps transmission speed, assuming that the sum (td+tb) is equivalent to ten bit
times, and that the propagation delay is 10µs/km. The time interval corresponding to five bits is 50 ns, that is
equivalent to a five-meter distance. Stations must hence be separated by at least five meters of cable.
4. INITIALIZATION PROCEDURE FOR TOKENET USING CSMA/CD PROTOCOL
I f an active station doesn’t hear any packet, token, or reservation burst on the bus for more than To
seconds (where To is of the order of several round trip delays), it goes onto initialization mode in order to elect
the end station and restart the protocol operation. The initialization procedure consists of the following steps:
*(A) Using the CSMA/CD protocol one station (say station A) acquires the channel and issues an initialization
token. (Note: this may require several retransmission attempts. At each attempts only the station which collided
in the previous attempt participates). To avoid ambiguities the initialization token length is such that transition
time is longer than the end-to-end propagation delay.
*(B) Upon hearing the initialization token, the remaining active stations compete again (as in step 1) until one
station (say station B) acquires the channel and issues a response token.
*(C) Station A issues a token which is called token A. Station B issues token B immediately upon hearing token
A. All the other stations are listening and can determine their relative position with respect to station B as follow:
for the station in region 1 (see Fig. 1) there is a gap (larger than Td seconds) between the time token A is
received and the time token B is received. For the stations in region 2 token A is received immediately (more
precisely td seconds) before token B. Station in region 1 are temporarily inhabited to respond to R station
location tokens (see next step).
*(D) Station A issues an R station location token that starts a train of tokens since all stations in region 2
(stations that are not inhabited) append a similar token to the train using the polite sensing mode. Each station
monitors the channel after transmitting its token. If a station does not hear any other transition after its own it
concludes that it is the right end station (R). The token has a special format, namely, the preamble length is such
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
271
that reamable transmission time is larger than the end-to-end propagation delay. This is necessary to guarantee
that all region 2 stations correctly receive tokens.
*(E) The newly elected R station issues an L station location token, and the left end station (L) are located using
the same procedure above. Again the token preamble transmit ion time is longer than the end-to-end propagation
delay.
*(F) The L station issues an L token to notify the end of the initialization phase.
*(G) The R station receives the L token and issues an R token, thus starting the data transfer phase.
5. OPERATION CONCEPT OF FAILURE RECOVERY AND INSERTING A NEW STATION
The failure of an intermediate station (i.e., a station located between L and R) has no impact on the
protocol; thus it does not require a specific recovery procedure. The failure of left or right end station, instead,
requires the selection of replacement. Assume that the R(L) station fails. The L(R) station will detect the failure
from the fact that the R-token (L token) does not come back after the line has been idle for a round trip time.
After learning that the R(L) station is down, the L(R) station decides it is the new R station, issues an L station
location token, and the initialization procedures is executed step 5 on. At the end the new end stations are
elected. Note that if the L station fails, the old R station remains the new R station after the recovery, whereas, if
the R station fails, the old L stations becomes the new R station.
When a new station become active, it is necessary to allow it to join the other stations in the dynamic
sharing of the communication channel. If the new station is in an intermediate position, i.e, between station L
and station R, then it may immediately participate in the protocol procedures. If, instead, the station is either off
to the right or off to the left, then it is necessary to reinitialize the whole network. The decision the new stations
must take whether to start the initialization procedure itself by listening to the channel.
If the new station receives the R token and reservation bursts immediately followed (after td seconds)
by the L token, then it concludes it is off to the left. If the new station detects the end of the train immediately
followed by an R token then it concludes that it is off to the right. If the new station, by listening to the channel,
determines that it is not located between L and R, then it decides it is the new R station, and issues an L station
location token, thus triggering the execution of the initialization procedure from step 5 on. At the end of the
execution of the algorithms the new end stations are elected, and the new station has become station R.
6. TOKENET NETWORK MANAGEMENT USING AUTONOMOUS AGENT APPROACH
The main objective in this session is to show how the agents fit in a standard network management
system. The used approach depends on presenting a diagram from a regular system and then replaces its internal
component by another diagram; using agents based components with similar functionalities.
An ordinary network management could have a workflow like the one presented in Fig.2 [6] in which
data is collected from a another managed device (1) using some standard management protocol as SNMP. After
that, in step (2), the collected data is analyzed, and then condensed. In step (3) creating the real management
information. Later on, managers make their analysis in step (4) and, if required, take reaction in order to correct
weak points of the system.
Based on Fig.2, we have replaced the management approach with another one shown in Fig.3 adapted
to agents’ technology oriented. The types of used agents are: data collected (AgF) used to coordination of
knowledge exchange between agents of the same community.
In the following, we describe each part of the generic agent’s framework shown in Fig.3 as follows:
(a) Data Collector agents (AgP) ; this type of agents retrieve data from devices and saves this information on a
local knowledge database. This knowledge could be kept and processed locally and then shared to a whole
community through the agents’ framework shown in Fig.4. The most part of data collector agent functionalitywhich is collecting data- is implemented by special knowledge rules loaded inside its knowledge database. These
rules, in association with native knowledge implemented in the code of the agent, allow them to interface
managed equipment through SNMP protocol. A signal data collector may collect information from several
different devices. It could be setup with multiple different goals, with one or more goals for each device.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
272
Fig.2 Network management workflow.
Fig.3 Network management with agents
Symbols used in Fig.3 are given by the following:
AgP: collector agent.
AgM: manager agent.
AgI: Interface agent.
AgF: Facilitator agent.
KQML: knowledge query management language.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
273
Fig.4 Generic agent structure
(b) Management agent (AgM); are responsible for the organization of operation between agents of the same
community .In addition, they can implement a higher level of data analysis based on shared knowledge (data)
coming from several different collector agents. Finally, management agents coordinate multiple data collector
agents on working on a special set of devices and later correlating that data. Besides, we have other types of
management agents associated to agent community and not directly to network management environments. What
we mean by this is that those agents will be presented on any agent community, not just on those dedicated to
management self-creation, agents that neural network structure, wizard agents- those that hold and share massive
knowledge data base-and many others being projected in order to create an agent virtual world [7].
The types of agents specially setup for network management purposes are:
* Data analyzer; which analyze data retrieved from different knowledge bases, usually from distinct data
collector agents.
*Data consolidator; which combines or correlate data coming for a set of distinct data collector agents,
creating new consolidated information.
*Future value predictors; which are implemented by using neural network structures designed for time series
prediction. With this class of agents our intention is to implement entities that theoretically could predict future
values in a time series. This feature is quite helpful in designing proactive management system.
*Value classifiers; which are implemented by using neural network as well, permit data classification in scales,
like HIGH, NORMAL, or LOW instead of linear values. This classification is implemented by pattern machines
and high adjustable being even possible to create neural network structures that adjust themselves based on past
values of the time series.
So far the neural agents have been studied as an improvement for the system functionality. Neural
agents work as servers for the neural network structures, where batches of data are submitted for processing and
result returned. In the future we are idealizing to have hybrid agents, with both rules- based and neural structures,
inside INTERFACE modules.
(c) Interface agents(AgI); are the bridges between the agent communities and the human managers. There are
various possibilities for these interfaces, like e-mail interfaces, terminal like interfaces and web interfaces.
Selecting which interface the human manager prefers the friendlier interface in some aspects are the terminal like
interface, where humans can "chat" with agents using structured language similar to human's natural language.
Interface agents, like any other, have their functionality structured inside the community and also the agent
world. When a new sentence is input, the local parser, which is implemented by means of rules inside the
knowledge database, will try to translate and understand it. If this is not possible locally, usually due to lack of
local knowledge, the task will be escalated a more skilled-inside the community or not- that will try to get
translation for it. If in the end the agents can't still understand the human entered sentence it will reply-as of
current implementation an "I can’t understand you" sentence as ask to repeat.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
274
(d) Facilitator agents (AgF); are communication management agents needed in the agent community to work as
brokers for the information exchange process. They also implement some useful tasks related to data exchanging
like list of available service names, routing, meditation, and translation.
7. CONCLUSIONS AND FUTURE WORKS
Managing the TOKENET local area network has been the aim of this paper. The reason behind
selecting this network type comes from the ability of this network to be used for a high speed local area
computers network using bidirectional bus, all network separations which must be executed through the agents
are based on the transmission and reception of control packets (token).
Tokens issued by the network and stations are used to schedule packets transmission in the from of a
train on which a packet is admitted only after a reservation has been issued by the original station. The algorithm
for the election of the end station, the recovery from failure, and the insertion of a new station are all based on
special token transmission. The architecture of TOKENET is completely distributed: any station may be elected
and station depending on which station are at some time connected to the network. When end station fail,
replacements are elected among those stations that survived the failure.
Our future work will be driven by the following:
To integrate even closer the promising field of neural networks inside current interface module, in a transparent
and extensible way. So, furthermore , we will have truly hybrid agents that will be able to take advantage of the
best of both words. For network management, we use a direct utilization by creating neural structures to predict
temporal series dynamically. This would be used in proactive network management through future value
prediction.
REFERENCES
[1] TANENBAUM, A.S.: Computer Networks, Fourth Edition, Pearson Education, Inc.,Prentice Hall PTR,
Upper saddle River, New Jersey, 2003.
[2] STALLINGS, W.: Data and Computer Communications, 6TH Edition., Upper Saddle River, N.J., Prentice
Hall, 2000.
[3] FRATTA, L., BORGONOVO, F. and TOBAGI, F.A.: "The EXPRESS-NET: a Local Area Communication
Network, Integration Voice and Data "Proceedings of the International Conference on Performance of Data
Communication Systems and their Applications, Paris, France, September 1981.
[4] MARSON, M.A., GERLA, M.: " TOKENET-A Token based Local Area Network", Proceedings of
Mediterranean Electron Technical Conference, Atens,Greece, May 1983.
[5] CHIKHOUHOU, M. M., CONTI, P., MARCUS, K., LABETOULLE, J.: A software Agent Architecture for
Network Management: Case Studied and Experience Gained, INSM Special Issue on Intelligent Agents for
Telecommunications Management, vol.8, No.3, Sep.2000.
[6] MAES, P.: Agents that Reduce Work and Information Overload, Comm. ACM, vol.37, No.7, U.S.A, 1994.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
275
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
276
MARKOV CHAINS AND EXAMPLE OF THEIR USE
Vítězslav Ševčík
Brno University of Technology, Faculty of Mechanical Engineering
Institute of Automation and Information Technology, Technická 2, 616 69 Brno
Tel.: +420 541 143 349, Fax: +420 541 143 334
Email: [email protected]
Abstract: This article clarifies and summarises the fundamentals of Markov analysis and presents
some fields of usage of Markov chains as an example. Markov Analysis helps us to understand a
certain sort of stochastic processes that can be described by the so-called Markov Chain. This
article clarifies the fundamental methods and shows utilising of the Markov Chain, focusing on the
descriptive methods.
Keywords: Markov chain, Markov analysis Optimising, Theory, Finite state first order Markov
Chain
1. INTRODUCTION
The Markovian model is applicable in many practical situations. Therefore it can be a component of many
algorithms, which deal with an appropriated sort of stochastic processes. This article was written to present a
basic concept in Markov analysis and optimisation under uncertainty.
0
Stage 1
h
Stage 2
2h
Stage 3
3h
4h
…
Time
Figure 1 Stages of the process
First of all, we present some attributes for processes named in honour of a Russian mathematician Andrei
Andreevich Markov. Markov process is a stochastic process with the so-called Markov property. In general there
can be discrete-time and continuous-time processes assumed. In practice, the process time can be divided into
intervals, which makes them discrete. This time intervals are called stages. Furthermore, let us consider that all
stages are of the same duration h , as shown in Figure 1. The process can be found in one of any different states
in each period of time. Let us consider that this set of process possible states is finite and also discrete. We say
that we have a finite and discrete state space denoted by U = {1, 2, K , u} . We consider that set U includes all the
possible states of the system, in other words, the system cannot appear in any other unexpected state, which is
leading to some limitations in practice. In this case, the set of states is said to be collectively exhaustive.
Furthermore, let us consider the system cannot be found in more than one state in each period. The states are said
to be mutually exclusive.
a1
Set of possible
states of the
process
a1
1
a2
1
1
1
3
a3
2
3
2
2
a2
a3
Stage n + 1
Stage n
Fiure 2 Systém with three states – current and following stage
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
277
The modelling of stochastic process behaviour
Stochastic processes must be of the Markov Property, so that Markov Chain might describe these. We
say that a stochastic process is of the Markov property if “the future is only depending on the present, not on the
past”. In other words, following stage depends only on current stage, not on the “path” which is leading to
current state. That means if we have a function, which describes transitions between proper states in the course
of a stage and information about current state of the system, we are able to model the system behaviour
throughout all the following stages.
Thus, the question is: “If we are in some state now, in what state we will be in the following period?” Since
system is stochastic, there is no clear answer, of course. We do not know the following state, but we know the
probabilities that the system will be in a particular state after the following stage, under condition of the current
state of the system in the present stage. This is called conditional state probability. If we denote the state of
system in period n as X n and the state of system in the following period n + 1 as X n +1 , then for the probability
of being in state j in the period n + 1 , provided that we are in state i in the period n , we write:
Pr[X n +1 = j | X n = i ]
(1)
Thus, the answer to our above-mentioned question, about following stage is not a number of state, but
the vector of the conditional state probabilities a = (a1 , a 2 , K, a s ) where s is the number of possible states. This
probability vector can be interpreted as all chances of moving from a particular state to all s possible states in
the following stage. Since the system cannot appear in any other states and must occupy one and only one of the
possible states, the sum of all these probabilities must be equal to one:
s
åa
j
=1
(2)
j =1
The conditional state probability represented by (1) is the basic information about the system together
with the number of possible states. This probability is called the transition probability because it represents the
possibility of the transition from one state to another across one stage. Let us present an illustration of a system
with three possible states a1 , a 2 , a3 , then Figure 2 illustrates the
1
1
transition probabilities of this system. As written above if we know
2
the actual system stage, we can compute a probability vector that
a2
represents the condition of the system in the following period. Let
a1
us have the system studied in the first state in the current stage.
Therefore, the probability vector, which represents the current
1
stage, is a ( n ) = (1, 0, 0) . (Here, number one represents certainty
2
1
because we know it.) As we can see, in the following stage the
3
a3
system transfers into the second state because probability of this
transition is 1, and any other transition must have a zero probability
according to Equation 2. The probability vector of following stage
2
3
is a ( n +1) = (0, 1, 0) .
As noted in [9], the transition probability concept is the key to the
Figure 3 Transition diagram
Markov analysis. To clarify the situation, we can show another
interpretation of the transition probabilities of
To
our example, by the transition diagram. In
State at the following period
Figure 3 we can see a mathematical graph which
From a
a2
L
aj
L as
1
represents the system possibilities of the moving
p
p
L
p
L
p 1s ö
æ
a
12
1j
from state to state. It can be interpreted as a
1
ç 11
÷
fraction of set of elements (for example the
a 2 ç p 21 p 22 L p 2 j L p 2 s ÷
number of broken machines) or as a probability
State in the
M
O
M
M ÷
M çç M
÷=P
of one element reaching a particular state. The
current period a ç p
p i 2 L p ij L p 11 ÷
i1
i
most frequently used representation of transition
ç
÷
M
M
O
M ÷
M ç M
probabilities is the transition matrix. It is a
a s çè p s1 p s 2 L p sj L p ss ÷ø
natural way to represent a
Figure 4 General structure of transition matrix
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
278
model of the Markovian processes in memory of universal computer. It also enables usage of matrix algebra for
the Markov analysis. Let us denote this matrix P . The general structure of this matrix can be seen in Figure 4.
Each particular element pij represents the transition probability from state i to state j . The rows of this matrix
are probability vectors discussed above, and Equation 2 is applicable to each row of matrix P ; therefore we can
write:
s
åp
ij
=1
(3)
j =1
We can get the answer to our question about the probability vector in particular stages if we know probability
vector of the initial stage and transition matrix of the Markov chain, by means of matrix algebra. Any probability
vector of n -th stage a (n ) , multiplied by matrix P , gives us probability vector of following stage a ( n +1) :
a ( n +1) = a ( n ) P
(4)
For our example the transition matrix can be constructed as follows:
æ0 1 0ö
ç
÷
P = ç 0 12 12 ÷
ç1 0 2÷
3ø
è3
(5)
If we continue in our numerical example, the probability vector of the following stage can be computed:
æ0 1 0ö
ç
÷
(0 1 0)ç 0 12 12 ÷ = (0
ç1 0 2÷
3ø
è3
1
2
1
2
)
By means of the matrix algebra we obtain the resulting probability vector a ( n + 2 ) = (0, 12 , 12 ) , the correctness of
this outcome can be proved by a glance at the graphical representation of the system transition probabilities in
Figure 2 or Figure 3. Please note that the same result can be obtained by multiplying the initial state probabilities
and the corresponding transition matrix power:
a (n) = a (0) P n
(6)
where a (n ) is the vector of state probabilities in stage n and a ( 0 ) in the initial stage. The proof of Equation 6 is
quite simple, and can be found in [8]. For our simple numerical example we get:
2
æ0 1 0ö æ0
ç
÷ ç
2
P = ç 0 12 12 ÷ = ç 16
ç1 0 2÷ ç2
3ø
è3
è9
1
2
1
4
1
3
1
2
7
12
4
9
ö
÷
÷
÷
ø
and then for the second stage:
a
(2)
æ0
ç
= (1 0 0)ç 16
ç2
è9
1
2
1
4
1
3
1
2
7
12
4
9
ö
÷
÷ = (0
÷
ø
1
2
1
2
)
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
279
Types of the Markov chain
a1
a3
P reducible =
a2
a4
a1
a2
a3
a4
a1 a 2 a 3 a 4
æ 12 12 0 0 ö
ç
÷
ç 13 23 0 0 ÷
ç1 1 1 1÷
ç4 4 4 4÷
ç0 0 1 3 ÷
4
4 ø
è
Figure 5 System with an ergodic set
As mentioned in the introduction, we deal with the Markov Chains, which have the finite number of
states, and we consider only the dependence of the following state on the current one. We call them finite-state
first order Markov chains (as opposed to infinite-state and higher order). The dependence of successive stages is
stochastic and we describe it by transition probability matrix P . The transition probabilities pij are assumed to
be unchanging over time, we call them stationary. The system states can be classified according to their
probability of incidence through time. When state i is accessible from state j , then we say they communicate.
If i communicates with both j and k , then j communicates with k . Therefore, communication is a property
of a certain class of states, and we call them ergodic and such a set of states is called an ergodic set (closed set).
(We use term ergodic if every part of the observed sample demonstrates the same statistics parameters as the
whole sample.) Whenever the system enters this closed set, this set cannot be left.
a3
a1
Pabsorb
a4
a1
a1 æ 1
ç
a2 ç 0
= ç1
a3 ç 2
a 4 çè 0
a 2 a3 a 4
0 0 0ö
÷
1 0 0÷
0 15 103 ÷
÷
3
1
1 ÷
10
2
5 ø
a2
Figure 6 System with two absorbing states
The example of such a system can be seen in Figure 5. States 1 and 2 communicate one another, but do
not communicate with two other states; we call them ergodic states. In the corresponding transition diagram we
can verify that states 1 and 2 cannot be escaped once entered. We say that the process has been absorbed in this
ergodic set. No matter where the process starts out, it will end in the ergodic set after
a2
the finite number of stages. In other words transition probabilities up to states 3 and 4
will approach to zero in high power of transient matrix (to be compared with
Equation 6). Thereof states 3 and 4 are called transient (transient set). The processes
that can be split out in this way are called reducible. The processes may have more
a1
than one ergodic set.
Another case is illustrated in Figure 6. This process will be absorbed either in state 1
or 2, which is evident from the transition diagram. States 1 and 2 are called absorbing
a3
states, and in the corresponding transition matrix can be distinguished from other
states by the ones in the leading diagonal. Another sort of finite Markov Chains is the
Figure 7 Cyclic system
case where systems always move from state to state. Such a simple system is
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
280
illustrated in Figure 7. These Markov systems are called cyclic. The chain that is not cyclic is called aperiodic.
The system that was used as the first example in this article (see Figure 3) is called regular. This system has a set
of states which all communicates one another. We say that the system is irreducible (cannot be split out to parts
which have different properties). Moreover, this system is aperiodic. Such a Markov chain is called regular and
its behaviour is well predictable especially for a long-term planning.
Steady State Probabilities
It can be observed that the probability vector of a current stage tends to remain constant after a number
of transitions. In other words, when we deal with the transition matrix power of a particular stage, we say that
matrix P k approaches a unique limiting matrix. The process is said to have reached a steady state. For the
regular Markov chain this is always the case.
100%
100%
50%
50%
0%
0%
1
4
7
10
13
a) One half of the machinery is prepared; the second
one is in operation at the beginning.
1
4
7
10
13
b) All machines are under maintenance at the
beginning.
Figure 8 Behaviour of regular system
Since the system reaches steady state, the probability distribution of the following stage remains the same:
a ( 0 ) P k +1 = a ( k +1) = a ( k )
(7)
where a ( k ) = (a1( k ) , a 2( k ) , K , a s( k ) ) is the vector of steady state probabilities. Let us denote this vector as
q = (q1 , q 2 , K , q s ) and limiting matrix as Q . In view of the fact that this probability distribution remains
constant over all stages when this situation occurred, we can write:
Q = QP
(8)
q = qP .
(9)
and
From the theory of matrix calculus, Equation 9 can be expressed as:
a3
a1
a1
a
P = 2
a3
a4
a2
a1 a 2 a 3 a 4
7
3
0
æ0
10
10 ö
ç
÷
3
2
0
ç0
5
5 ÷
ç 1
4
0
0 ÷
5
ç 5
÷
2
ç 3
0
0 ÷ø
5
è 5
a4
Figure 9 More complicated cyclic system
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
281
n
qj =
åp q
ij
for j = 1, 2, K , s .
j
(10)
i =1
Therefore we can express Equation 9 as a system of s simultaneous linear equations:
q1 = p11 q1 + p 21 q 2 + L + p s1 q s
q 2 = p12 q1 + p 22 q2 + L + p s 2 q s
.
M
M
M
M
q s = p1s q1 + p 2 s q 2 + L + p ss q s
(11)
In the system of linear equations (11) one equation is always redundant (i.e. it presents the same
information as that given by another equation in a different form) and hence it cannot be solved for unique
solution. Therefore we must replace any of these equations by another one. We can expediently use Equation 2
in the following form:
s
åq
j
=1 .
(12)
j =1
If we want to obtain steady state probabilities of our system described by transition matrix (5), the following
equations can be used:
q1 = 0q1 + 0q 2 + 13 q3
q 2 = 1q1 + 12 q 2 + 0q3 .
q3 = 0q1 + 12 q 2 + 23 q3
(13)
We can directly substitute the first equation for q1 in the equations for q 2 and q3 . Then we get two equations
which both express the same thing, thus one is redundant:
q2 = 13 q3 + 12 q2 Þ 12 q2 = 13 q3
q3 = 12 q 2 + 23 q3 Þ 13 q3 = 12 q2
This is a common property of all three equations, and generally, for all systems of equations constructed
this way, because of the transition matrix condition (see Equation 3). Therefore we replace one of three
equations by the equation q1 + q 2 + q s = 1 and then we obtain the following solution:
q1 = 16 , q 2 = 13 , q 3 =
1
2
This system will soon end up in this probability distribution for each individual state in whatever state it starts.
Now we can join our example with a practical planning problem: Consider that we have a group of machines.
100%
100%
75%
75%
50%
50%
25%
25%
0%
0%
1
4
7
10
13
16
19
22
25
28
31
1
4
7
10
13
16
19
22
25
28
31
Figure 10 Steady state and cyclic behaviour of the system
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
282
Each machine can be in one of three states in the course of operation q1 = “prepared to operate”, q 2 = “in
operation”, q3 = “out of order”. Then if there is a need of 150 machines in production and we know the steady
state probabilities, we can easily compute how many machines we need to buy and how much space we need for
the machinery maintenance. In our example 150 machines corresponds to one third of the whole set; therefore
we need to buy 3 × 150 = 450 machines. From this number we must reserve a place for the expected 450 × 12 = 225
machines in the repair shop. In Figure 8 the expected system behaviour through 13 stages can be seen. In the first
graph (8a) we can see a system, consisting of 50 % of machines in operation and 50 % machines prepared at the
beginning. In second graph (8b) is the same system, but at the beginning there are all machines under
maintenance. However, it can be seen that the system soon will end in stable form.
1.1. More complicated systems
Consider the system that is characterised in Figure 9. This system creates one ergodic set and we can
obtain the vector of steady state probabilities. (Computed by machine a=[.19642 .30357 .25892
.24107].) However, there can occur situations where the system cannot reach these steady state probabilities
depending on the initial state.
In Figure 10 are two graphs, showing distributions of the expected probabilities in particular stages through 33
periods of time. In the first graph stable behaviour of this system can be seen. It reaches the equilibrium
probability very soon, when initial probability vector is a ( 0 ) = ( 14 , 14 , 14 , 14 ) . The second graph in Figure 10 shows
the same system under another initial condition a ( 0 ) = ( 35 , 0, 25 , 0) . From this graph we can see that in this case
the system behaviour is cyclic. Therefore this system is not regular. More generally, we can say that systems that
are not regular need not reach steady state probabilities even if they exist.
First Passage Time
In our example of “machines in operation” a question might be of interest for us “ How long will it take
to go from state ‘out of order’ to state ‘prepared to operation’? ”, which means from state 3 to 1. In the
terminology of Markov analysis it is called first passage time and this means the expected number of transitions
required before the state moves from i to j for the first time. Let mij be the number of transitions from state i
to j . It is possible to calculate the expected number of transitions (sometimes we say average number of
transitions) using the following equation:
mij = 1 +
åp
ik
mkj
(14)
k¹ j
thus, in our example, we can compute m31 as follows:
m31 = 1 + p32 m21 + p33 m31
m31 = 1 + 0m21 + 23 m31
m31 = 3
Therefore we can say that the machine remains in the repair shop through three stages of process on
average. In this case it depends only on transition probability p33 , as we can easily discover from the transition
diagram (Figure 3). Generally, there can be much more dependencies, and therefore more computing steps to
find out the first passage time.
More information can be found in [2] or [10].
CONCLUSION
In this paper I tried to demonstrate the some properties of the Markovian analysis. Thanks to these
properties it is suitable to use the Markov analysis as a descriptive tool for the stochastic process in different
regions of industry.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
283
The paper is focused on the possibilities of utilisation in the management, production and process
planning. The above analysis is also a suitable tool of the computer aid especially in the management of various
enterprises.
Therefore, I recommend a larger utilisation of the software applied in this way. Due to the applicability of the
above methods, it will be good to motive Czech software firms to produce such a software that will include
them.
REFERENCES
[1] GRINSTEAD, C. M., SNELL, J. L.: Introduction to probability. GNU version, distributed under GPL, 1997
[2] KEMENY, J. G., SNELL, J. L.: Finite Markov Chains. Springer-Verlag, New York 1976
[3] KEMENY, J. G., SNELL, J. L., GERALD, L. T.: Introduction to Finite Mathematics. New GNU version,
distributed under GPL
[4] KLAPKA, J., DVOŘÁK, J., POPELA, P.: Metody operačního výzkumu. VUTIUM, Brno 2001
[5] REIBMAN, A., SMITH, R., TRIVEDI, K. S.: Markov and Markov reward model transient analysis: An
overview of numerical approaches. European Journal of Operational Research, 40, pp. 257-267, 1989
[6] STEVENSON, W. J.: Introduction to Management Science. Irwin, Boston 1998
[7] STEWART, W. J.: Introduction to the Numerical Solution of Markov Chains. Princeton University Press,
Princeton 1995
[8] ŠEVČÍK, V: Využití stochastických procesů markovského typu k řešení ekonomických problémů. Diploma
work, VUT Brno 2002
[9] TURBAN, E., MEREDITH, J.: Fundamentals of Management Science. Irwin, Boston 1991
[10] WALTER, J.: Stochastické modely v ekonomii. SNTL/Alfa 1970
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
284
APPLICATIONS OF THE TWO-DIMENSIONAL HELLINGER AND SHANNON
QUASI-NORM
Petr Jurák, Zdeněk Karpíšek
Department of Statistical and Optimization Methods of the Institute of Mathematics
at the BUT Faculty of Mechanical Engineering, Technická 2, 616 69 Brno
E-mail: [email protected], Tel.: +420-541 142 532
Abstract. This paper is concerned with the solution to a statistical problem of finding the twodimensional simultaneous discrete probability distribution of a discrete random vector by
minimizing the Hellinger distance and Shannon pseudo-distance with given constraints. The
constraints are chosen in a general form using suitable functions linear in unknown probabilities.
An economic example is used to illustrate the theoretical results.
Keywords: Hellinger quasi-norm, Shannon quasi-norm, probability estimations, distribution fitting
1. INTRODUCTION
Knowing the probability distribution is an important but often neglected part of finding a solution to the
majority of statistical problems. Estimates are usually made of such distributions that use previous experience,
asymptotic properties of statistics etc. However, problems with multi-modality and the dimension of the
probabilities to be found are often encountered. Therefore, methods of searching for the probability distribution
in question based on entropy [1,2] or on distribution distances seem to be a reasonable option [3,4].
The paper is concerned with making estimates of simultaneous discrete probability distributions of a
random vector ( X , Y ) based on its observed values. Certain constraints are specified on the searched for
functions j ijk ( x, y ) , i = 1,..., n1 , j = 1,..., n2 , k = 0,1,..., K . In addition, this probability
probabilities using
distribution p = ( pij ) , i = 1,..., n1 ,
j = 1,..., n2 must have a minimum distance from a given probability
distribution q = ( qij ) , i = 1,..., n1 , j = 1,..., n2 . Since this is a discrete problem in 2D, we can use a Hellinger
distance [3] given as
n1
n2
D ( p, q ) = åå
i =1 j =1
(
pij - qij
)
2
(1.1)
and, next, use a Shannon pseudo-distance [4] defined below
n1
n2
S ( p, q ) = åå ( pij ln pij - qij ln qij ) .
(1.2)
i =1 j =1
The distribution p can then be found either by a non-linear optimization [5] or, directly, deriving a
system of non-linear equations. Assume that an observed random vector
different
pairs
of
values
(x , y )
i
j
with
( X ,Y )
takes on a finite number of
unknown
probabilities
pij = P ( X = xi , Y = y j ) , i = 1,..., n1 , j = 1,..., n2 , n1 > 1, n2 > 1 . Observing this random vector yields a twodimensional sample
( ( x , y ) ,..., ( x , y ) ) , which, when sorted, results in a sorted sample ( ( x , y ) , f
1
1
n
i
n
f ij is the frequency of the observed pair ( xi , y j ) . The following theorems are proved in [8].
j
ij
)
/ n where
2. HELLINGER QUASI-NORM
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
285
Table 4.2 – Probability estimates using the Hellinger quasi-norm
y j \ xi
5
10
15
20
25
30
35
40
45
1
2
3
4
5
6
7
8
9
10
11
12
0.015334 0.014640 0.013992 0.013386 0.012818 0.012286 0.011787 0.011317 0.010875 0.010458 0.010064 0.009693
0.014296 0.013617 0.012986 0.012398 0.011848 0.011335 0.010854 0.010403 0.009980 0.009582 0.009207 0.008854
0.013360 0.012698 0.012085 0.011515 0.010985 0.010490 0.010028 0.009596 0.009191 0.008811 0.008455 0.008119
0.012513 0.011869 0.011275 0.010724 0.010212 0.009736 0.009293 0.008879 0.008492 0.008130 0.007791 0.007472
0.011743 0.011119 0.010543 0.010011 0.009518 0.009061 0.008635 0.008239 0.007870 0.007525 0.007202 0.006900
0.011043 0.010438 0.009881 0.009367 0.008893 0.008453 0.008045 0.007667 0.007314 0.006985 0.006678 0.006390
0.010404 0.009817 0.009279 0.008783 0.008327 0.007905 0.007514 0.007152 0.006815 0.006501 0.006209 0.005936
0.009818 0.009250 0.008730 0.008253 0.007813 0.007408 0.007034 0.006687 0.006365 0.006066 0.005787 0.005528
0.009281 0.008731 0.008229 0.007768 0.007346 0.006957 0.006598 0.006266 0.005958 0.005673 0.005408 0.005160
Table 4.3 – Probability estimates using the Shannon quasi-norm
y j \ xi
5
10
15
20
25
30
35
40
45
1
2
3
4
5
6
7
8
9
10
11
12
0.015094 0.014470 0.013872 0.013298 0.012749 0.012222 0.011716 0.011232 0.010768 0.010323 0.009896 0.009487
0.014165 0.013559 0.012979 0.012424 0.011892 0.011384 0.010897 0.010431 0.009985 0.009558 0.009149 0.008758
0.013292 0.012705 0.012143 0.011607 0.011094 0.010603 0.010135 0.009687 0.009259 0.008849 0.008458 0.008084
0.012474 0.011905 0.011362 0.010843 0.010349 0.009876 0.009426 0.008996 0.008585 0.008194 0.007820 0.007463
0.011706 0.011155 0.010630 0.010130 0.009654 0.009199 0.008766 0.008354 0.007961 0.007586 0.007229 0.006889
0.010985 0.010453 0.009946 0.009464 0.009005 0.008569 0.008153 0.007758 0.007382 0.007024 0.006684 0.006360
0.010309 0.009794 0.009306 0.008841 0.008400 0.007981 0.007583 0.007205 0.006845 0.006504 0.006179 0.005871
0.009674 0.009178 0.008707 0.008260 0.007836 0.007434 0.007053 0.006691 0.006347 0.006022 0.005713 0.005419
0.009078 0.008600 0.008146 0.007717 0.007310 0.006924 0.006559 0.006213 0.005886 0.005575 0.005281 0.005003
Probability estimates using the Hellinger quasi-norm
pij , fij /n
0,0161
0,0149
0,0136
0,0124
0,0112
0,0099
0,0087
0,0074
0,0059
0,0047
0,0035
0,0022
0,0009
5
10
15
20
25
y 30
35
40
45 12
11
10
9
8
7
6
5
4
3
x
2
1
fij /n
pij
Fig. 4. 1
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
288
Probability estimates using the Shannon quasi-norm
pij , fij /n
0,0161
0,0149
0,0136
0,0124
0,0112
0,0099
0,0087
0,0074
0,0059
0,0047
0,0035
0,0022
0,0009
5
10
15
20
25
y 30
35
40
11
45 12
10
9
8
7
6
5
4
3
2
1
x
Fig. 4. 2
Next we perform a chi-square test for both estimated distributions at a significance level of a = 0.05 .
For the number of degrees of freedom df = 12(9) - 3 - 1 = 104 , the critical interval for not rejecting the
hypothesis that the distributions fit well is á 0 ; 127.689288ñ . The value of the test statistic for the distribution
obtained by minimizing the Hellinger quasi-norm is
2
æ f ij
ö
- pij ÷
n1 n2 ç
n
ø = 16.74351002
c H2 = n1n2 åå è
pij
i =1 j =1
and, for the Shannon quasi-norm, it is
2
æ f ij
ö
- pij ÷
n1 n2 ç
n
ø = 16.10787788 .
c S2 = n1n2 åå è
pij
i =1 j =1
Thus we do not reject the good fit hypothesis at a significance level of a = 0.05 . However, by adding
further constraints, we may calculate the probability values with more precision.
5. CONCLUSION
This paper is an extension of the theoretical results of [2,3,4] and their applications. These results may
be implemented on a PC either using a non-linear optimization or finding a solution to a system of non-linear
equations for Lagrange multipliers. In Example 4.1 a non-linear system was implemented and solved in Maple.
However, the numerical solution of the system of non-linear equations appeared to be strongly contingent on the
starting values. Example 4.1 also shows that the results achieved very well correspond to the values observed
but, to achieve a greater accuracy, further constraints have to be added (on variance for one). Due to the
statistical properties and flexibility, the Shannon quasi-norm seems to be more suitable than the Hellinger one.
However, both quasi-norms produce very similar estimates.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
289
REFERENCES
[1] KARPÍŠEK, Z. Statistical Properties of Discrete Probability Distributions with Maximum Entropy. Folia
Fac. Sci. Nat. Univ. Masarykianae Brunensis, Mathematica 9, Brno 2001, p. 21-32, ISBN 80-210-2544-1.
[2] KARPÍŠEK, Z., JURÁK, P. Modelling of Probability Distribution with Maximum Entropy. In:
MENDEL ´01.7th International Conference on Soft Computing. Brno 2001, p. 232-239, ISBN 80-214-1894X.
[3] KARPÍŠEK, Z., JURÁK, P. Estimate of Discrete Probability Distribution by Means of Hellinger Distance.
In: MENDEL ´02. 8th International Conference on Soft Computing. Brno, 2002, p. 301-306, ISBN 80-2142135-5.
[4] KARPÍŠEK, Z., JURÁK, P. Odhady diskrétních rozdělení pravděpodobnosti pomocí kvazinorem, In: 1.
Mezinárodní matematický workshop – Brno 2002, 7 pp, CD-ROM, ISBN 80-86433-16-1.
[5] KLAPKA, J., DVOŘÁK, J., POPELA, P. Metody operačního výzkumu. Brno: PC-DIR, 1996.
[6] PITMAN, E. J. G. Some Basic Theory for Statistical Inference. New York: John Wiley & Sons, 1978.
[7] SHANNON, C., WEAVER, W. The Mathematical Theory of Communications. Urbana, Illinois, 1949.
[8] JURÁK, P. Diskrétní rozdělení pravděpodobnosti s maximální entropií. PhD Thesis (to appear).
[9] JURÁK P., KARPÍŠEK Z. Hellingerova a Shannonova kvazinorma ve dvourozměrném prostoru. Aplimat
2004, Bratislava (to appear).
The paper is part of the work on the CEZ: J22/98: 261100009 research design entitled „Non-traditional
methods for studying complex and vague systems“.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
290
THE NEW MODEL OF SYSTEM FUZZY RELIABILITY
Pavel Jelínek, Zdeněk Karpíšek
Department of Statistical and Optimization Methods of the Institute of Mathematics
at the BUT Faculty of Mechanical Engineering, Technická 2, 616 69 Brno
E-mail: [email protected], Tel.: +420-541 142 532
Abstract: Using Yager-type fuzzy probability, the paper defines fuzzy reliability generated by one
probability measure. The algorithm computing fuzzy reliability of combined system for fuzzy time
to failure is presented. The results can be used to determine the reliability of real objects or systems
in cases where the observed random data are of a vague numerical type.
Keywords: fuzzy possible time of failure, fuzzy reliability generated by probability measure, fuzzy
reliability for fuzzy possible time of failure
1. INTRODUCTION
The basic notion in the theory of probability is a random event (a subset of the basic space W), which
may or may not occur depending on the implementation of a certain set of conditions (of a random experiment).
We assume that, for a particular implementation, we can decide whether this event has or has not occurred.
However, in practice, this requirement may not be complied with in a simple way. This is the case, for example,
if a die is cast on a rugged surface or if we formulate a random event using vague linguistic terms such as: „a
small number will be the result“, „the quantity will assume a value approximately equal to x0“ or „a small time to
failure“. Such events may be suitably interpreted by fuzzy sets. On the other hand, the probability value itself
may be of a vague nature expressed, for example, as „a very probable event“, „a more or less probable event“ or
„the probability is approximately 0.9“. These inaccurate values may also be described by a fuzzy sets and fuzzy
numbers [2,3].
2. YAGER-TYPE FUZZY RELIABILITY
(
)
Fuzzy number B = ! + , m B ( t ) with non-decreasing membership function m B ( t ) where m B ( 0 ) = 0 is
%
%
%
%
called a fuzzy possible time of failure. The set of all fuzzy possible times to failure is marked B% .
Let P be crisp probability measure defined on universal set ! + and P be fuzzy probability measure
%
generated by measure P [4,5,6]. Let, further, B = ! + , m B ( t ) be fuzzy possible time of failure and B% be set of
%
%
all fuzzy possible times to failure. Let A * be the set of all generalized fuzzy numbers [4]. The fuzzy function
R : B% ® A * where R ( B ) = P ( B ) is called a fuzzy reliability generated by probability measure P. R ( B ) is
%
% %
% %
% %
called fuzzy reliability for fuzzy possible time of failure B .
%
Let P be fuzzy probability measure defined on universal set ! + and R be crisp reliability
(
)
(
corresponding to measure P. Let, further, R be fuzzy reliability generated by measure P and B = ! + , m B ( t )
%
%
%
be fuzzy possible time of failure. We have
m R( B ) ( p ) =
% %
sup
R ( Ba 1 ) = p
)
{a a Î [0;1]}
3. ALGORITHM COMPUTING FUZZY RELIABILITY OF SYSTEMS
A number of methods are used to calculate the reliability of a combined system of mutually
independent elements using their reliabilities. The algorithm that follows is based on a combination of a list
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
291
method and a method of paths. The list method is based on establishing the set of all the possible logical events
in the system that, subsequently, serves to calculate the actual reliability of the system using mutually
independent random events. A path in the graph of our system is defined as a sequence of arcs connecting
different vertices between the input and output nodes of the graph. The path can then be used to determine the
resulting system state. To calculate all the paths and their states the following property [1] is used.
We assume that a combined reliability system S of n elements has a non-empty simple acyclic digraph
with an adjacency matrix A = ( akl )k ,l =1 . Then an element ckl of the matrix C = ( E - A ) - E where E is the
n
-1
unit matrix expresses all the paths from vertex k to vertex l in a given graph if the arithmetic operation of
addition + is replaced by the logical addition Ú and the arithmetic product × by the logical product Ù in the
strings ckl of elements auv , u, v = 1,..., n . Moreover, for k ¹ l , we have ckl = Dlk where Dlk is the algebraic
complement of the element dlk of the matrix D = E - A . In matrix A, we put akl = 0 exactly if no arc goes
from vertex k to vertex l, otherwise akl = Ai where Ai is an element of the system S. This is actually a one-to
one mapping of the set of the elements Ai of the system S to the set of arcs akl . If the graph is not simple (with
multiple arcs), it can be transformed into a homeomorphic simple graph by, for example, suitably halving arcs.
This will make it possible to calculate all the paths from the input vertex k = 1 to the output vertex l = n of the
graph for the system S using Dkl for any particular variation of the state of the list elements and, subsequently,
to determine the state the system is in. If S denotes directly the state f the system and the state of the given
element is substituted for Ai , then the state of the system is S = sgn ( c1n ) = sgn ( Dn1 ) where, in the algebraic
complement Dn1 , we put auv = 1 or 0, if the state of the corresponding element is, according to the list, Ai = 1
or 0 respectively. This is because clk expresses, for given states Ai , the number of uninterrupted paths and
clk = 0 exactly if no path from vertex k to vertex l exists for given states Ai .
Next we assume that all the system elements are mutually independent and the fuzzy possible time of
failure B has a continuous membership function m B ( t ) . Now, at an arbitrary (non-negative) time t, we can
%
%
establish the resulting fuzzy reliability of the system by performing the steps of the so-called FJKS - algorithm:
1. Generate the list of all possible states of the elements of the given system in the form of a
matrix with 2n rows that are formed by all the n-th class variations ( A1 , K, An ) of the two2.
element set { 0,1} with repetition (these are in fact the binary numbers from 0 to 2n-1 ).
Using the algebraic complement Dn1 , for each variation, calculate the state of the system
S j = sgn ( c1n ) = = sgn ( Dn1 ) , j = 1,..., 2n where, for the elements akl of the adjacency matrix,
3.
we substitute the logical value of the state of the corresponding system element in step 1.
Denote
2n ì
n
ü
ï
é(1 - Ri (t ) )1- Ai ( Ri (t ) ) Ai ù ïý
RS ( t ) =
S
í j
ûï
ï i =1 ë
þ
j =1 î
å Õ
the reliability of system S where Ri ( t ) is the reliability of Ai (we put 00 = 1 ). For the fuzzy possible time of
failure B , the resulting fuzzy reliability RS ( B ) of the system has the membership function
%
% %
ì1
for p < RS ( min Ker ( B ) ) ,
%
ïï
m RS ( B) ( p ) = ía such that p = RS ( Ba 1 ) for p Î éë RS ( min Ker ( B ) ) , RS ( minSupp ( B ) )ùû ,
%
%
% %
ï
for p > RS ( minSupp ( B ) ) .
ïî0
%
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
292
4. EXAMPLE
Figure 1 shows the combined system of three mutually independent elements.
2
1
3
Fig. 1: Combined system S
The reliabilities of all elements are the same
R1 ( t ) = R2 ( t ) = R3 ( t ) = exp ( -t / 60 ) .
Fuzzy possible time of failure B (t ) depends on time-parameter t . Its membership function (Fig. 2, 3) is
%
3t
ì
for t < ,
ï0
4
ï
ï 4t - 3t
é 3t ù
for t Î ê ,t ú ,
m B(t ) ( t ,t ) = í
t
%
ë4 û
ï
ï1
for t > t ,
ï
î
where the parameter t Î !
to failure.
+
means end of the time to failure and the value 3t / 4 means the possible end of time
Fig. 2: Graph m B (t ) ( t ,t )
%
Fig. 3: Graph m B ( 40 ) ( t , 40 )
%
Using FJKS - algorithm we get the reliability of the system S
RS ( t ) = ( exp ( -t / 60 ) ) (1 - exp ( -t / 60 ) ) + ( exp ( -t / 60 ) ) (1 - exp ( -t / 60 ) ) + ( exp ( -t / 60 ) ) =
2
2
3
= 2 ( exp ( -t / 30 ) ) - ( exp ( -t / 20 )) .
Figures 4, 6 shows the graphs of the membership function of fuzzy reliability of an element for the
fuzzy possible time of failure B (t ) . Figures 5, 7 shows the graphs of fuzzy reliability of the membership
%
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
293
function of the system S for the same fuzzy possible time of failure B (t ) .
%
Fig. 4: Graph m R ( B (t ) ) ( p,t ) for an element
1
% %
Fig. 5: Graph m R ( B (t ) ) ( p,t ) for the system S
K
% %
Fig. 6: Graph m R ( B ( 40 ) ) ( p, 40) for an element
1
% %
Fig. 7: Graph m R ( B ( 40) ) ( p, 40 ) for the system S
% %
5. CONCLUSION
The selection of the membership function for fuzzy possible time of failure is of a subjective nature and
can be based either on an expert's opinion or statistical data of a vague nature can be used for expert evaluations.
This restricts the applicability and interpretation of the conclusions arrived at. On the other hand, using the fuzzy
set theory is of some advantage since it can be used to obtain more plausible even though approximate solutions
to actual problems than those achieved using „classical“ methods. We have presented a fuzzy reliability model,
which can deal with possible uncertainty precarious information about the observed probability distribution. This
fuzzification contrary to Zadeh-type fuzzy reliability [2] assumes only one reliability measure and its concept is
nearer the classical theory of probability. We expect implementation of the model on PC for system fuzzy
reliability calculation.
REFERENCES
[1] KARPÍŠEK, Z., JELÍNEK, P. Algoritmus pro výpočet fuzzy spolehlivosti systému. In: Sborník konference
1. Mezinárodní matematický workshop – Brno 2002. FAST VUT, CD ROM, s. 6, 2002, ISBN 80-86433-161.
[2] KARPÍŠEK, Z., JELÍNEK, P. Fuzzy stochastické metody modelování spolehlivosti. In: Sborník celostátního
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
294
semináře Analýza dat 2002/II. Lázně Bohdaneč, p. 90 – 103, 2002, ISBN 80-239-0204-0.
[3] KARPÍŠEK, Z., SLAVÍŠEK, K. Fuzzy stochastické metody. In: 2nd International Conference APLIMAT
2003 (part I). Bratislava, p. 141 – 148, 2003, ISBN 80-227-1813-0.
[4] KARPÍŠEK, Z., SLAVÍČEK, K. Two Fuzzy Probability Measures. In: 3rd Conference of the European
Society for Fuzzy Logic and Technology EUSFLAT 2003. Zittau, p. 669 – 674, 2003, ISBN 3-9808089-4-7.
[5] SLAVÍČEK, K. Fuzzy pravděpodobnostní míra (Fuzzy Probability Measure). PhD Thesis. FSI VUT, Brno
2002.
[6] YAGER, R. R. A Note on Probabilities on Fuzzy Events. In: Information Science (18), Elsevier, p. 113 –
129, 1979.
The paper is part of the work on the CEZ: J22/98: 261100009 research design entitled „Non-traditional
methods for studying complex and vague systems“.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
295
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
296
ON THE APPLICATION OF INTELLIGENT AUTONOMOUS AGENTS
FOR MANAGING THE REAL TIME COMMUNICATION NETWORK
Ibrahiem M. M. El Emary
Faculty of Applied Science, Al Balqa Applied University, Al Salt, Jordan.
E-mail:[email protected]
Abstract. This paper is concerned with a practical application of distributed artificial intelligence
for managing the real time communication networks. This type of network was chosen because it is
needed in interactive applications and all those requiring low data delivery time. A double bus
broadcasting architecture is presented after a short description of the three principal network
families, and its characteristics are outlined, showing how it is possible to introduce a priority
scheme, to reduce data delay. To manage such network well, we present a new tool that work better
than current depending on using Intelligent Autonomous Agents which is mainly a software that
assists people and acts on their behalf, since the Autonomous Agents help in automating repetitive
tasks, remembering rules that users frequently forget and summarizing complex data in
understandable human friendly reports or alerts.
Keywords : Distributed Artificial Intelligence, Autonomous Agents, Neural Networks, High Priority
Minipackets (HPMP), LAN, CSMA/CD, SNMP, KQML, CORBA, SNMP, MLOG, KB, INFERENCE,
MIB, NMS and CMIP.
1. INTRODUCTION
When the computer networks started to grow in the early years, a management protocol is needed
badly, to monitor the LAN faultiest, specialists first made in the first SNMP protocol using the ICMP (Internet
Control Message Protocol) messages and the Ping messages were used to test the connection settings between
the stations to make sure that there is no cut or any cable down in the network [1].
The main task of the network management system researchers is to develop new tool that work better
than current by available tools for doing this laborious work. The work presented in this paper directed about
how to delegate as much work as possible to the machine, using network administrators as knowledge engineers
that teach the machine how it should perform its work based on what is called Intelligent Autonomous Agent. .
There are a few predictable advantages of using Intelligent Autonomous Agents to any management system,
network management being just one. First of all, there is the Intelligent Nature, which sounds promising. Having
an Intelligent and adaptable system is usually better than having dedicated applications for specific solutions.
Likewise, the word Autonomous gives us the idea of something that can work by itself or need almost no human
interference. Finally, Agent gives the impression of a helper or a wizard that somehow works between the
machine and human. So, the main objective proposed here is to project a system where the human system
administration does not have to work as the main workforce available. Instead, we want them to work as the
truly intelligent-meaning cognitive-element in the management process, feeding the system with the rules of
work – or knowledge – it needs to operate, no more boring work, but intelligent work. .
The network that has been suggested to be managed using the above tool is local computer network
since these LANs have registered a great development in various fields as in office and industrial automation, so
permitting the sharing of expensive resources, satisfying the need for local collection of information, and a speed
data exchange for process control. A great proliferation has so arisen of different architectures characterized by
simple topologies and simple interfaces with the network, communications are normally very fast. There is
however some applications requiring real time communications, that is communications where data delivery
time Td is short, compared to evolution time of processes needing to communicate. As such a requirement must
be necessarily satisfied if needed by application, it is necessary to make Td as short as possible. .
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
297
Surely if transmission data rate increases, communication channel holding time decreases, so that at a
parity of traffic, delay time in the queue to gain the access to the network decreases. This is not however an
efficient way to have real time communication. Infact, increasing data transmission rate, requires to be
performing in hardware some functions that normally can be software performed. Moreover it is necessary an
expensive hardware to work with high bit rate. At last, also communication protocols must be accurately
designed to be suitable for the high speed required. With the aim of examining real time communication network
under using the Intelligent Autonomous Agents as a management tool to improve its performance, in section 2,
three types of local networks are described: bidirectional broadcasting, token passing, and unidirectional
broadcasting, examining how data requirements is influence their real time behavior. In section 3, a double bus
network architecture is presented, pointing out benefits produced by using two different paths for reservation and
data frames, and examining how even with a distributed management, it is possible to achieve a message
priority. Section 4 presents the criteria that should be verified in well planed agents. Section 5 describes the
architecture of the Intelligent Autonomous Agents. Section 6 presents the implementation of the Autonomous
Agents. Section 7 terminates this paper with conclusions and future work. .
2. ARCHITECTURE OF THE WELL KNOWN LAN
Three families of Local Area Network (LAN) are currently considered; bidirectional broadcasting
networks, ring networks and unidirectional broadcasting networks. In the first, we find Ethernet, net- one, Z-net,
etc. Such system uses a common coaxial transmission cable connecting all the communicating devices, by means
of a passive interface, so achieving a simple and reliable broadcast communication. The problem of controlling
the access to the communication channel has given rise to many different access techniques; the most used is
"carrier sense multiple access with collision detection" (CSMA/CD) that is satisfactory when exchanging files,
graphics, or other non interactive data, but can too long delay voice (more than the maximum valueTd max
allowed by vecoder's rate) when there is a heavy load. A better behavior can be obtained with slotted Ethernet
technique, where collision is minimized by means of "0" time slotting mechanism and message transmission is
guaranteed within a fixed time [1].
In the second, we find Primenet, Cambridge ring, DCS ring etc. In a ring network message are passed
from node to node along an unidirectional cable, by means of an active interface, a token is circulated around the
ring and a station has the right to transmit data in the ring only when it has the token. Such access technique, by
using an appropriate protocol ensures a maximum message delivery time, so that it can satisfy user requirements
when there is a light load; when traffic increases, and voice and data are mixed, delay can however reach a bad
value [2].
In the third family, we find Express Net. In a unidirectional broadcasting network, transmission signal
are forced to propagate on the cable, in only one direction, by means of special tapes that attenuate the signal in
the opposite direction. In this Express network type, there are two interconnected unidirectional channel,
implemented by folding a single cable, so that each station can receive messages on a channel and can transmit
data on the other one. This system provides advantages of a broadcasting communication and regularity of a
token ring communication. This architecture has shown a good behavior with voice and data.
All the above families' architectures show a satisfactory behavior in normal conditions, but in a hard
environment, they cannot satisfy all the user requirements in terms of delivery time and reliability. It is possible
to decrease delay time Td by increasing the transmission bit rate, but this is not an efficient way to use in the
communication channel, because in the network there is burst traffic normally, so that, only for a short period of
time, a data traffic jam can arise with long data queues. For this reason it is better to distinguish data on the basis
of their delivery time requirements, so giving a high priority to those short data that need real time
communications, as voice packets, and a lower priority to those data which can be delayed, or that reserve for a
long time the communication channel. In this way it is possible to integrate interactive data on every network by
means of an adequate protocol. If the network has to be used in distributed process control system- the matter of
this paper- it is moreover necessary to take the environmental constraints and the application requirements into
consideration.
3. STRUCTURE AND OPERATION CONCEPT OF THE DOUBLE BUS NETWORK
A considerable improvement of the throughput, reliability and a low delivery time can be achieved by a
double bus network. Reliability is often a very important parameter, many networks provide a redundant cable to
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
298
set going in case of failure of the main one so avoiding to stop the entire network. Such a use of the redundant
communication path is however a waste; in fact with a little hardware more it is possible to use both the channels
in the same time. In this way we can improve throughput, as there are two independent paths where data can
travel, and in case of failure of a bus, communication can continue ( in a reduced way ) on the other one.
Moreover, it is easy to implement a protocol that makes provision for reservation, and high priority packets so
that urgent message can travel with low delay, at last, it is possible to provide a full duplex transmission ( all the
previous mentioned tasks represent the main jobs that is required from the network management technique). .
Double bus architecture improves surely the behavior of any type of network, in this section a
bidirectional broadcasting local network is considered. The block diagram of simplified controller hardware is
shown in Fig.1. From this figure, each station uses two independent buffers for received data, and one for
transmitted data, so permitting a full duplex service. There are two receivers permanently connected to each
cable and only one transmitter; which can be switched on one of the cables. In this way, both the channels are
continuously monitored, where as it is possible to transmit on a channel each time. Bus 1 is used to transmit data
a packet, after a successful reservation, Bus 2 is used to transmit reservation mini packets, voice packets, and
also data packets. When a user has to send data, it must perform a reservation by sending on Bus 2, a reservation
mini packet. If the reservation is successful, all users update their reservation table, by including the one just
performed. As reservations are performed with CSMA/CD technique, a collision can occur; in this case,
conflicting users must reschedule transmission after an opportune delay, depending on adopted contention
resolution protocol.
All mini packets must have a length at least equal to 2Td (where Td is the time necessary to connect the
most remote users) so that a collision is surely detected if there is one. When a user has finished its reservation,
Bus 2 is available for other reservations; with this mechanism, a priority chain on Bus 1 is produced, according
to the following rule: the first reserved user is the first to transmit data. A maximum number N of reservations
can be performed to avoid a too long reservation queue. Data transmission on Bus 1 is collisionless transmission,
as it can be performed on the basis of the reservation order. When a user has become the first in the reservation
table it, after detection of end of carrier on Bus 1 (EOC) can begin to transmit its data packets. Using Bus 1, only
to perform reservation, it is obviously a restrictive way to employ the communication channel bandwidth; at this
point it can be useful to introduce the high priority mini packets (HPMP) concept. Such a mini packet can be a
speed data packet (for example a voice packet), or a reservation packet for a long data packet which need a speed
delivery.
To avoid the possibility of collision between HPMP and normal mini packets, a time slot mechanism
can be used, so that, after each transmission on Bus 1, a time slot t=2.5Td is reserved for high priority mini
packets; within this time no normal reservation can be performed. Moreover HPMP can be transmitted also when
the maximum number N of normal reservations which can be performed, has been reached.
Of course collisions can occur between HPMP, but in this case a procedure to solve contention must
act. A HPMP reservation is identified by all users and causes a back shift to all normal reservations, whereas the
user that has performed the prioritized reservation is moved forward in the reservation table.
A further improvement in using the communication channel can be achieved by transmitting data
packets on bus 1 when there are long data queues. If the reservation table is full, the first user in the list instead
of waiting for bus 2 available sends its data packet on bus 1. No collision is possible with normal mini packets as
reservation table is full and no user will try to perform a normal reservation, to avoid collision with HPMP, it will be
sufficient to delay such a data transmission for 2.5Td.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
299
Bus 1
Bus 2
Reciever1
Collision
Detect.
Encoder
Output
high
speed
buffer
Reciever2
E.O.C
E.O.C
Decoder
Decoder
Control
E
status
Input
high
speed
buffer 1
Input
high
speed
buffer 2
Control bus
Fig.1 Block diagram of the simplified controller hardware.
Single lines represent serial data paths and double lines are bit parallel data paths
4. CRITERIA TO BE VERIFIED IN A WELL PLANNED AGENTS
Our objective is to develop a decentralized computer network management platform borrowing existing
concepts from distributed artificial intelligence. The guidelines for the system are: high degree of adaptability,
module reusability, mobility, self generation and environment plasticity. These guidelines sound simple but
actually they hide what we have deep in our mind: "To create a system that can really work by itself, generate a
new system for a new environment alone, develop new components for new circumstances, and learn as much as
possible by itself asking us only when there is no other way". From these guidelines, we see that there are seven
commandments that drove us towards what we could call" measurable "goals. These commands are given by the
following:
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
300
Agents must be truly autonomous; this means that any agent must be able to control itself and its own
actions. Also, they should be able to work independently from what happened to other agents and have enough
freedom to adapt themselves to any possible new environment dynamics. Without this item we will have just a
usual system where whatever new happens it demands full system administration attention to properly react to
new environmental conditions and setup new operating parameters.
Agents should be goal-driven and rule-based; a set of rules constitutes the knowledge base for any
administration system-here including human ones. And goals, which are rules by themselves, are the motivations
agents have to work. We define static goals as parameters setup during agent creation, and dynamic goals those
added to agent behavior derived from environmental changes or incoming messages.
The agent environment must behave as a global knowledge base; where each data item or information stored
by one agent is sharable throughout the whole system (by other agents). Agents' intercommunication and
knowledge exchange is essential to ensure that all needed information would be available for decision making.
Besides, in the global system- meaning worldwide over internet- information one agent could learn a new rule
from another community and share that within its own community, thus increasing the overall community's
knowledge. We intend to reduce human interaction when we have one automatic system learning new tricks
from another automatic system more knowledgeable than the first one in some field.
Agents should be capable of learning new skills; either from other agents, the community of agents, the
global community or, as last resort, human beings. These new rules should be dynamically being stored on
agent's local knowledge databases. The other possibility is inferring new rules based on environment data along
with current knowledge.
Agents should be self-generating; they should adapt themselves to new environments and generate subsets of
their own characteristics creating new elements for new problems. This allows an agent to become a selfgeneration system, reducing human interaction as much as possible.
Specifically for network management agents; they should interface with commercial solutions using standard
protocols; this means that agents should know, through rules in their databases, how to interact with well-known
protocols such as SNMP and CMIP (Communication Management Information Protocol) protocols, how to
retrieve information from log files, and also how to interact with other existing management systems.
Finally, Agents should have a human friendly interface; the friendly interface between human and machines
is a natural language-like protocol. In mean time, we should use a more formal specification for the humanagent communication language and provide Application Program Interfaces (APIs) for developing other
interfaces- like HTML, NNTP and TELNET.
5. ARCHITECTURE OF THE INTELLIGENT AUTONOMOUS AGENTS
In the work presented here, we define Autonomous agents as any software that assists people and acts
on their behalf. How they are internally implemented or externally appear is up to the environment where they
are inserted and their developers. These discrepancies between implementations can be clearly observed taking
on different works about this subject like [3, 4, and 5]. Agents must be truly autonomous; this means that any
agent must be able to control itself and its own actions. Also, they should be able to work independently from
what happened to other agents and have enough freedom to adapt themselves to any possible new environment
dynamics. Without this item we will have just a usual system where whatever new happens it demands full
system administration attention to properly react to new environmental conditions and setup new operating
parameters.
Therefore the main idea is that an autonomous agent is a software application that works by allowing
human users to delegate tasks to these applications. Also, these applications should work independently, but on
the other hand, should interact with human masters in order to acquire new knowledge and learn new tricks. For
network management application, autonomous agents could help in automating repetitive tasks, remembering
rules that users frequently forget and summarizing complex data in understandable human friendly reports or
alerts.
To illustrate the framework for a generic agent; we say that the objective of the generic agent concept is
to define the basic structures, mechanisms and basic abilities that guarantee a minimum standard behavior,
internal and external, to any agent that is created. In other words, a generic agent is a template that the agency
system itself uses to create new agent. Over those templates new functionality can be added to make this new
agent being created best fit into its destination environment [6]. The generic agent concept defines the following
three modules as shown in Fig.2.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
301
AGENT
INFERENCE
INTERACTION
COMMUNICATION
Fig.2 Generic Agent Structure
INFERENCE
INTERACTION
R
R
cv
COMMUNICATION
CC
INFERENCE
KB
Fig.3 Detail for the INFERENCE Module
*Inference module; which implements the rule deduction engine and also stores the knowledge database as
shown in Fig 3. Symbols that denote the various components of the inference module are given by the following:
RR: Rule resolution component
CV: life cycle component
CC: cooperation component
KB: knowledge base
*Interaction module; which performs the interface between the agent and the environment.
*Communication module; which implement the message exchange procedures between agents.
Communications are implemented by means of knowledge Query and manipulation language (KQML) [7] which
was chosen due to its strong commitment to agent's applications.
For extensibility purposes, the generic agent's architecture is being adapted to be compatible with common
object request broker architecture (CORBA).We want agents to interface with CORBA services, either making
use of distributed object or serving as CORBA repositories sharing internal functionalities.
6. IMPLEMENTATION OF THE AUTONOMOUS AGENTS
The kernel part for every agent on the system is the very same defined in the generic agent structure
described in the above section. Thus we have only one single source code for every agent. The distinction for
each agent is made through the rules loaded on its knowledge base and the native rules loaded on its code. Native
rules follow the java definition and are used strictly when the task is so specific that a knowledge rule could not
be implemented instead.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
302
The rules language is called Mlog. Mlog is a subset of the prolog language optimized for size and
resource consumption reasons. It is possible to submit knowledge to an agent in the IF-THEN-ELSE rule
format and the agent will translate it to Mlog format-being helped by Interface agents that have IF-THENELSE to Mlog translation knowledge. A lot of prolog programs will run on Mlog-based agent. There should be
Prolog converters to allow any Prolog application to be compatible. Programming in Mlog consists of:
(a) Declaring some facts about objects and their relationships for network management, we can see objects as
devices.
(b) Defining some rules about objects and their relationships.
(c) Asking question about object and their relationships.
One major difference between Prolog and Mlog is that Mlog handles exception like unknown clauses or
invalid values instead of throwing out an error message .If an unknown clause is reached inside the rules
resolution rules are not analyzed when added to knowledge base (KB) but during execution-an unknown clause
exception is thrown and the cooperation component activated. Thus, the agent will try to learn that clause from
the community before deciding to interrupt that execution sequence. The same applies for invalid values and
other exceptions. Once Mlog is learned, it becomes straightforward to understand how the inference module is
implemented (see Fig 3). The knowledge base (KB) is implemented by a set of Mlog sentences, like in Prolog.
The rules resolution component (RR) in the Mlog engine and is based on a full but reduced Prolog engine that
process the (KB) rules based on an expression being executed. The cooperation component (CC) is a new
feature, not usual to Prolog system. It implements learning capabilities and is used when either a new knowledge
is required to accomplish a rule resolution, or a new knowledge is offered for the agent from the community. In
other words, it handles how to add to or remove knowledge from internal knowledge bases and how to ask for
new knowledge and later add it from the community. A flowchart for CC Vs. RR co-existence is presented in
Fig 3.
Execute rule(par1,par2,…)
Variables initializing
Knowledge
base
Load information about the rule being executed
No
Known
rule?
Learn from community
Yes
Analyze parameters
Yes
Is
native
?
Execute native rule(par1,par2,..)
No
Mlog rule resolution
Fig.3 Basic Rule Resolution Flowchart
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
303
The life cycle component (CV) is an internal Mlog implemented component that complies the set of
goals and has an agent to pursuer these in a certain amount of time. After that, it submits these goals, one by one,
to execution in the rules resolution component.
The interaction module is not a processing module per se but a set of rules and native methods declared
inside an agent to interface with environmental variables. In network management application, SNMP interface
clauses are implemented in this module. The goal of creating an independent module for those rules and native
rules is to better understand the system functionality. From the compilation point of view, the independent
modules are linked inside the code depending on the functions that will be exercised by agent being built.
Finally, the communication module is much like the interaction module: a set of rules and native methods
targeting communication procedures. Things like TCP/IP package exchanging KQML protocol formatting
communication session control, and other related to information exchange are treated in this module.
7. CONCLUDING COMMENTS AND FURTHER DEVELOPMENT
As we have seen in this paper, there are quite a few predictable advantages of using intelligent
autonomous agents for managing a computer network. First of all, there is the intelligent nature which sounds
promising. Having an intelligent and adaptable system is usually better than having dedicated applications for
specific solution. It was also shown that applying a distributed artificial intelligence offer use a significant
advantages over large monolithic system among it the following:
System modularity; it is easier to build and maintain a collection of quasi independent modules than one huge
one.
Efficiency; not all knowledge is needed for all tasks. By modularizing it, we gain the ability to focus intelligent
autonomous agent efforts in ways that are most likely to pay off.
A communication structure that enables information to be passed back and forth among agents.
As a future work to be done by others, we recommend with the following points:
* To create a series of new agents modules and module plug-ins, in order to make them available to an
increasing number of new working environments.
* To create a global wizard structured that is a set of Internet available agents to whom local agents will
refer in case they need locally unknown new information. In these structures we will store as much information
as possible for all different fields where agents will be applied.
REFERENCES
[1] TANENBAUM, A.S.: Computer Networks, Forth Edition, Prentice Hall PTR, Upper Saddle River, NJ
07458, 2003
[2] KALYANAMAN, S.: Simple Network Management Protocol, Resselaer Polytechnic Institute, 2000
[3] KAUTZ, H.A.: Botton-up Design of Software Agents, Communications of The ACM, Vol.37, No.7, U.S.A,
1994
[4] MAES, P.: Agents that Reduce Work and Information Overload, Communications of the ACM, vol.37, NO.7,
U.S.A, 1994
[5] FERBER, J.: Multi- Agent System an Introduction to Distributed Artificial Intelligent, pp.4-8, Ed. Addison –
wesley, France .1999
[6] ICHOCKIL, A.: Architectures and Electronic Implementation of Neural Network Models – Neural Network
for Optimization and Signal Processing, John Willey & sons, U.K, 1993
[7] KAHANI, M., BEADLE, P.H.V.: Decentralized Approaches for Network Management, Computer
Communications Review, ACM-SIGCOMM, Vol.27, NO.3, pp.36-47, U.S.A, 1997
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
304
ON THE APPLICATION OF INTELLIGENT AUTONOMOUS AGENTS
FOR MANAGING THE REAL TIME COMMUNICATION NETWORK
Ibrahiem M. M. El Emary
Faculty of Applied Science, Al Balqa Applied University, Al Salt, Jordan.
E-mail:[email protected]
Abstract. This paper is concerned with a practical application of distributed artificial intelligence
for managing the real time communication networks. This type of network was chosen because it is
needed in interactive applications and all those requiring low data delivery time. A double bus
broadcasting architecture is presented after a short description of the three principal network
families, and its characteristics are outlined, showing how it is possible to introduce a priority
scheme, to reduce data delay. To manage such network well, we present a new tool that work better
than current depending on using Intelligent Autonomous Agents which is mainly a software that
assists people and acts on their behalf, since the Autonomous Agents help in automating repetitive
tasks, remembering rules that users frequently forget and summarizing complex data in
understandable human friendly reports or alerts.
Keywords : Distributed Artificial Intelligence, Autonomous Agents, Neural Networks, High Priority
Minipackets (HPMP), LAN, CSMA/CD, SNMP, KQML, CORBA, SNMP, MLOG, KB, INFERENCE,
MIB, NMS and CMIP.
1. INTRODUCTION
When the computer networks started to grow in the early years, a management protocol is needed
badly, to monitor the LAN faultiest, specialists first made in the first SNMP protocol using the ICMP (Internet
Control Message Protocol) messages and the Ping messages were used to test the connection settings between
the stations to make sure that there is no cut or any cable down in the network [1].
The main task of the network management system researchers is to develop new tool that work better
than current by available tools for doing this laborious work. The work presented in this paper directed about
how to delegate as much work as possible to the machine, using network administrators as knowledge engineers
that teach the machine how it should perform its work based on what is called Intelligent Autonomous Agent. .
There are a few predictable advantages of using Intelligent Autonomous Agents to any management system,
network management being just one. First of all, there is the Intelligent Nature, which sounds promising. Having
an Intelligent and adaptable system is usually better than having dedicated applications for specific solutions.
Likewise, the word Autonomous gives us the idea of something that can work by itself or need almost no human
interference. Finally, Agent gives the impression of a helper or a wizard that somehow works between the
machine and human. So, the main objective proposed here is to project a system where the human system
administration does not have to work as the main workforce available. Instead, we want them to work as the
truly intelligent-meaning cognitive-element in the management process, feeding the system with the rules of
work – or knowledge – it needs to operate, no more boring work, but intelligent work. .
The network that has been suggested to be managed using the above tool is local computer network
since these LANs have registered a great development in various fields as in office and industrial automation, so
permitting the sharing of expensive resources, satisfying the need for local collection of information, and a speed
data exchange for process control. A great proliferation has so arisen of different architectures characterized by
simple topologies and simple interfaces with the network, communications are normally very fast. There is
however some applications requiring real time communications, that is communications where data delivery
time Td is short, compared to evolution time of processes needing to communicate. As such a requirement must
be necessarily satisfied if needed by application, it is necessary to make Td as short as possible. .
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
297
Surely if transmission data rate increases, communication channel holding time decreases, so that at a
parity of traffic, delay time in the queue to gain the access to the network decreases. This is not however an
efficient way to have real time communication. Infact, increasing data transmission rate, requires to be
performing in hardware some functions that normally can be software performed. Moreover it is necessary an
expensive hardware to work with high bit rate. At last, also communication protocols must be accurately
designed to be suitable for the high speed required. With the aim of examining real time communication network
under using the Intelligent Autonomous Agents as a management tool to improve its performance, in section 2,
three types of local networks are described: bidirectional broadcasting, token passing, and unidirectional
broadcasting, examining how data requirements is influence their real time behavior. In section 3, a double bus
network architecture is presented, pointing out benefits produced by using two different paths for reservation and
data frames, and examining how even with a distributed management, it is possible to achieve a message
priority. Section 4 presents the criteria that should be verified in well planed agents. Section 5 describes the
architecture of the Intelligent Autonomous Agents. Section 6 presents the implementation of the Autonomous
Agents. Section 7 terminates this paper with conclusions and future work. .
2. ARCHITECTURE OF THE WELL KNOWN LAN
Three families of Local Area Network (LAN) are currently considered; bidirectional broadcasting
networks, ring networks and unidirectional broadcasting networks. In the first, we find Ethernet, net- one, Z-net,
etc. Such system uses a common coaxial transmission cable connecting all the communicating devices, by means
of a passive interface, so achieving a simple and reliable broadcast communication. The problem of controlling
the access to the communication channel has given rise to many different access techniques; the most used is
"carrier sense multiple access with collision detection" (CSMA/CD) that is satisfactory when exchanging files,
graphics, or other non interactive data, but can too long delay voice (more than the maximum valueTd max
allowed by vecoder's rate) when there is a heavy load. A better behavior can be obtained with slotted Ethernet
technique, where collision is minimized by means of "0" time slotting mechanism and message transmission is
guaranteed within a fixed time [1].
In the second, we find Primenet, Cambridge ring, DCS ring etc. In a ring network message are passed
from node to node along an unidirectional cable, by means of an active interface, a token is circulated around the
ring and a station has the right to transmit data in the ring only when it has the token. Such access technique, by
using an appropriate protocol ensures a maximum message delivery time, so that it can satisfy user requirements
when there is a light load; when traffic increases, and voice and data are mixed, delay can however reach a bad
value [2].
In the third family, we find Express Net. In a unidirectional broadcasting network, transmission signal
are forced to propagate on the cable, in only one direction, by means of special tapes that attenuate the signal in
the opposite direction. In this Express network type, there are two interconnected unidirectional channel,
implemented by folding a single cable, so that each station can receive messages on a channel and can transmit
data on the other one. This system provides advantages of a broadcasting communication and regularity of a
token ring communication. This architecture has shown a good behavior with voice and data.
All the above families' architectures show a satisfactory behavior in normal conditions, but in a hard
environment, they cannot satisfy all the user requirements in terms of delivery time and reliability. It is possible
to decrease delay time Td by increasing the transmission bit rate, but this is not an efficient way to use in the
communication channel, because in the network there is burst traffic normally, so that, only for a short period of
time, a data traffic jam can arise with long data queues. For this reason it is better to distinguish data on the basis
of their delivery time requirements, so giving a high priority to those short data that need real time
communications, as voice packets, and a lower priority to those data which can be delayed, or that reserve for a
long time the communication channel. In this way it is possible to integrate interactive data on every network by
means of an adequate protocol. If the network has to be used in distributed process control system- the matter of
this paper- it is moreover necessary to take the environmental constraints and the application requirements into
consideration.
3. STRUCTURE AND OPERATION CONCEPT OF THE DOUBLE BUS NETWORK
A considerable improvement of the throughput, reliability and a low delivery time can be achieved by a
double bus network. Reliability is often a very important parameter, many networks provide a redundant cable to
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
298
set going in case of failure of the main one so avoiding to stop the entire network. Such a use of the redundant
communication path is however a waste; in fact with a little hardware more it is possible to use both the channels
in the same time. In this way we can improve throughput, as there are two independent paths where data can
travel, and in case of failure of a bus, communication can continue ( in a reduced way ) on the other one.
Moreover, it is easy to implement a protocol that makes provision for reservation, and high priority packets so
that urgent message can travel with low delay, at last, it is possible to provide a full duplex transmission ( all the
previous mentioned tasks represent the main jobs that is required from the network management technique). .
Double bus architecture improves surely the behavior of any type of network, in this section a
bidirectional broadcasting local network is considered. The block diagram of simplified controller hardware is
shown in Fig.1. From this figure, each station uses two independent buffers for received data, and one for
transmitted data, so permitting a full duplex service. There are two receivers permanently connected to each
cable and only one transmitter; which can be switched on one of the cables. In this way, both the channels are
continuously monitored, where as it is possible to transmit on a channel each time. Bus 1 is used to transmit data
a packet, after a successful reservation, Bus 2 is used to transmit reservation mini packets, voice packets, and
also data packets. When a user has to send data, it must perform a reservation by sending on Bus 2, a reservation
mini packet. If the reservation is successful, all users update their reservation table, by including the one just
performed. As reservations are performed with CSMA/CD technique, a collision can occur; in this case,
conflicting users must reschedule transmission after an opportune delay, depending on adopted contention
resolution protocol.
All mini packets must have a length at least equal to 2Td (where Td is the time necessary to connect the
most remote users) so that a collision is surely detected if there is one. When a user has finished its reservation,
Bus 2 is available for other reservations; with this mechanism, a priority chain on Bus 1 is produced, according
to the following rule: the first reserved user is the first to transmit data. A maximum number N of reservations
can be performed to avoid a too long reservation queue. Data transmission on Bus 1 is collisionless transmission,
as it can be performed on the basis of the reservation order. When a user has become the first in the reservation
table it, after detection of end of carrier on Bus 1 (EOC) can begin to transmit its data packets. Using Bus 1, only
to perform reservation, it is obviously a restrictive way to employ the communication channel bandwidth; at this
point it can be useful to introduce the high priority mini packets (HPMP) concept. Such a mini packet can be a
speed data packet (for example a voice packet), or a reservation packet for a long data packet which need a speed
delivery.
To avoid the possibility of collision between HPMP and normal mini packets, a time slot mechanism
can be used, so that, after each transmission on Bus 1, a time slot t=2.5Td is reserved for high priority mini
packets; within this time no normal reservation can be performed. Moreover HPMP can be transmitted also when
the maximum number N of normal reservations which can be performed, has been reached.
Of course collisions can occur between HPMP, but in this case a procedure to solve contention must
act. A HPMP reservation is identified by all users and causes a back shift to all normal reservations, whereas the
user that has performed the prioritized reservation is moved forward in the reservation table.
A further improvement in using the communication channel can be achieved by transmitting data
packets on bus 1 when there are long data queues. If the reservation table is full, the first user in the list instead
of waiting for bus 2 available sends its data packet on bus 1. No collision is possible with normal mini packets as
reservation table is full and no user will try to perform a normal reservation, to avoid collision with HPMP, it will be
sufficient to delay such a data transmission for 2.5Td.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
299
Bus 1
Bus 2
Reciever1
Collision
Detect.
Encoder
Output
high
speed
buffer
Reciever2
E.O.C
E.O.C
Decoder
Decoder
Control
E
status
Input
high
speed
buffer 1
Input
high
speed
buffer 2
Control bus
Fig.1 Block diagram of the simplified controller hardware.
Single lines represent serial data paths and double lines are bit parallel data paths
4. CRITERIA TO BE VERIFIED IN A WELL PLANNED AGENTS
Our objective is to develop a decentralized computer network management platform borrowing existing
concepts from distributed artificial intelligence. The guidelines for the system are: high degree of adaptability,
module reusability, mobility, self generation and environment plasticity. These guidelines sound simple but
actually they hide what we have deep in our mind: "To create a system that can really work by itself, generate a
new system for a new environment alone, develop new components for new circumstances, and learn as much as
possible by itself asking us only when there is no other way". From these guidelines, we see that there are seven
commandments that drove us towards what we could call" measurable "goals. These commands are given by the
following:
Agents must be truly autonomous; this means that any agent must be able to control itself and its own
actions. Also, they should be able to work independently from what happened to other agents and have enough
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
300
freedom to adapt themselves to any possible new environment dynamics. Without this item we will have just a
usual system where whatever new happens it demands full system administration attention to properly react to
new environmental conditions and setup new operating parameters.
Agents should be goal-driven and rule-based; a set of rules constitutes the knowledge base for any
administration system-here including human ones. And goals, which are rules by themselves, are the motivations
agents have to work. We define static goals as parameters setup during agent creation, and dynamic goals those
added to agent behavior derived from environmental changes or incoming messages.
The agent environment must behave as a global knowledge base; where each data item or information stored
by one agent is sharable throughout the whole system (by other agents). Agents' intercommunication and
knowledge exchange is essential to ensure that all needed information would be available for decision making.
Besides, in the global system- meaning worldwide over internet- information one agent could learn a new rule
from another community and share that within its own community, thus increasing the overall community's
knowledge. We intend to reduce human interaction when we have one automatic system learning new tricks
from another automatic system more knowledgeable than the first one in some field.
Agents should be capable of learning new skills; either from other agents, the community of agents, the
global community or, as last resort, human beings. These new rules should be dynamically being stored on
agent's local knowledge databases. The other possibility is inferring new rules based on environment data along
with current knowledge.
Agents should be self-generating; they should adapt themselves to new environments and generate subsets of
their own characteristics creating new elements for new problems. This allows an agent to become a selfgeneration system, reducing human interaction as much as possible.
Specifically for network management agents; they should interface with commercial solutions using standard
protocols; this means that agents should know, through rules in their databases, how to interact with well-known
protocols such as SNMP and CMIP (Communication Management Information Protocol) protocols, how to
retrieve information from log files, and also how to interact with other existing management systems.
Finally, Agents should have a human friendly interface; the friendly interface between human and machines
is a natural language-like protocol. In mean time, we should use a more formal specification for the humanagent communication language and provide Application Program Interfaces (APIs) for developing other
interfaces- like HTML, NNTP and TELNET.
5. ARCHITECTURE OF THE INTELLIGENT AUTONOMOUS AGENTS
In the work presented here, we define Autonomous agents as any software that assists people and acts
on their behalf. How they are internally implemented or externally appear is up to the environment where they
are inserted and their developers. These discrepancies between implementations can be clearly observed taking
on different works about this subject like [3, 4, and 5]. Agents must be truly autonomous; this means that any
agent must be able to control itself and its own actions. Also, they should be able to work independently from
what happened to other agents and have enough freedom to adapt themselves to any possible new environment
dynamics. Without this item we will have just a usual system where whatever new happens it demands full
system administration attention to properly react to new environmental conditions and setup new operating
parameters.
Therefore the main idea is that an autonomous agent is a software application that works by allowing
human users to delegate tasks to these applications. Also, these applications should work independently, but on
the other hand, should interact with human masters in order to acquire new knowledge and learn new tricks. For
network management application, autonomous agents could help in automating repetitive tasks, remembering
rules that users frequently forget and summarizing complex data in understandable human friendly reports or
alerts.
To illustrate the framework for a generic agent; we say that the objective of the generic agent concept is
to define the basic structures, mechanisms and basic abilities that guarantee a minimum standard behavior,
internal and external, to any agent that is created. In other words, a generic agent is a template that the agency
system itself uses to create new agent. Over those templates new functionality can be added to make this new
agent being created best fit into its destination environment [6]. The generic agent concept defines the following
three modules as shown in Fig.2.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
301
AGENT
INFERENCE
INTERACTION
COMMUNICATION
Fig.2 Generic Agent Structure
INFERENCE
INTERACTION
R
R
cv
COMMUNICATION
CC
INFERENCE
KB
Fig.3 Detail for the INFERENCE Module
*Inference module; which implements the rule deduction engine and also stores the knowledge database as
shown in Fig 3. Symbols that denote the various components of the inference module are given by the following:
RR: Rule resolution component
CV: life cycle component
CC: cooperation component
KB: knowledge base
*Interaction module; which performs the interface between the agent and the environment.
*Communication module; which implement the message exchange procedures between agents.
Communications are implemented by means of knowledge Query and manipulation language (KQML) [7] which
was chosen due to its strong commitment to agent's applications.
For extensibility purposes, the generic agent's architecture is being adapted to be compatible with common
object request broker architecture (CORBA).We want agents to interface with CORBA services, either making
use of distributed object or serving as CORBA repositories sharing internal functionalities.
6. IMPLEMENTATION OF THE AUTONOMOUS AGENTS
The kernel part for every agent on the system is the very same defined in the generic agent structure
described in the above section. Thus we have only one single source code for every agent. The distinction for
each agent is made through the rules loaded on its knowledge base and the native rules loaded on its code. Native
rules follow the java definition and are used strictly when the task is so specific that a knowledge rule could not
be implemented instead.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
302
The rules language is called Mlog. Mlog is a subset of the prolog language optimized for size and
resource consumption reasons. It is possible to submit knowledge to an agent in the IF-THEN-ELSE rule
format and the agent will translate it to Mlog format-being helped by Interface agents that have IF-THENELSE to Mlog translation knowledge. A lot of prolog programs will run on Mlog-based agent. There should be
Prolog converters to allow any Prolog application to be compatible. Programming in Mlog consists of:
(a) Declaring some facts about objects and their relationships for network management, we can see objects as
devices.
(b) Defining some rules about objects and their relationships.
(c) Asking question about object and their relationships.
One major difference between Prolog and Mlog is that Mlog handles exception like unknown clauses or
invalid values instead of throwing out an error message .If an unknown clause is reached inside the rules
resolution rules are not analyzed when added to knowledge base (KB) but during execution-an unknown clause
exception is thrown and the cooperation component activated. Thus, the agent will try to learn that clause from
the community before deciding to interrupt that execution sequence. The same applies for invalid values and
other exceptions. Once Mlog is learned, it becomes straightforward to understand how the inference module is
implemented (see Fig 3). The knowledge base (KB) is implemented by a set of Mlog sentences, like in Prolog.
The rules resolution component (RR) in the Mlog engine and is based on a full but reduced Prolog engine that
process the (KB) rules based on an expression being executed. The cooperation component (CC) is a new
feature, not usual to Prolog system. It implements learning capabilities and is used when either a new knowledge
is required to accomplish a rule resolution, or a new knowledge is offered for the agent from the community. In
other words, it handles how to add to or remove knowledge from internal knowledge bases and how to ask for
new knowledge and later add it from the community. A flowchart for CC Vs. RR co-existence is presented in
Fig 3.
Execute rule(par1,par2,…)
Variables initializing
Knowledge
base
Load information about the rule being executed
No
Known
rule?
Learn from community
Yes
Analyze parameters
Yes
Is
native
?
Execute native rule(par1,par2,..)
No
Mlog rule resolution
Fig.3 Basic Rule Resolution Flowchart
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
303
The life cycle component (CV) is an internal Mlog implemented component that complies the set of
goals and has an agent to pursuer these in a certain amount of time. After that, it submits these goals, one by one,
to execution in the rules resolution component.
The interaction module is not a processing module per se but a set of rules and native methods declared
inside an agent to interface with environmental variables. In network management application, SNMP interface
clauses are implemented in this module. The goal of creating an independent module for those rules and native
rules is to better understand the system functionality. From the compilation point of view, the independent
modules are linked inside the code depending on the functions that will be exercised by agent being built.
Finally, the communication module is much like the interaction module: a set of rules and native methods
targeting communication procedures. Things like TCP/IP package exchanging KQML protocol formatting
communication session control, and other related to information exchange are treated in this module.
7. CONCLUDING COMMENTS AND FURTHER DEVELOPMENT
As we have seen in this paper, there are quite a few predictable advantages of using intelligent
autonomous agents for managing a computer network. First of all, there is the intelligent nature which sounds
promising. Having an intelligent and adaptable system is usually better than having dedicated applications for
specific solution. It was also shown that applying a distributed artificial intelligence offer use a significant
advantages over large monolithic system among it the following:
System modularity; it is easier to build and maintain a collection of quasi independent modules than one huge
one.
Efficiency; not all knowledge is needed for all tasks. By modularizing it, we gain the ability to focus intelligent
autonomous agent efforts in ways that are most likely to pay off.
A communication structure that enables information to be passed back and forth among agents.
As a future work to be done by others, we recommend with the following points:
* To create a series of new agents modules and module plug-ins, in order to make them available to an
increasing number of new working environments.
* To create a global wizard structured that is a set of Internet available agents to whom local agents will
refer in case they need locally unknown new information. In these structures we will store as much information
as possible for all different fields where agents will be applied.
REFERENCES
[1] TANENBAUM, A.S.: Computer Networks, Forth Edition, Prentice Hall PTR, Upper Saddle River, NJ
07458, 2003
[2] KALYANAMAN, S.: Simple Network Management Protocol, Resselaer Polytechnic Institute, 2000
[3] KAUTZ, H.A.: Botton-up Design of Software Agents, Communications of The ACM, Vol.37, No.7, U.S.A,
1994
[4] MAES, P.: Agents that Reduce Work and Information Overload, Communications of the ACM, vol.37, NO.7,
U.S.A, 1994
[5] FERBER, J.: Multi- Agent System an Introduction to Distributed Artificial Intelligent, pp.4-8, Ed. Addison –
wesley, France .1999
[6] ICHOCKIL, A.: Architectures and Electronic Implementation of Neural Network Models – Neural Network
for Optimization and Signal Processing, John Willey & sons, U.K, 1993
[7] KAHANI, M., BEADLE, P.H.V.: Decentralized Approaches for Network Management, Computer
Communications Review, ACM-SIGCOMM, Vol.27, NO.3, pp.36-47, U.S.A, 1997
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
304
HOLOGRAPHIC REDUCED REPRESENTATION IN
ARTIFICIAL INTELLIGENCE AND COGNITIVE SCIENCE
Vladimír Kvasnička, Jiří Pospíchal
Slovak Technical University, 812 37 Bratislava, Slovakia
I INTRODUCTION
A modern view of the relation between brain and mind is based on the neuroscience paradigm [3],
according to which the architecture of the brain is determined by connections between neurons, their inhibitory
or excitatory character and also by the strength of the connections. Human brain displays a great plasticity,
synapses are perpetually formed (but also deleted) during a learning process. It can be stated, that an ability of
brain to perform not only cognitive activities, but also to serve as memory and control center for our motoric
activities, is fully encoded by its architecture. The metaphor of a human brain as a computer should be therefore
formulated in such a way, that a computer is a parallel distributed computer (containing many billions of
neurons, elementary processors interconnected into a complex neural network). A program in such a parallel
computer is directly encoded in the architecture of the neural network, i.e. human brain is a single-purpose
parallel computer represented by a neural network, which can not be reprogrammed without a change of its
architecture.
It follows from the above general statement that the mind with the brain creates one integral unit, which
is characterized by a complementary dualism. The mind is in this approach understood as a program carried out
by the brain, while this program is specified by architecture of the distributed neural network representing the
brain. The brain and the mind are two different aspects of the same object:
·
When talking about the brain, we have in mind a „hardware“ structure, biologically determined by
neurons and their synaptic connections (formally represented by a neural network), on the other hand.
·
When talking about the mind, we have in mind cognitive and other similar activities of the brain, which
are carried out on a symbolic level, where the transformation of symbolic information is processed on the
basis of (simple) rules.
A complementary dualism between brain and mind causes certain difficulties in the interpretation of
cognitive activities of mind. A purely neural approach to the interpretation of cognitive activities of mind focuses
on the search of neural correlates of neural activities and cognitive activities (connectionism). The application of
the neural paradigm for the interpretation of symbolic cognitive activities has a „side effect“ in „dissolving“ of
these activities in their microscopic description, symbols quasi „disappear“ in the detailed description of
activities of neurons, strengths of synaptic connections etc. On the other side, the absolute acceptance of
symbolic paradigm in interpretation of cognitive activities of mind (cognitivism), ignoring of the fact, that mind
is thoroughly embedded in brain, leads to a conceptual sterility, to an effort to explain cognitive activities of
human mind only in the phenomenological terms derived from the concept of symbol. It leads to symbolic
constructs (methods, algorithms etc.), for which there usually does not exist any experimental support in
neuroscience. The goal of this paper is to highlight an alternative approach, which may overcome the gap
between the connectionist and cognitivistic approach in the description and interpretation of cognitive activities
of the human brain [10-13]. We shall show, that the application of a distributed representation allows to integrate
connectionism and cognitivism, where mental representations (symbols) are specified by distributed patterns of
neural activities, while over these distributed patterns we can introduce formal algebraic operations, which not
only allow to mathematically model cognitive operations, but also allow to simulate processes of storage and
retrieval of information from memory.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
305
....
distributed
representation
neural network
Figure 1. A visualization of the transition from neural network to distributed representation. The state of neural
network in the time t is given by the activities of single neurons, which are determined by the activities in the
previous time t-1 and by weight coefficients of single oriented connections. Using a certain abstraction, these
activities can be ordered into one big one-dimensional array (vector) of real numbers (their size is determined by
the level of gray of the corresponding component – neuron). In the distributed representation the architecture of
the neural network is ignored, i.e. two distributed representations must be understood as totally independent
without mutual relations, their incidental connections derived from neural network are completely ignored. New
unary and binary operations are introduced in the distributed representation, which enable to create new
distributed representations from the original ones.
We shall turn our attention to a nontraditional style of performing calculation by using distributed
patterns. This approach is substantially different from classical numeric and symbolic computations and it is
a suitable model tool for understanding of global properties of neural networks. We shall show, that such a
„neurocomputing“ is based on extensive randomly created patterns (represented by multidimensional vectors
with random entries), see fig. 1. This approach, which basic principles were formulated already at the end of
sixties [2,4,5,9,14], was crowned by a series of works by Tony Plate [7-9] about „holographic reduced
representation“ (HRR). We shall show, which types of computation can be implemented in this approach and
whether they help us to understand the processes in the brain during cognitive activities. Our addition to the
development of HRR consists in its application to modeling of cognitive processes of reasoning by application of
rules modus ponens a modus tollens. Kanerva [16-18] in the middle of nineties proposed a certain alternative to
HRR, which is based on randomly generated binary vectors.
2. A MATHEMATICAL FORMULATION OF HOLOGRAPHIC REPRESENTATION
The aim of this chapter is a presentation of basic properties of a holographic representation, which was
developed by A. Plate [7-9]. Its basic notion is a conceptual vector, which is represented by an n-dimensional
vector
a Î R n Þ a = ( a0 ,a2 ,...,an-1 )
(1)
where its components are random numbers with a standard normal distribution
ai = N ( 0 ,1 n ) "Î {0 ,1,...,n - 1}
(2)
where N(0,1/n) is a random number with a mean equal to 0 and a standard deviation 1/n.
Over conceptual vectors there is defined a binary operation „convolution“, which assigns to a couple of
vectors a third vector, Ä
: R n ´ R n ® R n , or
c = aÄb
The components of the resulting vector c = ( c0 ,c1 ,...,cn -1 ) are determined by a formula
n -1
ci = å a j b[i - j ]
( i = 0 ,1,...,n - 1)
(3)
(4)
j =0
where the index in the square brackets, [k], is defined using a modulo n operation as follows1
1
A standard definition of an arithmetic operation k modulo n is determined as a remainder after integer division
by a number n. It is necessary to comment, that the used definition of the operation k modulo n is different from
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
306
k ¢ = k mod n
ìk ¢
[ k ] = ïí
ïîn + k ¢
( if
( if
(5a)
k ¢ ³ 0)
(5b)
k ¢ < 0)
Since it is one of the basic notions of the holographic representation, we shall give a table of values [k]
for –4£k£4 and n=4.
k
-4
-3
-2
-1
0
1
2
3
4
k’
0
-3
-2
-1
0
1
2
3
0
[k]
0
1
2
3
0
1
2
3
0
b
a
0
1
2
0
00
01
02
1
10
11
12
2
20
21
22
c2
c0
c1
Figure 2. Convolution of two vectors a and b for n=3 (see (6)).
A convolution of two vectors a and b for n=3 has the following form (see fig. 2)
c0 = a0b0 + a1b2 + a2b1
c1 = a0b1 + a1b0 + a2b2
(6)
c2 = a0b2 + a1b1 + a2b0
The convolution satisfies the following properties:
(1) commutativity, a Ä b = b Ä a
(2) associativity, ( a Ä b ) Ä c = a Ä ( b Ä c )
(3) distributiveness, a Ä ( ab + b c ) = a ( a Ä b ) + b ( a Ä c )
(4) an existence of a unit vector, 1 Ä a = a
(1 = (1,0,...,0 ) )
The convolution can be also expressed by a circulant matrix [1]
this standard definition for negative numbers k. While the standard definition provides a result with a negative
value, if the result is negative in our definition, than it is transformed by adding n to it.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
307
c0 = a0b0 + a1b2 + a2b1 ü æ c0 ö æ a0 a2 a1 öæ b0 ö
ï
÷
c1 = a0b1 + a1b0 + a2 b2 ý Þ çç c1 ÷÷ = çç a1 a0 a2 ÷ç
÷ç b1 ÷ Þ a Ä b = circ ( a ) b
b ÷
c2 = a0b2 + a1b1 + a2b0 ïþ çè c2 ÷ø çè a2 a1 a0 ÷ç
144244
3øè 2 ø
(7)
circ ( a )
This specific example is generalized for an arbitrary dimension n as follows
æ a0
ç a
ç 1
c = a Ä b Û c = circ ( a ) b Û circ ( a ) = ç ..
ç
ç an -2
ça
è n-1
an-1
a0
..
an -3
an - 2
..
..
..
..
..
a2
a3
..
a0
a1
a1 ö
a2 ÷÷
.. ÷
÷
an -1 ÷
a0 ÷ø
(8)
where the general circulant matrix has its elements
( circ ( a ) )
ij
= a[i - j ]
(9)
The circulant matrix has the following properties
circ ( a Ä b ) = circ ( a ) circ ( b )
(10)
and since the convolution is a commutative operation, then circulant matrices are mutually commutative
circ ( a ) circ ( b ) = circ ( b ) circ ( a )
(11)
Let X is an inverse matrix to a circulant matrix circ(a)
Xcirc ( a ) = circ ( a ) X = E
Its alternative form is
-1
Let a
(12)
X = circ -1 ( a )
be an inverse vector to the vector a, a
-1
(13)
Ä a = 1 = (1,0 ,...,0 ,0 ) , then assuming that the
circulant matrix is regular, circ ( a ) ¹ 0 , it follows
circ -1 ( a ) = circ ( a -1 )
(14)
Let us define a unary operation involution (see fig. 3)
( )*
by a formula
: Rn ® Rn
(
b = a* = a[0] ,a[-1] ,...,a[- n + 2] ,a[- n +1]
(15)
)
(16)
(a 0,a 1,a 2,...,a n-2, an-1)* =(a 0,a n-1,an-2,...,a 2,a 1)
Figure 3. Visualization of the unary operation of involution.
The operation of involution satisfies the equations
( a + b ) = a * + b*
( a Ä b )* = a * Ä b*
( a Ä b* ) × c = a × ( b Ä c )
(17b)
a ** = a
circ ( a* ) = circT ( a )
(17d)
(17e)
*
(17a)
(17c)
One of the ba8sic aspects of the holographic representation is the possibility of reconstruction of the
original components, which were used for construction of convolution of two vectors. This possibility is very
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
308
important, since it allows us to decode the original information from the complex conceptual vectors.
Reconstruction of x from cÄx is based on the formula
x% = c * Ä ( c Ä x ) » x
(18)
according to which the convolution c* with the vector cÄx produces the vector x% , which is similar to the
original vector x, x% » x . In the component formalism the vector x% has a form
n -1
x%i = å c j ( c Ä x )[i + j ]
(19)
j =0
where ( c Ä x )[ k ] =
n -1
åc x [ ]
l =1
n -1
l éë k -l ùû
x%i = xi å c 2j +
j =0
, by using this formula in (19) one gets after simple algebraic manipulations
n -1
åc cx[
j ,l = 0
( j ¹l )
j l éë i + j ]- l ùû
æ
ö
ç
÷
ç
÷
æ n -1 2 ö ç
1 n -1
= ç å c j ÷ xi + n -1 å c j cl xé[i + j ]-l ù ÷
ç
ë
û ÷
è j =0 ø ç
c 2j (j j,l¹=l0)
÷
å
j =0
ç
144424443 ÷÷
ç
hi
è
ø
(20a)
This result can be reformulated in the form
æ x0 + h0 ö
ç x +h ÷
1
%x = ç 1 1 ÷ = x + h
( c × c ) ç ............. ÷
çç
÷÷
è xn -1 + hn -1 ø
(20b)
where h is interpreted as a random noise with a normal distribution with a zero mean and a standard deviation
much smaller than x.
The overlap of the resulting vector x% with the original vector x is determined from a scalar product by
-1 £ overlap ( x , x% ) =
x × x%
£1
x x%
(21)
where the inequalities result directly from the Schwartz’s inequality from linear algebra. The more this value is
close to its maximum value, the more similar2 are the vectors x% and x.
In the fig. 4 a histogram of overlaps is shown for the product c Ä x , containing a couple of randomly
generated different conceptual vectors c and x of the dimension n=1000. It is evident from the figure, that the
most common overlap between x% = c * Ä c Ä x and x is around 0.7, from which follows, that the vectors x%
and x are similar, x% » x .
frequency of occurrence
1,0
0,5
0,0
0,4
0,5
0,6
0,7
0,8
overlap(x,x)
0,9
1,0
2
In the case, that the overlap value approaches -1, then the vectors x% and x are also similar, even though they
have opposite orientation (they are anticolinear).
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
309
Figure 4. A histogram of overlaps between vectors x% and x (of dimensions n=1000) has highest frequency
around 0.7, from which follows, that the vectors x% and x are similar.
Let’s turn our attention to the second possibility of the verification of the formula (20b) with the
application of the approach called the „superposition memory“. Let us have a set containing p+q randomly
{
}
generated conceptual vectors, X = x1 , x2 ,..., x p , x p +1 ,..., x p + q , while p<q. Using the first p vectors from X
allows us to define a memory vector t as their sum
p
t = å xi
(22)
i =1
0,6
overlap(xi,t)
0,4
0,2
0,0
-0,2
1
2
3
4
5
6
7
8
9 10 11 12 13 14
vectors included in
vectors not included in
the superposition memory the superposition memory
Figure 5. Illustration of the superposition memory for the first 7 vectors of the set X, which contains 14
randomly generated conceptual vectors of the dimension n=1000. The threshold value J can be in this case set to
0.2.
The vector t represents a superposition memory, which by a simple additive way contains vector from
the set X. The decision, whether some vector x Î X is contained in t must be based on the value of the overlap
(21)
overlap ( x , t ) =
x×t
x t
(23)
If this value is greater than a predefined threshold value, overlap ( x , t ) ³ J , then the vector x is
included in the superposition memory t, in the opposite case, if overlap ( x , t ) < J , then the vector x is not
included in t (see fig. 5).
In both of the previous examples (see figs. 3 and 4) there was used the same method of
determination of conceptual vectors, which can „appear“ in some other different complex conceptual vector
(which can be the result of complicated previous calculations – transformations). The used method is called
„clean-up“ and it is specified as follows: Let us have a set of vectors X = { x1 , x 2 ,..., xn } and some vector t.
We face the decision, whether the memory vector (trace) t contains a superposition component, which is similar
(or which is not similar) to some vector from the set X. This problem can be solved by calculating so called
overlap (23), formally
ìï yes
x »t =í
ïîno
( overlap ( x , t ) ³ J )
( overlap ( x , t ) < J )
(24)
where J is a chosen threshold value of acceptance of the size of the overlap as the positive answer. The result of
this cleaning-up process is a subset of vectors
X ( t ) = { x Î X ; x » t} Í X
(25)
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
310
We can put the question also in a rather different form, which is, whether the memory vector t is similar
to any of the vectors from the set X? The answer to this more general question shall be decided from the
maximum value of the overlap
overlap ( t , X ) = max overlap ( t , x )
(26)
xÎX
Then we can rewrite (24) in the form
( overlap ( x , X ) ³ J )
( overlap ( x , X ) < J )
ìï yes
x» X =í
ïîno
(27)
3. ASSOCIATIVE MEMORY
The construction of the associative memory belongs to the main results of the holographic reduced
representation, which can be further generalized by so called chunking. Let us have a set of conceptual vectors
X = { x1 , x 2 ,..., xn } and a training set Atrain = {ci xi ; i = 1, 2,...,m} , which contains m<n associated
couples of conceptual vectors ci xi , where ci is the input to the associative memory (cue) and xi is the output
from the memory. Let’s create a memory vector t representing the associative memory created from the training
set Atrain
m
t = c1 Ä x1 + ... + cm Ä xm = å ci Ä xi
(28)
i =1
Let us suppose, that we know in advance only the inputs ci to the associative memory, we do not know
the possible outputs from the set X train = { x1 , x2 ,..., xm } . The response of the associative memory to the input
- clue ci is determined by the process of „clearing-up“ represented by the formula (27). In the first step we shall
*
calculate the vector x% i = ci Ä t , then by a process based on the maximum value of the overlap we shall find
whether x% i » xi Î X
overlap ( x% i , X ) = max overlap ( x% i , x )
(29)
xÎX train
The associative memory will be illustrated by the following two examples.
1st example
This example uses only the training set Atrain = {ci xi ; i = 1, 2,...,m} , which is randomly generated
for m=8, while the dimension of conceptual vectors is n=1000. For each associated couple ci xi there are
(
*
calculated ti = ci Ä xi . The values of overlap ci Ä ti , x j
1
2
3
4
5
6
7
8
1
0.71703
-0.03998
-0.02757
0.00785
-0.00466
-0.01467
0.02966
-0.00344
2
-0.01820
0.73804
-0.01736
0.00374
0.00426
0.02522
0.00892
-0.01080
3
0.01452
0.01510
0.64667
-0.01899
-0.01831
-0.01403
-0.00301
0.00843
4
0.02776
0.01430
0.00474
0.68728
-0.00827
-0.01316
-0.00358
-0.01871
) are presented in the table
5
-0.01488
0.00276
-0.11580
-0.15340
0.70767
-0.03000
0.01285
0.00324
6
-0.01922
0.02346
-0.00812
0.00005
0.04175
0.71444
0.00971
-0.02629
7
-0.02442
-0.00545
0.01476
-0.00561
-0.03384
0.00078
0.70790
0.00851
8
0.01358
-0.01626
0.00379
0.00136
-0.00668
-0.00526
0.01816
0.58957
It is evident from the table, that the overlaps are sufficiently great just for diagonal values, while the
non-diagonal overlaps are smaller by an order of magnitude. We can therefore unambiguously decide from the
*
overlap, whether ci Ä ti » xi is associated with the cue ci.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
311
2nd example
In this illustrative example we shall use the training set Atrain = {ci xi } , generated for m=10
associated couples – vectors of dimension n=1000. This memory is represented by a memory vector
t = c1 Ä x1 + ... + cm Ä xm . The following table shows 20 experiments of „clean up“, where we used with a
50% probability as an associative entry a vector ci from the training set or a randomly generated conceptual
vector. The table contains maximal values of overlaps (29), by which we can unambiguously determine, whether
the used input has an associated counterpart in the training set.
#
max. overlap
Input index
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
0.311
0.047
0.383
0.373
0.316
0.397
0.074
0.065
0.069
0.039
0.344
0.402
0.032
0.073
0.017
0.004
0.033
0.056
0.373
0.037
6
rand. gener.
5
10
3
4
rand. gener.
rand. gener.
rand. gener.
rand. gener.
7
8
rand. gener.
rand. gener.
rand. gener.
rand. gener.
rand. gener.
rand. gener.
10
rand. gener.
index of output with
max.overlap
6
nonexistent
5
10
3
4
nonexistent
nonexistent
nonexistent
nonexistent
7
8
nonexistent
nonexistent
nonexistent
nonexistent
nonexistent
nonexistent
10
nonexistent
It follows from the table, that the associative memory with the clean up process is unambiguously
identifying, the values of a maximum overlap for conceptual vectors well specify the existence (or nonexistence)
of corresponding associative outputs.
1.0
overlap
0.8
a
b
d
c
0.6
0.4
0.2
0.0
1
2
c
d
e
f
g
h
i
j
3
4
5
6
7
8
9
10 11 12
superposition memory associative
memory
Figure 6. Illustration of analysis of a combination of superposition memory and associative memory. In the first
step we carried out the process of clean up, by which we found, that the memory contains only two items –
atomic vectors a and b, which in the next step serve as an input for further analysis of the associative memory,
where we found as outputs the vectors c and d. During the clean up process we moreover verified with a negative
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
312
result, whether the superposition memory contains also further vectors c,d,...,i,j. With a great probability we can
therefore decide, that the memory vector t is the combination of the superposition and associative memory
t = a + b + a Ä c + b Ä d . The dimension of the used vectors is n=1000.
Combination of superposition memory and associative memory
We shall show, that also a combination of can be superposition and associative memory reliably
analyzed, which will prove in our further applications as an advantageous feature of the holographic
representation. Let us presume, that we have 10 conceptual vectors a,b,c,d,...,i,j, and from the first four we shall
construct a combination of a superposition and associative memory as follows
t = a+ b+ aÄc +bÄd
(30)
By the clean up procedure we can find out, that the vector t contains as its parts the vectors a and b,
which we shall in the following step remove from the vector t
t¢ = t - a - b = a Ä c + b Ä d
(31)
From the remaining superposition part we can find out by the application of the analysis (as from the
associative memory) that it contains two couples a c and b d , see fig. 6.
4. SEQUENCE OF SYMBOLS
The construction of the associative memory does not allow storing of structured data, the aim of this
chapter is to show, that a holographic distributed representation is able to process a linear sequence of symbols,
which are represented by a sequence of conceptual vectors.
To concretize our thoughts, let us study a sequence of 6 conceptual vectors of the dimension n=1000
sequence = {a ® b ® c ® d ® e ® f }
(32)
For these vectors we shall construct a memory vector
t0 = a + a Ä b + a Ä b Ä c + a Ä b Ä c Ä d + a Ä b Ä c Ä d Ä e + a Ä b Ä c Ä d Ä e Ä f
(33)
We know, that this vector contains the sequence of vectors coded by (33), but we do not know, which
vectors these are and in what order. We shall show, that by the clean up procedure we can from the vector t0
reconstruct the original sequence (32) step by step using the following procedure (see fig. 7):
1. step: a = clean _ up ( t0 ) , t1 := t0 - a ,
t%1 := a* Ä t1 ,
2. step: b = clean _ up ( t%1 ) , t2 := t1 - a Ä b ,
*
t%2 := ( a Ä b ) Ä t2 ,
3. step: c = clean _ up ( t%2 ) , t3 := t2 - a Ä b Ä c ,
*
t%3 := ( y1 Ä y2 Ä y3 ) Ä t3 ,
4. step: d = clean _ up ( t%3 ) , t4 := t3 - a Ä b Ä c Ä d ,
*
t%4 := ( a Ä b Ä c Ä d ) Ä t4 ,
5. step: e = clean _ up ( t%4 ) , t5 := t4 - a Ä b Ä c Ä d Ä e ,
*
t%5 := ( a Ä b Ä c Ä d Ä e ) Ä t5 ,
6. step: f = clean _ up ( t%5 ) .
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
313
(a
a b
a b c
a b c d
a b c d e
a b c d e f
)» a
step 1
a *Ä a b
a b c
a b c d
a b c d e
a b c d e f
)» b
step 2
a b c d
a b c d e
a b c d e f
)» c
step 3
) (
a b c d
a b c d e
a b c d e f
)»
d
step 4
(a
b c d *Ä
a b c d e
a b c d e f
)»
e
step 5
b c d e *Ä a b c d e f
)»
f
step 6
(
( a b )* Ä( a
(a
b c
b c *Ä
) (
(a
) (
Figure 7. Representation of the reconstruction of the sequence of symbols abcdef (single symbols are
represented by randomly generated conceptual vectors). The clean up procedure is successively applied to the
sequence of t vectors, by which we shall obtain the corresponding sequence vector.
1.0
overlap
0.8
0.6
0.4
0.2
0.0
a
b
c
d
e
vectors of the sequence
f
Figure 8. The overlap for single vectors with a sequence from (30), which were obtained by the reconstruction
from the vector t0. It is apparent from the figure, that a degradation of reconstruction relatively quickly appears,
already the sixths vector f is reconstructed with a probability smaller than 0.20.
The function clean_up(×) carries out the clean up process for the given vector t with respect to the set of
vectors X = {a , b ,..., f , g , h,...} . The single steps of the reconstruction of the sequence of conceptual vectors –
symbols are shown in the fig. 8, from which follows, that the process of the reconstruction of a sequence of
symbols rather quickly degrades, already for the sixth vector the overlap is smaller than 0.2.
The sequence of symbols can be coded also by an associative memory, where the vector of the entry ci
specifies ith position of the given symbol. The above mentioned illustrative example is represented by a memory
vector
t = c1 Ä a + c2 Ä b + c3 Ä c + c4 Ä d + c5 Ä e + c6 Ä f
(34)
The recognition of this sequence consists in the search of the associate vector to the input vector ci , by
application of the clean up process there can be constructed a „training set“
Atrain = {c1 a , c2 b , c3 c , c4 d , c5 e , c6 f }
(35)
which unambiguously specifies the sequence of vectors. The advantage of such a technique is in the accuracy of
recognition, which does not degrade so fast as during the original procedure based on the memory vector (33).
The associative approach to the implementation of the memory for a sequence of symbols (represented
by atomic vectors a, b, ) can be easily modified into the form of so called „stack memory“. Associative entries
ci are determined as follows
ci = pi
( i = 1,2 ,...)
(36)
where pi is the ith (convolutive) power of randomly generated conceptual vector p, p = p
memory vector (32) has a form
i
i-1
Ä p . Then the
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
314
t = p Ä x1 + p 2 Ä x2 + p3 Ä x3 + ... + p n Ä xn
(37)
where xi are single items from memory {a, b,...}. Such an interpreted associative memory for the sequence of
symbols is called the „stack memory“, with the help of power entries pi we can easily change their contents, see
fig. 9. Over this memory we can define three different operations, by which we can change its contents:
t
pop(t)
top(t)
push(t)
Figure 9. Three possible operators for the stack memory represented by a vector.
(1) push ( t , x ) = p Ä x + p Ä t , the new item x is placed to the top of the stack.
(
)
(2) top ( t ) = clean _ up p Ä t , recognizes the top item in the stack.
(3) pop ( t ) = p
*
Ä t - top ( t ) , removes the top item from the stack.
-1
The most problematic is the third operation, which removes the top item from the stack. The correct
implementation needs an application of the exact inverse vector p-1, the approximation of this inverse vector by
involution, p
-1
! p* , leads to a fast degradation of the stack memory.
5. MEMORY CHUNKS
Memory chunks help to overcome the problems with a degradation of memory for a sequence of
symbols (see chap. 4). Let us have a set of conceptual vectors S = {a , b ,..., k , l ,...} , this set shall be divided
into disjoint subsets - chunks
(S È S
S = S1 È S2 È S3 È S4 È ...
i
j
= Æ , pre i ¹ j )
(38)
Let’s study a set S = {a,b,c,d ,e, f , g,h} , its decomposition into chunks looks as follows (see fig.
10)
S1 = {a,b,c} , S 2 = {d ,e} , S3 = { f } , and S 4 = { g,h}
a
s
s1
(39)
d
f
b
e
s3
s2
h
c
i
s4
g
Figure 10. Illustration of the decomposition of the set S into disjoint chunks, see (38).
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
315
Chunks are represented by a vector, which represents a sequence of chunks {s1 ® s2 ® s3 ® s3 }
t = s1 + s1 Ä s2 + s1 Ä s2 Ä s3 + s1 Ä s2 Ä s3 Ä s4
(40b)
while single chunks are defined by corresponding sequences of vectors (see fig. 11)
s1 = a + a Ä b + a Ä b Ä c
s2 = d + d Ä e
s3 = f
s4 = g + g Ä h
(40c)
(40d)
(40e)
(40f)
The processing of the memory chunks can be divided into two steps:
1. step – by a clean up process we shall identify chunks contained in t (we presume, that the clean up
process has the set X = { x1 , x 2 ,..., xn } from the end of the 2nd chapter, where this process was specified,
enlarged also by chunks s1 , s 2 , s3 , s4 , i.e. in our illustrative example X = {a , b , c , d , e , f , g , h, s1 , s2 , s3 , s4 }
).
2. step – the identified chunks are further processed by the clean up technique.
t
s1
s2
s3
a b c d e
s4
f
1st stage
2nd stage
g h
Figure 11. The chunking of 8 conceptual vectors onto 4 chunks. In the 1st stage the clean up process identifies
the chunks, which are then in the 2nd step further analyzed up to vectors describing elementary concepts.
1,0
a
abc
0,8
de
e
b
overlap
f
d
g
h
0,6
f
gh
c
0,4
0,2
0,0
1
2
3
set of chunks
(abc,de,f,gh)
4
5
6
7
8
9
10 11 12
1st chunk 2nd chunk
4th chunk
(cde) 3rd chunk (gh)
(abc)
(f)
Figure 12. The representation of the two-step clean up process, where in the first step the chunks are identified,
while in the second step there are identified the vectors corresponding to atomic concepts from the already
identified chunks.
The result of the two-step process of clean up is shown in the fig. 11. It is evident from this figure, that
in the case of a long sequence of conceptual vectors a fast degradation during the clean up process can be
partially overcome by the chunking of concepts onto chunks, which are at the highest level separately coded.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
316
t=s12345
s1
s2
a b c d e
1st step
s345
s12
s3
f
s4
s5
g h
i j k
2nd step
3rd step
Figure 13. Illustration of chunks of higher order, where some of the used chunks are composed from simpler
chunks.
1.0
overlap
0.8
0.6
0.4
0.2
0.0
s12 s345 s1 s2 s3 s4 s5 a
s12345 s12
s345
b c d e
s1
s2
f g h
s3
s4
i
j k
s5
Figure 14. Illustration of a 3-step clean up of the chunked trace t specified by the formula (41h). In the first step
the trace t is analyzed, it is found, that it contains two chunks s12 and s345. In the second step the two chunks from
the previous step are analyzed, and found to contain chunks s1, s2, ..., s5. In the last third step the 5 chunks
identified in the previous step are successively analyzed. They contain the vectors a, b, ..., k.
The chunking method can be generalized, so that the chunks of higher order are created, i.e. chunks
composed of chunks, see fig. 12, where single chunks are defined as follows (the used vectors of concepts a, b,
..., c have the dimension n=1000)
s1 = a + a Ä b + a Ä b Ä c
s2 = d + d Ä e
s3 = f
s4 = g + g Ä h
s5 = i + i Ä j + i Ä j Ä k
s12 = s1 + s1 Ä s2
s345 = s3 + s3 Ä s4 + s3 Ä s4 Ä s5
(41a)
(41b)
(41c)
(41d)
(41e)
(41f)
(41g)
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
317
t = s12345 = s12 + s12 Ä s345
(41h)
The process of clean up of the chunked memory trace t specified by (41h) is shown in the fig. 13. In the
1st step we analyze the trace t, its analysis tells us, that the trace t contains two chunks s12 a s345. In the second
step we analyze chunks from the previous first step, and we recognize their contents as chunks s1, s2,..., s5. In the
last third step we analyze chunks from the previous step, which already contain atomic conceptual vectors a, b,
..., k. The overlaps of the resulting vectors in the clean up process are shown at fig. 14.
The presented illustrative examples show, that the memory chunk approach represents an effective way
to overcome fast degradation of the original version of a successive analysis of the vector (33). By combining of
several conceptual vectors into a chunk, we shall gain a simple opportunity to extend our ability to analyze
correctly greater sets of conceptual vectors. The chunking process can have several hierarchic levels, which
removes the limits from our ability to store and recall conceptual vectors.
6. CODING OF RELATIONS
Holographic reduced representation can serve also as a suitable means for encoding relations
(predicates). Let us study a binary relation P(x,y), when the Pascal code is used, this relation is formally
specified by the head
function P ( x : type1 ; y : type2 ) :type3
(42)
The single arguments of the relation are specified by the types type1 and type2, which specify the
domain, over which are these variables defined; similarly also the relation P itself is understood as a function,
which domain of values is specified by the type type3. In many cases the domain of variables and also the
domain of the relation itself are equal to each other; therefore their specifications can be omitted, which
substantially reduces the holographic representation of relations. The reduced form of relation (42) looks as
follows
function P ( x ; y )
(43)
where we know in advance the type of variables x, y, and also the type of the relation P itself. The holographic
representation of the relation (42) can have the following form
t = P + variable1 + variable2 +
(44)
P Ä ( type3 + variable1 Ä ( x + type1 ) + variable2 Ä ( y + type2 ) )
Their decoding is carried out step-by-step. In the first step we use the clean up procedure to recognize
the name (identifier) of the relation P and also the names (identifiers) of its variables x and y. In the second step
we identify the type type3 of the relation P, in the last, third step we use previous results to identify variables x, y
and also their types type1 and type2. In many cases the representation of the relation P(x,y) is satisfactory in the
following simplified form (see (43))
t = P + variable1 Ä x + variable2 Ä y
(45)
The chosen method of the holographic representation of relation can be easily generalized also for more
(
)
complex (higher order) relations, where the variables are predicates as well, e.g. P x,Q ( y,z ) , where the
„inner“ predicate Q is characterized by
function Q ( y : type3 ; z : type4 ) :type5
(
(46)
)
In order to create a higher order relation P x,Q ( y,z ) , we must presume a type compatibility of the
second variable of the relation P and of the type of relation Q, i.e. type2=type4. In the simplified approach, where
all the types are the same, it is not necessary to distinguish the types of single variables and the relations
themselves. A simplified holographic representation of relation (46) has the following form
t ¢ = Q + variable1 Ä y + variable2 Ä z
(47)
By exchanging the representation (47) for the variable y in the representation (45) we get the following
(
resulting representation of the higher order relation P x,Q ( y,z )
)
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
318
t = P + variable1 Ä x + variable2 Ä ( Q + variable3 Ä y + variable4 Ä z )
= P + variable1 Ä x + variable2 Ä Q +
(48)
variable2 Ä variable3 Ä y + variable 2 Ä variable 4 Ä z
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
29
30
31
32
Figure 15. A set of 48 similar figures, which contain two objects, placed either next to each other, or above each
other and the objects are either small or big. Every column contains a couple of similar objects, which differ only
by their placement or size.
1st illustrative example –a similarity between geometric figures
In the figure 15 there are presented 48=6´8 geometric patterns, which contain either in horizontal or in
vertical settings two objects, which moreover can be of two sizes, small and big. Let us mark holographic
representations of corresponding atomic concepts as follows:
Objects: tr (triangle), sq (square), ci(circle), st (star)
Unary relations: sm (small), lg (large)
Binary relations: hor (horizontal), ver (vertical)
Variables: ver_var1 (1st variable for binary relation ver), ver_var1 (2nd variable
for binary relation ver), hor_ver1 (1st variable for binary relation
hor), hor_ver2 (2nd variable for binary relation hor)
Single figures from z fig. 15 are characterized by relations given in the following table.
row
1
2
3
4
5
6
specification
ver(lg(x),lg(y))
hor(lg(x),lg(y))
hor(sm(x),lg(y)) and hor(lg(x),sm(y))
ver(sm(x),lg(y)) and ver(lg(x),sm(y))
ver(sm(x),sm(y))
hor(sm(x),sm(y))
Holographic representations of single cases from this table have the following form (compare with the
equation (45)).
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
319
t1,x , y = ver + ver _ var1 Ä lg Ä x + ver _ var2 Ä lg Ä y
t2 , x , y = hor + hor _ var1 Ä lg Ä x + hor _ var2 Ä lg Ä y
ìïver + ver _ var1 Ä lg Ä x + ver _ var2 Ä sm Ä y
t3 , x , y = í
ïîver + ver _ var1 Ä sm Ä x + ver _ var2 Ä lg Ä y
ìïhor + hor _ var1 Ä lg Ä x + hor _ var2 Ä sm Ä y
t4 , x , y = í
ïîhor + hor _ var1 Ä sm Ä x + hor _ var2 Ä lg Ä y
t5 ,x , y = ver + ver _ var1 Ä sm Ä x + ver _ var2 Ä sm Ä y
(49)
t6 ,x , y = hor + hor _ var1 Ä sm Ä x + hor _ var2 Ä sm Ä y
where x and y are holographic representations of single objects (tr, sq, ci, st) and the bracket u indicates, that
the vector u is normalized. The similarity between single figures is determined by the overlap of their
holographic representations
similarity ( X , X ¢ ) = overlap ( t , t ¢ )
0.42
0.58
0.13
0.09
0.52
0.16
0.48
0.47
0.15
0.43
0.16
0.54
(50)
0.12
0.54
0.52
0.45
Figure 16. Illustrative presentation of similar figures for two chosen figures 1 and 48 (see fig. 16). Single arrows
are marked by the overlap between the figures calculated by formula (50).
The obtained results are shown in the fig. 16. The dominant feature controlling similarity value is the
horizontal or vertical arrangement of objects. The overlap (i.e. also the similarity) between two figures, which
have different arrangement is usually smaller than 0.1.
In general, holographic reduced representation allows fairly simple determination of similarity of
objects specified by a predicate structure (42) or by its generalization through further nested predicates (see
(48)). This possibility opens new horizons on future developments in fundamental methods of search for similar
objects or analogies, which are considered very difficult problems for artificial intelligence requiring special
symbolic techniques [7].
2nd illustrative example – similarity between binary numbers
We shall study similarity between binary numbers of the length 3, which are represented by a sequence
( a1a 2 a3 ) Î {0 ,1}3 . This number can be understood as an ordered triple of binary symbols, which are in the
distributed representation represented as follows (see (33))
t( a1a 2a3 ) = ta1 + ta1 Ä ta2 + ta1 Ä ta2 Ä ta3
(51a)
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
320
ìï zero
ta = í
ïîone
( for a = 0 )
( for a = 1)
(51b)
where zero and one are distributed representations of numbers ‘0’ or ‘1’. For example, a binary number (101) is
represented as follows
t(101) = one + one Ä zero + one Ä zero Ä one
7
0.1250
6
0.2500
5
0.3750
4
0.5000
3
0.6250
(52)
0.7500
2
1
0.8750
0
1.000
0
1
2
3
4
5
6
7
Figure 17. Graphic representation of similarity between distributed representations of binary numbers, the size
of overlap between couples is proportional to the brightness of the corresponding square area (the areas on the
skew diagonal are the brightest ones). The darkest areas are in the bottom lower corner and in the left upper
corner. It corresponds to the fact that these areas are assigned to representations of maximally distant couples of
numbers.
The similarity between single representations is reflected also by a similarity between corresponding
binary numbers; the representation of two close binary numbers is inversely proportional to their distance (e.g. to
the absolute value of their difference). In the following table there are given the similarities between
representations of binary numbers, which were calculated from their overlap (23).
000
001
010
011
100
101
110
111
000
1.00
0.74
0.50
0.46
0.21
0.16
0.14
0.11
001
0.74
1.00
0.71
0.42
0.45
0.15
0.12
0.15
010
0.50
0.71
1.00
0.75
0.70
0.43
0.09
0.08
011
0.46
0.42
0.75
1.00
0.46
0.70
0.37
0.04
100
0.21
0.45
0.70
0.46
1.00
0.74
0.43
0.39
101
0.16
0.15
0.43
0.70
0.74
1.00
0.70
0.35
110
0.14
0.12
0.09
0.37
0.43
0.70
1.00
0.69
111
0.11
0.15
0.08
0.04
0.39
0.35
0.69
1.00
The maximum similarity is between couples of representations assigned to two neighboring integers,
minimum similarity of two representations occurs, if the corresponding numbers have a maximum distance,
which equals 7. These values from the table are graphically represented in the fig. 17.
This simple illustrative example shows, that in the framework of the holographic distributed
representation one can use (at least potentially) associative representations of the type (28), where associative
cues correspond to numbers. It means that in this distributed approach there exists a possibility of associative
simulation of an arbitrary function, which substantially increases the potential of the method to be used
universally.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
321
7. REASONING BY MODUS PONENS AND MODUS TOLLENS
Simulation of reasoning processes (inference) belongs to the basic problems, which are repeatedly
solved in artificial intelligence and cognitive science [15]. Fodor‘s critique of connectionism [19] was based
precisely on the brash conclusion, that artificial neural networks are not able to simulate higher cognitive
activities, which are purported to be an exclusive domain of the classical symbolic approach. This Fodor’s
opinion was proved to be incorrect, further development of theory of neural networks showed, that
connectionism is a universal computational tool, which does not have limits of applicability, it does not have
domains of inapplicability, which would be forbidden for it. Of course, it can transpire, that in some domains its
application is extremely cumbersome and exceedingly complicated, that there exist other approaches, which in
the given domain provide substantially simpler and direct solution, than the one provided by neural networks.
In this chapter we shall show a possibility of representation of two basic modes of deductive reasoning
of modal logic,
pÞq
p
and
q
pÞq
q
(53)
p
which are called modus ponens resp. modus tollens. These modes of reasoning are equivalent to the following
tautologies of the predicate logic
(( p Þ q ) Ù p ) Þ q
(( p Þ q) Ù q ) Þ p
(54a)
(54b)
Implication ••• can be understood as a binary relation, which can be in holographic distribution
represented like this (see formula (47))
t p Þ q = op Ä impl + var1 Ä p + var2 Ä q
(55)
which contains a sum of three parts, the first part specifies the type of relation (implication), the second and third
parts specify the first (antecedent) resp. the second (consequent) variable of the relation of implication. This
conceptual vector representing relation of implication can be transformed as follows
t%p Þ q = t p Þ q Ä T
(56a)
where
T = var1* Ä p * Ä p * Ä q + var2* Ä q * Ä q * Ä p
(56b)
The transformed representation of implication is represented by a sum of two associated couples
t%p Þq » p* Ä q + q * Ä p
(57)
which gives the holographic representation of the rules modus ponens and modus tollens
p Ä t%pÞq » q
q Ä t%pÞ q » p
(58a)
(58b)
The first formula (58a) can be understood as a holographic representation of modus ponens (see (53)
and (54a)), while the other formula is a holographic representation of modus tollens (see (53) and (54b)).
A similar result can be obtained also by an alternative approach, which is based on the disjunctive form
of implication
( p Þ q) º ( p Ú q)
(59)
The distributed representation of implication in this alternative form can be expressed by
t p Ú q = op Ä disj + var1 Ä p + var2 Ä q
(60)
By a transformation of this representation we can get (see (55))
t%p Ú q = t p Ú q Ä T » p Ä q * + q Ä p*
(61a)
where
T = var1* q * + var2* p*
(61a)
This transformation is much simpler than the one in the previous case (56b). The rules modus ponens
and modus tollens are now realized by formulas similar to (58a-b). Moreover, we get also the following two
„rules“
q* Ä t%pÚ q » p*
(62a)
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
322
p* Ä t%pÚ q » q *
(62b)
which remind us of the well known fallacies
pÞq
q
pÞq
p
a
p
(63)
q
that are known as „affirming the consequent“ resp. „denying the antecedent“. This fault is caused by the fact, that
the transformed representations of implications t%p Þ q and t%p Ú q are not identical, the representation t%p Ú q leads
to unexpected results (63), which represent erroneous modes of reasoning (which are however often used by
people without knowledge of principles of logic).
8. PREDICATE LOGIC
We shall further deal with a simple form of predicate logic, which is based on unary predicates, P(x),
where the distributed representation has a form (see chapter 6)
t P( x ) = pred Ä P + pred_var Ä x
(64)
( "x ) P ( x ) ,
can be represented in
t( "x ) = uni _ quant Ä uni + uni _ quant_var Ä x
(65a)
The connection of this predicate with the universal quantifier,
the following way
t( "x) P( x ) = t( "x) + t( "x) Ä ( pred Ä P + pred_var Ä x )
(65b)
Both conceptual vectors t P ( x ) a t( "x ) P ( x ) can be recognized and extracted by a clean up procedure.
This process is unnecessarily complicated for our purposes of further study of reasoning processes in
the framework of predicate logic and their distributed representation; the application of the conceptual vector t(x)
for the representation of the symbol (x) basically only unnecessarily complicates the process of analysis of
composed conceptual vectors containing as a constituent t(x). We shall therefore cease to use the symbol (x)
explicitly, its meaning will be substituted by usage of a „universal variable“ x, i.e. predicate P(x) containing the
universal variable x is interpreted as ( "x ) P ( x ) , we can therefore with a certain caution use a „formula“
( "x ) P ( x ) º P ( x ) .
In the predicate logic there exists a rule of universal instantiation, which concretizes a predicate with a
universal quantifier onto a predicate with a concrete variable a, ( "x ) P ( x ) Þ P ( a ) , which is a result of
a simple tautology of propositional logic
(( p Ù q ) Ù p ) Þ q . With an application of the universal variable x
we shall rewrite this concretization into a simpler form
P ( x) Þ P (a)
(66)
We shall construct a distributed representation of this universal instantiation of a simple unary
replicator by a transformation vector T, see equations (56-58). The distributed representation (66) looks as
follows
t P( a) » t P( x ) Ä T
(67)
t P( a) » t P( x ) Ä x * Ä a
(68)
*
where T = x Ä a , then
We can recapitulate our thoughts in saying that distributed representation of the universal instantiation
is concretized by a transformation vector T , which helps to substitute a universal variable x by a „specified“
variable a.
We shall use this simplified representation of quantified predicates to study so called generalized modus
ponens and generalized modus tollens
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
323
( "x ) ( P ( x ) Þ Q ( x ) )
P (a )
Q (a)
( "x ) ( P ( x ) Þ Q ( x ) )
Q (a )
P (a)
a
or in a simplified form using a universal variable x
(69a)
P ( x) Þ Q ( x)
P ( x) Þ Q ( x)
P (a )
a Q (a)
Q (a)
P (a)
(69b)
These generalized schemes of deductive reasoning follow directly from their standard sentential form
(53) and concretization (63). The distributed representation of the main (top) premise of these rules has a form
t P( x )ÞQ( x ) = op Ä impl + var1 Ä ( pred Ä P + pred_var Ä x ) +
(70)
var2 Ä ( pred Ä Q + pred_var Ä y )
The concretization of implication P ( x ) Þ Q ( x ) onto P ( a ) Þ Q ( b ) , can be formally expressed
by an implication (see (66))
( P ( x ) Þ Q ( x ) ) Þ ( P ( a ) Þ Q ( a ))
(71)
where the right hand side has the following distributed representation
t P( a)ÞQ( a) = op Ä impl + var1 Ä ( P + a ) + var2 Ä ( Q + b )
(72)
Similarly as in the introductory illustrative example (see (67)), this transfer expressed by an implication
(71) can be in a distributed representation written by the following transformation
*
t%P( a)ÞQ( a ) = t P( a )ÞQ( a ) Ä T » ( P + a ) Ä ( Q + b ) + ( Q + b ) Ä ( P + a )
(73)
where the new transformed distributed representation t%P a Þ Q a satisfies the formulas, which represent the rules
*
( )
( )
(69) modus ponens resp. modus tollens
( P + a ) Ä t%P( a)ÞQ( a) » ( Q + b )
( Q + b ) Ä t% ( )
P a ÞQ( a )
(74a)
» (P + a)
(74b)
Illustrative example – modeling of reflexive reasoning
In this illustrative example we shall show, that the holographic distributed representation provides
formal tools, which can be used to simulate the reasoning process based on generalized modus ponens (69). This
process was widely studied by Shastri a Ajjanagadde [15] by the connectionist system called SHRUTI, which
was able to simulate reflexive reasoning based on predicate logic. Similar results are achieved also by
a formalism of holographic distributed representation.
give(x,z,y)Þown(y,z)
give(John,Mary,book)
buy(x,y)Þown(x,y)
buy(John,something)
own(Mary,book)
own(x,y)Þcan_sell(x,y)
own(John,something)
can_sell(Mary,book)
can_sell(John,something)
Figure 18. Illustration of an application of the generalized rule modus ponens (66) for deduction or knowledge
discovery (marked by gray shading and also by incoming arrows) from implications (1-3) and from input facts
(a-c), marked by outgoing arrows.
Let us have a formal system containing three general rules (see fig.18):
1.
give ( x, y,z ) Þ own ( y,z ) , type x : donor; type y : acceptor; type z : object,
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
324
2.
buy ( y,z ) Þ own ( y,z ) , type y : buyer; type z : object,
3.
own ( y, z ) Þ can _ sell ( y,z ) , type y : owner; type z : object,
and three observations (facts)
·
give ( John,Mary,book ) ,
·
own ( Mary,book )
·
buy ( John,something ) .
What are the deductive conclusions of this system? The results are shown in the figure 18, we shall now
deduce them with an application of distributed representation based on conceptual vectors and operations over
them.
Let us analyze the first generalized modus ponens from the figure 18
give ( x, y,z ) Þ own ( y,z )
give ( John,Mary,book )
(75)
own ( Mary,book )
With an application of the approach described by (70-74) we can realize this scheme of reasoning by
a representation of conceptual vectors, its single items (going top down) are represented as follows
1
t1( ) = op Ä impl + var1 Ä tdať ( x ,y ,x ) + var2 Ä tvlastniť ( y ,x )
t2( ) = t give( John ,Mary ,book ) = give + give_var1 Ä John +
(76a)
1
give_var2 Ä Mary + give_var3 Ä book
t3(1) = town( Mary ,book ) = own + own_var1 Ä Mary +
own_var2 Ä book
(76b)
(76c)
where conceptual vectors t give( x ,y ,x) and town( y ,x ) are constructed analogically as in (45). In the first step we
must carry out a concretization of the implication give ( x, y,z ) Þ own ( y,z ) , so that the general variables x,
y, z are substituted by concrete variables John, Mary, book. This concretization is carried out by a transformation
T, which is specified by formulas (67-68)
ˆt1(1) » t1(1) Ä T
where
ˆt1(1) = op Ä impl + var1 Ä t
+ var2 Ä town( Mary ,book )
give ( John ,Mary ,book )
T = t1*(1) Ä ˆt1(1)
(77)
(78a)
(78b)
Thus concretized representation give ( John,Mary,book ) Þ own ( Mary,book ) is in the next
step applicable for modus ponens carried out by formulas (73-74)
1
1
t%1( ) » ˆt1( ) Ä T ¢
(1)
where the resulting conceptual vector t%1
(79a)
already represents modus ponens, i.e. the next formula holds
t give( John ,Mary ,book ) Ä t%1( ) » town( Mary ,book )
1
(80)
The other three generalized modus ponens from the figure 18 can be realized in a similar way.
Illustrative example –generalization by induction
Let us have a „training“ set composed of sequences of simple unary predicates, which can be
interpreted as observations
Atrain = {m ( ai ) ; i = 1, 2,...,q}
(81)
Our goal will be to generalize these particular predicates into the form with a universal quantificator (or
in our simpler formalism, with a universal variable)
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
325
(82)
( "x ) éëm ( x )ùû º m ( x )
It means, that the single particular cases m ( ai ) from the training set (77) are generalized into a
formula, which is not directly deducible from them (see fig. 19)
m ( a1 ) Ù ... Ù m ( a p ) Þ m ( x )
?
m(a1)
m(a2)
......
{m(a1), m(a2),...., m(ap)}
(83)
m(x)
m(ap)
stand-alone observations
unification of observations
into a single set
inductive
generalisation
Figure 19. Diagrammatic scheme of inductive generalization, which consists from three stages. In the first stage
we have isolated observations m ( ai ) , which are not related. In the second stage the isolation of single
observations is replaced by their unification into a single set, which expresses the fact that observations are not
isolated and independent, but they have something in common. In the final, third stage, the unified form of
observations is inductively generalized using a universal variable x, which represents a class of objects with the
same property m.
Let each predicate m(ai ) from the training set Atrain is holographic represented in the following way
tm( ai ) = rel Ä m + m_var Ä ai
(84)
These represented conceptual vectors can be mutually transformed by transformational vectors Ti,j
tm( a ) » tm( ai ) Ä Ti , j
j
Ti , j = t m* ( ai ) Ä t m( a ) » ai* Ä a j
j
%
Let us define a new transformation vector T
(85a)
(85b)
i
q
T%i = å Ti , j = ai* Ä x
(86a)
x = a1 + a2 + ... + a p
(86b)
j =1
where the conceptual vector x is assigned to the new ”generalized” vector, which represents each argument ai
from the training set Atrain (in our further consideration about induction it is understood as a stand-alone
conceptual entity equal to the representations of the original objects a1, a2, ..., ap)
tm( ai ) Ä T%i » tm( x )
(87)
This formula can be interpreted as an inductive generalization, where particular objects ai from the
training set in representations tm( a ) are substituted by a new object x, that can be interpreted as a new universal
i
object.
9. CONCLUSIONS
Holographic reduced representation offers new unconventional solution to one of the basic problems of
artificial intelligence and cognitive science, which is to find a suitable distributive coding of structured
information (sequence of symbols, nested relational structures, etc.). The used distributed representation is
based on two binary operations: unary operation „involution“ and binary operation „convolution“ over a domain
of n-dimensional randomly generated conceptual vectors, which elements satisfy normal distribution N(0,1/n).
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
326
Application of this distributed representation allows us to model various types of associative memory, which are
represented by a conceptual vector and also to decode a memory vector, i.e. to determine the conceptual (atomic)
vectors it is composed of. Such an analysis of the memory vector is carried out by a clean-up procedure that
determines from the overlap of the vectors, which of the vectors is the most similar to the memory vector. We
have also described the process of chunking of vectors, which allows us to overcome the undesirable fast
degradation of success in retrieval of all the components of the memory vector. The countermeasure against the
degradation consists in chunking of several vectors into one, which is then put into the memory vector.
Holographic reduced representation allows to measure similarity between two structured concepts by a simple
algebraic operation of scalar product of their distributed representations. This fact can be very useful, when we
want to model processes, which search through memory to find its similar (analogical) single components. In the
last part of the paper we have demonstrated, that the holographic reduced representation may be used also to
model an inference process based on the rules modus ponens and modus tollens, and moreover, we have
demonstrated the effectiveness of the approach for modeling of an inductive generalization process.
ACKNOWLEDGMENTS
This work was supported by the grants # 1/0062/03 and # 1/1047/04 of the Scientific Grant Agency of
Slovak Republic.
REFERENCES
[1] DAVIS, P. J.: Circulant Matrices. Chelsea Publishing, New York, 1999.
[2] GABOR, D.: Holographic model for temporal recall. Nature, 217 (1968), 1288–1289.
[3] McCLELLAND, J. L. M., RUMELHART, D. E., and the PDP research group (eds.): Parallel distributed
processing: Explorations in the microstructure of cognition, volumes 1 and 2. The MIT Press, Cambridge,
MA (1986).
[4] METCALFE Eich, J.: Levels of processing, encoding specificity, elaboration, and charm. Psychological
Review, 92 (1985), 1–38.
[5] MURDOCK, B. B.: A theory for the storage and retrieval of item and associative information. Psychological
Review, 89 (1982), 316–338.
[6] PLATE, T. A.: Distributed Representations and Nested Compositional Structure. Department of Computer
Science, University of Toronto. Ph.D. Thesis, 1994.
[7] PLATE, T. A.: Holographic Reduced Representation: Distributed Representation for Cognitive Structures,
CSLI Publications, Stanford, CA, 2003.
[8] PLATE, T. A.: Holographic Distributed Representations. IEEE Transaction on Neural Networks, 6 (1995),
623-641.
[9] SLACK, J. N.: The role of distributed memory in natural language processing. In O’Shea, T. (ed.): Advances
in Artificial Intelligence: Proceedings of the Sixth European Conference on Artificial Intelligence, ECAI84. Elsevier Science Publishers, New York, 1984.
[10] SMOLENSKY, P., LEGENDRE, G., MIYATA, Y.: Principles for an Integrated Connectionist/Symbolic
Theory of Higher Cognition. Technical Report CU-CS-600-92, Department of Computer Science and 92-8,
Institute of Cognitive Science. University of Colorado at Boulder, 1992, 75 pages.
[11] SMOLENSKY, P.: On the proper treatment of connectionism. The Behavioral and Brain Sciences, 11
(1988), 1-74.
[12] SMOLENSKY, P.: Tensor product variable binding and the representation of symbolic structures in
connectionist networks. Artificial Intelligence, 46 (1990)., 159-216.
[13] SMOLENSK, P: Neural and conceptual interpretations of parallel distributed processing models. In J. L.
McClelland, D. E. Rumelhart, & the PDP Research Group, Parallel Distributed Processing: Explorations
in the Microstructure of Cognition. Volume 2: Psychological and Biological Models. MIT Press - Bradford
Books, Cambridge, MA, 1986, pp. 390-431.
[14] WILLSHAW, D. J., BUNEMAN, O. P., LONGUET-HIGGINS, H. C.: Non-holographic associative
memory. Nature, 222 (1969), 960–962.
[15] SHASTRI, L., AJJANAGADDE, V.: From simple associations to systematic reasoning: connectionist
representation of rules, variables, and dynamic bindings using temporal synchrony. Behavioral and Brain
Sciences, 16 (1993), 417-494.
[16] KANERVA, P.: The Spatter Code for Encoding Concepts at Many Levels. In Marinaro, M., Morasso, P. G.
(eds.): ICANN ’94, Proceedings of the International Conference on Artificial Neural Networks. Springer
Verlag, London, 1994, pp. 226–229.
[17] KANERVA, P.: Binary Spatter-Coding of Ordered K-tuples. In von der Malsburg, C., von Seelen, W.,
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
327
Vorbruggen, J. C., Sendhoff, B. (eds.): ICANN'96, Proceedings of the International Conference on
Artificial Neural Networks. Springer Verlag, Berlin, 1996, pp. 869-873.
[18] KANERVA, P.: Analogy as a Basis of Computation, In Uesaka, Y., Kanerva, P., Asoh, H.: Foundations of
real-world intelligence. CSLI Publications, Stanford, CA, 2001.
[19] FODOR, J.A., PYLYSHYN, Z.W.: Connectionism and cognitive architecture: A critical analysis.,
Cognition, 28 (1988) 3-71.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
328
A NOTE ABOUT NON-LINEAR FUNCTION APPROXIMATION
WITH FUNCTION BOUNDARY
Radomil Matoušek
Institute of Automation and Computer Science, Brno University of Technology,
Technická 2, 616 69 Brno, Czech Republic.
Institute of Scientific Instruments, Academy of Sciences of the Czech Republic,
Královopolská 147, 612 64 Brno, Czech Republic.
[email protected]
Abstract: The approximation of engineering data by means polynomial techniques is widely known
as well as advantages and disadvantages these techniques. This paper discus different method of
approximation based on special kind of function denoted as Radial Basis Function (RBF). Next, this
paper proposes to embed a genetic algorithm (GA) in traditional learning algorithm of Radial Basis
Function Networks (RBFN).
Keywords: Genetic Algorithms (GA), Radial Bases Function (RBF) and approximation with
function boundary.
INTRODUCTION
In engineering applications existing approximation tasks, which cannot be solving by means traditional
mathematical techniques based on polynomial regression. Cause agent is a physical substance of problem. For
example, absolute temperature has not go bellow zero, deformation resistance must by positive, and so on. In
these cases the behaviour of polynomial can be unacceptable.
This paper recommended special kind of RBF functions to use. Approximation method is take over
from neural network denoted as RBFN. This paper proposes to embed a genetic algorithm in the traditional
learning algorithm of radial basis function networks. Each function centre, function widths and output an RBF
network is encoded as a binary string, and the concatenation of the strings forms chromosome. In each
generation cycle, the GA determines the parameters of RBFN.
RADIAL BASIS FUNCTION NETWORK
Radial Basis Function Networks (RBFN) are new and extremely powerful type of feedforward artificial
neural network. A RBFN are very often used to implement the mapping f : Â ® Â
n
M
æ x -cj
f k (x) » wk 0 + å wkj F ç
ç sj
j =1
è
m
such that
ö
÷ = yk (x), 1 £ k £ m
÷
ø
where x is an input vector in  , yk ( x) is the kth network output, wkj denotes the weight connecting hidden
n
node j and output node k, cj denotes the jth function centre , and
s j denotes the jth function width.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
329
Common forms of basis function are:
2
F ( r ) =e - r
F ( r ) =r 2 log r
(Gaussian)
(thinplate spline)
-1/ 2
(inverse multiquadratic)
F (r )= r 2 +c 2
(multiquadratic)
1/ 2
2
2
F (r )= r +c
(
(
)
)
ü
ï
ï
ý, r
ï
ïþ
³0
PHASE A
PHASE B
GA training
The topology of RBFN and RBF unit can by describe as follows:
INPUT xi
A general RBFN and a RBF on the planar feature space.
OPTIMISATION AND GENETIC ALGORITHMS
The optimization task by GA is assumed to be:
max {F (a) | a Î {0,1}l } ,
min { f (a) | a Î {0,1}l } ,
where F is objective function denoted as fitness function – satisfies F(a)>0, "a. For next discussion we will
assume minimization optimization task with objective function f(a). The current simple GA consists of n binary
strings a (individual), each of them of length l. This set of n individuals is denoted as population P. Each
individual a is a feasible solution to problem. The transition between successive populations (Pi,Pi+1,…) is
achieved by applying the genetic operators of selection S, crossover C and mutation M. This transition is denoted
as a generation (in numerical terminology as an iteration).
The simple GA with a generation model was used. The short formal description of the GA is:
Pi +1 = g ( Pi , x ), x Î ( S , C , M ) , where Poº pseudo-random or heuristic setting. As we know, selection,
crossover and mutation operators determine genetic algorithm’s behavior. Selection operator has influence on
the convergence behavior of GA.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
330
TEST PROBLEM
As test example of the RBF function approximation and GA learning, we used a set of data points from
measuring of deformational resistance of steel.
y
150
Polynomial
100
50
0
0
0.2
RBF
0.4
0.6
0.8
1
x
1.2
1.4
1.6
1.8
2
CONCLUSION
This paper shows method for function approximation, which combines the RBFN and GA, to find the
parameters of RBF networks. The result was support by means of simple statistic.
D MSE
100 %
90 %
Approximation
The best polynomial
RBF GA learning
Training time
< 1s
5s
Different power of polynomial was confronted with RBF. MSE statistics don’t show demonstrably
better quality of RBF. But power of RBF solution was presented in always-suitable solutions. Id est., the
function values fulfilled the condition f ( x ) ³ 0 .
1
A different power of polynomial
0.8
0.6
y
Polynomial
function is
unacceptable
0.4
0.2
0
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
x
Center of RBF unit
REFERENCES
[1] OŠMERA, P.: An Application of Genetic Algorithms with Diploid Chromosomes, Proceedings of
MENDEL’98, p. 86 – 89, Brno, Czech Republic, 1998.
[2] GOLDBERG, D. E.: Genetic Algorithms in Search, Optimization and Machine Learning, Wesley, 1989.
[3] LIGHT, W.: Advances in Numerical Analysis, Oxford Science Publications, Oxford, 1992.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
331
ACKNOWLEDGEMENTS
The paper was supported by research designs CEZ: J22/98: 261100009 “Non-traditional methods for
investigating complex and vague systems”, VZ MŠMT ME 526 andVZ MŠMT No: 260000013.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
332
ANALYSIS METHODS FOR IMAGE PRE-PROCESSING AND 2-DIMENSIONAL
PATTERN RECOGNITION
Roman Weisser
Orlová, 73514, Czech Republic, [email protected]
Abstract: Subject of my thesis is the analysis of algorithms for pre-processing of visual data and
object recognition in 2-dimensional technological scenes. This work was developed in the research
project CEZ: J22/98: 261100009 „Non-traditional methods for investigating complex and vague
systems“.
ITRODUCTION
Image pre-processing:
Methods for image pre-processing used for improvement image from look next processing.
Subject image pre-processing is reduce noise created at digitalizing and picture transmision, remove distortion
existent property sensor arrangement, or reduce or emphasize other charakteristics at standpoint next processin.
Methods used for image pre-processing: transmission to greyscalle, histogram equalization, filtration, image
sharpening, contrast and brightness correction
Segmentation:
Segmeting is procedure which disperze on part, having near context with objects.
Method used for segmenting picture: threshold, edge detection, colorings
Object description:
Next levels pattern recognition is description found region.
Method for object description: symptomatic description, syntactical description
Pattern recognition:
Pattern recognition consist of sorting into the clases. Class is subset objects, whose elements have common
feature from classification stadtpoint. Object have physical character, which in computer vision understand most
frekquently parts sectional picture.
TESTING METHOD
Testing single method was implementation on computer:
Microprocessor: AMD Thunderbird 1132 @ 1267MHz, 266DDR
RAM: 256MB / 266, CL 2
Methods for image pre-processing
action
Sharpening
time (s)
Roberts operator
Laplace operator
Coloring transformation Greyscalle
Brightness,
transformation
Segmentation
contrast Brightness, contrast
Histogram Equalization
Threshold
0,105
0,12
0,055
0,05
0,125
0,02
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
333
This methods was testing on scene:
Width: 512 pixels, Height: 512 pixels
Number of objects: 5
Coloring table:
coloring
objects type
1
2
number of
3
objects
4
5
bitmap size
512 * 512
1024 * 1024
randomly
coincident
coincident
0,06
0,05
0,19
0,07
0,06
0,25
0,1
0,07
0,3
0,131
0,07
0,32
0,161
0,171
0,391
Coloring table:
From introduced table is resulting, that coloring method is depending not only on sizes picture, but on
recontouring objects. Type of object named as coincident contain coincident objects, consistently turning and
beyond they are set so as to with minimizeusing table colours equivalence. Objects named as randomly contain
randomly turning objects and each other variety.
Filtration table:
Filtration
Averaging
mask
mask 3
mask 5
mask 7
mask 9
time (s)
0,661
1,161
1,783
2,553
0,451
Median
0,1
0,5
1
1,5
2
time (s)
0,1
0,12
0,14
0,171
0,2
Rotaring mask
1
5
Number of
10
iteration
15
20
time (s)
0,14
0,681
1,362
1,993
2,724
Gauss filter
sigma
This table is summary of method for filtration scenes. Fastest method for filtration picture is Gaussian
filter, which produced also very good results. Parameter sigma for Gaussian filter indicating peace smudge
scenes. Filtration mask means size array for filtration. Whereby bigger is this array, thereby more neighbouring
pixels is into filtration inclusion thereby happen also to bigger smudge (smooting). But as shows table averaging,
so at using bigger filtration mask greatly increasing time for filtration.
METHODS FOR EDGE DETECTION
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
334
In this part are comparsion single edge detectors. Introduced method are comparsion according to few
charakteristics:
Safe edge: Indicating cleannes edge. Contain residual noise, If edge width has everyvere identical count pixels,
atd.
Doubling edge: Critize edge reduplication or double display object with displacement.
Noise: Critize cleannes scenes.
Thickness edge: This item critize edge width throughout edge length.
Recontouring preservation: Indicating formative modification object after edge detection.
Erasure edge: Has been watched mistaken erasing all section edge.
Last item inform about setting input values.
From results is resulting, that fastest method for edge detection is zero crossing operator. But resultant
scene contains considerable noise. This is posible precede, if entrance for this method will thresholding modified
picture.
Prewitt operator, Laplace operator, Gradient operator, Zero corossing table:
Second fastest method is gradient method. Results this method is picture has less noise, than zerro
crossing operator. If entry for this method is thresholded picture, then generate 2 pixel width edge (ideal for next
processing).
Next method are Izotropic operator, Sobel operator, Prewitt operator. This method have approximately
coincident exit and differentiate only sometimes edge detecting. Fastest response has the Sobel operator, then
Izotropic operator and on last places is Prewitt operator.
Sobel, Izotropic operator table:
Has been waylaid, that accelerated calculation at Sobel operator and Izotropic operator do not affect
markedly output picture and this method accelerate about double. This method can cleanness separate real edge
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
335
of noise and thereby is resulting scene almost without noise.
Next tested method is Laplace operator. This method os slowest method of all allude to method and on
exit these method is too large noise.
If major criteria is not speed detection, then for beforehand unpreparation scenes of these mentioned
method is most suitable use Prewitt operator.
Next method is deviation method. When in use mask 3 ´ 3, generate this method very first-rate edge.
Deviation method has some handicap and though, that is slow in comparsion with competetive operators. When
in use bigger masks happen to expansion lines and increase time requirement and this scene after edge detection
must skirt thinning action (erosion method).
Deviation method table:
Next method for edge detection is Canny detector, which is using in medicine in machine computer
termography (CT). It is very first-rate edge detector, generating edges width 2 pixels and beyond smooth. With
reductive sigma soars speed edge detection, however in the same way quickly soars noise in resulting picture.
With increasing sigma happen to modification shape single objects in picture. Sharp edges these objects passage
in curve. That is one from reasons what for this detector suitable to detection objects, that are representation
largely curved shapes. Modification shape is extra percetible at classification grammar, which is sensitive to any
diversity. In the event of, that is finding optimal set this detector, then produces results executing all criteria with
first-rate results. Only handicap of these detector is speed. In comparsion with different here tested methods be
ranked among slowest algorithm.
Canny detector table:
Next edge detector is Rothwell edge detector, which detection edge in the same way first-rate as a Canny
detector, but exit picture hasn`t been shape modification. But this method at increasing value sigma trimming
margin scene and csequently near margin may be so much modification, that will badly classified. This edge
detector leave sharp quoins edges and that is why is suitable for syntactical method image recognition. Edge
detected this methom have 2 pixels width edges.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
336
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
337
Rothwell detector table:
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
338
Wavelet transformation table:
Symbols in following tables:
Single methods was classification reciprocal collating.
METHOD FOR CLASSIFICATION OBJECT
This methods constitute last and upper-most step in theory computer vision.
This method was collation each other:
Recognition with the aid of moment,
Recognition with the aid of method Fourier`s descriptors,
Recognition with the aid of gramatic, which describe to edge of object,
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
339
Recognition with the aid of neural network (back propagation).
Fastest methods for pattern recognition are recognition with the aid of grammar and neural network.
Next method is Fourier`s descriptors and lastly is method recognition with the aid of moment.
Recognition with the aid of moment method give verz good results. This method classify faultlessly
most object alredy at precept on some model etalon. Moment method it can be use for other edge detector, than
for what was moment enumeration, since aren`t too sensitive on changes edge objects. e.g. Canny detector (at
higher sigma rounded edge) and Sobel operator. This method is unfit for application expectant recognition
minimum shape dissimilar, because this method insn`t sensitive on less shape changes. This method is fit for
application that have require fast recognition dissimilar objects in different turning.
Next method is Fourier`s descriptors. This method isn`t no-variant in the face of rotation, thereby at
classification scenes no-recognition objects, which are otherwise turning than objects in etalon scenes. Problem
with turning is possible sort out consecutive tutorage models, but increases hazard creation mistaken
classification. This method is faster, then moment pattern recognition and is fit for application no-expectant
recognition objects in dissimilar turning.
Next method is recognition with the aid of grammar. If primitives marked in singles partitions edge,
grammar is very sensitive to small mistakes in edge detection. Grammar is required set together „on peace“ for
definite type edge detector. e.g. is fault set together grammar for objects which was detected e.g zero crossing
operator and then recognition objects by means of Canny edge detector with sizeable sigma (edge modifing).
Grammar fit in application which require recognition objects in dissimilar and when is emphasis on recognition
small changes at segment edge. Creation grammar requires time and knowledge grammar description edges.
Creation rules for grammar must work manual and unattended automatically as other method.
Last tested method is algorithm Back Propagation. This method is the fastest of all collation method.
For description objects this method was used 70 symptomatic vector orient of gravity to edge object. This
method recognise considerably shape forced objects, but can she faulty identify shape similar objects. This
method recogniting dissimilar turning objects. Fault these method growing if constrictive objects.
CONCLUSION
Subject of my thesis was analysis algorithms for pre-processing of visual data and object recognition in
2-dimensional technological scenes.
Methods for pre-processing are difficult on the time, so is suitable use them only necessary eventuality.
Every operation on scenes decreases number information contained in this scenes. Use method for preprocessing isn`t necessary always and consists in qualities scenes. If picture contains heavy expenses quantiy
noise, that advise insert Gaussian filtration, which is very fast and has been exit good results for minimal s.
After implementation filtration is also much simler find threshold in picture, than at non-thresholding picture.
If being asked bigger edge quality (e.g. at syntactical recognition, when are primitive single edge
partitions), is fit insert before edge detection threshold picture. For threshold segmentation picture
unambigously advise gradient edge detection, or zero crossing operator, which in combination with threshold
have very good time even qualitative results. Very good results have even Canny edge detector and Rothvel edge
detector even without using threshold.
Recommended / unredommended methods for pre-processing image:
For recognition object isn`t necessary brightness transformation (change brightness and contrast, histogram
equlization, etc. are methods suitable when human has qualitatively analyse scenes). For most cases isn`t
necessity use higher order masks at averaging method.
Recommended / unredommended methods for edge detection:
Worst results serve Wavelet transformation, which isn`t in methods, where is requirement accurately determine
edge object, on difference for instance of method concentrative on face rocognition.
There is no need to use higher order masks at deviation method, because is dilatancy edge width and thereby
would necessity for next procedure insert thinning method.
Appearances at identical results in quality output picture at Prewitt operator, Sobel operator and Izotropic
operator, is possible critize this methods according for the duration of calculation. Thereby I would
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
340
recommended only Sobel operator and Sobel accelerated calculation.
Recommended application stocking for pattern recognition methods:
Grammar recognition – is fit, when is requirement recognition turning objects and where is requirement detect
high accurate single edge partitions, without possibilities creation meaningful mistakes and where is exaction
high speed classification. Meaningful mistakes understand mistake, which it is imposible implicate on rules.
Recognition with the aid of Fourier`s descriptors – is fit, where is demending high speed classification, without
supposed turning objects.
Rocognition with the aid of moments – is fit, where isn`t that much important accurate edge course, but be
sufficient only „hard“ partition to the single classes. e.g. it does not matter, whether is between the two edge
sharp passage, or short curve. Advantage compare to Fourier`s descriptors is invariant in turning objects, to be
sure at the expense of slowly calculation.
Recognition neuronal network – is fit, whete exaction high speed classification with randomly turning scenes
and where we need tolerate differences between learned etalons and classified scenes.
LITERATURE
[1] WEISSER R., Software pro klasifikaci objektů [thesis], Brno, VUT FSI 2003
[2] KOZÁK M., Využití genetických algoritmů v systémech pro rozpoznávání objektů [thesis], Brno, VUT FSI
2002
[3] MÁLEK M., Programový systém pro rozpoznávání objektů [thesis], Brno, VUT FSI 2002
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
341
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
342
ON THE DISREGARDED FORMS OF COMPUTER CRIME
Karel Pichl,1) Markéta Pichlová2)
1)
Military Academy in Brno, K 303. Kounicova 65, P.B. 13, 612 00 BRNO, CZECH REPUBLIC
Tel. +420-973442336, Fax: +420-973442987, E-mail: [email protected]
2)
Gaz de France, Département Etudes Générales; 361, Avenue du president Wilson – B.P. 33
93211 SAINT-DENIS LA PLAINE CEDEX, FRANCE.
Tel. +33-1-49225000, Fax +33-1-49225142, E:mail: [email protected]
Abstract: This paper reports on the unjustifiably disregarded forms of computer crime, focussing on
the present status in the Czech Republic. They include: ”An assault on a computer, program, data,
or communication facility; unauthorised use of a computer or communication facility; unauthorised
access to data; theft of a computer, programme, data, or communication facility; changing of programs, data and technical equipment, and finally, abuse of information technology to commit a
crime of another form.” The paper mentions also some weak points of computer systems and the
methods used in an unauthorised access to information systems.
Keywords: Computer crime, criminal acts, dematerialised crime, intellectual theft, copyright violation, unauthorised used of a computer or communication facility, unauthorised access to data, weak
points of computer systems, methods of unauthorised access to information systems, computer viruses, computer espionage, software piracy.
I. INTRODUCTION
While ”classical” crime has remained practically unchanged over the centuries, modern technology has
brought about also new forms of crime. A murder remains a murder, no matter if the criminal uses a laser gun
instead of a slaughter axe. Together with an increasing dematerialisation of the commonly used tools (in particular information, financial tools), a new phenomenon has emerged, which could be termed a dematerialised
crime.
It is rooted in dishonesty in the fields of business, enterprise, banking and insurance. It exploits computer systems with databases filled with information, the knowledge or the manipulation of which bestows significant advantages. It is committed by persons of a good and very high level of expertise.
Dematerialised crime is a result of dishonesty of people with a high level of responsibility and powers,
working at positions enjoying trust – the ”white collars”, as the clerk positions are sometimes called. In committing the crime, they use their brain and often highly sophisticated methods. This type of crime does not carry any
signs of violence as it is commonly understood, but is almost always based on an illegal manipulation or fraud of
the offender. It is reasonable to say that if these culprits would exert the same energy, creativity and smartness in
doing a business, they would be highly successful. A prevailing motive of these crimes is a profit, a personal
gain. Dematerialised crime is very lucrative – yielding hundred, thousand or more times the proceeds of the
”classical” crime. However, there may be also other motives or combinations thereof:
· The feeling of superiority over the employer, police, or public, the feel of impunity and of being undiscoverable.
· The belief that small losses cannot harm the company.
· A wish to compensate dissatisfaction about work or personal life.
· A feeling of being exploited or injured by the employer.
· Desire for adventure and risk.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
343
The victims of this crime are mostly large banks and multinational companies (which often have no interest in making public the information that a fraud or other threat has taken place), or the public funds. In both
cases, the injured parties are major operators of computers and in particular of computer networks and longdistance data transmissions. In such systems, there is a greater anonymity of access to the network due to a
greater openness of the network and a variability of possible ways of access.
In a majority of cases of dematerialised crime, modern technologies are involved that in fact enable the
very existence of the intangible assets. Therefore, they fit very well as the objects of as well as tools for this type
of crime.
Regarding the computer, the dematerialised crime may be divided into two basic categories:
- offences where the computer (program, data etc.) is a tool,
- offences where the computer (program, data etc.) is the target.
Examples of the first category include credit card frauds, bank account frauds, withholding business
proceeds in a commerce, etc. The other group covers software piracy, hacking, unauthorised use of computers
(theft of processor time), etc. However, a typical feature of dematerialised crime is that its various aspects and
forms are combined, including an international character of the crime.
Computer crime is in general very little discussed in the literature and the authors of the few articles
that there are concentrate almost exclusively on software right or copyright or licence and their application on
software. And that is too little. Completely neglected, other forms of computer crime remain disregarded, including:
· An assault on a computer, data or communication facility.
· Unauthorised use of a computer or communication facility.
· Unauthorised access to data, retrieving confidential information (computer espionage) or other
information on persons.
· Theft of a computer, data or communication facility.
· Changing of programs and data (and sometimes of the technical connection of a computer or
communication facility).
· Abuse of information technology to commit a crime of another form.
· Frauds committed in relation to information technology.
In case that any criminal activity related to information technology takes place, it is indispensable to
seek an assistance of a computer expert who can prevent the deletion of ”mined” data, restore deleted information, prevent further spreading of a virus, etc.
II. WEAK POINTS OF COMPUTER SYSTEMS
In general, there are six major ”weak points” of computer systems, which may be abused to commit
crime. They are:
· system software,
· user programs,
· databases,
· terminals,
· telecommunication connections,
· microcircuits.
The perpetrators of computer crimes focus very often on computer software, targeting both the operation system or testing programs ensuring and controlling the operation of the computer, and software serving for
solving of particular tasks. Such software may be stolen or degraded in some way.
Databases concentrate a large amount of information to be further processed by a computer. The objective of the offenders is to capture the information and use it to their advantage or to degrade or destroy the information in order to inflict significant damages on the owner of the information.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
344
Terminals are essential components of mainframe computers and computer networks. They allow for
communication of the user with a machine that may be a few meters or hundreds of kilometres far away. In a
terminal network, the computers are mostly connected by telephone lines. The perpetrators seek to get access to
terminals or they connect secretly to the respective connection lines. They are thus able to obtain input and output information from the computer processing and sometimes even intervene with the user programs. The telecommunication lines also give an opportunity for an illegal connection to the network.
Another vulnerable part of computers are microcircuits, where undesired actions are difficult to identify. Making use of the LSI and VLSI technologies employed, ”secondary records” of the information processed
may be created.
III. METHODS OF UNAUTHORISED ACCESS TO COMPUTER SYSTEMS
The literature reports on a number of methods of illegal access to computer systems. More than fifty
methods have been identified. Each of the methods usually has various modifications and variations, some
methods are used combined. In principal, the methods may be divided into:
· Active unauthorised access methods (direct methods), where the offender deliberately exercises an activity seeking in particular to obtain information processed in the computers.
· Passive unauthorised access methods (may be qualified as unintentional, however, this often
involves disguising of the true intentions).
Active unauthorised access methods
Active unauthorised access methods include in particular:
· destruction or theft of data,
· the Trojan horse method,
· method of alienating small amounts from various sources,
· unauthorised use of application software,
· misuse of asynchronies in computer processing,
· misuse of identifiers and passwords,
· misuse of transmission lines,
· time bomb,
· logical bomb,
· information purging method,
· wiretapping,
· role sharing,
· misuse of the failure and repair tasks,
· simulation and modelling.
Passive unauthorised access methods
To degrade or misuse the processed information, these methods rely on:
· damaging of memory media,
· malfunction of reading (writing) equipment,
· wrong marking or misnaming of a file, program or privilege,
· modification or replacing of data or a program,
· incorrect user instructions,
· wrong user address,
· switching of terminals (redirection of output to another terminal),
· error in identification (of files, terminals, authorised users),
· failure to observe applicable organisational guidelines,
· input of wrong information into the computer memory,
· external effects (e.g. a magnetic field).
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
345
Due to a limited scope of this paper, the individual methods will not be discussed in detail here (an attentive reader can anyway appreciate the operational applications of these methods). The methods will be shortly
introduced in our conference presentation. For more details, the reader is referred to the literature in the field.
And the following part of this paper is dedicated to methods that are often being unjustifiably disregarded.
IV. UNJUSTIFIABLY DISREGARDED FORMS OF COMPUTER CRIME
1. Assault on a Computer, Program, Data or Communication Facility
In considering this case, it is necessary to recognise that there are two areas of actions that are or rather
may be combined. An assault on the computer hardware and accessories (theft, damage, etc.) and an assault on
the information stored in the computer. An example of a combination of the two is a theft of a notebook with
valuable software including text and data files storing e.g. a large work of literature, a technical specification, or
the complete information system of a company.
As regards an assault on the computer hardware as a material, the case is quite clear since the penal law
does not make a difference between a computer and another thing. In this respect, a computer may be stolen,
damaged or illegally used.
Probably the simplest to describe are offences of the category of physical assaults on computer facilities, magnetic media, computer network lines, etc.
However, the issue is how to determine the price (value) of the damaged data or any other damages occurring or threatening as a result of such a crime. While the price of a computer, its part or a magnetic medium
as such is rather easy to determine, assessing the value of information (data stored) is a difficult task.
There are two factors relevant for the price of information: the costs expended in obtaining and compiling the information, and the value that the information may have for its owner (or user). It must be admitted
that the second factor – i.e. the value for the owner – belongs to an area requiring much of further work. Indeed,
there exists no established methodology and any experiments will only lay the very foundations for any general
approach.
Logical assaults are much more sophisticated. The offenders directly use the properties (or the weaknesses) of the computer system. Examples include in particular the so called logical bombs (programs activated
under certain conditions and performing certain actions) and viruses (harmful programs dragged in from the
exterior by a deliberate action or failure to observe rules in handling of diskettes and in remote communication),
which delete the memory content, block computer operation by performing a large number of useless operations,
etc.
In this area, there are two categories of offenders: those that are proud of their authorship, and those
who remain cautiously clandestine. Also crimes committed by negligence of a programmer may occur, because
not every programmer can appreciate all effects that may cause his program to behave as a logical bomb or a
virus. Unfortunately, the law does not recognise a damage to other person’s thing due to negligence.
2. Unauthorised Used of a Computer or a Communication Facility
The second category of computer crime includes mainly two types of actions: calculation on a computer, use of a program or communication facility belonging to another person without its knowledge or permission – i.e. an unauthorised use of computers.
Probably the most typical offence of this type committed from ”inside” are private business activities of
programmers. The statistics show that over a half of the perpetrators of computer crimes are data processing
experts – system programmers, managers, technical staff. The legal awareness of this staff is often inversely
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
346
proportional to their commitment to IT work and their fascination by technical and software capabilities.
A member of staff who has good knowledge, skills and an interesting work in which he has remarkable
success may still not be or not feel being sufficiently remunerated. In such a situation, an opportunity, an offer or
other trigger may occur that leads to a crime. And with people who deliver little work for little money, the limit
is even easier to trespass.
In all of the above cases, two typical features of computer crime are present:
· Computing on a computer (and using the software) of the employer.
· Selling of programs created within an employment contract to other users on the offender’s account and in his name.
While the case b) carries clear signs of a crime, the case a) is usually judged very tolerantly by the environment (not to mention the offender). It is always necessary to distinguish between a failure to observe working
obligations, a violation of owner rights, and a crime.
It is practically impossible to track unauthorised use of a personal computer that is not connected to a
computer network and is used by a single user. The operations are not documented in any way because personal
computers usually do not use any registration of processor time. Larger computer systems (mainframes) or computer networks usually allow for determining that some tasks were carried out and in most cases also for identifying the user who launched them. However, the difficult issue is to find out whether the use was appropriate or
unauthorised.
3. Unauthorised Access to Data
In this area, covering unauthorised access to data, retrieving of confidential information (computer espionage) or information that are in other countries protected based on personality protection laws, a crime may
in practical cases be proved only after such information is actually used. On the other hand, it may be difficult to
penalise database hacking, in particular when done through a use of a modem. The essential problem is to prove
an intent to cause a damage, as it is required by the law. Indeed, many hackers perform their activities only as an
intellectual hobby, having no material or other ambitions, and not needing to share their enjoyment with anyone.
They may well be satisfied by knowing that they are smarter that the designer of the security system of the computer. However, it may also happen that a hacker succeeds to make a particularly successful hack, and as he
browses the database, he may take the decision to cross over beyond the line of a hobby and misuse the information obtained, usually against some kind of redress. However, at the time he gained access to the data carrier, he
had no intention to misuse the information. And the illegal use of such information is yet another issue.
We doubt that the ministry of the interior or defence or a major bank would be happy to find out that
some post-pubescent spends the nights browsing through their databases. But unless they can establish and prove
that he used or at least intended to use the information, they can do nothing (at least as far as the penal law is
concerned) but quietly improve their data protection system.
A mere copying of data (and programs) without using it is practically impossible to disclose. Investigating and proving such acts is then rather difficult, as there is usually a significant time lag between illegal
obtaining of the information and its misuse. The only evidence is once again the above mentioned activity
tracking and registration.
More than ten years ago, on 29.4.1992, Act 256/1992 Coll. on protection of personal data in information systems was passed. This legislation provides for the protection of personal information; in particular, it
stipulates the obligations and responsibilities related to protection of information in the operation of any information system that processes personal data of individuals. The law is constructed entirely in terms of a civil
code, not stipulating any penal liability, but defining the claims (Art. 22) arising to the injured party in case of a
breach of obligations on the part of the operator, intermediary or another liable person.
4. Theft of a Computer, Program, Data or Communication Facility
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
347
While a theft of a tangible property in this fourth category of crime remains a theft, the so called
”intellectual theft” consisting in a copying of a program or data is once again more difficult to qualify legally. In
the former case, the perpetrator must physically take away the things, while on the latter case, the original objects remain at their place and the perpetrator takes away only there copies (or alternatively, he may remove the
originals and leave copies). In this respect, it is I most case impossible to find out whether a digital record is an
original or a copy. The only evidence available is secondary evidence, such as invoices, delivery certificates,
witness testimony, printer outputs or files in magnetic memory and in particular the registered date of their creation.
Such software piracy is in the Czech Republic as well as in most post-communist and developing countries regarded by many as a completely normal way of a acquiring the program needed. In this respect, the provisions of the copyright law should be recalled, stipulating that also a computer code may be subject to a copyright. This has important repercussions also in the penal law area. The cited copyright law precisely stipulates the
right of the author, so that it is likely that not only the person spreading programs in a pirate way, but also anyone having such programs on his computer runs a risk of a penal prosecution.
The value of a software may be determined based on the selling price or otherwise (based on expert
opinion), so that everyone is able to calculate not only how much he has saved, but also how much he will be
liable to pay if it is found that the software used has been obtained illegally. And to divulge the software illegality, it is enough that there is one unsatisfied employee or former employee. Because everyone knows very well
that legal software comes with original manuals, original diskettes and licence certificates, while illegal programs are characterised by the fact that thee is either no manual or it is photocopied. The nature of programs,
their structure and other evidence available usually provide no means to determine conclusively which is the
original product and which is a copy.
5.
Changes in Programs, Data and Technical Equipment
This fifth category includes in particular:
· Changing of programs and data by other programs (viruses) or by direct programmer intervention (which is common in particular in computer assisted fraud).
· Less commonly, an alteration of the connection or another attribute of a technical equipment or
communication facility.
In case of changes of programs or data, it is possible (next to the above mentioned possibilities) to try to
find older or deleted versions of files on magnetic media and make a comparison. However, it is advisable so
charge a qualified professional with his task. Therefore, it is necessary to ensure that the whole computer or at
least the magnetic memories are untouchable. This can be ensured also by removing the hard disk from the computer and depositing it in a safe deposit box.
An alteration of the technical connection is a crime that is easier to disclose and document and where it
should not be a problem to gather evidence on the case. The most appropriate evidence is the equipment itself.
Detailed documentation of original connection and assembly diagrams is also indispensable, as well as signals
and tests history. As has been noted above, investigating assaults on the information content of computers for
penal law purposes will obviously pose a greater problem.
Despite several unclear stipulations, the law penalises virtually all unauthorised interference with software. In the authors’ opinion, this should include also viruses, regardless of whether the virus destroys or just
damages the data stored. Also in this respect, a lack of penal liability for acts in negligence is apparent, because a
co-worker who introduces a highly destructive virus to a computer through e.g. a computer game may inflict
much more damage than someone who deliberately makes pirate copies.
INTERNATIONAL CONFERENCE ON SOFTCOMPUTING APPLIED IN COMPUTER AND ECONOMIC ENVIRONMENTS ICSC 2004
EPI Kunovice, Czech Republic, January 29-30, 2004
348
6. Misuse of Information Technology to Commit Other Crime
The easiest and therefore probably the most wide-spread way to illegal incomes using IT is a manipulation of data – goods on stock, revenues, health insurance and number of employees, fulfilment of the plan, account balance, etc. Unlike the manipulation of hard-copy documents, the manipulation of electronic data has
several advantages for the offender:
· Deletion or replacement of data on magnetic media is much easier and leaves practically no
comebacks.
· People (inspector, customer, etc.) psychologically a priori consider computer outputs reliable
and trust them (though subconsciously).
The latter aspect results in a high success of offences committed using information technology. It s precisely the high level of trust in computer outputs that renders IT frauds successful. And nothing is easier than to
”improve” the outputs in this sense. Indeed, who is going to re-calculate columns of numbers added up by a
computer or by just a simple calculator with an output slip printer? There are dozens of thousands of people who
were attracted to participate in a financial ”game” of a pyramid type by the slogan ”No need to look for followers – everything is controlled by a computer”.
Data may be changed by manipulating the documents based on which they are recorded, by altering the
data on the storage media, in the course of a computation performed by the computer, or in the printer output.
Making a change in the already printed output is technically most tricky. However, if also the printer output is
saved as a file, the task becomes rather easy. With less qualified staff, the most common form of a fraud is a
manipulation of the input documents (i.e. inputting wrong data). A co-worker who processes his admin on a
computer – whether it is a PC or a terminal of a mainframe machine – may practically arbitrarily change data
prior to, during and after the processing. However, he or she must obviously take care in some way of the remaining information systems of the company, with which there are logical links.
The main sources of evidence for criminal offences are records kept correctly and in full, in compliance
with the stipulations of the Law, carrying out of all required bookkeeping and other operations, and comparing
the results of manual and computer processing. If offences of this type are suspected, it is necessary to secure all
hard copy documents as well as to record the current status of data and programs in the computer. If the computer cannot be put off service, it is necessary to archive its status to provide evidence for experts.
7.
Frauds Committed in Relation to Information Technology
As a result of the assaults mentioned above, yet a more involved criminal activity may take place. The
common criterion to judge on a crime taking place is in general the presence of the factual content of the