Geoinformatics FCE CTU

Transkript

Geoinformatics FCE CTU
Geoinformatics
FCE CTU
Geoinformatics
Faculty of Civil
Engineering
Czech
Technical
University
in Prague
Volume 3, 2008
Proceedings of the workshop Geoinformatics FCE CTU 2008, September 18-19th, Prague, 2008.
Editorial board:
Editor in Chief:
Members:
Aleš Čepek, Czech Technical University in Prague
Jáchym Čepický, Help Service - Remote Sensing
Martin Hrubý, Brno University of Technology
Martin Landa, Czech Technical University in Prague
Geoinformatics, Faculty of Civil Engineering, Czech Technical University in Prague
ISSN 1802-2669
This book was prepared from the input files supplied by the authors. No additional English, Czech or Slovak style
corrections of the included articles were made by the compositor.
Published by Faculty of Civil Engineering, Czech Technical University in Prague.
Contents
1 GAL Framework – Current State of the Project
2 The importance of computational geometry for digital cartography
5
15
3 Change Detection with GRASS GIS – Comparison of images taken by
different sensors
25
4 Moebius: An interface to web map services
39
5 ISO 19115 for GeoWeb services orchestration
51
6 Deriving Hydrological Response Units (HRUs) using a Web Processing
Service implementation based on GRASS GIS
67
7 Toolbar icons for GIS applications
79
8 Projekt OpenStreetMap z pohledu geoinformatika
91
9 GUI pro orchestraci GeoWebových služeb
Geinformatics FCE CTU 2008
109
3
Geinformatics FCE CTU 2008
4
GAL Framework – Current State of the
Project
Radek Bartoň, Martin Hrubý
Faculty of Information Technology
Brno University of Technology
E-mail: [email protected], [email protected]
Keywords: design, GIS, GRASS, open source, library, dynamic language, remote procedure
call
Abstract
The GAL (GIS Abstraction Layer) Framework is a component-architecture-oriented1 remote
procedure call (RPC) library with implementations of GIS-related subsystems communicating
using the library and a set of demonstrational and testing tools utilizing that services. It
doesn’t aim to be a full-featured solution for GIS application construction but a proposal
for possible incremental GRASS GIS2 modernization. This article summarizes current state
of the project, it’s history, application and potential and also presents options for further
advancement and areas of possible participation. Only a concern of other developers or users
and the time may transform this idea into something practically usable.
History and Motivation
The project was originated as an article author’s master degree diploma thesis at the Faculty of Information Technology of the Brno University of Technology in February 2007. It
was intended to be a higher-level abstraction layer above GRASS GIS core libraries from the
beginning allowing rapid and clear GRASS module development. It also allows sequential
exchange of the current implementations with the new ones if used communication interfaces
1
2
http://trac.edgewall.org/wiki/TracDev/ComponentArchitecture
http://grass.osgeo.org/
Geinformatics FCE CTU 2008
5
GAL Framework – Current State of the Project
would be well-designed and preserved. This could help during possible GRASS GIS innovation procedure. Support of distributed computing and dynamic language facilitation was
contemplated too.
An initial stage of project realization was to design core communication mechanisms and
lasted until July 2007 when the first steps to implement them was started. The library
design was introduced on the last year’s volume3 of Geoinformatics FCE CTU Workshop.
Further information about project creation motivation in consequence to GRASS’s internal
organization was discussed there also.
Main development of the framework, including the design of introductory general-purpose,
raster display and raster processing interfaces, was performed during the first half of year 2008
until the end of May when the project was presented in front of a diploma thesis commission.
But the development did not stop since then and it may continue further if there will be
enough of interest.
Current State
The library is divided into several subsystems which are developed in parallel to allow implementation of certain features of example tools. These are mainly but not lastly a reimplementation of d.mon module functionality and a real-time 3D visualization tool called d.roamer
similar to the nviz but with emphasis on interactivity. This paragraph will tell a few words
about progress of each of the subsystems; designed interfaces and implemented modules are
discussed in next paragraphs.
Generally can be said about these subsystems that GRASS’s libraries has been used in their
implementation everywhere it was feasible but a possibility of their replacement with different
implementations has always been kept in mind.
Core Subsystem
This part of library defines basic ways of communication between the components through
the interfaces, abstracts used event processing libraries to a single event loop and provides
a general model for RPC based subsystems such as a D-Bus4 subsystem is. What do the
component“ and the interface“ terms mean in context of the GAL Framework and what is
”
”
the component architecture“ was explained the last year5 or can be found in this document6 .
”
The core subsystem is naturally the most evolved part of the framework. Only things that
should be done here are a proper event processing loop implementation since current one is
quite naive and a user (module programmer) comfortance improvements which are not crucial
in this stage of evolvement.
Exception Subsystem
3
http://geoinformatics.fsv.cvut.cz/gwiki/GAL Framework
http://www.freedesktop.org/wiki/Software/dbus
5
http://geoinformatics.fsv.cvut.cz/gwiki/GAL Framework
6
http://trac.edgewall.org/wiki/TracDev/ComponentArchitecture
4
Geinformatics FCE CTU 2008
6
GAL Framework – Current State of the Project
It contains an exception objects’ class hierarchy so far. The exceptions are generally used as
the only one mechanism for an error state signalization occured during the communication
between the components.
A local exception evocation and processing is provided natively by GCC but an exception
passage through D-Bus message bus is not working yet.
D-Bus Subsystem
The only one RPC communication implementation present is the D-Bus subsystem. The
D-Bus library was chosen because of its simplicity and desktop systems orientation, but it’ll
be probably replaced with an ORBit2 implementation of a CORBA architecture in the future
for its robustness.
Current implementation allows only single process act as a server which provides components
with interface implementations. This have to be changed so that any number of processes
will be accessible to any client module soon.
General Subsystem
Together with the core, the exception and the D-Bus subsystems, general subsystem can be cut
out and reused in any other project needing component architecture implementation, because
it contains general purpose objects, interfaces and components. For example a command-line
argument parsing and an environment variables management is located here.
The subsystem is quite solid, only a module arguments documentation strings access has to
be improved. This however doesn’t mean that it doesn’t need other extensions. If there will
occur any new requirements for general functionality, their concretization may be inserted
here.
GIS Subsystem
This subsystem should include all instruments to GIS related computations. Currently it has
only information about active user and default region and their control. Possible algorithms
for a map projection or general GIS data transformation are waiting for their introduction.
Raster Subsystem
It comprehends everything about raster data access, manipulation and conversion. Raster
architecture is designed so that data are accessed by tiles. Request for tile contains desired
dimensions, position and resolution of the tile in a layer region object. Colour rules and a
colour table for data presentation are associated with the returned tile similarly like in the
GRASS. Actual data storage is currently kept in GRASS’s competence using a GRASSlib
library.
A present design of the raster data representation is quite initiatory and and it needs an
adequate degree of revision from the outside with proper modifications. Hence any comments
Geinformatics FCE CTU 2008
7
GAL Framework – Current State of the Project
or suggestions would be positive and convenient contribution. If progress of the project allows
practical usage of the library along with the GRASS, new implementations of the raster data
storage may be added. Some examples of data analysis modules should be implemented too.
Display Subsystem
Raster data are passed to this part of the framework and displayed. A basic element of
this process is a raster image object defined by its dimensions, number of channels and bit
depth. First present component implementing raster data visualization emulates d.mon’s eight
monitors but it uses Qt 4.x for window management and OpenGL for rendering, second is a
d.roamer’s module component which displays raster data as 3D scene with terrain. Vector
data display isn’t currently elaborated.
Dynamic Languages Bindings
To allow easy development of modules written in scripting or dynamic languages, SWIG7
wrapper generator was employed. Existing bindings are targeted to Python and Java.
Unfortunately, technical difficulties with dynamic and heterogeneous nature of the designed
communication methodology leaded to many customizations of the wrapper and some limitations. For example a server-side module development in dynamic languages is for now
impossible without using D-Bus communication. This can be translated as: It is not possi”
ble to call Python/Java code from C++ code directly.“ Possibility to write client-side modules,
which is the main reasonable dynamic languages usage, is though available.
Designed and Implemented Interfaces
Although this article shouldn’t serve as the library reference, some important communication
interfaces should be listed and explained here to get image about GAL Framework approaches.
Interfaces are actually designed as interface objects which holds an interface configuration
state (list of available functions with their signatures, a way of communication, etc.) and
which are imported to a module on demand from the GAL core. INodeController – is basic
interface for independent process management from outside. It’s mainly used internally, for
example d.quit module calls process termination function of this interface. Other functions
will serve for communication negotiation.
ˆ IRasterDisplayer – displays any raster image on a monitor. This can be tiles of raster
layer or simply any raster image (legend, icon, etc.).
ˆ IRasterLayerDisplayer – allows direct display of a raster layer on the monitor. This
may help to reduce unnecessary computations for better performance and and lets a
monitor handler to record a list of raster layer display requests.
ˆ IRasterLayerProvider – gives tiled access to GIS raster data. Current implementation uses GRASS libraries for low-level data manipulation.
7
http://www.swig.org/
Geinformatics FCE CTU 2008
8
GAL Framework – Current State of the Project
ˆ IEnvironmentProvider – provides different storages for global variables. Present
implementations are volatile memory, GRASS mapset configuration and GRASS global
configuration storage.
Example Tools
A few modules known from the GRASS GIS was developed to test and demonstrate functionality of designed and implemented interfaces. They are described here.
g.gald, g.quit, g.list and g.gisenv
Some modules from a general category was rewritten as tests of the designed interfaces. They
are a g.list and a g.gisenv. In addition, a g.gald and a g.quit modules was introduced.
Figure 1. shows example of their usage. First the g.gald module, which provides all available
functionality implementation, is executed as a daemon. Then the g.list is used to list raster
layers of a mapset and the g.gisenv module displays defined environment variables. Finally,
the g.quit module terminates the running g.gald module.
Figure 1: Some modules from general category.
d.mon, d.move, d.resize and d.rast
User interface of reimplemented d.mon module is shown on the Figure 2. The d.mon module
actually only gives order to show a monitor to the waiting g.gald process which performs
own monitor window display. It is the same with d.rast module that reads raster data from
GRASS and sends them to g.gald. Other controlling modules the d.move and the d.resize
tell the g.gald to move or resize the window.
Geinformatics FCE CTU 2008
9
GAL Framework – Current State of the Project
Figure 2: d.mon module in action.
d.roamer
The last presented module is called a d.roamer and it allows the user to fly over a visualized
terrain in real-time. It’s screenshots can be found on Figure 3. and 4. The first shows the
terrain rendered with full faces, the second uses wireframe. This demonstrates used level of
detail algorithm called geo mip-mapping.
Figure 5. contains diagram of internal communication between d.roamer and d.rast modules
using the framework. Analogously as with the g.gald, d.mon and d.rast modules in previous
paragraph, data are read form GRASSRasterLayerProvider component and pased through
IRasterLayerProvider and IRasterDisplayer interfaces to d.roamer’s RoamerComponent
component.
Areas of Future Development
As you may notice, vector subsystem is not present in the framework at all yet. The explanation is that it was not necessary to focus on so complex area as vector data architecture
is for the prove of concept of proposed and designed communication strategy. Hopefully, decent vector implementation will be result of Bc. Jan Kittler’s master thesis whom the article
author is cooperating with. He should design new internal and external representation of
vectors and some analytical tools with user interface. Core parts should be implemented in
C++, analysis tools and user interface in C#. This will introduce need of C# bindings for
GAL Framework.
Geinformatics FCE CTU 2008
10
GAL Framework – Current State of the Project
Figure 3: d.roamer module interface with full-faced terrain.
Figure 4: d.roamer module interface with wireframe terrain.
Geinformatics FCE CTU 2008
11
GAL Framework – Current State of the Project
Figure 5: Architecture of d.roamer module.
Because of huge scale of project’s extent, another outside contribution would be more than
welcomed. Safe multi-thread processing of events in loop including thread-safe access to any
internal data of the library may be elaborated. Better raster architecture as long as any
number of raster or vector data format implementations may be added. And finally, new
modules using GAL Framework may be developed. Bachelor or Diploma theses on that
themes could be published.
Some Statistics
ˆ 20 months of development of single person.
ˆ 9000 code lines (according to http://www.ohloh.net/projects/9183/analyses/latest).
ˆ 6500 comment lines (mainly Doxygen documentation).
ˆ C++ as main language, Python and Java bindings.
ˆ 41 commits to SVN repository (svn://gal-framework.no-ip.org:3691).
ˆ Depends on D-Bus, libxml2, libgcj or libffi, Qt 4.x, SoTerrain8 and GRASSlib libraries
(some optionally).
ˆ Homepage is Trac instance at http://gal-framework.no-ip.org.
8
http://blackhex.no-ip.org/wiki/SoTerrain
Geinformatics FCE CTU 2008
12
GAL Framework – Current State of the Project
References
1. Christopher Lenz, Dave Abrahams and Christian Boos. Trac Component Architecture
http://trac.edgewall.org/wiki/TracDev/ComponentArchitecture, July 2007.
2. Radek Bartoň and Martin Hrubý. GAL Framework. In Proceedings of the workshop
Geoinformatics FCE CTU 2007. Czech Technical University in Prague, September 2007.
3. GRASS Development Team. GRASS GIS. http://grass.itc.it.
4. freedesktop.org. D-Bus. http://www.freedesktop.org/wiki/Software/dbus.
5. SWIG. Simplified Wrapper and Interface Generator. http://www.swig.org.
6. Radek Bartoň. SoTerrain. http://blackhex.no-ip.org/wiki/SoTerrain, October 2007.
Geinformatics FCE CTU 2008
13
Geinformatics FCE CTU 2008
14
The importance of computational
geometry for digital cartography
Tomáš Bayer
Faculty of Science, Charles University in Prague
[email protected]
Keywords: computational geometry, digital cartography, open source, GIS, automated generalization, convex hull
Abstract
This paper describes the use of computational geometry concepts in the digital cartography.
It presents an importance of 2D geometric structures, geometric operations and procedures
for automated or semi automated simplification process. This article is focused on automated
building simplification procedures, some techniques are illustrated and discussed. Concrete
examples with the requirements to the lowest time complexity, emphasis on the smallest area
enclosing rectangle, convex hull or self intersection procedures, are given. Presented results
illustrate the relationship of digital cartography and computational geometry.
Introduction
Needs of human to capture and represent surrounding landscape are very old. The first evidences can be found on the walls of caves or animal horns; they are associated with the
beginnings of the cartography. Cartography is over two milenia old science, but during this
period has been radically changed. Adding mathematical fundamental and analytical methods to the process of data acquisition and mapping resulted in the birth of earth sciences.
Due to new knowledge in mathematics, physics, computational geometry, statistics and informatics the methods of creating maps have been rapidly modified and enforced (Kolar at
al, 2008). The transformation process of analogue maps to digital maps incurred as a result
of cartographic representation of the Earth based on planar structures (eg. points, lines,
polygons) brought some new problems that can be effectively solved using computational
geometry. In digital cartography are some new geometric structures like topological skeleton,
Voronoi diagrams, Delaunay triangulation has been started to use.
Geinformatics FCE CTU 2008
15
The importance of computational geometry for digital cartography
From computational geometry to natural sciences
The beginnings of the computational geometry arose as a response to data acquisition and
data processing techniques changes at the 60th of 20 century. Their transformation into digital
form brought a new data representation of the landscape, based on its decomposition to 0D,
1D, 2D, 3D entities. The process of creating maps was associated with the lack of digital
data analysis and synthesis. It led to the need of their processing with the least amount of
manual interventions by an operator. A number of new techniques aimed to planar or spatial
data analysis and their relationships has been created. Those exact methods were based on
linear algebra, geometry, cartography, statistic or adjustment calculus.
Based on synthesis of these findings, a new field “computational geometry” has been established. The computational geometry studies features of geometry algorithms in 2D or 3D and
tries to find an optimal solution for geometry problems due to the time complexity. Whereas
there is a bigger amount of data we are able to process, it is necessary to solve problems effectively. Due to the difference of the cartographic and informatic look to problems, this article
tries to find unifying perspective emphasing importance of the computational geometry for the
cartography education. In order to the Czech Republic does not become passive consumers
of information technologies, it is necessary to invest in development of own geoinformatic
problems solutions. This fact plays and important role and can not be underestimated in the
long term perspective.
The educational process must be adapted to those facts. It is not sufficient to focus only
on practical solving of problems. Based on an analysis, the student should be able to find
optimal solution for the problem. In general terms, it is necessary to strengthen the teaching of
natural sciences. In today´s highly over-technized world plays the ability of exact assessment
of the problem an important role. It allows to reduce an inwardness of human decisionmaking and a dependency on ideologies. But this concept is in the discrepancy with the
requirements to practical focus of higher education. It is possible to illustrate those problems
on the educational process of computational geometry with the focus on interrelationships
and interdisciplinary links. Students would be able to feel the problem comprehensively and
solve it much more effectively.
Computational geometry and building generalization
A map represents an abstract expression of the reality. To maintain the basic characteristics
of the cartographic products (dis passionateness, clearness, lucidity...), a controlled reduction
of information must be performed. This process is called “generalization” and results in
the simplification of the map content. Generalization takes an important role in computer
graphics, it allows to reduce the amount of information and shorten the visualization process.
Generalization is a subjective process with an accent to knowledge and experiences of the
cartographer. Computational geometry makes the process of simplification less dependent
on a subjective view of a cartographer. An algoritmization of the simplification process
is not unambiguous. It is not an easy task to find and set a geometric criterion, that is
supposed to be satisfied by a simplified element. The simplification represents a process
of more interdependent steps, an implementation of one step causes the next step. This
Geinformatics FCE CTU 2008
16
The importance of computational geometry for digital cartography
part of the article uses information and mathematical background from a new simplification
algorithm proposed by the author.
Generalization factors. The are several important factors of the generalization that affect
the results. They could be divided into four categories: map scale, the purpose of the map,
characteristic of the territory, used cartographic symbols.
Geometric generalization of the building. The geometric generalization carries out a
controlled reduction of the map content based on analysis of the geometric properties of the
elements. It tries to remove those elements, that are not significant in the map context.
Some geometric structures like Voronoi tessellation or Delaunay triangulation can be used.
Automated or semi automated generalization of the building represents a problem solved in
many ways. Commonly used simplifying algorithms can not be applied, they do not maintain
internal angles of the polygon edges (± π2 ) representing the building, see Fig. 1. Building
simplification has the constrain that makes this process more difficult.
Figure 1: Building generalization without internal angles maintaining.
Requirements for the algorithm. A design of the algorithm with the reasonable time
complexity (quadratic or better) providing appropriate cartographic results with minimizing
the needs of manual corrections seems to be a hard problem. In addition, we have the following
requirements for the simplification algorithm:
ˆ ability of building detection and simplification in any position,
ˆ self intersections removing,
ˆ ability to keep the area (equal area or near o equal area algorithm),
ˆ regulation of the simplification factor by user,
ˆ ability to simplify complex and non-convex shapes.
In terms of computational geometry we explain more detailed the first and second points.
Scheme of the simplification process
An automated or semi automated simplification of buildings based on the least square method
is currently being solved in many ways. From the cartographic perspective it provides relatively good results. The simplification process can be shortly described using the following
Geinformatics FCE CTU 2008
17
The importance of computational geometry for digital cartography
steps:
1. Detection of the angle of rotation ϕ of the building:
ˆ construction of the convex hull of a set of points,
ˆ construction of the smallest area enclosing rectangle of a set of points.
2. Set rotation of the building: the angle of rotation −ϕ.
3. Detection of the vertices and edges of the building based on the recursion:
ˆ calculation of the splitting criterion σ,
ˆ recursive decomposition of the edge to the set of new edges.
4. Set rotation of the building: angle of rotation ϕ.
In order to simplify mathematical calculations, generalized building is rotated by the angle
of −ϕ. The building is rotated so that its edges are parallel to the axes of x, y.
Detection of the building rotation
We will consider a non convex rectangular polygon in the plane to be a building. The building
usually does not have to be oriented in the basic position, when all edges are parallel to axis
x, y of the coordinate system. In general position the building is rotated, the rotation angle
ϕ must be detected as a first step of the simplification algorithm.
An accuracy of determining the angle of rotation ϕ significantly affects an effectiveness of the
algorithm. The most common method of detecting the angle of rotation ϕ formed by x axis
and the longer edge of the rectangle, is based on construction of the minimum bounding box
(rectangle enclosing all points with the minimum area), and follows with the detection of the
angle formed by the x axis and the longer edge of the rectangle, see Fig. 2.
Whereas, the calculation is carried out over a large set of points, it is necessary to choose the
procedure with the lowest time complexity. The procedure runs over non-convex polygon, this
feature makes the process more complex. Commonly available algorithms achieve quadratic
time complexity O(N 2 ) for this operation. Using rotating calipers method published in [2] we
can perform this step in linear time. This procedure is usable only for convex polygons, first
step represents transformation of the non-convex polygon to convex hull. Which method for
the convex hull construction is the best to choose: Jarvis scan, Graham scan or QuickHull?
Given the time complexity requirements as the best variant appears the Graham scan.
An interesting fact may be a comparison of the detected angle ϕ to street line angle constructed using the topological skeleton (eg straight skeleton). This technique is currently at
the research stage.
Graham scan
Graham scan enables constructioning the convex hull in sub quadratic time with O(N · lg N )
complexity. It assumes, there are no three collinear points in the set. Algorithm is based
on the idea of right turn. For each triplet Pi , Pi+1 , Pi+2 , i ∈ 1, .., n − 2, we analyze relative
Geinformatics FCE CTU 2008
18
The importance of computational geometry for digital cartography
Figure 2: A detection of the building rotation using convex hull and the smallest area
enclosing rectangle.
position of Pi+2 and the segment consisting of Pi , Pi+1 (left or right turn). Let us denote
→
−
−
u = Pi − Pi+1 and →
v = Pi+1 − Pi+2 . Right turn criterion we can write as
ux uy vx vy v 5 0.
The first step consists of finding a point Q with extreme x coordinate (xmax ). It follows with
sorting of points according to the angle ω measured between k x and Q, P . When calculating
the angle, it is necessary to determine ω at interval (0, 2π). Notify, that computing angle ω
from
ω = arccos( √
(x2 −x1 )(x3 −x2 )+(y2 −y1 )(y3 −y2 )
√
)
(x2 −x1 )2 +(y2 −y1 )2 (x3 −x2 )2 +(y3 −y2 )2
brings numerical troubles.
Sorting algorithm. The relationship computational geometry and informatics can be illustrated by a sorting algorithm. What algorithm seems to be appropriate for sorting the set
of points because of the time complexity? Given the fact, that the set of points forming a
building is not too large, the choice of sorting algorithm does not play an important role.
Given the fact that sorting procedure could be repeated for the data made of thousands of
buildings, it is efficient, in terms of overall approach to the problem, to use QuickSort. The
QuickSort implementation is available in many programming languages as a standard sorting
procedure.
Data structure and implementation. A concept of the data representation also plays
an important role. One possible solution using the stack can be found in [3]. Every point is
represented by its unique identifying number, coordinates x, y, and flag illustrates the deletion
of point from the hull. A correct definition of copy constructors and casting operators is
important. Look to the following source code sample:
Geinformatics FCE CTU 2008
19
The importance of computational geometry for digital cartography
class Point
{
private:
int num;
bool del;
double x,y;
...
public:
Point::Point (const Point &point)
{
num=point.num;
del=point.del;
x=point.x;
y=point.y;
}
bool Point::operator < (const Point &point)
{
return (y<point.y)||(x>point.x)&&(y==point.y);
};
bool Point::operator == (const Point &point)
{
return (x==point.x)&&(y==point.y);
};
Point Point:: operator = (const Point &point)
{
num=point.num;
del=point.del;
x=point.x;
y=point.y;
return *this;
}
...
}
Collinerity problem. The collinearity problem negatively affects the process of convex hull
construction. Collinear points have the same angle, how to sort those points? Let us denote
two colinear points Pi , Pj and si , sj euclidean distances from those points to the Q. We
define a new sorting rule: if (ωi = ωj ) than closer point min(si , sj ) is considered as earlier.
Coincident points represent a special case of the collinearity problem. For GIS data this
problem is not so important, they are topologically valid (it means also without duplicated
points).
Smallest area enclosing rectangle
Problem of the smallest area enclosing rectangle construction was solved in many ways. Presented solution described in [5] solves the problem in linear time using two calipers orthogonal
to each other. The following procedure is called Rotating calipers. The idea of construction
is based on the repeated rotation of rectangle, the rectangle is gradually improved and becomes an approximation of smallest enclosing area in the next step. One edge of the smallest
enclosing box must be collinear with one segment of the convex hull.
Let us denote ϕj , j ∈ h1, 4i, four angles formed by the four smallest area enclosing box edges
and four edges of the convex hull in points of contact Vj . Let Vj0 represents a point, that is
a successor of the point Vj , and Mj represents a vertex of the smallest area enclosing box.
Vertices of the smallest area (and thus edges) are clockwise oriented.
Geinformatics FCE CTU 2008
20
The importance of computational geometry for digital cartography
We find the minimum angle ϕmin = min(ϕj ) and rotate the rectangle by an angle ϕmin .
Another edge of the rectangle becomes collinear with some segment of the convex hull. Three
points of contacts will not change. However one point Vj , represented by the start point of
the collinear segments, changes to its successor Vj0 . We calculate an area S of the rectangle,
compare it with a minimum area Smin initialized
∞. If S < Smin ,
P duringπ the first step toP
we store Smin = S. Repeat those steps until
ϕmin < 2 leads to result
ϕmin = ϕ. Due
to the fact, that buildings are represented by rectangular polygons, more than one edge of
rectangle with more segments of the convex hull. Because of errors cumulation the numerical
inaccuracy is the problem of presented algorithm.
Figure 3: Problem of smallest area enclosing rectangle construction lead to inappropriate
simplification.
For our purposes it is sufficient to determine the angle ϕ with the accuracy of one degree,
therefore we do not have to deal with this problem in more detail. For some specific shapes the
smallest area enclosing rectangle do not have to be the best way how to detect the rotation
of the building. This situation is typical for Z or L segment, when the deviation between
calculated angle ϕ and true value of the angle may be of up to several tens of degrees, see
Fig. 3. It is important to note, that the steps above represent only auxiliary geometric
construction with a certain percentage of errors.
The detection of self intersections
During the process of the cartographic generalization we can be encountered with the problem
of self intersections. They represent such situations, in which some undesirable forms as a
results of the generalization process have been created. Due to the topological incorrectness
of such data, this error is very dangerous. Closed “pseudoregion” is the result of crossing of
two or more line segments. In the locus of the intersection there is no vertex inserted. Using
GIS software this pseudoregion will be considered as topologically incorrect, see Fig. 4.
One of the possible solution may be a test, which verifies an existence of self intersections.
Before an edge removing or edge splitting procedure it is verified, whether this edge does not
intersect any other edge of the building. If so, a procedure for the edge simplification will be
Geinformatics FCE CTU 2008
21
The importance of computational geometry for digital cartography
canceled. Unfortunately, this step will contribute to a significant slowdown of the algorithm.
How to perform the effective detection of self intersections with better than quadratic time
complexity? Bentley&Ottman algorithm brings one of possible solutions.
Figure 4: Problem of self intersection after the splitting procedure.
Bentley&Ottman algorithm
Bentley&Ottman algorithm, published in 1979, is able to find intersections of sets of lines
with O(N lg N ) time complexity. A brute force algorithm, based on checking of all possible
intersection, is working only with the quadratic time complexity. Bentley&Ottman algorithm
represents an application of the sweep line, moving over the lines from left to right. The sweep
line parallel to y axis divides the set into processed part and unprocessed part. It calculates
intersections only with those lines, that are cut by the sweep line.
Data structures. The proposed algorithm is an example of the use of the priority queue. The
proposal of data structures plays an important role. The first data structure is represented
by the priority queue, points are sorted according to x coordinate. Information whether this
point is a start point, an end point or an intersection, are stored for each point. If sweep line
moves to point, an event is called. The second data structure, often represented by the tree,
stores lines in the order in which they intersect the sweep line.
Lines intersection. Finding the intersection of two lines is possible from parametric equations. Using general equation for lines parallel to x bring problems. Let us denote the first
line l1 given by two points P1 = [x1 , y1 ], P2 = [x2 , y2 ], the second line l2 given by two pints
P3 = [x3 , y3 ], P4 = [x4 , y4 ], and intersection Q = [xq , yq ]. Parametric equation for the line we
can write
xq
x1
x2 − x1
x3
x4 − x3
=
+s
=
+t
,
yq
y1
y2 − y1
y3
y4 − y3
where
s=
y1 (x3 −x4 )+y3 (x4 −x1 )+y4 (x1 −x3 )
(x2 −x1 )(y3 −y4 )−(y2 −y1 )(x3 −x4 ) , t
=
y1 (x3 −x2 )+y2 (x1 −x3 )+y3 (x2 −x1 )
(x2 −x1 )(y3 −y4 )−(y2 −y1 )(x3 −x4 ) .
For s ∈ (0, 1) ∩ t ∈ (0, 1) the intersection could be found from previous formulas.
Intersection of segments. The sweep line moves over the segments and stops at events
of three types: (1) start point of the segment, (2) intersection point between two segments,
(3) end point of the segment, see Fig. 5. If the event point represents start point, we test
segment against two neighbors along the sweep line. If the event point represents end point,
point is removed from the list. If we found an intersection of those segments, it becomes a
new event point. If the event point represents an intersection of two lines, we change their
Geinformatics FCE CTU 2008
22
The importance of computational geometry for digital cartography
order. Each of both segments has adjacent segments along the sweep line, that must be
tested for intersections. If the point represents an end point, adjacent segments are tested for
intersection and event point is removed. Bentley&Ottman algorithm is based on assumption,
that no segment is parallel to sweep line and no three segments pass through one point.
Figure 5: Bentley&Ottman algorithm with positions of sweep line.
History of segments intersecting the sweep line is stored in balanced binary tree. This data
structure is very efficient and enables update operations in O(lg(N )) time. So, it is apparent, that the implementation of Bentley&Ottman algorithm looks quite difficult, and uses a
combination of several dynamic data structures.
Conclusion
This paper presents the use of computational geometry in digital cartography. As an illustrative example the process of automated or semi automated building simplification was chosen,
several examples were given and discussed. It was focused on the idea of possibility of more
intensive computational geometry teaching. This article tries to find unifying perspective emphasis importance of the computational geometry for cartography education. Not to become
only passive consumers of information technologies, it is necessary to invest in the development of own geoinformatic solutions. This fact plays and important role and can not be, as
mentioned above, underestimated in the long term perspective.
References
1. DE BERG M., SCHWARZKOPF O., KREVELD M., OVERMARS M.: Computational
geometry: Algorithms and applications, 2000, Springer-Verlag.
2. DUTTER M.: Generalization of buildings derived from high resolution remote sensing
data, 2007.
3. ROURKE O. J.: Computational geometry in C, 2005, Cambridge University Press.
4. SESTER M.: Generalization based on least square adjustment, International Archieves
of Photogrammetry and Remote Sensing, 2000.
Geinformatics FCE CTU 2008
23
The importance of computational geometry for digital cartography
5. TOUSSAND G., Solving Geometric Problems with the Rotating Calipers, McGill University Montreal, 1983
Geinformatics FCE CTU 2008
24
Change Detection with GRASS GIS –
Comparison of images taken by different
sensors
Michael Fuchs, Rainer Hoffmann and Friedhelm Schwonke
Federal Institute for Geosciences and Natural Resources (BGR)
[email protected]
Keywords: Remote Sensing, Change Detection, diversity, GRASS, Yemen
Abstract
Images of American military reconnaissance satellites of the Sixties (CORONA) in combination with modern sensors (SPOT, QuickBird) were used for detection of changes in land
use. The pilot area was located about 40 km northwest of Yemen’s capital Sana’a and covered approximately 100 km2 . To produce comparable layers from images of distinctly different
sources, the moving window technique was applied, using the diversity parameter. The resulting difference layers reveal plausible and interpretable change patterns, particularly in areas
where urban sprawl occurs.
The comparison of CORONA images with images taken by modern sensors proved to be an
additional tool to visualize and quantify major changes in land use. The results should serve
as additional basic data eg. in regional planning.
The computation sequence was executed in GRASS GIS.
Introduction
GRASS GIS (http://grass.osgeo.org) with extended functionality and operability is more than
a common geographic information system. It is powerful in raster data processing, offers
fundamental functions in terrain- and landscape analysis with extended tools for hydrological
modeling and a small functionality for remote sensing. Furthermore it can be used to process
three dimensional data. This powerful functionality can be used as a frame for studies, which
use GIS in combination with remote sensing tools.
Change Detection – State of the art
Change Detection is a group of methods commonly used in remote sensing. Because of
the repetitive coverage of earth orbiting satellites at short intervals and consistent image
Geinformatics FCE CTU 2008
25
Change Detection with GRASS GIS – Comparison of images taken by
different sensors
quality, methods of Change Detection have become part of environmental observation systems
(Lunetta & Elvidge 1999; Owe 2007).
Change Detection is defined as: “The sensing of environmental changes that uses two or
more scenes covering the same geographic area acquired over a period of time.” (Glossary of
Canada Centre for Remote Sensing, http://www.ccrs.nrcan.gc.ca/glossary) Aside from visual
interpretation different algorithms are applied.
Essential aims of Change Detection are:
ˆ Detection and evaluation of land use changes
ˆ Support the monitoring of disasters triggered by geological, meteorological or man made
factors.
The use of Change Detection algorithms requires two preconditions:
1. Changes in land cover must result in changes in radiance values.
2. Changes in radiance due to land cover changes must be large with respect to radiance
changes caused by other factors, such as atmospheric conditions, sun angle or vegetation
phenology.
The preconditions mentioned are based on processing scenes from the same sensor type. The
scenes acquisition should be done carefully because differences in radiation, precipitation
and surface temperature in combination with phenological variations lead to discrepancies
in reflectance properties. These sources of interference have to be extensively eliminated.
The phenological variations are reduced by using scenes taken at the same season of the
year. Additionally, climate data should be available to assess the phenological stage of the
vegetation.
Well-known satellite missions have been operating continuously for decades. Landsat missions
for instance have been delivering images since 1972 with repetition rates of 18 days (MSS)
and 16 days (Landsat 4, 5, 7), respectively.
The data preparation includes:
ˆ Image registration with geometric correction
ˆ Radiometric calibration with atmospheric correction
The goal is to achieve high quality images with geographic precision of less than one pixel
and correlation of radiometric calibration close to 1.
The applied methods of Change Detection comprise simple difference procedures and multivariate statistical routines. Change Detection can be used directly to multiband stacks or
derived resp. classified layers. An overview of Change Detection methods can be found in
Théau (2006), the comparison and evaluation of methods and their applicability is described
in Peinado (2001). Some major definitions used in remote sensing are given below according
to Théau (2006) and Yang (1999):
ˆ Image differencing
ˆ NDVI, Tasseled Cap
Geinformatics FCE CTU 2008
26
Change Detection with GRASS GIS – Comparison of images taken by
different sensors
Continuity
Spatial Coverage
Spatial Resolution
Band Numbers
Repetition
Acquisition Costs
MODIS
Since 1999
2330 km (cross
track) by 10 km
(along track at
nadir)
250 m (bands
1-2) 500 m
(bands
3-7)
1000 m (bands
8-36)
Multispectral
(36)
(hyper
spectral)
2 days
free
Landsat
Since 1972
170 x 183 km
SPOT
Since 1986
60 x 60 km
QuickBird
Since 2002
16.5 x 16.5 km
30 m (pan 15
m)
10 m (pan 5 or
3 m)
2.44 m
0.61 m)
Multispectral
(7)
+
panchromatic
16 days
selective imagery
free,
further cost
0.02 $/km2
Multispectral
(4) + panchromatic
2.5 – 26 days
0.94 $/km2
Multispectral
(4) + panchromatic
1 – 3.5 days
22 $/km2
(pan
Tab.1: Technical Data of Selected Remote Sensing Satellites
The Tasseled Cap transformation (TC) optimizes data viewing for vegetation studies as one
of the available methods for enhancing spectral information content of Landsat TM. Four
bands are calculated: brightness, greenness, wetness, and haze.
ˆ Image rationing
ˆ Principal Components Analysis (PCA)
This technique is usually used to reduce the number of spectral components (spectral bands)
to fewer principal components accounting for the most variance in the original multispectral
images. Image spectral bands of two or more dates are treated as a single data set. After
performing PCA, information that is common to multidate images is mapped to the first
component (unchanged areas) whereas information that is unique to one of the dates is
mapped to the following components (changed areas).
ˆ Composite Analysis
Supervised and unsupervised classifications are used to analyze these datasets. Classes where
changes occurred are expected to present statistics significantly different from where changes
did not take place.
ˆ Change Vector Analysis
ˆ Comparison of post-classifications
The critical step of all mentioned methods is deciding where to place the threshold for changes.
Furthermore the exact nature of the changes needs a careful interpretation including the
knowledge of the investigation area including ground checks.
Geinformatics FCE CTU 2008
27
Change Detection with GRASS GIS – Comparison of images taken by
different sensors
Reconnaissance Satellite Photos – CORONA
The term stands for a series of U. S. Military reconnaissance satellites (KH 1 to KH 5)
which were operated between 1959 and 1972. The satellites of the CORONA series delivered
panchromatic photographs of many areas of the world.
Images of the first generation were declassified at the end of the Nineties. The ground resolution of the two KH-4 systems (1963 – 1972) ranged between 2 and 3 m. The photographs
are 30 $ each and can be ordered under http://edcsns17.cr.usgs.gov/EarthExplorer.
CORONA photos are used in various research projects. One application is the derivation of
elevation models because many scenes provide stereoscopic records (Schmidt et al. 2001).
Grosse et al. (2005) used CORONA images for the visual interpretation of thermokarst
processes. Another area of application comprises the preparation and support of archeological
excavations (Goossens et al. 2002). In geological mapping CORONA images are required
where other high resolution images are missing. Lorenz (2004) completed the mapping of
Paleozoic stratums in Russian Arctic with CORONA images.
Method
CORONA images are an essential source of information in particular for those decades where
other high resolution images are missing. This applies to the sixties of the last century when
only military reconnaissance satellites were operating. However, only Corona images are
available for this decade since 1996 (http://edc.usgs.gov/guides/disp1.html).
The methods of Change Detection mentioned above are based on scenes taken by the same
sensor type at different dates. The method described in this paper is based on the image
differencing method. Scenes are compared that were taken by different sensors. For this, the
steps for the preparation and harmonization of the image information are very important.
These working steps comprise the geometric correction of the CORONA image, the transformation of the RGB channels of the modern satellite data into one panchromatic channel
and the resample process into the pixel resolution of the CORONA image. Then the subsequent moving window algorithm can be applied. The computation sequence ends with the
subtraction step (Fig. 1).
The core of the computation sequence uses the moving window technique. This technique
is offered by the GRASS raster module r.neighbors (http://grass.osgeo.org). The command
r.neighbors can be run with different parameters. Basically two groups of parameters exist.
The first group comprises the statistical parameters. The second group comprises parameters
commonly used in landscape analysis (McGarigal & Marks 1995). These two parameters are
the diversity and the interspersion. Diversity is defined as the number of different values
within the neighborhood. The computation with parameters of the second group leads to
results which calculate pixelwise diversity as dimensionless value. Therefore the comparison
between images taken with different sensors is possible as outlined now.
For each pixel the number of different neighborhood pixel values has to be identified and
stored as a new value. Therefore the size of the moving window is to be considered as sensible
value with strong influence on the result layers (Fig 2). The size of the moving window has
Geinformatics FCE CTU 2008
28
Change Detection with GRASS GIS – Comparison of images taken by
different sensors
Fig. 1: Flow chart of sequential computation
strong influence on the pattern of diversity. Tests show that the matrix size of 25x25 delivers
interpretable result layers; hence 625 neighboring cells are included in the computation. With
the pixel size of 2.5 m, the radius of influence is 12 * 2.5 m = 30 m (leaving out the central
pixel).
In addition to the diversity parameter the entropy formula (Eq. 1) is used in the computation.
The Shannon Diversity Index (SHDI) is computed in our own application written in Fortran
90 and the results are dumped as absolute values. The SHDI is based on the information
theory and is also called as Negentropy (Palm 1985). It presents the amount of information
Geinformatics FCE CTU 2008
29
Change Detection with GRASS GIS – Comparison of images taken by
different sensors
Fig 2: Inclusion of neighborhood size in the moving window
for a defined quantity. The entropy formula is commonly used in different research areas,
such as landscape analysis (McGarigal & Marks 1995). In soil geography it is called areal
heterogeneity. Here the entropy is a measure of uncertainty. In this discipline it was discussed
as indicator of landscape structure (Altmann & Haase 1987). The measure of uncertainties
for a defined quantity of information is basis of evaluations in human geography (Paulov 1991)
and it is discussed in cartography to support the process of generalization (Bjorke 1996).
P
SHDI = − m
i=1 (pi ∗ ln pi ) (1)
ˆ with range: 0 – ln m
ˆ pi – Proportion of number of one value to values total
ˆ m – Count of different values
ˆ SHDI=0 if window contains the same value in all cells.
ˆ SHDI increases with the number of different values in the window.
ˆ Maximum entropy is reached when all values are different, the same as ln m.
The result layers are intersected by subtraction. Sources of error originating from clouds
or shadows can be masked. Therefore results of supervised or unsupervised classifications
can be used because such classes normally have a good delineation from other classes in the
multivariate space.
Area of investigation and data input layer
The test site is located north west of Yemen’s capital Sana’a and comprises an area of 10 x
10 km. In this arid to semi-arid climate zone, an ancient cultivated land with deficiencies of
water occurs. The test site is composed of a cuesta landscape with altitudes between 2500
and 3000 m a.s.l. with wide-stretched valleys and a network of wadis (Fig. 3). Farming
within the test site is characterized by extensive irrigation using groundwater from wells. On
a limited scale run-off water is used, too. Arable land mainly is located in the valleys and on
Geinformatics FCE CTU 2008
30
Change Detection with GRASS GIS – Comparison of images taken by
different sensors
man-made terraces located on the pediments in front of the escarpments and on dip slopes.
Aside from land use such as arable farming, various other categories of land use can be found
(Fig. 4).
Due to the long term technical cooperation between the geological surveys of the Republic
of Yemen and Germany, there are satellite images available in the Federal Institute for Geosciences and Natural Resources (BGR). For this study SPOT data (http://www.spot.com)
and GoogleEarth-QuickBird data (http://earth.google.com) were chosen for the comparison
with CORONA images (Tab. 2).
Fig. 3: View to the investigation area northeast of Shibam city, Yemen (photo R. Kringel
11/2006, BGR)
Sensor type
Spatial resolution [m]
2.5
Date
Source
CORONA
Spectral
bands
panchromatic
07.11.1967
QuickBird
color composite
0.6
14.11.2003
SPOT
RGB, NIR +
panchromatic
10 + 2.5
23.04.2004
USGS
(http://www.usgs.gov)
Google
Earth
(http://earth.google.com)
SPOT
IMAGE
(http://www.spot.com)
Tab. 2: Images used
Geinformatics FCE CTU 2008
31
Change Detection with GRASS GIS – Comparison of images taken by
different sensors
Fig 4: Exemplary land uses in the study area (photos by R. Kringel 11/2006, BGR)
Results
The first results contain the computed diversity layers derived from the panchromatic images.
The diversity/heterogeneity is quantified at date A (Fig. 5) and date B (Fig. 6 -7). Areas
with low and high diversity can be delineated and combined with land cover classes. The
two entropy layers of QuickBird (Fig. 7) and SPOT (Fig. 6) data show identical patterns in
heterogeneity. This is confirmed by the correlation of 0.84 (Tab. 4). In contrast the comparison of diversity between the CORONA image and the modern scenes shows no correlation
(Tab. 4).
layer
CORONA panchromatic
CORONA entropy
CORONA diversity
QuickBird panchromatic
QuickBird entropy
QuickBird diversity
SPOT panchromatic
SPOT entropy
SPOT diversity
CORONA-QickBird entropy difference
CORONA-QuickBird diversity difference
CORONA-SPOT entropy difference
CORONA-SPOT diversity difference
minimum
0
0
1
0
0
1
0
0
1
-5.08
-193
-4.06
-84
maximum
248
4.98
172
254
5.25
224
230
4.48
118
3.16
130
2.81
139
mean
152.4
3.97
85.5
116.6
4.16
100.2
75.3
3.1
34.4
-0.19
-14.7
0.87
51.1
variance
2348.0
0.65
941.2
1338.6
0.29
1287.4
245.9
0.21
166.4
0.58
1280.5
0.61
852.8
Tab. 3: Univariate statistics for input and output layers
The entropy layers mark agriculture terraces, plantations, infrastructure, and settlement areas
Geinformatics FCE CTU 2008
32
Change Detection with GRASS GIS – Comparison of images taken by
different sensors
as highly divers. The visible patterns with high entropy coincide with the border areas of
Wadis. This can be explained by intensive human activities and changes in land use. Parts
with low entropy comprise areas covered by clouds or shadows, areas on the higher part of
the plateaus as well as barren land.
Fig. 5: CORONA image (left), its entropy pattern (middle), and distribution of entropy
values (right)
Fig. 6: SPOT panchromatic image (left), its entropy pattern (middle), and distribution of
entropy values (right)
entropy layer
CORONA
QuickBird
QuickBird
0.43
SPOT
0.34
0.84
Tab. 4: Correlation coefficients between the entropy layers
The difference images (difference = CORONA entropy – panchromatic image entropy) illustrate the intensity of change between date A and date B. Areas with shadows and cloud cover
have to be neglected, although high differences can occur. For these areas an assessment of
change is not possible. The threshold between change and no change is drawn in the middle
Geinformatics FCE CTU 2008
33
Change Detection with GRASS GIS – Comparison of images taken by
different sensors
Fig. 7: QuickBird panchromatic image (left), its entropy pattern (middle), and distribution
of entropy values (right)
of the distribution of difference values. A negative value stands for change. The smaller the
values the stronger is the change (Tab. 3).
Fig. 8: CORONA, QuickBird image and entropy difference layer of Shibam
Fig. 9: CORONA, SPOT image and entropy difference layer of Shibam
Geinformatics FCE CTU 2008
34
Change Detection with GRASS GIS – Comparison of images taken by
different sensors
The city of Shibam is located in the center of the test site (Fig. 8, 9). In this 2.5 km²
clipping area the readability of the entropy difference pattern is depicted (Fig. 8, 9). Negative difference values have an orange-yellow coloring and are plotted transparently on the
panchromatic image. Areas with very strong differences are marked by shadows (visible along
the geological fault zone, Fig 8 and 9 left). These areas were not considered in the final generalization pattern. The remaining pattern clearly reflects the distribution of building areas
and infrastructure at the edge of the town as well as changes in land use. Therefore both
resulting entropy difference layers show an identical pattern.
Shadows and clouds were eliminated for the difference patterns by masking and the remaining
patterns were generalized. As result several categories can be distinguished (Fig. 10):
ˆ extension of settlements (1)
ˆ urban sprawl as result of construction of new roads (2)
ˆ plant settlement (3)
ˆ extension of plantation (4)
The areas shown in Fig. 10 mark the areas with changes due to human activities. Even in this
rural area changes in infrastructure – mainly construction of new roads – and urban sprawl
are clearly visible. During the last four decades the changes have reached a considerable
dimension so that the size of the city now is twice as large.
The use of CORONA images for Change Detection (simple image differencing method) increases the evaluated period by one decade. The entropy and diversity difference images show
plausible and interpretable change pattern. The comparison of CORONA images of the Sixties with images taken by modern sensors turned out to be a promising complement approach
to visualize and quantify major changes in land use.
References
1. Altmann, R.; Haase, G. (1987): Zur Kennzeichnung von Merkmalsvariabilität, Kontrast
und Arealheterogenität als Eigenschaften der Landschaftsstruktur”. Strukturen und
Prozesse in der Geographie: Beiträge zur quantitativ arbeitenden Geographie, Band
19, 145-154, Haack: Gotha.
2. Bjorke, J. (1996): Framework for Entropy-based Map Evaluation. Cartography and
Geographic Information Systems, 23, 2, 78-95.
3. GRASS (2008): Geographic Resources Analysis Support System, GRASS GIS 6.3.0
http://grass.osgeo.org
4. Grosse, G.; Schirrmeister, L.; Kunitsky, V. V.; Hubberten, H.-W. (2005): The Use of
CORONA Images in Remote Sensing of Periglacial Geomorphology: An Illustration
from the NE Siberian Coast. Permafrost and Periglac. Process. 16: 163–172.
5. Goossens, R.; De Man J.; De Dapper M. (2001): Research on the possibilities of
CORONA-satellite-data to replace conventional aerial photographs in geo-archaeological
Geinformatics FCE CTU 2008
35
Change Detection with GRASS GIS – Comparison of images taken by
different sensors
Fig. 10: Changes in land use marked on topographic map from the eighties.
studies, practised on Sai (Sudan). In A Decade of Trans-European Remote Sensing Cooperation, Buchroithner MF (ed.). Balkema Publishers: Lisse/Netherlands; 257–262.
6. Lorenz, H. (2004): Integration of Corona and Landsat Thematic Mapper data for
bedrock geological studies in the high Arctic. International Journal of Remote Sensing,
Volume 25, Number 22: 5143-5162.
7. Lunetta, R. S.; Elvidge, Ch. D. (eds.) (1999): Remote sensing change detection:
environmental monitoring methods and applications, Taylor & Francis, London.
8. McGarigal, K.; Marks B. J. (1995): FRAGSTATS: spatial pattern analysis program for
quantifying landscape structure. USDA For. Serv. Gen. Tech. Rep. PNW-351.
9. Owe, M. (ed.) (2007): Remote sensing for environmental monitoring and change detection: a compilation of papers presented at the IAHS Symposium on Remote Sensing
Geinformatics FCE CTU 2008
36
Change Detection with GRASS GIS – Comparison of images taken by
different sensors
for Environmental Monitoring and Change Detection, in Perugia, as part of the 24th
IUGG General Assembly, 2007, IAHS publication, 316.
10. Palm, G. (1985): Information und Entropie. In: Natur und Wissenschaft. Konkursbuch
14, Zeitschrift für Vernuftskritik, 95-110. Tübingen: Konkursbuchverlag.
11. Paulov, J. (1991): Entropie in der Humangeographie – Einleitende konzeptionelle Übersicht.
Petermanns Geographische Mitteilungen 2/1991, 89 – 97, Gotha: Haack.
12. Peinado, L. O. (2001): Comparison of Change Detection Methods for the Extraction of
Land Cover Parameters. Herbert Utz.
13. Schmidt, M.; Goossens, R.; Menz, G.; Altmaier, A.; Devriendt, D. (2001): The use
of CORONA satellite images for generating a high resolution digital elevation model.
IEEE vol. 7: 3123 – 3125.
14. Théau, J. (2006): Detection of Changes Using Remote Sensing: an Overview of Principles and Applications. Geo-Spatial and Range Sciences Conference, online1 .
15. Yang, X. M. (1999): Change Detection Based on Remote Sensing Information Model and
its Application on Coastal Line of Yellow River Delta. GISdevelopment-Proceedings,
online2 .
1
2
http://giscenter.isu.edu/gisday/grsc archives/chdetection.pdf
http://www.gisdevelopment.net/aars/acrs/1999/ps5/ps5043.asp
Geinformatics FCE CTU 2008
37
Geinformatics FCE CTU 2008
38
Moebius: An interface to web map services
David Procházka*, Jana Procházková**
* Dep. of Informatics, Faculty of Business and Economics, Mendel University in Brno
** Dep. of Mathematics, Faculty of Mechanical Engineering, Brno University of Technology
[email protected]
Keywords: Indexing, Searching, Web Map Service
Abstract
Our article presents a concept of a geospatial search engine based on a Web Map Service
(WMS) compliant virtual mapserver. This virtual mapserver is able to index mapservers
based on the WMS standard and create an unified interface to all shared map layers. Our
presented approach also allows to search the map layers within the virtual mapserver and
process the results directly in GIS tools.
Introduction
We could recognize two basic approaches for retrieving some files or more generally a piece of
information: searching and classification. Searching is a widely used method and is replacing
the classification approach in many applications (for instance retrieval of a relevant web
page). In geoinformatics however, ontological classification is dominantly used: metadata
catalogs (http://mis.cenia.cz), semantic rules (see [1], [2]), etc. Although these methods have
some benefits, they also have many drawbacks: 1. Catalogues and other ontologically based
approaches require manual administration (delays in actualization, limited range, etc.). 2. It
is hard to classify geodata into a fix set of categories because on every layer is possible to look
from many aspects (origin, resolution, coverage, content). For an overview of currently used
approaches see [3] or [4].
Generally, these approaches are not solving the basic problem: geodata is spread across the
Internet on many mapservers and it is usually a preliminary problem to find these mapservers,
for this reason, there is a need for a geospatial version of a search service such as Google
(http://www.google.com), Jyxo (http://www.jyxo.cz) or similar engines. Nowadays geodata
is usually published through different map services, therefore we have focused on them. The
presented search engine is using OpenGIS standards for communication, especially the Web
Map Service (see [5]).
Geinformatics FCE CTU 2008
39
Moebius: An interface to web map services
Geospatial search service
The presented solution has three basic parts: First an indexing engine to find as many
mapservers as possible. Indexing must be as autonomous as possible because manual administration would quickly become the bottleneck of the engine. Secondly it is necessary to create
a unified interface to them. For this purpose we have designed a virtualization engine. The
third part is a search engine working with given indices. Such engine must be very simple
and intuitive (e.g. like Google). The following sections present the basic structure of these
components.
Indexing tool
Indexing tool (called Indexer) is a web service written in Python. To start the Indexer it must
be given the address of the indexed mapserver. It then sends a GetCapabilities request to
the given mapserver and decomposes the resulting XML file. Pieces of information connected
directly to the map layer (bounding box, name, title, ...) are integrated with information valid
for more layers. For instance the Abstract of the mapset (or the mapserver itself) is valid for
all layers in the mapset. Therefore the index of each layer must contain this information.
Result of this process is an index with following structure.
ˆ NickName – unique identification string, composition of the name of the map layer
(unique within the mapserver) and unique identification of a mapserver (it is chosen
during indexing, usually it is part of URL – result is e. g. [email protected]),
ˆ Name – name of the layer (content of name element),
ˆ WMS – version of WMS supported by the mapserver, taken from the head of the
GetCapabilities file,
ˆ Address – URL of the mapserver where the layer is stored, hence it is also the URL
used for the GetMap requests, again taken from the head of the GetCapabilities file,
ˆ Access – access mode to the layer, there are three options: all (everyone is able to
access this layer), black (everyone except for users from IP addresses on a blacklist),
white (users from IP addresses on a whitelist only), used for security reasons,
ˆ Descriptions – contain the content of Title elements in Layer elements (description of
the layer or mapset) and usually also from the head of the GetCapabilities file,
ˆ Abstracts – list of Abstracts taken from the head of the GetCapabilities file and instances of Layer element,
ˆ SRSs – list of supported coordinate systems,
ˆ BoundingBoxes and LatLonBoundingBox – define the bounding box of the layer,
ˆ MinScale and MaxScale – maximal and minimal scale of the layer, taken from the
lowest instance of Layer element (could be replaced or extended by ScaleHint element),
ˆ Formats – list of supported output formats, taken from the head of the GetCapabilities
file,
Geinformatics FCE CTU 2008
40
Moebius: An interface to web map services
ˆ Opaque and Queriable – same meaning as in GetCapabilities file, values are taken
from the lowest instance of Layer element,
ˆ Styles – list of Styles – names of NamedStyles known to the WMS and appliable to this
layer.
All described pieces of metadata are given by the WMS itself. As there exists no unified
metadata system for geospatial data, it is not possible to rely on information stored in metadata files using different formats. But there are enough different elements in GetCapabilities
documents to provide complex information about a map layer. The basic problem is that
these elements are frequently not used. Abstracts and descriptions are very brief, information
about supported resolutions, accuracy, etc. is usually completely missing. From our point of
view the situation is slowly getting better, but there is still enough place for improvements.
Currently it is necessary to pass the URL of some mapserver to the Indexer. Appropriate
indices are created automatically. For higher performance the indexing tool should be accompanied by some kind of a web crawler for automatic mapserver discovery as described
in [6].
It is necessary to emphasize that the contents of the indices have to be checked periodically.
There are two possible control approaches. The first one checks just the existence of the layer.
This could frequently be done by a GetMap request on some small part of the layer. The
second approach could be called ”reindexing”: If a newly created index entry matches to an
old one, the contents must be the same or otherwise it is necessary to replace the entry.
Virtualization tool
There are many approaches in virtualization (or rather aggregation) of web services. Probably
the most successful projects are GIDB [6] and GeoBrain [7], [8]. In our project a different
approach is used. The concept is described in more detail in [4]. The standard “old-style”
approach is to create lists of mapserver URL or create a WMS interface to them (GIDB).
But still we have a number of different mapservers. In our approach we are merging layers
from all indexed layers together into one huge virtual mapserver. Such a mapserver contains
no data, the virtual layers are generated from the indices stored in the database.
It is obvious, that every WMS compliant mapserver must be able to respond on GetMap and
GetCapabilities requests. Following section describes the implementation of these requests in
our virtual mapserver called Moebius.
GetCapabilies
The implementation of the GetCapabilities request is straight forward: The Moebius has
the indices that contain all information necessary for generation of the GetCapabilities (GC)
documents. Therefore the response is in fact a translation of the indices into a GC file. The
first part of the GC file contains information about the Moebius (supported formats, address
of the service, contact information, etc.). These information is stored in a configuration
file. The second part is generated by the translation method. Indices are stored in current
implementation in an XML, therefore an parser generates just slightly different XML tree
Geinformatics FCE CTU 2008
41
Moebius: An interface to web map services
according to the GC DTD.
Figure 1: uDig application with our virtual mapserver Moebius. In the bottom window is
displayed the content of the Moebius map service.
GetMap
Every GetMap request must be decomposed according to the number of requested layers. For
every requested layer is a new GetMap request executed. This request is sent to the real
mapserver. The response – an image – is stored by the Moebius. After the mapservers have
returned all requested images, the Moebius merges them into one. This image is returned
to the client. It is obvious that the client does not know that it is in fact receiving data
originating from different mapservers (see Fig. 2).
Figure 2: Scheme of the GetMap request implementation in the Moebius
Geinformatics FCE CTU 2008
42
Moebius: An interface to web map services
An example using layers from two mapservers:
example URL1
and the result is the following single image (Fig. 3).
Figure 3: Example of a GetMap request with layers from two different mapservers.
Search engine
A web page is a common approach for searching the web. This approach, however usually
effective, is inconvenient in this situation. Let us suppose that a user formulates a question
and receives an answer in form of some list of links. There is an significant disadvantage in
such a response: In case the user wants to add some layers into his project in a GIS, it will
be necessary to copy the addresses of mapservers, names of the layers, etc. Therefore we have
designed a completely different solution.
The basic idea of our approach is: if GetCapabilities means ”return all available layers”, there
should be an another request FindMap which means ”return me layers which fulfil given
criteria”. The response on such a request should be again a GetCapabilities file,
just with limited amount of layers. This approach allows to process the response directly in
a GIS application because every response is in fact from the GIS application point of view an
independent WMS mapserver.
Structure of FindMap request
1
http://echo.mendelu.cz/cgi-bin/moebius/moebius.py?service=wms&version=1.1.1&request=g \
etmap&layers=topp:tdwg level [email protected],Radarsat [email protected]&srs=EPS \
G:4326&bbox=-180,-90,180,90&styles=&format=image/png&width=500&height=400&
Geinformatics FCE CTU 2008
43
Moebius: An interface to web map services
The FindMap request is similar to other WMS requests. Parameters allow the user to formulate a question for what he is searching for and where it should be. This can be done using
the following attributes:
ˆ request=FindMap – identification of the request, should be mandatory or optional
(depend on implementation of the service),
ˆ words=keyword,keyword,... – list of keywords which are searched in the indices,
mandatory,
ˆ bbox=minx,miny,maxx,maxy – bounding box for searching, mandatory,
ˆ operator=and,or – defines relation between keywords, optional (default value is “or”),
ˆ version=1.0.0 – version of request, currently not used, just for the future development,
ˆ exceptions=exception format – defines format of exceptions, optional,
ˆ abstract=0..n – number from 0 to n which represents the significance of instances
of keywords in this part of the index (0 – abstract is not used in the calculation, n –
abstract has the highest significance),
Example of such a FindMap request is:
example URL2
The response is an appropriate part of the GetCapabilities document of the Moebius with
layers for a given part of China.
From the request (especially the bounding box part) it is obvious that user assumes that
there exists only one place called ”Three Gores” and that he does’t know where this place is.
Therefore he is searching the whole Earth.
More usual is the second application of this service, where the user is searching on some specific
part of the Earth. For example: If a user is searching for the coast of Iberian peninsula, it
is possible to search for keywords ”coast” and ”Iberian”/”Iberian peninsula” on the whole
Earth or for ”coast” just above the appropriate peninsula.
1. example URL3
2. example URL4
The second approach is obviously much more effective. On a mapserver in Spain or Portugal
there will be probably a layer called ”coast”, but it is much less probable, that this layer will
be called ”Iberian coast”. Moreover the coast looked for could be part of a greater layer –
e. g. a European coast layer. It is obvious that in this case searching for ”Iberian coast” is
ineffective.
Search method and calculation of the relevance
2
http://echo.mendelu.cz/cgi-bin/moebius/search.py?words=Three,Gorges&operator=and&bbox \
=-180,-90,180,90
3
http://echo.mendelu.cz/cgi-bin/moebius/search.py?words=iberian,coast&operator=and&bbo \
x=-180,-90,180,90
4
http://echo.mendelu.cz/cgi-bin/moebius/search.py?words=coast&bbox=-16,42,10,36
Geinformatics FCE CTU 2008
44
Moebius: An interface to web map services
Probably the widest spread method used for searching is Inverted Index (see [9], [10] and many
others). A great advantage of this approach is its simplicity. Inverted Index method is used
by Google and many other search engines. The gist is building records which contain touples
– a keyword and its instances in documents (e.g. golf is in documents 1, 7 and 9). Usually the
records also contain information about position or positions of the keyword in the document
(golf appears in document 1 on positions 7, 25 and 78). This method is frequently extended
with a thesaurus, dictionary and other improvements. Important for implementation is, that
these improvements can be added independently.
Inverted Index based methods do not reflect the semantic meaning of documents. If a user
is searching for the word ”Golf”, they are only able to find documents containing this word.
Usually they are not able to recognize the difference between golf (sport) and Volkswagen Golf
(car). Some engines (such as Google) are using the history to guess the semantic meaning of
a question (e.g. a user is usually asking for information about cars). But what if there is a
page about Tiger Woods which is in fact about the golf sport, but does not contain the word
golf itself? Inverted Index methods are usually not able to recognize it. Therefore there is a
need for a more complex method which is able to work with semantic relations.
Latent Semantic Analysis
Important method based on analysis of the semantic meaning is Latent Semantic Analysis (LSA), also called Latent Semantic Indexing. LSA is a technique in natural language
processing for analyzing relationships between a set of documents by producing a set of concepts related to the documents and terms they contain. LSA is based only on mathematical
principles and does not use any indices or keywords. Important advantage is that a similar
document must not contain a given keyword (see [11]) and can still be found.
The input of the algorithm is a set of different documents and one document which contains
the keywords. LSA will find the documents which are close to given keywords.
LSA can use a term-document matrix which describes the occurrences of terms in documents.
It is a sparse matrix where the rows correspond to terms (typically stemmed words that appear
in the documents) and the columns correspond to documents.


x11 · · · x1n

..  (1)
..
X =  ...
.
. 
xm1 · · ·
xmn
A typical example of the weighting of the elements of the matrix is Inverse Document Frequency (IDF). The element of the matrix is proportional to the number of times the terms
appear in each document, where rare terms are upweighted to reflect their relative importance.
The next step is applying mathematical algorithm Singular Value Decomposition – SVD (for
mathematical background see [12] or [13]) The output is the product of three special matrices:
X = K · S · DT (2)
Matrix K contains the eigenvectors ui of XX T (in columns), DT is the matrix of the eigenvectors vi of X T X (in rows). Matrix S is composed of square root of singular values, which
are written in descending order on the main diagonal.
Geinformatics FCE CTU 2008
45
Moebius: An interface to web map services
It turns out that when you select the s largest singular values, and their corresponding singular
vectors from K and DT , you get the rank s approximation to X with the smallest error
(Frobenius norm). This approximation translates the term and document vectors into a
concept space. Equation (2) can be rewritten as:
Ms = Ks · Ss · DsT , (3)
The last step is to find, which documents are close to the given query (view this as a mini
document). To do the latter, we must first translate our query q into the concept space – q∗ .
It is obvious that we must use the same transformation that we use on our documents (SVD
transformation). Then we compare it to our documents (vectors vi ) using cosine similarity.
cos(q∗ , vi ) =
q∗ ·(DsT )i
|q∗ |·|(DsT )i |
(4)
The result of the described equation always lies in the interval h0, 1i – property of cosine
function. The result near zero shows that there is no similarity between query and the
document. A value near one shows that there is a high similarity, hence we have probably
found a relevant result.
This approach is very promising. Although it is necessary to comprehend and implement many
mathematical algorithms, the results outweigh the difficulties. For the presented approach to
create accurate results at least few larger sentences which describe the content are needed.
Nowadays descriptions of layers contain usually only a few words, and hence it is not possible
to use LSA efficiently right now. It is necessary to wait until owners of mapservers are
publishing more complete and precise meta-data. For the time being it is necessary to use an
algorithm which is able to work with less information.
Implemented method
Our approach is a combination between Inverted Index and LSA principles. The implemented
search engine is using indices created by the indexing tool. Relevance is based on number of
instances of searched keywords and their position in the indices.
Let for all elements of index ei , i = 0, 1, . . . , n − 1 (Abstracts, Titles, ...) exists a coefficient
of the importance of the element wi . Coefficient wi starts with value zero and is increased by
one with every keyword ks found in the element. This calculation of the importance is done
for all elements ei .
Furthermore for every element ei is defined coefficient vi . This is the weight of of the element.
The weight is designed to emphasize the important elements such as Keywords. For instance
if a keyword appears in the element Keywords, it is more important than its appearance in
element Abstract. Hence every element has its weight given by default settings of the engine
or by the user as a parameter of the FindMap request. These two coefficients are used for
calculation of the importance of the layer:
Qn−1
W = i=0,w
wi .vi
i 6=0
where
(
1, keyword ks is in element ei
bin(ks , ei ) =
0, keyword ks is not in element ei
Geinformatics FCE CTU 2008
46
Moebius: An interface to web map services
and coefficient of the importance wi for element ei is given by:
P
wi = ∀s bin(ks , ei )
The calculation is based on following assumption: If in some element more keywords appears,
it is more probable that this layer is relevant. If there are more elements containing more
keywords, relevance is much higher. From this reason values of the non-zero coefficients are
multiplied.
The second important coefficient which is used for calculation of the relevance is the coefficient of instances. The number of instances of searched keyword ks in element ei is called
ai . We calculate the sum ms of these instances for every keyword ks :
Pn−1
ms = i=0
ai
In case there is an operator ”and” between keywords and there exists at least one ms = 0, is
coefficient of instances for that layer set to zero. In all other cases is the coefficient given by
equation
P
M = ∀s ms
The value which represents the relevance of a layer – R – is calculated by multiplying the
coefficients W and M presented above.
R=W ·M
This formula reflects the thought that important is not only the number of instances of the
keywords, but also their position in the index and their proximity.
Conclusion and further development
The key innovations of the presented approach are the virtualization of multiple Web Map
Servers and the method of searching. It is necesary to empathize that the virtualization engine creates a single WMS compliant interface to all mapservers. Hence the virtual mapserver
Moebius can be opened in every GIS tool that connects to OGC WMS. The FindMap request
which is embedded into the Moebius allows to process search results directly in GIS applications. This was done by selecting the GetCapabilities document language as the output
format of the search results. Moreover, because this GetCapabilities document is an ordinary
XML document, it could be transformed into any other XML based format – XHTML, KML,
etc. This allows to process the results in many more ways.
Currently we are developing an extension to Moebius which transforms the GetCapabilities
files into Keyhole Markup Language (KML, for description see [14]). Therefore it is possible
to load virtually any WMS mapserver (or the search result) in Google Earth. An example
of such a translated search result follows. The KML document itself contains no data. All
geodata is loaded on demand using the Moebius WMS. The generated KML supports the
Super-Overlay technology (see [14]).
Great challenge which is before us is the optimization of the ranking algorithm. The currently
used approach is very simple. We are working on development of a rank for every layer
that could present its reliability (similar meaning as pagerank has [15]). It will be based on
Geinformatics FCE CTU 2008
47
Moebius: An interface to web map services
observing the usage of different layers (frequently used layers are probably more relevant).
Extension of our algorithm with such a rank could significantly improve the search results.
Although we have a working proof of our concept, there must be done a lot of work before
this application can be used for everyday work. Currently we are experimenting with new
solution for storage of the indices and we are trying to remove the performance bottlenecks.
Source codes of our solution written in Python and further information are available on
http://echo.mendelu.cz, where you can also find more examples and further information. If
you are interested in this project, do not hesitate to contact us.
Figure 4: Google Earth application with opened KML file with the results of the search
References
1. Cruz, I. F. et al. Handling semantic heterogenities using declarative agreements. In GIS
’02: Proceedings of the 10th ACM international symposium on Advances in geographic
information systems, pp. 168–174, ACM Press, New York, NY, USA, 2002.
2. Wiegand, N. et al. A web query system for heterogeneous government data. In Proceedings of the 2004 annual national conference on Digital government research. Digital
Government Research Center, 2004.
3. Procházka, D. Modelovánı́ a vizualizace vymezeného geografického prostoru (Ph.D. Thesis). MUAF in Brno, Brno, 2008, online5 .
4. Procházka, D. Motyčka, A. Geospatial Search Service. In Collaboration, software and
services in information society, Ljubljana, Slovenija, 2008.
5
http://echo.mendelu.cz/disertace.pdf
Geinformatics FCE CTU 2008
48
Moebius: An interface to web map services
5. De La Beaujardiere, J. OpenGIS Web Map Server Specification Implementation, 2007,
online6 .
6. Sample, J. et al. Enhancing the US Navy’s GIDB Portal with Web Services. In Internet
Computing, IEEE. Sept.-Oct. 2006, 10, 5, pp. 53–60.
7. Zhao, P. – Di, L. Semantic Web Service Based Geospatial Knowledge Discovery. In
IEEE International Conference on Geoscience and Remote Sensing Symposium 2006.
2006, pp. 3490–3493.
8. Yue, P. et all Semantic Augmentations for Geospatial Catalogue Service. In IEEE
International Conference on Geoscience and Remote Sensing Symposium 2006. 2006,
pp. 3486-3489.
9. Manning, Ch. D. Raghavan, P. Schütze, H. Introduction to Information Retrieval.
Cambridge University Press, Cambridge, MA, 2008, online7 .
10. Black, P. E. Inverted index. In Dictionary of Algorithms and Data Structures, U.S.
National Institute of Standards and Technology, 2008, online8 .
11. Yu, C. Cuadrado, J. Ceglowski, M. Payne, J. S. Patterns in Unstructured Data – Discovery, Aggregation, and Visualization. National Institute for Technology and Liberal
Education (NITLE), 2008, online9 .
12. Aggarwal, C.C. Yu, P.S. On effective conceptual indexing and similarity search in text
data. In Proceedings of the IEEE International Conference on Data Mining, 2007, pp.
3-10.
13. Wall, M. E. Rechtsteiner, A. and Rocha, L. M. A Practical Approach to Microarray
Data Analysis. Kluwel, Norwell, MA, 2003.
14. Google, Inc. Keyhole Markup Language Introduction. Mountain View, CA, 2008,
online10 .
15. Page, L. Brin, S. Motwani, R. Winograd, T. The PageRank Citation Ranking: Bringing
Order to the Web. Stanford Univeristy, 1999, online11 .
Acknowledgement
This article was written in context of project VZ MSM 6215648904/03/03/01 – Ministry
of Education, Youth and Sports of the Czech Republic.
6
http://www.opengeospatial.org/standards/wms
http://nlp.stanford.edu/IR-book/html/htmledition/irbook.html
8
http://www.nist.gov/dads/HTML/invertedIndex.html
9
http://www.knowledgesearch.org/lsi/cover page.htm
10
http://code.google.com/apis/kml/documentation/
11
http://dbpubs.stanford.edu:8090/pub/1999-66
7
Geinformatics FCE CTU 2008
49
Geinformatics FCE CTU 2008
50
ISO 19115 for GeoWeb services
orchestration
Jan Růžička
Institute of Geoinformatics, VSB-TU of Ostrava
[email protected]
Keywords: ISO 19115, GeoWeb, Orchestration, BPEL, MIDAS, Dublin Core, INSPIRE
Klı́čová slova: ISO 19115, GeoWeb, Orchestrace, BPEL, MIDAS, Dublin Core, INSPIRE
Abstract
The paper describes theoretical and practical possibilities of ISO 19115 standard in a process
of generating dynamic GeoWeb services orchestras. There are several ways how to instantiate
orchestras according to current state of services and user needs, some of them are briefly
described in the paper. The most flexible way is based on metadata that describe geodata
used by services. The most common standard used for geodata metadata in the EU is ISO
19115. The paper should describe if the standard is able (without extensions) to hold enough
information for orchestration purposes. The paper defines minimal set of metadata items
named ”ISO 19115 Orchestration Minimal” that must be available for geodata evaluation in a
process of orchestration. A second part of the article will be probably less optimistic. It should
describe how are (or were, or are planned to be) ISO 19115 possibilities used for metadata
creation nowadays in the Czech Republic. This part is based on analyses of ISO 19115 core,
MIDAS system, Dublin Core and INSPIRE metadata IR.
Abstrakt
Přı́spěvek popisuje teoretické a praktické možnosti standardu ISO 19115 v procesu tvorby
dynamických orchestrů služeb platformy GeoWeb. V zásadě je možné vytvářet instance orchestrů mnoha způsoby na základě aktuálnı́ho stavu služeb a požadavků uživatele. Některé z
nich jsou stručně popsány v přı́spěvku. Nejpružnějšı́ způsob tvorby je založen na metadatech,
které popisujı́ geodata využı́vaná službami. V současné době je v rámci EU nejvyužı́vanějšı́m
standardem standard ISO 19115. Přı́spěvek by měl popsat zda je standard schopen (bez
rozšı́řenı́) pojmout všechny nezbytné položky pro potřeby orchestrace. V přı́spěvku je definována minimálnı́ sada metadatových položek nazvaná ”ISO 19115 Orchestration Minimal”,
která je nezbytná pro posouzenı́ geodat v procesu orchestrace. Druhá část přı́spěvku bude
zřejmě méně optimistická nebot’ se bude zabývat jak to vypadá s reálnými možnostmi využitı́
Geinformatics FCE CTU 2008
51
ISO 19115 for GeoWeb services orchestration
potenciálu standardu ISO 19115 pro orchestraci v rámci ČR. Tato část je založena na analýze
ISO 19115 core, systému MIDAS, Dublin Core a INSPIRE metadata IR.
Orchestras
An orchestration is a process where are modelled processes (real or abstract) in a way of
formalized description. A process modelling is a technique that uses several description tools,
mainly schemas or diagrams, to describe usually real processes inside enterprise. The processes
can lead across several organizations.
A model of a process is transformed from abstract languages (BPMN (Business Process
Modelling Notation), UML (Unified Modelling Language)) to a form that can be directly
run on a computer. In this area of runnable models of processes is the most known BPEL
(Business Process Execution Language). A process run means reading inputs, invoking web
services, deciding according to results, repeating some parts of the process and other necessary
operations.
A process modelling offers possibilities how to formally describe processes inside an enterprise,
to find duplicate processes, to find processes that are not optimised, etc. A process modelling
helps with processes optimisation and with sources management optimisation. When it is
possible, than the description is available in a form of BPEL-like language and processes can
be directly invoked.
GeoWeb services orchestration can be done in many ways. The GA 205/07/0797 team has
researched the two ways of possible orchestration.
Simple orchestras
The first way is based on orchestras where the services searched during the building orchestra
instance are using the same data sources in a meaning of data sources and algorithms. During
the building orchestra instance are searched only services that use the same data source and
the same algorithms for data source and input manipulation. Data source content can change
only on spatio-temporal extent of the working area. We can speak about services replication
(or distribution in a horizontal plane). Current instances of the services that are connected
to the orchestra are selected according to current state of the services, such as performance,
speed or provider.
These services differ on physical binding. These kind of orchestras is focused on optimisation of orchestras run. For these kind of orchestras is not needed any specific manipulation.
There is necessary to identify same services using some key. For our testing purposes we use
common identification, based on standardisation organisation identification, standard identification, service identification. Such identification is described on the following example.
http://gis.vsb.cz/ogc/wms/1.1.1/ZABAGED/0.1. Items are defined by url. First item is
domain of the service type guarantee. Second item is abbreviation of standardisation organisation name. Third item is abbreviation of standard name. Fourth item is a version of the
standard. Fifth item is abbreviation of the service. Last item is a version of the service type.
This type of orchestras is simpler to manage than the second one.
Geinformatics FCE CTU 2008
52
ISO 19115 for GeoWeb services orchestration
Dynamically created orchestras
The second way is based on orchestras where current instances of the services can be just
similar to each other in a meaning of data sources and algorithms. For example we can use
service that uses railways data source where tracks are just simple lines between stations or
we can use service that uses railways data source where tracks are modelled by real headway.
We can switch between these sources in many cases, such as routing (finding the best routes)
where the main parameter for routing is time. This type of orchestras is more difficult to
manage than the first one.
Our research shows that usually the first type of orchestras will be used, but there are still
situations when a system for orchestration should be able prepare second type of orchestras.
There are two ways how to handle this problem.
The first solution is simple, but difficult to manage in a meaning of long time term, because
this solution is rather static than dynamic. There must be simple database (no matter how
is organised – relational, XML) where are defined relations between data sources (services).
Related services can be named group of similar services.
The second solution is based on data source evaluation based on metadata analyses. This
article should describe, why is this way so complicated and probably impossible.
Metadata items useful for data evaluation
In a process of searching available services for dynamic orchestras building we are looking
for similar data sources. First of all we have to specify metadata items that can be used for
evaluating that the data are similar enough for our orchestra.
There are many different standards in this area that define metadata items, but nowadays
probably the most important one is ISO 19115 (ISO 19139). For our research we identify only
items from this standard.
We can name this set of items ISO 19115 Orchestration Full. Later is described Minimal
set of the items that are necessary for running similarity tests.
Administrative metadata
Item
MD Metadata/
dateStamp
MD Metadata/
metadataMaitenance
MD Identification/
resourceMaitenance
Geinformatics FCE CTU 2008
Description of usage and problems
Date that the metadata was created. Useful
for evaluation of metadata reliability.
Frequency and scope of metadata updates.
Useful for evaluation of metadata reliability.
Frequency and scope of data updates. Individual items are described later.
53
ISO 19115 for GeoWeb services orchestration
MD MaintenanceInformation/
maintenanceAndUpdateFrequency
userDefinedMaintenanceFrequency
updateScope
updateScopeDescription
MD ReferenceSystem
Only supplemental information, but useful
when information about temporal extent is not
available
A reference system is not necessary for analyses, but for using the service. Usually we have
enough information in EPSG code, that is included in metadata for a service, but sometimes full description is necessary.
Table 1: Administrative metadata items from ISO 19115 Orchestration Full
Quality metadata
Item
MD DataIdentification/
spatialResolution
MD Resolution/
equvivalentScale
distance
MD Metadata/
dataQualityInfo
DQ DataQuality
LI Lineage/
statement
processStep
source
DQ Element/
nameOfMeasure
measureIdentification
measureDescription
evaluationMethodType
evaluationMethodDescription
evaluationProcedure
dateTime
result
Geinformatics FCE CTU 2008
Description of usage and problems
Density of spatial data. Very useful.
We can use both options of the resolution, but the distance is better valuable.
Quality of a resource. Individual
items are described later.
Very important item. Items (associations are described later).
Very useful items, but unfortunately
only simple table of items and the free
text domain is used. Very difficult to
handle free text for automatic evaluation. Only items for defining source
are not described only by free text,
but this is not enough.
This abstract element should be completely included. Of course the main
item is result described later.
54
ISO 19115 for GeoWeb services orchestration
DQ Result/DQ ConformanceResult/
specification
explanation
pass
DQ Result/DQ QuantitativeResult/
valueType
valueUnit
errorStatistic
value
DQ Completeness/
DQ CompletenessCommission
DQ CompletenessOmission
DQ PositionalAccuracy/
DQ AbsoluteExternalPositionalAccuracy
DQ GriddedDataPositionalAccuracy
DQ RelativeInternalPositionalAccuracy
DQ TemporalAccuracy/
DQ AccuracyOfATimeMeasurement
DQ TemporalConsistency
DQ TemporalValidity
DQ ThematicAccuracy/
DQ ThematicClassificationCorrectness
DQ NonQuantitativeAttributeAccuracy
DQ QuantitativeAttributeAccuracy
This items are quite well defined and
useful for evaluation. Even domains
are good enough for automatic evaluation.
Described by DQ Element.
Described by DQ Element.
Described by DQ Element.
Described by DQ Element.
Table 2: Quality metadata items from ISO 19115 Orchestration Full
Usage metadata
Item
MD Identification/
resourceSpecificUsage
MD Usage/
specificUsage
userDeterminedLimitations
MD Identification/
resourceConstraints
MD Constraints/
useLimitation
Geinformatics FCE CTU 2008
Description of usage and problems
Specific applications for which the resource was
used.
Very useful item, but unfortunately only the
free text domain is used. Very difficult to handle free text for automatic evaluation.
Constraints on a resource. Individual items are
described later.
Very useful item, but unfortunately only the
free text domain is used. Very difficult to handle free text for automatic evaluation.
55
ISO 19115 for GeoWeb services orchestration
MD LegalConstraints/
accessConstraints
useConstraints
otherConstraints
MD SecurityConstraints/
classification
userNote
classificationSystem
handlingDescription
Very useful items, but unfortunately only simple table of items and the free text domain is
used. Very difficult to handle free text for automatic evaluation. Information that there is
copyright or license is not very useful for evaluation, if the resource can be used in orchestration.
Useful only in some very specific applications.
Only simple table of items and the free text
domain is used. Very difficult to handle free
text for automatic evaluation.
Table 3: Usage metadata items from ISO 19115 Orchestration Full
Extent metadata
Item
MD DataIdentification/
extent
EX Extent/
description
geographicElement
temporalElement
verticalElement
EX GeographicExtent/
extentTypeCode
EX BoundingPolygon/
polygon
EX GeographicBoundingBox
westBoundLongitude
eastBoundLongitude
southBoundLatitude
northBoundLatitude
EX GeographicDescription/
geographicIdentifier
EX TemporalExtent/
extent
EX VerticalExtent/
minimumValue
maximumValue
unitOfMeasure
verticalDatum
Geinformatics FCE CTU 2008
Description of usage and problems
Spatio-temporal extent. For geographic extent
is preferred polygon instead of bounding box.
56
ISO 19115 for GeoWeb services orchestration
Table 4: Extent metadata items from ISO 19115 Orchestration Full
Content and structure metadata
Item
MD DataIdentification/
spatialrepresentationType
MD DataIdentification/
language
MD DataIdentification/
topicCategory
MD Keywords/
keyword
Type
ThesaurusName
MD GridSpatialRepresentation/
numberOfDimensions
axisDimensionsProperties
cellGeometry
MD Dimension/
dimensionName
dimensionSize
resolution
MD VectorSpatialRepresentation/
topologyLevel
geometricObjects
MD GeometricObjects/
geometricObjectType
geometricObjectCount
MD FeatureCatalogueDescription/
featureTypes
featureCatalogueCitation
MD CoverageDescription/
attributeDescription
contentType
dimension
Geinformatics FCE CTU 2008
Description of usage and problems
Method used for spatial representation. List of
available items is very simple. We can use it
only for distinguish between raster and vector.
The other items described later must be used
for better evaluation.
Language used within the dataset. Necessary
for evaluation. We can use dataset with different language usually only when dealing only
with geometry or topology.
Main theme of the dataset. Not very useful,
but can be used for basic evaluation.
More useful than topicCategory for basic evaluation.
More precise information about grid. We
can include also MD Georectified and
MD Georeferenceable, but these are not
necessary for analyses.
More precise information about vector. Number of object can be significant for analyses of
similarity.
Information about used feature catalogue and
selected set of features from the catalogue.
Information about values in grid data cells.
57
ISO 19115 for GeoWeb services orchestration
MD ImageDescription/
illuminationElevationAngle
illuminationAzimuthAngle
imagingCondition
imageQualityCode
cloudCoverPercentage
processingLevelCode
compressionGenerationQuantity
triangulationIndicator
MD RangeDimension/
sequenceIdentifier
descriptor
MD Band/
maxValue
minValue
units
bitsPerValue
peakResponse
toneGradation
scaleFactor
offset
Information about digital image record.
Table 5: Content and structure metadata items from ISO 19115 Orchestration Full
Minimal set of Metadata items for automatic data evaluation
Following list shows minimal set of metadata items, that must be available to test similarity
of the analysed datasets. We can name this set as ISO 19115 Orchestration Minimal.
Without these items are not metadata useful for running tests of similarity. This recommendation should be applied to all new created metadata. There are not included items,
that are generally useful, but used domain for their specification is not suitable for automatic
evaluation. Some of the items are not applicable for all resources (e.g. you can not specify
MD Band for vector data).
MD DataIdentification/spatialResolution
MD Resolution/equvivalentScale
MD Resolution/distance
MD Metadata/dataQualityInfo
DQ DataQuality
LI Lineage/source
DQ CompletenessCommission/DQ Element/DQ Result
Geinformatics FCE CTU 2008
58
ISO 19115 for GeoWeb services orchestration
DQ CompletenessOmission/DQ Element/DQ Result
DQ AbsoluteExternalPositionalAccuracy/DQ Element/DQ Result
DQ GriddedDataPositionalAccuracy/DQ Element/DQ Result
DQ RelativeInternalPositionalAccuracy/DQ Element/DQ Result
DQ AccuracyOfATimeMeasurement/DQ Element/DQ Result
DQ TemporalConsistency/DQ Element/DQ Result
DQ TemporalValidity/DQ Element/DQ Result
DQ ThematicClassificationCorrectness/DQ Element/DQ Result
DQ NonQuantitativeAttributeAccuracy/DQ Element/DQ Result
DQ QuantitativeAttributeAccuracy/DQ Element/DQ Result
MD DataIdentification/extent
EX Extent/geographicElement/EX BoundingPolygon/polygon
EX Extent/geographicElement/EX GeographicBoundingBox
EX Extent/temporalElement/EX TemporalExtent/extent
EX Extent/verticalElement/EX VerticalExtent
MD DataIdentification/spatialrepresentationType
MD DataIdentification/language
MD DataIdentification/topicCategory
MD Keywords
MD Keywords/keyword
MD Keywords/Type
MD Keywords/ThesaurusName
MD GridSpatialRepresentation
MD GridSpatialRepresentation/numberOfDimensions
MD GridSpatialRepresentation/axisDimensionsProperties
MD Dimension/dimensionName
MD Dimension/dimensionSize
MD Dimension/resolution
MD GridSpatialRepresentation/cellGeometry
MD VectorSpatialRepresentation
MD VectorSpatialRepresentation/topologyLevel
Geinformatics FCE CTU 2008
59
ISO 19115 for GeoWeb services orchestration
MD VectorSpatialRepresentation/geometricObjects
MD GeometricObjects/geometricObjectType
MD GeometricObjects/geometricObjectCount
MD FeatureCatalogueDescription
MD FeatureCatalogueDescription/featureTypes
MD FeatureCatalogueDescription/featureCatalogueCitation
MD CoverageDescription
MD CoverageDescription/attributeDescription
MD CoverageDescription/contentType
MD CoverageDescription/dimension
MD RangeDimension/sequenceIdentifier
MD RangeDimension/descriptor
MD Band
MD Band/maxValue
MD Band/minValue
MD Band/units
MD Band/bitsPerValue
MD Band/peakResponse
MD Band/toneGradation
MD Band/scaleFactor
MD Band/offset
MD ImageDescription
MD ImageDescription/illuminationElevationAngle
MD ImageDescription/illuminationAzimuthAngle
MD ImageDescription/imagingCondition
MD ImageDescription/imageQualityCode
MD ImageDescription/cloudCoverPercentage
MD ImageDescription/processingLevelCode
MD ImageDescription/compressionGenerationQuantity
MD ImageDescription/triangulationIndicator
Geinformatics FCE CTU 2008
60
ISO 19115 for GeoWeb services orchestration
Expected metadata extent
Previously defined set of items named ISO 19115 Orchestration Minimal will not be probably
available generally in the future. We can expect that only a few closed communities e.g.
companies can be able have all resources described in this level of detail. In general we can
expect that available metadata will not be never so detailed.
We can expect that metadata available in the Czech republic are going to be prepared according to several types of detail. This is necessary to know for geodata evaluation.
These types are:
ˆ metadata according INSPIRE IR (INSPIRE, 2007),
ˆ metadata according to ISO 19115 core (ISO/TC 211, 2003),
ˆ metadata according to Dublin Core basic set (DCMI, 2007),
ˆ metadata according to MIDAS database (CAGI, 2007) completeness.
Other alternatives are not expected.
Metadata according to INSPIRE
The list of items is used from draft implementation rules (INSPIRE, 2007).
Level 1 is a basic level, that will be required always (if the conditional rule does not define
different options).
ˆ Resource title.
ˆ Temporal reference – in a case when information is meaningful.
ˆ Geographic extent of the resource.
ˆ Resource language – in a case when text is used.
ˆ Resource topic category.
ˆ Keyword.
ˆ Service type – in a case of a service.
ˆ Resource responsible party.
ˆ Abstract.
ˆ Resource locator – in a case if any reference exists.
The second level is extended level and we can not expect full implementation of this level
for all catalogues (datasets or services).
ˆ Constraints.
ˆ Lineage.
ˆ Conformity.
Geinformatics FCE CTU 2008
61
ISO 19115 for GeoWeb services orchestration
ˆ Service type version – in a case of a service.
ˆ Operation name – in a case of a service.
ˆ Distributed computing platform – e.g. Web Services.
ˆ Resource Identifier – e.g. URI.
ˆ Spatial resolution.
INSPIRE specifies other metadata elements, that can be available, but their usage by data
(services) provides is disputable. The same problem is with the second level of metadata,
where usage is based on provider decision. We can expect only following items: resource
title, geographic extent of the resource, resource language, resource topic category, keyword,
resource responsible party, abstract and in some cases temporal reference. That level of detail
is not enough for the orchestration, but it can be used for a basic services selection.
Metadata according to ISO 19115 core
ISO 19115 core is more detailed than INSPIRE requirements and is going to be better applicable for orchestration. But we are still missing for example quality reports. Items in the
core are Mandatory (M), Conditional (C) or Optional (O).
ˆ Dataset title (M)
ˆ Dataset reference date (M)
ˆ Dataset responsible party (O)
ˆ Geographic location of the dataset (by four coordinates or by geographic identifier) (C)
ˆ Dataset language (M)
ˆ Dataset character set (C)
ˆ Dataset topic category (M)
ˆ Abstract describing the dataset (M)
ˆ Distribution format (O)
ˆ Additional extent information for the dataset (vertical and temporal) (O)
ˆ Spatial resolution of the dataset (O)
ˆ Spatial representation type (O)
ˆ Reference system (O)
ˆ Lineage (O)
ˆ On-line resource (O)
ˆ Metadata file identifier (O)
ˆ Metadata standard name (O)
ˆ Metadata standard version (O)
Geinformatics FCE CTU 2008
62
ISO 19115 for GeoWeb services orchestration
ˆ Metadata language (C)
ˆ Metadata character set (C)
ˆ Metadata point of contact (M)
ˆ Metadata date stamp (M)
Metadata according to Dublin Core
Dublin Core is general standard and can be used for definition of own items, but we can not
expect that providers will use such capabilities. They will probably use only simple metadata
items list.
ˆ Title
ˆ Creator
ˆ Subject
ˆ Description
ˆ Publisher
ˆ Contributor
ˆ Date
ˆ Type
ˆ Format
ˆ Identifier
ˆ Source
ˆ Language
ˆ Relation
ˆ Coverage
ˆ Rights
Metadata according to MIDAS database completeness
We have analysed MIDAS database and we can probably expect same providers behaviour in
the future. The following table categorised metadata items according to completeness in the
MIDAS database. MIDAS system contains metadata about 3400 datasets.
Mandatory and conditional items were always filled (was controlled by the system). Optional
items were filled in a case, when list of options was available. Very interesting is completeness
of alternate title, temporal extent (date from), reference data and dataset usage. Out of
interest are quality elements (except lineage).
Geinformatics FCE CTU 2008
63
ISO 19115 for GeoWeb services orchestration
Completeness
80 – 100 %
60 – 80 %
40 – 60 %
20 – 40 %
5 – 20 %
<5%
Metadata items
Title, abstract, coordinate system for metadata, metadata
update, spatial schema, lineage, horizontal spatial accuracy,
update frequency, data structure, format, language, classification, direct coordinate system, responsible party.
Alternate title, temporal extent (date from), planar extent
(by coordinates), reference data.
Dataset usage
Memo, planar extent (by description)
Abbreviated title, version, purpose of production, temporal
extent (by description), metadata language, spatial coverage,
scale, temporal extent (date to).
English title, English abstract, update date, fees, metadata
update plan, vertical spatial accuracy, logical consistency,
completeness, homogeneity, resolution, quality, vertical extent, distribution units, medium, indirect reference system,
vertical reference system, features description
Table 6: Completeness of the metadata items in the MIDAS database
Comparison to ISO 19115 Orchestration Minimal
ISO 19115 Orchestration Minimal
INSPIRE
MD Resolution
LI Lineage/source
DQ CompletenessCommission
DQ CompletenessOmission
DQ AbsoluteExternalPositionalAccuracy
DQ GriddedDataPositionalAccuracy
DQ RelativeInternalPositionalAccuracy
DQ AccuracyOfATimeMeasurement
DQ TemporalConsistency
DQ TemporalValidity
DQ ThematicClassificationCorrectness
DQ NonQuantitativeAttributeAccuracy
DQ QuantitativeAttributeAccuracy
EX BoundingPolygon
EX GeographicBoundingBox
EX TemporalExtent
EX VerticalExtent
SpatialrepresentationType
Language
TopicCategory
MD Keywords
+
+
–
–
–
–
–
–
–
–
–
–
–
+
+
+
+
–
+
+
+
Geinformatics FCE CTU 2008
ISO
19115
core
–
+
–
–
–
–
–
–
–
–
–
–
–
+
+
+
+
–
+
+
–
Dublin
Core
MIDAS*
–
+
–
–
–
–
–
–
–
–
–
–
–
+
+
+
+
–
+
+
+
+
+**
+
+
+
+
+
+
+
64
ISO 19115 for GeoWeb services orchestration
MD
MD
MD
MD
MD
GridSpatialRepresentation
VectorSpatialRepresentation
FeatureCatalogueDescription
CoverageDescription
ImageDescription
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
+**
+
-
Table 7: Comparison to ISO 19115 Orchestration Minimal
* Items completed over 60% has been included
** Partly
The following table shows percent of the items that will be probably included according to
selected standard, directive or system.
Standard, directive, system
INSPIRE
ISO 19115 Core
Dublin Core
MIDAS
Percent of the ISO 19115 Orchestration Minimal items available
34
27
31
42
Table 8: Percent of the ISO 19115 Orchestration Minimal items available
Conclusion
Results of the research are not so optimistic, because we can not expect in any potential
case that metadata are enough detailed for the efficient orchestration. To build orchestras
dynamically needs to use alternative ways, how to evaluate served geodata.
According to results of our research, we have decided to use metadata for geodata, but not
as only single source for geodata evaluation. We are preparing methodology how to deal with
evaluation.
Basic principles of the methodology are summarised in the following points:
ˆ If it is possible use simple orchestras
ˆ Do not base creating groups of similar services on metadata for geodata
ˆ Use experts’ evaluation of the orchestras results to create groups of similar services
ˆ Update groups of similar services according to new results evaluation
ˆ Evaluate simple orchestras’ results as well
If you are interested in the prepared methodology, please read the arcitle that will be published
in the proceedings of the symposium GIS Ostrava 2009.
Geinformatics FCE CTU 2008
65
ISO 19115 for GeoWeb services orchestration
References
CAGI. (2007). MIDAS. 2001- 2007. at http://gis.vsb.cz/midas/, [accessed 2 July 2007].
DCMI. (2007) Dublin Core Element Set v. 1.1. – Reference Description, online1 , [accessed
12 April 2007].
INSPIRE. (2007). DT Metadata – Draft Implementing Rules for Metadata at online2 , [accessed 12 April 2007].
ISO/TC 211. (2003). ISO/FDIS 19115:2003. ISO/TC 211 Secretariat, Oslo, Norway, 152 p.
Růžička, J., Kaszper, R. Opět o metadatech v geoinformatice. Proceedings 1. národnı́ kongres
v Česku – Geoinformatika pro každého, May 29-31 2007, Mikulov, Czech Republic, online3 ,
[accessed 2 July 2007].
Support
The article is supported by Grant agency of the Czech republic GACR as a part of the project
GA 205/07/0797 GeoWeb services orchestration. The article is supported by open
source community as well. We have used open source projects GeoNetwork Open Source,
WSCO, Apache Tomcat, Jetty, Open Office, GIMP, Dia, PostGIS, PHP, PostgreSQL, Apache
HTTP Server, GNU/Linux Ubuntu, GNU/Linux Debian, X11, MySQL, Freefont and others
for this article.
1
http://dublincore.org/documents/dces/
http://www.ec-gis.org/inspire/reports/ImplementingRules/draftINSPIREMetadataIRv2 20070 \
202.pdf
3
http://mikadapress.com/prednasky/Ruzicka.pdf
2
Geinformatics FCE CTU 2008
66
Deriving Hydrological Response Units
(HRUs) using a Web Processing Service
implementation based on GRASS GIS
Christian Schwartze
Department of Geography – Chair of Geoinformatics, Geohydrology and Modelling
University Jena
[email protected]
Keywords: QGIS, GRASS, WPS, PyWPS, Web Processing Service, Python, HRU, Hydrological Response Units
Abstract
QGIS releases equal to or newer than 0.7 can easily connected to GRASS GIS by means of a
toolbox that provides a wide range of standard GRASS modules you can launch – albeit only
on data coming from GRASS. This QGIS plugin is expandable through XML configurations
describing the assignment of options and inputs for a certain module. But how about embedding a precise workflow where the several processes don’t consist of a single GRASS module by
force? Especially for a sequence of dependent tasks it makes sense to merge relevant GRASS
functionality into an own and encapsulated QGIS extension. Its architecture and development
is tested and combined with the Web Processing Service (WPS) for remote execution using
the concept of hydrological response units (HRUs) as an example. The results of this assay
may be suitable for discussing and planning other wizard-like geoprocessing plugins in QGIS
that also should make use of an additional GRASS server.
Brief background
Hydrological Response Units may be considered as spatial entities with the objective of applying them to the process of water modelling. The designation of such regions as assumed
for the present work operates on physiographical characteristics of the catchment area [2]
and aims at its partitioning into zones similar to each other – both topography and dynamic
related. For further information such as various additions you may refer to e.g. [1] and [5].
Details and sub-steps of the derivation used by the planned tool are discussed in section 4.
Geinformatics FCE CTU 2008
67
Deriving Hydrological Response Units (HRUs) using a Web Processing
Service implementation based on GRASS GIS
Architecture
Due to the abundance of tasks a complete HRU derivation consists of, it was decided to split
it into modules developed as processes for PyWPS 2.0.1 [8]. To meet the requirements of a
client/server system, albeit in this case running all components on just one single machine
(including WPS), a user-friendly client enabling the several tasks sequentially would be more
than appropriate and has to be developed. In this context QGIS gets the vote. Not only
on account of the python scripting support in QGIS, but also because of its very well GIS
visualization capabilities equipped with basic, spatial tools. As PyWPS comes with native
GRASS support, consequently all HRU relevant computation is done by GRASS, here version
6.2.2. By the way, the written plugin profits i.e. from the temporary GRASS sessions in
PyWPS since only important main data are swapped out when a HRU task ends – no extra
management of GRASS mapsets is needed. So in that case PyWPS serves as a kind of
middleware between two GIS, or in other words, it separates processing from visualization in
the HRU tool.
Figure 0: Architecture
Extending QGIS
In order to write a new extension for QGIS [6] you start work in an empty subfolder in
/python/plugins/ of your installation directory. The Plugin Manager gets its information
about available python plugins from the primarily created init .py file – the starting point
for all upcoming implementation code. More precisely, the first activation of the plugin by
the installation routine results in a call of the classFactory() function that returns a plugin
instance initiating the toolbar icon, menu entries and other plugin related control items.
The sample HRU plugin
Geinformatics FCE CTU 2008
68
Deriving Hydrological Response Units (HRUs) using a Web Processing
Service implementation based on GRASS GIS
Adaptability concerning the plugin options and functionality is mainly focused during the
development. Later changes and improvements in the HRU derivation process should be easy
to integrate. Hence, a module concept was designed and the phases of the current HRU
work flow were mapped on ready-to-use components instantiated through Python classes. If
you are willing to write some extension for the HRU derivation plugin you have to become
acquainted with the abstract python class HRUModule. Therefore, an own module designed
for the process chain has to be a subclass of HRUModule and has to implement four common
functions:
ˆ SetInput() specifies the layout of a tabbed widget and arranges the necessary input
forms.
ˆ Validate() addresses relevant module input parameters, checks and formats them to a
valid PyWPS parameter string.
ˆ UpdateWizard() manages the modules impact on any other tabbed widget within the
plugin, e.g. enabling subsequent wizard tabs, filling out forms or predefining options in
upcoming tasks.
ˆ UpdateMapView() handles modifications that concern visualisation of map layers and
linked legend entries in QGIS.
The individual processes were implemented according to the guidelines in [4]. Thus, the HRU
derivation was divided into logical units which resulted in seven module classes. Once coded,
you can integrate such modules using the statement
self.wizard.addTab(WaterFlowModule(), WaterFlowModule.MODULE_ICON, WaterFlowModule.MODULE_TAB_DEF)
that embeds a tab in the wizard whose initial state is enabled as long as an other module
releases it. That is why the correct schedule of derivation is guaranteed, however a return
to already performed steps is possible at any time. Especially for testing influence of various
input parameters the backspaces are considered meaningful. In PyWPS [8] each process stores
its assigned and calculated data in GRASS mapsets that do not outlive the end of the process.
That means, a series of n PyWPS tasks is instantiated along with n temporary mapsets whose
names follow the pattern tmpmapset<x>.
In spite of the alternative to handle all processes in only one but persistent and already
existent GRASS location/mapset, the temporary version has been used. So each process
implementation will end with lines containing some g.copy calls. The advantage is that any
interim solution never belongs to user’s location and is removed at the end of the WPS process.
When it is triggered twice (or several times) the GRASS data would just be overwritten by
the WPS process while copying it to the persistent mapset.
The workflow more detailed
All the processes explained in the following subsections have something in common: their
results are relocated from a process-owned temporary mapset to a persistent mapset inside a
predefined GRASS location. In process code stored (estimated) computing time information
proves to be helpful for the user while he tracks the execution in the wizard (see the progress
bar).
Geinformatics FCE CTU 2008
69
Deriving Hydrological Response Units (HRUs) using a Web Processing
Service implementation based on GRASS GIS
Preparation
The QGIS/GRASS based HRU derivation starts with an option dialogue where you have
to specify essential data, including the digital elevation model (DEM), region characteristics
(land use, soils and geology) as well as the locations of gauges. As the first noted are all raster
maps, the latter one should be usually imported as a shapefile. To minimize every kind of
computational effort in pending tasks users have to drag a bounding box keeping the rough
catchment area in mind. The underlying WPS process produces a subimage of each stated
data layer using GDAL/OGR and imports them to a GRASS location locally installed.
Yet another preprocessing task which is integrated into the wizard sequence as a separate
module deals with the DEM to obtain a depressionless elevation model (see the actual but
still disabled Preparation tab next to Setup, not explicitly focused in screenshot of figure 1).
Means, another WPS process is triggered that not only runs r.fill.dir multiple times but also
provides slope and aspect of the area.
Figure 1: Setup module
Reclassification
As long as real-life surface values (gathered from whatever measuring method) represent slope,
aspect and sinkless elevation data, an intersection between them is hard to handle. On that
account the reclassification module expects rules defining classes of categories entered in three
respective tables (figure 2). Recommended ranges may be accepted or changed. Internally,
typical GRASS rule files are written and will serve as input for r.reclass.
Generation of waterflow related maps
Geinformatics FCE CTU 2008
70
Deriving Hydrological Response Units (HRUs) using a Web Processing
Service implementation based on GRASS GIS
Figure 2: Reclassification module
Within the next step you have to make a set of water flow oriented maps available (figure 3).
This includes the drainage direction, the accumulation and the location of watershed basins.
An additional raster map has to point out the segmented stream network (so called ”reaches”).
There is one GRASS analysis tool that covers the computation of all desired maps in a single
command – r.watershed. Unlike in many another WPS processes this almost elementary case
leads to a quite concise task description in Python language.
Speaking about watershed basins means to distinguish between such type of basin derivation
defined by r.watershed and such given through r.water.outlet. The latter GRASS module
determines a basin as you pass a geographic coordinate, e.g. a gauge position. Using for
instance r.water.outlet in a further WPS process and a well placed overlay statement inside
the gauges iteration loop constitutes a solution for a gauge oriented basin map. In terms of
accurate results you will probably have to move gauges onto reaches manually. But this can
be done quickly since QGIS offers a vector data editing mode (figure 3, right).
Overlay strategy
The fifth step by the wizard (figure 5) serves as a special intersection operation between actual
eight preset or calculated raster maps. Latter includes the reclassified DEM, slope and aspect
data as well as soils, landuse and geology information. In addition, the watershed basin map
and the basins relative to gauges in the catchment are required. The idea is shown in figure
4a and consists of following steps:
1. Load the gauge basin map from subsection 4.3 as a reference map for spatial extent of
resulting HRU dataset and construct a map that masks out the relevant area
2. Join the mask and above-mentioned data layers separately using r.patch and apply r.null
Geinformatics FCE CTU 2008
71
Deriving Hydrological Response Units (HRUs) using a Web Processing
Service implementation based on GRASS GIS
Figure 3: Water flow module
to redefine the null value in the new masked datasets
3. Merge the non zero data in the eight maps of (2) via r.cross to a single map
4. Make use of r.clump to relabel occurrences of non adjacent regions which still have the
same category
Figure 4a: Overlay method
This procedure does not yet result in final HRUs since so much spurious, midget areas may
occur. Eliminating almost pixelsized intersection snippets and their reallocation is an essential
part in the postprocessing. In the range of vector data v.clean with correct parameters hits
Geinformatics FCE CTU 2008
72
Deriving Hydrological Response Units (HRUs) using a Web Processing
Service implementation based on GRASS GIS
the spot. The same is true for r.reclass.area on raster maps but with the limitation that
respective areas are filled with GRASS nodata cell value. Filling them taking nearby areas
into account is one solution discussed in [3]. The next script operates in a similar way:
1. Detect areas which are smaller than a specific threshold, e.g. 28125m2 (= 45 pixels,
25m resolution assumed)
2. While such areas exist do:
(a) Get the one-pixel-wide boundary of each area and fill the interior with NULL
(b) For every pixel onto the boundary do:
i. Reassign the category value with largest occurrence in the 3x3 neighbourhood
(corresponds to mode value)
ii. Mark the left NULL values as removable, minimal areas (pink colored in 2 and
3, figure 4b)
As indicated in the output map (4, figure 4b) snippets are just not reallocated to one neighbour
region but rather melt into adjacent areas proportionally. When the superior WPS process has
done that kind of cleaning the HRUS obtain their final form. However, it raises the question as
to whether the underlying data associated with each HRU is still significant. Due to the fact
that the cleaning algorithm manipulates the original overlay map (see above) depending on
the number of eliminating areas and their location to each other, any dominant characteristic
(e.g. soil type) could be changed. For this reason a further script takes the regenerated and
cleaned HRU map as a type of template. Based on it all data layers are checked to determine
a potentially new raster category that accounts for a major portion within each HRU. This
is done by calling r.statistics plus mode method as aggregation option.
At the end of the overlay section it appears to be appropriate to store these gained and
probably new categories as labels to the HRU raster map. A piped combination of r.stats,
some awk commands and r.reclass on the cleaned HRU data helps writing a vertical bar
separated label entry that represents values for the linked data layers:
[...]
#var inputs: list of data layers (with new determined raster values)
inputs = inp_list.rstrip(",")
awk_cmd = "’{print $1,\" = \",$1,"
for i in range(1, len(inputs.split(","))+1):
awk_cmd += "$"+str(i*2)+"\"|\""
awk_cmd += "}’"
g_cmd = "r.stats -l input=%s | " % inputs
g_cmd += "awk %s | " % awk_cmd
g_cmd += "r.reclass --o input=%s output=%s_result" % \
(os.getenv("GIS_OPT_INPUT"), os.getenv("GIS_OPT_INPUT"))
os.system(g_cmd)
[...]
Topological network
While the last preceding paragraph has created the prerequisites to feed physiographic properties into some model the next section focuses on how to include relations between HRUs.
It aims at pointing out drainages from one HRU into others, furthermore into streams inGeinformatics FCE CTU 2008
73
Deriving Hydrological Response Units (HRUs) using a Web Processing
Service implementation based on GRASS GIS
Figure 5: Overlay module
side catchment (routing). Therefore, the topological sequence acquisition is bipartite and
exemplified by figure 6 where pink lines demonstrate HRU borders:
”HRU to HRU“
1. Respectively do a r.mapcalc to get
(a) borderlines of the HRU map
(b) drainage direction only on borderlines from (1) – see step 1, figure 6
(c) drainage destination (ID of HRU) only on borderlines – see step 2, figure 6
(d) accumulation data only on borderlines – see step 3. figure 6
2. Do a non null overlay only (r.cross -z) between HRU source map and (1.3) to hold the
HRU to HRU“ relation as raster labels
”
3. Use (2) as base map in r.statistics to sum up accumulation data with regards to one
and the same destination HRU – see step 4, figure 6
4. Finally overlay again (r.cross) to append the accumulation sums (3) to the HRU to
”
HRU“ relation map (1.3)
As is evident, the operations take advantage of r.cross twice. Consequently, all required
information about relations within the topological sequences is summarized in HRU raster
labels up to sample "category <from hru>; category <to hru>; <amount>". That proves
true when you have a look into the GRASS category file (/cats subdirectory) of the result
layer:
[...]
2:category 10; category 19; 53
3:category 10; category 20; 14
4:category 17; category 39; 141
[...]
Geinformatics FCE CTU 2008
74
Deriving Hydrological Response Units (HRUs) using a Web Processing
Service implementation based on GRASS GIS
Figure 6: Relation HRU to HRU
According to the first two lines, HRU 10 drains into HRUs 19 and 20 to the value of respectively 53 and 14. Using this GRASS category file as an input for a small awk script topology
information could be easily transformed to a more general format that joins one-to-many
HRU relations into one output row:
[...]
10
19,53
17
39,141
[...]
20,14
As mentioned earlier the topology delineation is separated into two parts: One part was just
discussed, the other one is still outstanding. Instead of draining into nearby HRUs it also
would be thinkable that water flows directly into any reaches before. The fact implicates
some changes in comparison to the prior approach (in figure 7 let’s assume that blue lines
illustrate the stream network):
”HRU to reach”
1. Do a r.mapcalc considering a stream buffer into account – with the objective to get the
reaches in which stream neighbour cells flow (see figure 7)
2. Perform nearly the same operations like in ”HRU to HRU” beginning with (1.4) but
Geinformatics FCE CTU 2008
75
Deriving Hydrological Response Units (HRUs) using a Web Processing
Service implementation based on GRASS GIS
ignore accumulation accurately located on streams
Figure 7: Relation HRU to reach
Since step 1 marks reaches as negative numbers to avoid confusions with HRU identifiers
the process can carry on with parsing the category file as already done for ”HRU to HRU”.
Concluding work concatenates both into a final and all-embracing topology report. To this
end, tools from UNIX command line are employed, for instance sort and join. Only on that
condition meaningful weights (with regard to total flow-out of every HRU) are feasible with
few awk instructions, i.e.:
AWK_calc_weights_in_topo = "’BEGIN {print \"#TOPOLOGY N:M * FORMAT: <Source-HRU> <Dest-HRU>| \
<Dest-Reach>;<Rate>[ <Dest-HRU>|<Dest-Reach>;<Rate> ...]\"} \
{for (i = 2; i <= NF; ++i) \
{split($i,a,\",\"); \
sum = sum + a[2]; } \
line = $1; \
printf line\" \"; \
for (i = 2; i <= NF; ++i) \
{split($i,a,\",\"); \
printf a[1]\";\"\"%.3f\"\" \", a[2]/sum; } \
sum=0; \
print \"\"; \
line = \"\";} \
END {}’"
The ultimate result looks like:
[...]
1542
1543
1547
1568
[...]
1543;0.640
875;0.955
1165;0.382
1482;1.000
1655;0.175
1655;0.001
1377;0.176
1934;0.010
-12;0.044
1468;0.029
-14;0.004
-12;0.171
1629;0.412
Conclusion
The duration of the whole derivation process in QGIS depends on the size of the selected
subregion during initial wizard step (setup). The larger the chosen bounding box, the more
noticeable the increase of computing time (see table 1). This is mainly attributed to the
water flow oriented section of the wizard using r.watershed in the backend. At the expense of
Geinformatics FCE CTU 2008
76
Deriving Hydrological Response Units (HRUs) using a Web Processing
Service implementation based on GRASS GIS
Figure 8: Topology module
execution time the GRASS module yields more accurate maps than r.terraflow [7], for which
reason it was preferred. However, it remains to check whether [9] may considerably improves
the performance of the watershed basin analysis. Or should process implementation changed
by substitution with r.terraflow, provided that whose output raster maps are barely exact
enough for the HRU derivation work? There is also a need for optimization regarding to
that part of the overlay algorithm where resulting HRUs are relabeled after removing midget
areas. Actually, a simple r.reclass statement does the job but not very fast which may affect
the total computation time, too.
watercourse, gauge
Erlbach, Thieschitz (Thuringia,
GER)
Hasel, Ellingshausen (Thuringia,
GER)
Gera,
Erfurt-Möbisburg
(Thuringia, GER)
catchment size
105 km2
number of HRUs
2116
duration
12 min
340 km2
6832
45 min
850 km2
16696
2.5 h
Table 1 – Performance of the HRU derivation in GRASS using the QGIS extension
References
1. Flügel, W.A. (1995): Delineating Hydrological Response Units by Geographical Information Systemanalyses for regional hydrological modelling using MMS/PRMS in the
drainage basin of the river Bröl, Germany. In: Kalma, J.D. & Sivapalan, M. (1995):
Scale Issues in Hydrological Modelling. 183-194
Geinformatics FCE CTU 2008
77
Deriving Hydrological Response Units (HRUs) using a Web Processing
Service implementation based on GRASS GIS
2. Leavesley, G.H.; Lichty, R.W.; Troutman, B.M.; Saindon, L.G. (1983): PrecipitationRunoff Modeling System; Users Manual, Denver
3. Neteler, M. and Mitášová H. (2008): Open Source GIS: A GRASS GIS Approach, Third
Edition, Springer, ISBN 978-0-387-35767-6
4. Pfennig, B.; Fink M.; Krause P.; Müller Schmied H. (2006): Leitfaden für die Ableitung
prozeßorientierter Modelleinheiten (HRU’s) für die hydrologische Modellierung
5. Staudenrausch, H. (2001): Untersuchungen zur hydrologischen Topologie von Landschaftsobjekten für die distributive Flußeinzugsgebietsmodellierung. Dissertationsschrift. Jena
6. http://www.qgis.org/ – QuantumGIS
7. http://grass.itc.it/ – Geographic Resources Analysis Support System
8. http://pywps.wald.intevation.org/ – Python Web Processing Service
9. http://markus.metz.giswork.googlepages.com/r.watershed fast version.tar.gz – Metz, M.
(2008): r.watershed.fast
Geinformatics FCE CTU 2008
78
Toolbar icons for GIS applications
Robert Szczepanek
Institute of Water Engineering and Water Management, Cracow University of Technology
robert.szczepanek iigw.pl
Keywords: icon, GIS, usability, GUI
Abstract
Graphical user interface is an important element of today software. Discussion on design
aspects of toolbar icons is presented. Three concepts related to GIS applications are proposed.
Preliminary icon set gis-0.1 oriented to usability and simplicity is outlined.
Introduction
Graphical user interfaces (GUI) become standard element of desktop applications. Toolbar
icons are probably the most frequently used elements of GUI. Some of them are universal
(fig.1), some are commonly used in certain domain (fig.2) and some are application specific
(fig.3).
Fig.1 Universal icons
Fig.2 Domain specific icons – GIS
Fig.3 Application specific icons – QGIS
GIS applications are different and have different interfaces. This is good, because we like diversity. The philosophy and implementation of GIS functions is different among applications.
But do they really should use different symbols for the same objects and actions? Why traffic
signs are (almost) the same among different countries? Shouldn’t we try the same in our
domain?
Geinformatics FCE CTU 2008
79
Toolbar icons for GIS applications
If you feel familiar with GIS applications try a short quiz1 by Karsten Berlin at [1]. As will
be shown later, even simply icons like import and export can be misunderstood. My proposal
is towards icons lerning curve shifting from application specific group to domain one (fig.4).
This is more matter of symbology, not final visual implementation, so every GIS application
can keep its identity untouched. I don’t intend to present ”the only right” solution, rather
present my voice in discussion.
Fig.4 Icon learning curve
Behind the scene – meaning of words and symbols
Lets start from very beginning. Analyzing different application I found that simple operations
like add, new and create are treated as synonyms and often mixed in any combination. Is it
correct? According to definitions in table 1 not exactly.
We can treat new and create as synonyms, but create is an action, while new isn’t. They are
both related to object that didn’t exist, while add is used for operation on existing objects.
So there are two basic actions. Create when we bring into existence. For example create
layer in the sense of creation of new layer. Add when we put existing object into some group.
For example add layer to composition/group of layers. Looking at object’s death (tab.2) we
find more serious existential problems.
The first problem is that we have cross-definition. Erase is defined by delete and remove,
while delete by erase and remove (underlined). Delete and remove seems to be simpler
cases. Removed objects after this operation still exist. We only change their properties. So
it can be treated as reverse operation to adding. Delete operation results in annihilation of
object. Erasing can be used in both context, so should be avoided or used only in sense of
1
http://www.karsten-berlin.net/gisusability.php?top=games
Geinformatics FCE CTU 2008
80
Toolbar icons for GIS applications
add
verb
new
adjective
create
verb
http://www.merriamwebster.com
1: to join or unite so as to bring
about an increase or improvement
4: to include as a member of a
group
having recently come into existence
to bring into existence
http://www.thefreedictionary.com
to join or unite so as to increase in size,
quantity, quality, or scope
having been made or come into being only
a short time ago; recent
to cause to come into existence
Table 1 – Meaning of words: add, new and create.
erase
verb
delete
verb
remove
verb
http://www.merriamwebster.com
1 a: to rub or scrape out (as written, painted, or engraved letters)
d: to delete from a computer
storage device
to eliminate especially by blotting out, cutting out, or erasing
to change the location, position
http://www.thefreedictionary.com
to remove (recorded material) from a magnetic tape
or other storage medium
to remove by striking out or canceling
to move from a place
Table 2 – Meaning of words: erase, delete and remove.
object cleaning without annihilation. Finally we get the following antonyms: create – delete,
add – remove.
How this is related to visual representation we can check in table 3. Results are based on
Google picture search mechanism. First 100 hits of search were generalized. This method is
neither representative nor objective, but gives a rough picture on how different actions are
visualized.
add
new
create
54
4
7
1
4
3
9
4
erase
delete
remove
–
4
15
2
58
31
14
1
3
19
4
Table 3 – Basic action icons representation based on first 100 hits in Google picture search.
The most unambiguous sings are
corresponding to add action, and
corresponding to
delete. Both are very universal and have no connotation with any specific object. For creation
Geinformatics FCE CTU 2008
81
Toolbar icons for GIS applications
action I would recommend
with
sign because
is less neutral. Remove action is identified
sign but at the same time this sign is better known as delete action, so we take
sign. For erase action we have
second the most frequent,
Unfortunately not better sign was found yet.
sign, which is not neutral.
Finally, we get the following set of signs:
create
delete
add
remove
erase (hopefully to be replaced in the future by more neutral sign)
Toolbar icons from GIS application perspective
Icons in toolbars are used as comfortable shortcuts to commands. Good icon should be
unambiguous and easy to remember [3]. Apart of artistic and visual aspects, there are also
some technical issues in icon design.
Size
Due to limited area for toolbars and number of potential icons in application, one of critical
elements is icon size. Icon size determines its recognizability, so we can’t make it too small.
But available workspace is also limited and depends on standard display resolution, which
changes constantly. So icon size is compromise between screen resolutions, our perception
capabilities and available space within application. Usually set of icons with different sizes
is prepared. Depending on icon size different levels of detail are visualized. Suggested for
Windows toolbar icon sizes are 16x16, 24x24 and 32x32 pixels [2][7]. In Microsoft’s recommendations we can read that for this size of icon simplification is recommended. So we forget
about photorealistic pictures. GIS and CAD applications run usually on big monitors, so
16x16 pixels icons are really small ones. Two following two sizes are thus to be considered as
basic.
Perspective, lights and shadows
Toolbar icons should be always flat, not 3D, even at the 32x32 size [7]. In some cases this is
difficult to achieve. One of such symbols is layer, which will be discussed later. According
Microsoft suggestions, for flat icon lighting comes from the upper-left at 130 degrees and
parallel light rays produce shadows that all have the same length and density. However use
of shadows in icons at 24x24 or smaller size is not recommended [5][7].
Colors
In interface design, color is often overused. One of the most important points is that color
table must be consistent, so aggressive colors close to pastel ones doesn’t look good. Color
Geinformatics FCE CTU 2008
82
Toolbar icons for GIS applications
is often used to communicate status. The interpretation of red, yellow, and green for status
is consistent globally [7]. However, color should not be used as primary medium of message.
There are different methods to utilize saturation or hue to reinforce icons message. Are also
other methods to play with visual effect, like gradients making picture more realistic. Toolbar
icons should not use colors and design similar to other elements of interface, e.g. warning
alerts [3].
File format and naming conventions
Icon for toolbar can be saved in many different formats. The most popular is still raster, but
vector format seems to take this place in near future.
When drawing icon usually transparency is needed. Transparency can have 256 levels in 8bit alpha channel file formats (PNG, TIFF) or 2 levels in 1-bit case (GIF, PNG) when one
color is selected as transparent. This transparent color should be chosen carefully. The most
popular and safe color is magenta (#FF00FF). From raster formats PNG seems to be the
most suitable, and from vector formats SVG. Presently, the complete procedure of icon design
is the following:
ˆ paper and pencil – initial concept, sketch
ˆ vector program – primary, scalable digital version
ˆ raster program – final raster version
Some designers skip first or even first two steps. To make raster icons from vector file is
not so straightforward, and for smaller icons picture have to be generalized. Also simply
downscaling from big raster icons to smaller size doesn’t work [7].
Simple and consistent naming convention of icon files can be advantageous. Good example
of such consistency can be Quantum GIS (QGIS):
ˆ mActionAddRasterLayer.png – for adding action on raster layer
ˆ grass add vertex.png – for GRASS modules
Icon as message
What makes an icon – shape, content, color? All mentioned elements are important but their
role is different.
Icon shape changed recent years from rough 2D pictures to photo realistic visualization.
Windows aero (Vista) icons set compared to previous version (XP-style) is more realistic
than illustrative, toolbar icons have less detail and no perspective to optimize for smaller
sizes and visual distinctiveness [7]. Visualization technologies fascination will end, when we
understood that effective pictogram recognition is not the matter of realism level but rather
association.
Content is the most conservative element and once spread out, becomes standard de facto.
Good example of such standard is icon for save operation. Everyone recognizes icon with
3,5” diskette instantly, but who in 5 years will know what is shown on that icon? Sometimes
Geinformatics FCE CTU 2008
83
Toolbar icons for GIS applications
content is not directly related with function and when used in domain specific icon group can
be difficult to recognize by new user. There are many discussions on that problem – should
we be conservative preserving old symbols, which are part of our history or try to find better
ones.
Understanding of color’s role and its usage changed when accessibility started to be an important issue. Any message, including graphics, should be accessible to everyone, so color
cannot be used as primary or unique method of communication. In time of globalization this
is a big challenge but color related problems are even more complicated. Colors and symbols
have cultural context and sometimes even religious connotations. In some places white color
is related to wedding while in others with funeral. The same problem is with black. But not
only the color is very sensitive element of message. Drawing forefinger we do not know often
what connotation it has in other cultures.
The last important element of icon communication is context in which it exists. Left arrow
can represent direction of movement, speed of movement or some conventional operation like
undo, import or export. It depends on neighboring icons. Context can simplify of complicate
message, so icons final location should be considered already at design stage.
Snapshot of selected GIS toolbar icons
Just to give an idea of diversity and different approaches in design on following figures (5-14)
selected GIS applications toolbar are presented.
Fig.5 GRASS 6.3 toolbar.
Fig.6 QGIS 1.0 toolbar.
Fig.7 ArcMap toolbar.
Fig.8 GeoMedia Viewer 5.2 toolbar.
Geinformatics FCE CTU 2008
84
Toolbar icons for GIS applications
Fig.9 gvSIG toolbar.
Fig.10 IDRISI32 toolbar.
Fig.11 MapInfo 8 toolbar.
Fig.12 OpenJUMP toolbar.
Fig.13 Thurban toolbar.
Fig.14 uDig toolbar.
Implementation of gis-0.1 icon set for GRASS and QGIS
When designing GIS domain icons, several assumptions were taken into account. Some of
them are obvious, but hard to implement like recognizability and transferability. Others
are controversial, but in my opinion worth to test. GRASS (with wxPython) and QGIS were
chosen for tests implementation. Both applications are ready for easy themes implementation,
so everyone is able to customize icon sets. New, wxPythons-based GUI of GRASS [6] uses as
standard silk icon set [8] which is nice and well designed, but not always able to address GIS
needs. There are also other interesting projects related to icons development, like Tango [10],
but all of them are of general purpose.
Toolbar block context
Geinformatics FCE CTU 2008
85
Toolbar icons for GIS applications
There are two approaches to icon design within toolbar. First one is declarative. Icon is selfexplaining without any additional information. Making icon for ”add layer” we need object
(layer) and action (add) picture. Second one is simplified (contextual). In this approach we
divide toolbar to caption with object (inactive) and icons with only actions. So ”add layer”
can be represented just by action (add) and the object will be known from context – layer
toolbar.
Concept 1a: Where possible, decompose object from action and create icons consisting of
both elements.
This concept is based on methodology described be Y.Gilyov [4]. Icon can be solid or contain
two elements – object part and action. Where possible, object-action approach should be
used. If action primitives are well defined, they become reusable. It simplifies regonizability.
Good example in this direction is ‘add’ action, which is used in wide range of icons. Action
part should be placed probably in lower right part of the icon, framed by semi-transparent
background (fig.15). Transparency enables partial use of action area by object part, while
not disturbing too much action part. There is only one limitation. As space for action part
is very limited, action primitive must be really simple.
Fig.15 Object-action method of icon design
Concept 1b: Group icons by object.
The second (contextual) design is probably more scalable and easier to implement especially
for small size icons. We need just one set of action icons for any object – add, remove, etc.
In many applications it is difficult to figure toolbar context. Usually we know it just because
we use application, but for beginner this is a big challenge.
Sometimes simplified design leads to misunderstanding. The most popular and most frequently used icons (new, open, save) are first in toolbar. But they are without any additional
information. We know that they correspond to the root object in object’s tree. But sometimes it is difficult to guess what is the root object. In GIS application it can be composition
(IDRISI), mapset (GRASS), project (QGIS) or maybe something else. Why not to show it
explicitly.
Here we come to conclusion – every simplified toolbar should have at the beginning graphical
caption (icon) representing object (fig.16). Of course the visual representation should be
different from action icons.
Content
Icon should be simple and easy to guess. Let’s analyze GIS related symbols from table 4.
Geinformatics FCE CTU 2008
86
Toolbar icons for GIS applications
Fig.16 Contextual method of icon design
close
84
refresh
65
10
save
60
10
edit
53
5
display
33
9
open
20
12
4
map
15
14
9
export
15
7
import
12
5
exit
11
11
pan
11
4
layer
6
5
show
5
1
9
Table 4 – Common GIS icons representation based on first 100 hits in Google picture search.
The most unambiguous sign is
corresponding to close action. But we decided to use it for
delete action already. One of possible solutions can be use of synonym which in this case is
exit action represented by
. Save icon have two main symbols with predominance of
.
But technology changes very fast. What to do with historical object in our icons? Is it better
to use physical objects or some metaphors?
Concept 2: New, more neutral objects or metaphors can replace some old-technology icons.
There’s a push to get rid of the file-folder metaphor and floppy disk 3,5” for saving. Icon
should not rely on current technology visualisation. Those symbols are used because everyone
is familiar with them. Second sign
is far more neutral and universal. Similar situation is
with open action, which is related to folder picture and arrows.
Map icon is very difficult case. Regular connotation with globe
is proper one, but not
the best from GIS point of view. Second the most frequent is 3D view of paper map
.
On import/export example we can see problems of interpretation. In this case majority is
probably right and when we export, arrow must go ”from” object. Synthesis of this action
Geinformatics FCE CTU 2008
87
Toolbar icons for GIS applications
with proposal of more neutral icons for open and save actions is presented on fig.17.
Fig.17 Basic actions – export, import, open, save.
Pan operation is represented by
or fingers, but we must remember about cultural con-
. Layer object
notations, so this sort of signs should be avoided. So for pan we choose
is represented by three parallel rectangles with supremacy of 2D view. Show operation is
assigned human eye sign
.
Explicit or not
Last concept is based on observation that for fast and easy perception not whole object is
needed.
Concept 3: Not whole object or symbol must be shown, to be recognized properly.
This can be seen in favicons design and in some modern interfaces. One of good implementation examples can be VirtualBox2 interface. If properly designed, this could solve problem
with very limited size of icon. At this stage of research implementation of this concept was
not tested yet.
Final note
Presented concept and practical implementations of gis-0.1 icons set are still under development. Recent version is available under http://www.szczepanek.pl/icons.grass.
References
1. Berlin K. (2007), GIS usability games online3
2. Creating Windows XP Icons, Windows XP Technical Articles, 2001 online4
3. Designing toolbar icons, Apple Human Interface Guidelines online5
4. Gilyov Y. (2007): Designing an iconic language online
6
5. Kortunov D. (2008): 10 Mistakes in Icon Design online7
2
http://www.virtualbox.org/wiki/Screenshots
http://www.karsten-berlin.net/gisusability.php?top=games
4
http://msdn.microsoft.com/en-us/library/ms997636.aspx
5
http://developer.apple.com/documentation/UserExperience/Conceptual/AppleHIGuidelines/XH \
IGIcons/chapter 15 section 9.html
6
http://turbomilk.com/blog/cookbook/usability/designing an iconic language/
7
http://turbomilk.com/blog/cookbook/criticism/10 mistakes in icon design/
3
Geinformatics FCE CTU 2008
88
Toolbar icons for GIS applications
6. Landa, M. (2007): GUI development for GRASS GIS. Geoinformatics FCE CTU 2007,
Workshop Proceedings Vol. 2, Prague online8
7. Microsoft Windows Vista Development Center online9
8. Silk icons online10
9. Szczepanek R. (2008): Custom icons for GRASS online11
10. Tango Desktop Project online12
8
http://geoinformatics.fsv.cvut.cz/wiki/index.php/GUI development for GRASS GIS
http://msdn.microsoft.com/en-us/library/aa511280.aspx
10
http://www.famfamfam.com/lab/icons/silk/
11
http://www.szczepanek.pl/icons.grass/
12
http://tango.freedesktop.org/Tango Desktop Project
9
Geinformatics FCE CTU 2008
89
Geinformatics FCE CTU 2008
90
Projekt OpenStreetMap z pohledu
geoinformatika
Daniel Bárta
Institute of Geoinformatics, VSB-TU of Ostrava
[email protected]
Keywords: OpenStreetMap, open geodata
Klı́čová slova: OpenStreetMap, otevřená geodata
Abstract
This thesis discusses conditions suitable for creation of open-licensed geographic data, distinguishes different levels of opennes. It focuses the OpenStreetMap community project, which
has the aim to create and provide free geographic data. This paper gives a brief insight to the
project, presents its key features and its history.
Abstrakt
Práce pojednává o podmı́nkách vhodných pro vytvářenı́ geodat se svobodnou licencı́, rozlišuje
různou úroveň jejich otevřenosti. Dále se zaměřuje na komunitnı́ projekt OpenStreetMap, který
vytvářı́ a udržuje svobodná geografická data. Poskytuje prvotnı́ náhled na projekt, seznamuje
s jeho klı́čovými vlastnostmi a vývojem.
Od open source k open geodata
Koncem 80. let 20. stoletı́ začala vznikat, snad nejprve mezi programátory, potřeba vytvářet
svobodné/otevřené programové vybavenı́. Snahy jednotlivců o vytvořenı́ vhodných licencı́ pro
publikovánı́ programů, propagace a přı́padně hájenı́ práv autorů a uživatelů byly později spojeny pod hlavičkou nadace Free Software Foundation GNU, nebo neziskové organizace Open
Source Initiative. S odstupem času můžeme řı́ci, že mnohé projekty vzešlé z této myšlenky
Geinformatics FCE CTU 2008
91
Projekt OpenStreetMap z pohledu geoinformatika
hrajı́ významnou roli v mnoha odvětvı́ch informačnı́ch technologiı́ – nápad několika nadšenců
se změnil ve fenomén. Pro přı́klad uved’me jádro operačnı́ho systému GNU/Linux, který
je šı́řen pod často užı́vanou licencı́ GNU GPL, která je dnes ve třetı́ verzi. Klı́čovým prvkem všeobecného rozšı́řenı́ otevřeného softwaru byl přesun hardwarového vybavenı́ ze sálů
výpočetnı́ch středisek na každý pracovnı́ stůl v zaměstnánı́ či domovech.
Obdobným procesem prošel i hardware geoinformatiky a přı́buzných oborů. V devadesátých
letech 20. stoletı́ byl uveden do provozu a zpřı́stupněn veřejnosti projekt americké armády
Navstar GPS [1]. Přijı́mače družicového signálu se z ponorek a amerických letadlových lodı́
postupně dostávajı́ do každého motorového vozidla, do rukou turisty. Prvotnı́ potřeba běžných
uživatelů byla zjišt’ovánı́ polohy a navigace, později přibyla i zábava jako napřı́klad geocaching.
Nenı́ tedy žádný důvod proč by obdobný proces jako bylo osvobozenı́ programového kódu
nemohl začı́t v oblasti geoinformatiky a také, což je i tématem této práce, osvobozenı́ geodat.
Otevřenost geodat
Free Software Foundation popisuje možnost nahlı́žet na počı́tačové programy skrze mı́ru svobody, s jakou lze s nimi pracovat.[2] Analogiı́ tohoto přı́stupu, použitou na geodata, pak
můžeme uvažovat:
I. svoboda
Možnost zobrazit data (metadata), za jakýmkoliv účelem.
Těchto možnostı́ je dnes mnoho, jak prostřednictvı́m produktů komerčnı́ch subjektů, tak
státnı́ch organizacı́. Pro zobrazovánı́ dat využı́vajı́ bud’ účelově sestavený nebo standardizovaný mapserver. Využitı́ dat je dı́ky licenci možné pouze pro zobrazenı́ a osobnı́ potřebu,
informace o metadatech jsou k dispozici jen z mlhavých dedukcı́ uživatelů. Nejjednoduššı́
způsob provedenı́ rozhranı́ mapserveru jsou v běžném internetovém prohlı́žeči zobrazitelné
webové stránky na technologiı́ch HTML, Javascript, AJAX. Jsou přı́stupné zpravidla veřejně
a bez registrace, bývajı́ přizpůsobené pro uživatele avšak nemajı́ rozhranı́ vhodné a standardizované pro strojové zpracovánı́.
České komerčnı́ mapové servery obsahujı́ obvykle družicové nebo letecké snı́mky, automapu,
uličnı́ mapy měst, turistické mapy nebo trasy, přı́padně staré mapy z 19. stoletı́. Jsou začasto
omezené územı́m Česka, přı́padně nejbližšı́ch sousedů. Přı́kladem může být:
ˆ http://amapy.atlas.cz
ˆ http://mapy.seznam.cz
ˆ http://supermapy.centrum.cz
Zahraničnı́ mapservery obsahujı́cı́ relevantnı́ data k územı́ České republiky jsou typické s nižšı́
kvalitou a stářı́m geodat, nebot’ jejich původci jsou cizı́ organizace, majı́cı́ těžiště zájmu mimo
ČR. Poskytovány jsou zejména družicové nebo letecké snı́mky, automapy a uličnı́ mapy měst.
Napřı́klad:
ˆ http://maps.google.com
Geinformatics FCE CTU 2008
92
Projekt OpenStreetMap z pohledu geoinformatika
ˆ http://maps.yahoo.com
ˆ http://maps.live.com
Výjimečně se na českém Internetu objevujı́ netypické služby, zpřı́stupňujı́cı́ dı́lčı́ části státnı́ho
mapového dı́la jako napřı́klad:
ˆ vizualizace UIR-ADR na RZM10 od MPSV1
Pokročilejšı́ způsob výměny vizualizovaných geodat poskytuje služba standardu WMS provozovaná obvykle spolu s mapserverem, kterou lze snadno dále využı́vat v programovém
vybavenı́ nebo automatizovaně zpracovávat. Napřı́klad[14]:
ˆ WMS CENIA2 (neposkytuje korektnı́ výstup pro EPSG:4326)
ˆ WMS Oblastnı́ plán rozvoje lesa ÚHUL3
ˆ WMS Katastrálnı́ mapa ČÚZK4
II. svoboda
Možnost studovat data a metadata a adaptovat je ke svým potřebám. Předpokladem
je přı́stup k zdrojovým datům.
Zde už je možnostı́ výrazně méně. Můžeme sáhnout po ucelených komerčnı́ch sadách subjektů (ČÚZK viz tabulka, Arcdata, T-Mapy, ...). U těchto datasetů je však licence obvykle
limitována – tedy k dispozici je sice forma zdrojových dat, ale způsob využijı́ je podstatně
omezen.
Název balı́ku dat
Zabaged polohopis
Zabaged výškopis
Ortofotomapa ČR (0,5m/px)
Cena za
3.700.000
1.000.000
2.400.000
územı́ ČR
Kč
Kč
Kč
Ukázka ceny dat, cenı́ku ČÚZK platný od 1. 1. 2007, převzato z [3]
Pro některá data rastrového datového modelu (např. letecké snı́mkovánı́ ve viditelném spektru) lze poskytnout zdrojová data skrze WMS službu. Vhodný způsob poskytovánı́ zdrojových
dat vektorového datového modelu je WFS služba. Jedny z mála WMS/WFS služeb provozuje
ÚHUL:
ˆ WFS ÚHUL – lesnı́ pokryv ČR5 (aktuálně nedostupné)
ˆ WMS ÚHUL – panchromatické letecké snı́mky ČR, zdroj dat ČÚZK6
III. svoboda
1
http://mapy.mpsv.cz:8080/mapy2/mpsv2.html
http://geoportal.cenia.cz/
3
http://geoportal2.uhul.cz/cgi-bin/oprl.asp?service=WMS
4
http://wms.cuzk.cz/wms.asp
5
http://212.158.143.149/cgi-bin/wfs?service=WFS
6
http://geoportal2.uhul.cz/cgi-bin/oprl.asp?service=WMS
2
Geinformatics FCE CTU 2008
93
Projekt OpenStreetMap z pohledu geoinformatika
Možnost vytvářet kopie a volně je distribuovat.
Pro typický přı́klad se musı́me poohlédnou do USA, kde je na data vytvořená státnı́mi organizacemi uplatňována nejčastěji licence public domain, tedy poskytovánı́ dat zdarma avšak
bez záruky:
ˆ vektorová data: NIMA (VMap0, VMap1), US CENSUS (Tiger)
ˆ rastrová data: NASA (DEM, Landsat 7, SRTM) Majı́ celosvětové pokrytı́ v měřı́tkách
do 1:1 000 000 nebo podrobnějšı́ pro vybraná územı́ zájmu USA (USA, Mexiko, část
bývalého SSSR).
V České republice lze taktéž uvažovat o volně dostupných datových sadách s možnostı́ redistribuce, nicméně u nich neexistuje formálně definovaná licence, byt’ napřı́klad gestor MPSV,
nebo ŘSD volné nakládánı́ s daty neformálně předpokládá nebo připouštı́, naopak např. HEIS
VÚV se stavı́ proti. Obecně je postoj organizacı́ a jednotlivců k poskytovánı́ vlastnı́ch dat
třetı́m stranám ve znamenı́ neochoty a nejistoty v definovánı́ vlastnı́ licence. V přı́padě souhlasu se jedná právně neformulovaný ústnı́ nebo do e-mailu verbalizovaný souhlas. A to i v
přı́padě, kdy vznikajı́ z veřejných prostředků a jsou ve zdrojovém formátu veřejně dostupné
nebo výsledek volnočasové aktivity jedinců.[14][15] Na Českém územı́ se jedná napřı́klad o
datasety:
ˆ registry:
– UIR-ADR7 gestora MPSV
– UIR-ZSJ8 gestora ČSÚ
ˆ vektorová data:
– generalizovaná komunikačnı́ sı́t’9 Silničnı́ databanky Ostrava správce ŘSD
– vodnı́ toky10 Povodı́ Labe.
Možnost data upravovat, odvozovat jiná a tyto změny veřejně sdı́let. Předpokladem
je přı́stup k zdrojovým datům.
Existujı́ licence, které definici splňujı́ nebo vynucujı́, avšak datové sady šı́řené pod touto
licencı́ v Česku nejsou známy vyjma OpenStreetMap.
Předpoklady pro vznik open-geodata projektu
Vznik projektu zaměřenı́ na vytvářenı́ původnı́ch open-geodat (přı́padně OpenStreetMap a
obdobných) je obvykle motivován:
ˆ Absentujı́cı́mi geodaty, přı́padně existujı́cı́ geodata nejsou dostupná veřejně a za dostatečně volných podmı́nek.
ˆ Lidskou potřebou tvořit a vytvářet hodnoty i mimo činnost finančně honorovanou.
7
http://forms.mpsv.cz/uir/
http://www.czso.cz/csu/rso.nsf/i/prohlizec uir zsj
9
http://www.rsd.cz/rsd/rsd.nsf/0/DFFC2FF000FC1FB3C1256DBF002CCEE3
10
http://www.pla.cz/planet/ram.aspx?id=21
8
Geinformatics FCE CTU 2008
94
Projekt OpenStreetMap z pohledu geoinformatika
ˆ Potřebou sdı́let své znalosti a výsledky bez restrikcı́ a poskytovat je komunitě.
a předpokládá:
ˆ svobodu pohybu
ˆ volný čas (po práci, po škole)
ˆ levný a dostupný hardware
ˆ přı́stup ke službám (GPS, Internet)
Za těchto okolnostı́ může vzniknout komunitnı́ projekt. OpenStreetMap (OSM) nenı́ samozřejmě prvnı́ projekt zaměřený na vytvářenı́/soustředěnı́ geodat. Nejčastěji ho předcházely
mapy vytvářené uživateli přijı́mačů/navigátorů GPS Garmin. Později v západnı́ Evropě vznikajı́ lokálnı́ mapy na podobném principu jako OSM, účelové mapy např. pro projekt Wikipedia, speciálnı́ nebo lokálnı́ mapy, nebo vytvořenı́m jednotného balı́ku datasetů třetı́ch stran
FreeGeodataCZ11 . OSM je ale výjimečný svou životaschopnostı́, přizpůsobivostı́ a lidským
potenciálem. Zabývá se sběrem dat komplexně, nezávisle na cı́lovém mapovém výstupu a
upotřebenı́, přesto však buduje rozhranı́ pro snadný import a export na stávajı́cı́ cı́lová
zařı́zenı́ (proprietárnı́ GPS moduly, GIS programy). Jasně a zřetelně se hlásı́ k svobodným
licencı́m a využı́vá jiné legálnı́ zdroje dat. Části datového modelu jsou otevřené uživatelům,
kteřı́ jej upravujı́ dle jejich potřeb a možnostı́. Projekt nenı́ určen jen pro vybraný region,
národnost; vytvářet data lze pro celém světě a v libovolném jazyce.
Cı́lem projektu je vytvářet otevřená polohopisná geografická data s širokým okruhem obslužných aplikacı́ na principech komunitnı́ otevřené a sdı́lené práce.
Figure 1: Logo projektu OpenStreetMap
Historie OpenStreetMap
Projekt OSM vzniká v červenci roku 2004 v Anglii, kde je registrována doména OSM12 , stojı́
za nı́m Stephen Coast, Richard Fairhurst. Výrazné osoby se přidávajı́ z Německa Immanuel
Scholz, Frederik Ramm, Jochen Topf a dalšı́...
ˆ V začátkem roku 2006 začı́najı́ vznikat národnı́ sekce, obvykle na na úrovni států, které
spolupracujı́ při tvorbě dat v daném regionu.
ˆ V dubnu 2006 vzniká nadace OpenStreetMap, která má za úkol shromažd’ovat finančnı́
prostředky na podporu projektu OSM.
ˆ V řı́jnu 2006 se přidávajı́ prvnı́ uživatelé z Česka a vznikajı́ zde prvnı́ data.
11
12
http://grass.fsv.cvut.cz/wiki/index.php/FreeGeodataCZ
http://www.openstreetmap.org/
Geinformatics FCE CTU 2008
95
Projekt OpenStreetMap z pohledu geoinformatika
ˆ V prosinci 2006 je pro OSM významné uvolněnı́ družicových snı́mků Ikonos prostřednictvı́m
serveru maps.yahoo.com13 pro legálnı́ tvorbu dat.
ˆ V listopadu 2007 je v OSM ČR plně dostupná silničnı́ sı́t’ I. a II. třı́d a dálnic
Licence
V rámci projektu OSM je zvykem využı́vat licence GNU GPL pro podpůrný software. Často se
jedná o Java, Perl, C, Python, Ruby aplikace využı́vajı́cı́ jiné knihovny svobodného softwaru.
Tato licence je i v Česku podle rozborů některých právnı́ků pod právnı́ ochranou [4],[5],[6].
Pro geodata je užı́vána licence Creative Common Attribution-ShareAlike 2.014 (zkráceně CC
BY-SA 2.0), někteřı́ uživatelé je navı́c poskytujı́ pod licencı́ Public Domain. Licence CC BYSA 2.0 umožňuje data volně kopı́rovat, měnit i prodávat za předpokladu, že jejich libovolná
modifikace nebo interpretace bude opět dostupná pod touto licencı́.
Ve Francii dřı́ve formulovaná licence Public Geodata License15 (český překlad16 PGL), nebyla
nakonec komunitou použita.
Referenčnı́ rámec a model geodat
Polohopisná složka
Projekt OSM se zabývá sběrem polohopisných dat, pro něž je využı́váno geodetické datum
WGS-84, jak je definováno v EPSG:4326.
Výškopis
Výškopis nenı́ předmětem sběru dat. Pro účely překrývajı́cı́ch se objektů (nejčastěji mosty,
tunely, plochy zeleně a vody) lze využı́t tématický klı́č, kterým lze definovat pořadı́ zobrazenı́
jednotlivých prvků.
Uvažuje-li se o využitı́ výškopisných dat jako doplňujı́cı́ informaci k polohopisu v podobě
reliéfu nebo vrstevnic, pak jako zdroj je nejčastěji užı́ván SRTM3, nebo GTOPO30.
Tématická složka
Tématická složka je robustnı́ a nejvı́ce dynamickou složkou komunitnı́ho wiki [7]. Uživatelé
navrhujı́ a schvalujı́ rozličné vlastnosti, které majı́ potřebu mapovat, nebo je považujı́ za
důležité. V současné době obsahujı́ sady značek (tagů) pro fyzické objekty[17]:
ˆ dopravnı́ komunikace a zařı́zenı́ (silničnı́, železničnı́, vodnı́ a letecká doprava)
13
http://maps.yahoo.com
http://creativecommons.org/licenses/by-sa/2.0/
15
http://cemml.carleton.ca:8080/OGUG/Members/drsampson/pgl/public-geodata-license
16
http://gis.templ.net/pgl/index.html
14
Geinformatics FCE CTU 2008
96
Projekt OpenStreetMap z pohledu geoinformatika
ˆ občanské, průmyslové a vojenské objekty a areály
ˆ využitı́ kulturnı́, urbanistické krajiny nebo krajinný pokryv, vodstvo
ˆ občanská vybavenost
ˆ turistické a historické objekty
a abstraktnı́, rozšiřujı́cı́, doplňujı́cı́ nebo omezujı́cı́ sady značek (tagů):
ˆ trasy (hromadná doprava, cyklokoridory)
ˆ administrativnı́ hranice
ˆ volnočasové aktivity
ˆ okolı́ objektů
ˆ přı́slušenstvı́ a obecné vlastnosti
ˆ omezenı́ (předevšı́m dopravnı́)
ˆ názvy
ˆ mı́stopis
ˆ poznámkový aparát
Datová primitiva
Centrálnı́ databáze [8] shromažd’uje uživateli vytvářená geodata, která jsou tvořena dvěma
základnı́mi prvky, které nesou unikátnı́ index, časové razı́tko, autora a informaci o své existenci
(platnosti). Jsou to:
ˆ nodes (uzly) – jako jediné nesou samy o sobě přı́mou polohovou informaci.
ˆ ways (cesty) – jsou uspořádané orientované posloupnosti nodů, kde se každý uzel vyskytuje nejvýše jednou.
ˆ areas (plochy) – v přı́padě že cesta je uzavřená (prvnı́ a poslednı́ uzel je totožný),
považuje se za plochu.
Rozšiřujı́cı́ prvky
ˆ tags (značky) – je výčet možných proměnných a jejich hodnot pro popisnou složku
geodat
ˆ relations (vztahy) – vztahy je náznak budoucnosti v rozšı́řených možnostech seskupovánı́ a určovánı́ rolı́ primitiv pro zjednodušenı́ správy editace a udržovánı́ objektů.
Vývoj struktury datových primitiv je ve zkratce následujı́cı́: [9]
1. nodes, segments (orientované hrany) + tags
2. nodes, segments (orientované hrany), ways(posloupnost hran) + tags
3. současný stav: nodes, ways + tags, relations
Geinformatics FCE CTU 2008
97
Projekt OpenStreetMap z pohledu geoinformatika
4. budoucnost?: nodes, ways + tags, s plným uplatněnı́m relations, historie změn a metaeditačnı́ data [12][13]
Jejich schématické zobrazenı́ je na obrázku [Figure 2], strukturu zápisu do souboru na schématu
[Figure 3].
Figure 2: Primitiva modelu OSM: node, way, area
<?xml?>
<osm version="0.5">
<node id="" lat="" lon="" visible="" timestamp="" user="">
<tag k="" v="" />
</node>
<way id="" visible="" timestamp="" user="">
<nd ref="" />
<tag k="" v="" />
</way>
<relation id="" visible="" timestamp="" user="">
<member type="" ref="" role="" />
<tag k="" v=""/>
</relation>
</osm>
Figure 3. Vzorový XML zápis OSM modelu
Centrálnı́ databáze OSM skrze API poskytuje uživatelům poslednı́ aktuálnı́ data z požadované
geografické oblasti a jejich opravy přijı́má pouze inkrementálně. Veškerá historie zůstává tedy
archivována, jejı́ využitı́ nenı́ zatı́m do žádného uživatelského editoru plně implementováno,
částečnou lze najı́t v online editoru Potlatch. Jako demonstraci možnostı́ historie je webová
aplikace OSM History17 vytvářejı́cı́ animovaný rastrový obrázek s růstem dat vybrané oblasti
v čase.
Záznamy z GPS přijı́mačů
Databáze má také vyhrazenou část pro sběr samotných záznamů z GPS přijı́mačů (tracklog)
ve formátu GPX. Zdrojová data tak nezůstávajı́ skryta u původnı́ch uživatelů, ale mohou být
použita jako podklad pro nová geodata odvozená jiným způsobem, nebo v jiném čase.
Zdroje dat
Zdrojem dat pro projekt OSM jsou předevšı́m individuálnı́ záznamy (tracklogy) uživatelů
z přijı́mačů GPS. Jejich postupný růst doplňuje několik licenčně kompatibilnı́ch datasetů s
rozsáhlým pokrytı́m:
17
http://openstreetmap.gryph.de/history/
Geinformatics FCE CTU 2008
98
Projekt OpenStreetMap z pohledu geoinformatika
1. vektorová mapa Vmap0 (autor NIMA) – celý svět 1:1 000 000
2. družicové snı́mky Landsat 7 pořı́zené v roce 1999-2001 (autor NASA) – rozlišenı́ 30m
3. družicové snı́mky hlavnı́ch měst států (poskytovatel Yahoo) – v Česku pouze Praha a
okolı́ (rozlišenı́ ∼2m, snı́mky družice Ikonos z roku 2002)
4. letecké snı́mky územı́ ČR z let 1998-2001 jejichž původcem je ČÚZK, poskytovatel
skrze WMS a licence pro OSM je ÚHUL.
5. mapy bez autorských práv – volná licence
6. mapy, kde vypršela autorská práva – v Česku 70 let od smrti (poslednı́ho) autora
Lokálnı́ datasety jako např. TIGER v USA nebo AND v Holandsku nejsou ve výčtu uvedeny
a starajı́ se o ně obvykle národnı́ mapovacı́ skupiny OSM.
Součásti projektu
Projekt OSM se skládá z několika fyzicky nebo logicky dı́lčı́ch částı́ [10]:
ˆ www (Amsterdam, NL) – mapserver, který zpřı́stupňuje databanku rastrových výřezů
ˆ tile (Londýn, UK) – databanka výřezů map v rastrovém formátu
ˆ tilegen – rendrovacı́ server, který z planet.osm vytvářı́ rastrové výřezy map
ˆ planet (Londýn, UK) – týdennı́ export aktuálnı́ verze geodat z databáze do jednoho
XML souboru, jeho velikost je po kompresi bz2 ve stovkách MB (300 MB v červenci
2007)
ˆ api (Londýn, UK) – API k databázi geodat
ˆ db (Londýn, UK) – databáze geodat, provozovaná v MySQL, která poskytuje data k
editaci a přijı́má modifikovaná nebo nová data, udržuje historii dat
ˆ wiki (York, UK) – wiki rozhranı́ pro dlouhodobou výměnu informacı́ uvnitř projektu,
spravovaná všemi uživateli
ˆ svn (York, UK) – subversion rozhranı́ pro vývoj aplikacı́ a skriptů
ˆ dev (Amsterdam, NL) – testovacı́ rozhranı́ vývojářů, některý vývoj a testovánı́ probı́há
na soukromých strojı́ch, jako např. editor JOSM v Německu.
ˆ mail (York, UK) – rozhranı́ pro e-mailové konference talk, talk-dev, talk-*
ˆ blog (York, UK) – blog stručných zpráv z konferencı́ a událostı́ okolo OSM
Software
API
Geinformatics FCE CTU 2008
99
Projekt OpenStreetMap z pohledu geoinformatika
Figure 4: Diagram komponent OSM. Převzato z [10].
API [11] je klı́čovou částı́ OSM nebot’ propojuje vnějšı́ svět s databázı́ geodat. Maximálně
využı́vá existujı́cı́ch standardů a jen to nezbytné přidává. Základem je sı́t’ová vrstva IP, transportnı́ vrstva TCP a aplikačnı́ vrstva HTTP. Poslednı́ a jediná podporovaná verze API je 0.5.
Základnı́ požadavek klienta je pro HTTP specifikován:
"http:"
"//" host [ ":" port ] [ abs_path ["?" query ]]
Dotaz na jeden konkrétnı́ prvek node, např.:
http://api.openstreetmap.org/api/0.5/node/35
Uživatelské editory dat
Jedná se o programy, kterými uživatelé přistupujı́ k datovému skladu ze svých domácı́ch
počı́tačů a s nimiž upravujı́ geodata OSM. Úpravy je možno provádět jen z dat umı́stěných
v centrálnı́mu datovému skladu a to při připojenı́:
1. dočasném (např. JOSM) – uživatel si nejprve stáhne soubor dat, provede úpravy, zkontroluje konflikty a odešle data zpět do datového skladu.
2. stálém (např. Potlach) – uživatel si na mapserveru nalezne oblast k editaci, na požadavek
je mu umožněn přı́stup k vektorové podobě a provedené změny lze průběžně odesı́lat,
přı́padně vracet (i za hranici editacı́ aktuálnı́ho uživatele).
Mezi editory patřı́:
Geinformatics FCE CTU 2008
100
Projekt OpenStreetMap z pohledu geoinformatika
ˆ JOSM (viz Figure 5) – ”Java OSM” je plně funkčnı́ a použitelný editor OSM dat.
Původnı́m autorem je Immanuel Scholz. Program vlastnı́ nástroje na vytvářenı́, editaci
a modifikaci dat, jejich značkovánı́. Umı́ řešit editačnı́ konflikty aktuálnı́ch editacı́ a
zobrazuje autory jednotlivých prvků. Nynı́ je dostupný zkompilovaný ve stabilnı́ verzi
1.5 a vývojové verzi. Umožňuje vytvořená data ukládat na disk, podkládat záznamy
cest z GPS přijı́mačů (tracklogy) ve formátu GPX. Je rozšiřitelný pomocı́ pluginů,
mezi nejzajı́mavějšı́ patřı́ pokročilý WMS klient (jehož implementace je umožňuje velmi
efektivnı́ práci s WMS v produktech GIS jako např. ArcGIS neznámou), Mappaint pro
vylepšené zobrazovánı́ editovaných dat, Validator korektnı́ho značkovánı́).
ˆ Potlatch – Flash internetová aplikace pro on-line editaci dat, jejı́ž autorem je Richard
Fairhurst. Aplikace je vyvı́jena předevšı́m pro licenčnı́ kompatibilitu s Yahoo Maps
použı́vaných jako podkladnı́ vrstva pod vynášená geodata. Vyvı́jena od ledna 2007.
ˆ a jiné jako Osmeditor, Merkaator, Osmpedit, Java on-line applet – jejich vývoj byl
z různých důvodů ukončen nebo jejich vývojáři nedržı́ bezprostřednı́ krok s vývojem
projektu OSM a často jejich poslednı́ vydánı́ nenı́ kompatibilnı́ s aktuálnı́m API.
Figure 5: Java editor JOSM 1.5 (WMS a mappaint plugin) s daty z Brna 9. 6. 2007,
podloženým snı́mky z Landsatu. Provozováno na GNU/Linux Ubuntu 7.04 a SUN Java 1.6.
Renderery
Programy, které transformujı́ data ze souboru XML formátu OSM na vektorové obrázky XML
formátu SVG nebo rastrové obrázky PNG.
Geinformatics FCE CTU 2008
101
Projekt OpenStreetMap z pohledu geoinformatika
ˆ Mapnik (viz Figure 6) – program napsaný v C++, rozhranı́ v Pythonu a propojený
s jinými knihovnami, určený předevšı́m pro běh na serveru. Předpokládá import Planet.osm do PostgreSql databáze. Po definovánı́ výřezu v zeměpisné šı́řce a délce vytvořı́ databanku obrázků použitelných předevšı́m pro mapserver. Výsledek aktualizovaný přibližně jednou týdně je dostupný jako implicitnı́ zdroj dat na oficiálnı́m mapserveru.
ˆ Osmarender (viz Figure 7) – individuálnı́ renderer aktuálnı́ verze 6. Využı́vá transformačnı́ch stylů XSL a skrze XML parser vytvářı́ vektorové obrázky map ve formátu
SVG. Je určen pro koncové uživatele (dostupný i jako plugin pro JOSM).
ˆ tiles@home – rozšı́řená a upravená verze Osmarenderu o schopnost distribuovatelných
výpočtů podle vzoru seti@home. Uživatel si bud’ vybere oblast, kterou chce udržovat
aktuálnı́, nebo převezme od serveru požadavek, který je na základě žádosti uživatelů
nebo změny dat v databázi. Klient si stáhne aktuálnı́ data, vytvořı́ výstup obrázků pro
databanku a zašle jej zpět. Výsledek, průběžně aktualizovaný, je dostupný jako volitelný
zdroj dat na oficiálnı́m mapserveru.
Figure 6: Ukázka zobrazených dat ve webovém prohlı́žeči. Dálnice a rychlostnı́ silnice Česka
a jeho sousedu z renderu Mapnik dostupného na mapserveru www.openstreetmap.org ze dne
2. 4. 2007.
Fenomén OSM
OpenAerialMap
Postupně jak se projekt OSM rozšiřuje mezi uživatele vznikajı́ sesterské projekty, které přı́mo
s OSM nesouvisı́, ale poskytujı́ mu podporu, nebo rozšiřujı́ jeho možnosti. Jednı́m z takových
Geinformatics FCE CTU 2008
102
Projekt OpenStreetMap z pohledu geoinformatika
Figure 7: Ukázka zobrazených dat ve webovém prohlı́žeči. Oblast centra města Brna (pouze
nekompletnı́ silničnı́ sı́t’) z renderu Osmarender verze 4 dostupného na mapserveru
www.openstreetmap.org ze dne 2. 4. 2007.
projektů je OpenAerialMap www.openaerialmap.org, který si klade za cı́l agregovat známé
snı́mky DPZ ve viditelném spektru pod volnou licencı́. Základem je snı́mek z Landsat 7,
který je v malých měřı́tkách překryt podrobnějšı́mi snı́mky. Server komunikuje předevšı́m
WMS rozhranı́m a jako mapserver, který na požadavky uživatelů poskytuje lokálnı́ kopie,
nebo je přeposı́lá na původnı́ servery správců dat. Pokud to licence dovoluje, jsou ukládány
do vyrovnávacı́ paměti. Dalšı́ možnostı́ je vložit přı́mo nasnı́mané a rektifikované snı́mky.
Někteřı́ uživatelé jdou až tak daleko, že kombinacı́ bezpilotnı́ch leteckých prostředků, GPS
přijı́mačů a fotoaparátů, produkujı́ svá původnı́ data DPZ.
The State of the Map
Mnoho uživatelů OSM vystupuje se svými přı́spěvky o projektu na rozličných konferencı́ch.
Uvnitř komunity však vznikla potřeba potřeba zpětné vazby projektu a osobnı́ho kontaktu.
Proto byla 14.-15. července 2007 na univerzitě v Manchesteru (UK) uspořádána konference
The State of the Map18 o teoretických základech, stavu a vývoji OSM či sesterských nebo
jiných inspirativnı́ch geoinformačnı́ch projektech. Dalšı́ ročnı́k konference byl v Limericku
(Irsko) 12.-13. července 2008. Třetı́ ročnı́ bude 10.-12. července 2009 v holandském Amsterodamu.
Figure 8: Logo konference The State of the Map
18
http://www.stateofthemap.org/
Geinformatics FCE CTU 2008
103
Projekt OpenStreetMap z pohledu geoinformatika
Mı́stnı́ setkánı́
V zemı́ch západnı́ Evropy, kde se také nacházı́ většı́ počet uživatelů, se pořádajı́ školı́cı́ akce
pro nové uživatele, neformálnı́ setkánı́ a mapovacı́ akce. Úkolem akcı́ je systematicky pokrýt
daty dosud plně nezaznamenanou část urbanizovaného územı́, nebo domapovat odlehlé části
měst.
Nadace OSM
V Anglii vznikla i nadace nezávislá na projektu, která si klade za cı́l zı́skávat penı́ze na
podporu, propagaci projektu OSM. Jedná se o právnický subjekt, který reprezentujı́ osoby
podı́lejı́cı́ se na vývoji projektu, kteřı́ nesou tı́hu vývoje. Finančnı́ prostředky jsou určeny pro
vývoj, provozu a udržovánı́ hardware projektu.
Vlastnosti komunitnı́ho projektu
Komunitnı́ projekty majı́ své specifické vlastnosti, které vyplývajı́ z charakteru uživatelů a
jejich organizace. Při takových úvahách nám může pomoci přı́klad Wikipedie, která má delšı́
historii a popularitu a přes jiné zaměřenı́ obdobné problémy.
Pohled geoinformatika
Pro základnı́ hodnocenı́ projektů obvykle uvažujeme měřı́tka např. finančnı́ a časové efektivity,
nebo účelnosti. V OSM nenı́ možno finančnı́ho měřı́tka pro dobrovolnost využı́t, čas dosaženı́
i obecného cı́le je velmi subjektivně chápán každým uživatelem.
Jako jeden z cı́lů můžeme definovat vytvářenı́ polohopisných map velkých měřı́tek s možnostı́
generalizace pro střednı́ a malá měřı́tka s obsahovou náplnı́ automap, plánů měst, cyklomap.
Dalšı́ z cı́lů je routovacı́ mapa pro navigaci. Architektura systému tyto dva cı́le umožňuje a
jejich naplněnı́ je jen otázkou počtu dobrovolnı́ků a definovánı́ požadované úrovně kvality a
předevšı́m obsahové náležitosti. Také hardwarové řešenı́ je pro tisı́ce dlouhodobě aktivnı́ch
uživatelů udržitelné v provozu.
Vývoj datového modelu ukazuje jeho živelný růst spolu s touhou uživatelů pracovat. Snaha
začı́t projekt zcela od počátku bez robustnı́ho a odzkoušeného datového modelu způsobuje
ještě nynı́ komplikace. Jedná se předevšı́m o konvertibilnost formátu OSM do GIS standardnı́ch formátů a následné možnosti využitı́ nástrojů geoinformačnı́ch technologiı́ (např.
GDAL). Dalšı́ historickou tı́žı́ datového modelu je nevhodnost snadné a dlouhodobé údržby
dat, nebot’ dosavadnı́ implementace modelu v editorech vyžaduje přı́stup k datům na nı́zké
úrovni, tedy i dostatečné znalosti a zručnosti uživatelů. Původnı́ jednoduchost datového modelu umožňovala snadný vývoj obslužných aplikacı́, nynı́ však v přechodném stádiu od jednoduchého k pokročilé struktuře modelu je jak správa geoprvků tak obslužných aplikacı́ netriviálnı́.
Z pohledu operátora GIS má projekt využitı́ jako doplňkového zdroje dat, přı́padně základnı́
orientace, nejsou-li v daném okamžiku dostupná jiná data (např. ověřenı́ informace o elementárnı́ korektnosti georeferencovánı́ třetı́ stranou). Nynı́ je v OSM třeba uvažovat:
Geinformatics FCE CTU 2008
104
Projekt OpenStreetMap z pohledu geoinformatika
ˆ kvalita polohového měřenı́ ani obsahové náplně nenı́ definována.
ˆ metadata o mapovaných objektech, prováděných změnách, zdrojı́ informacı́ nejsou
jednotná ani obecně použı́vaná.
ˆ konvence práce při vytvářenı́ jsou definovány pouze v obecné rovině.
ˆ pokrytı́ daty, rozsah zmapovaných územı́ nenı́ možno specifikovat a nesnadná je i statistická konfrontace úplnosti (např. silnice v OSM versus Jednotná dopravnı́ vektorová
mapa)
ˆ konvertibilnost dat je netriviálnı́, komplikovaný systém rolı́ nenı́ dostatečně triviálnı́
pro vytvořenı́ dlouhodobého a univerzálnı́ho exportu do jiných formátů.
ˆ geodetické základy využı́vá parametry WGS-84, tedy po úspěšné konverzi formátu
je už plná kompatibilta se standardy
ˆ znalost mı́stnı́ho významu obsažená v mapě může být cennou informacı́; v optimálnı́m přı́padě může být aktuálnı́ (změny v klasických mapách trvajı́ dlouho a stojı́
nové penı́ze) a vyjadřujı́cı́ skutečné využitı́ (nejen prvotnı́ či původnı́ účel)
Projekt
Projekt OSM je jako organismus, neexistuje žádná finálnı́ nebo stabilnı́ verze. Stále se rozšiřuje
co do kvality obsahu, tak do kvantity mapovaného územı́. Mnoho částı́ projektu je v základnı́m
a neustálém vývoji, jsou sice použitelné a zprovoznitelné, ale vyžadujı́ však značnou zručnost
a zkušenosti. V souvislosti s neustálým růstem a změnami neexistujı́ často manuály skriptů či
programů. Časté změny pravidel pro editaci a zadávánı́ dat ponechávajı́ mnohé návazné části
projektů ve zpožděnı́ a tak např. některé značky (tagy) nenı́ možno v globálnı́m mapserveru
renderovat.
Velká variabilita systému je ovlivňovaná poptávkou uživatelů a konkrétnı́m zájmem mapovat. To dává za následek malou jednotnost a koncepčnost značkovánı́ geoprvků. Problémem
každého začı́najı́cı́ho projektu je řı́dké pokrytı́ daty, jehož růst se s časem zpomaluje, přı́padně
se zaciluje jen na urbanizovaná nebo navštěvovaná mı́sta. Každý uživatel pracujı́cı́ jen s
výstupy svého GPS přijı́mače je přibližně do roka informačně vytěžen, pokud se nestává
OSM jeho hlavnı́ konı́ček a cestovánı́ cı́leně vyhledává. V létě 2007 působı́ na územı́ ČR asi
10 uživatelů/editorů dat, na jaře 2008 už asi 20, z čehož polovina má spojitost s Prahou,
dalšı́ jsou rozeseti po městech a městysech. Pro základnı́ a postupné mapovánı́ mı́st ”Hic sunt
leones” by bylo zapotřebı́ mnohem vı́ce uživatelů.
Velkou otázkou také zůstává aktualizovatelnost dat, či samoopravný mechanismus chyb na
straně uživatelů. Problémem jsou i změny mapovaných objektů a verifikace dat bez většı́ho
počtu zodpovědných uživatelů, kteřı́ by měli pod svým dohledem předevšı́m data z územı́,
kde se každodenně pohybujı́ a kde jsou sami znalci mı́stnı́ho významu.
Uživatelé
Hlavnı́m motorem projektu je Evropa a konkrétně Angličané a Němci, nebot’ zde má projekt
největšı́ počet aktivnı́ch uživatelů a vývojářů, vysoké pokrytı́ územı́ daty. Ti udávajı́ základnı́
Geinformatics FCE CTU 2008
105
Projekt OpenStreetMap z pohledu geoinformatika
tón projektu a majı́ také velkou členskou základnu. Komunikace je mimo národnı́ celky vždy
v angličtině.
Většina uživatelů pocházı́ profesně mimo obory geovědnı́, často se jedná o studenty se zjevným
zájmem v informatice. Proto se potřebujı́ naučit elementárnı́ návyky ve vizuálnı́ interpretaci,
dále syntaxi, sémantiku, systematiku a topologii. I pokud odhlédneme od různé vyzrálosti
uživatelů a budeme předpokládat, že majı́ znalosti stejné úrovně a aktuálnı́, přesto produkujı́
různou kvalitu dat různými metodami sběru, editace a osobnı́ch zvyklostı́ a každodennı́ náplnı́.
Uživatelé majı́ také o projektu rozličné představy z jejichž premis přistupujı́ k projektu:
ˆ Až jednou charakterizuje uživatele, který vkládánı́ dat vnı́má jako dlouhodobý maraton
ˆ Ihned je charakter uživatele, který vnı́má zadánı́ a využitı́ dat aktuálně v přı́tomném
čase
ˆ Kvalita je vlastnost, která určuje, že uživatel vnı́má vysokou hodnotu dat (přesnost,
pravdivost, ověřenost), jako klı́čové parametry
ˆ Cokoliv je vlastnost, která určuje, že uživatel vkládá cokoliv a hledı́ předešı́m na vysokou penetraci dat
Všichni uživatelé jsou si rovni a neexistujı́ žádné formálnı́ třı́dy (správci), které by řešily spory,
garantovaly editace a zásahy. Určitá privilegia majı́ hlavnı́ vývojáři, velká mı́ra demokracie
je při schvalovánı́ nových značek. Pro přı́liš velká bı́lá mı́sta se uživatelé prozatı́m potkávajı́
zřı́dka a spory jsou zatı́m jen drobné na mezinárodnı́ úrovni, např. Řecko, blı́zkovýchodnı́
oblast, kde občas prosakujı́ vleklé politické problémy.
Zajı́mavým aspektem jsou záškodnı́ci, kteřı́ by chtěli projekt poškodit. Pokud by se na jejich
činnost nepřišlo včas, bylo by (po jejich zablokovánı́ obtı́žné) jejich vandalismus obnovit do
původnı́ho stavu, nebot’ k historii v hlavnı́ databázi OSM lze přistupovat pouze diskrétně a od
přı́tomnosti do minulosti. Navı́c pro práci s historiı́ nenı́ vyvinut žádný uživatelský program,
nebo sada skriptů.
Závěr
Projekt OpenStreetMap tu existuje několik let a žije svým vlastnı́m životem mimo dosavadnı́
struktury zajı́majı́cı́ se o mapovánı́ povrchu předevšı́m urbanizované země. Prodělává možná
zbytečně dětské nemoci, je na počátku, nedaleko chvı́le, kdy mapa byla zcela prázdná. Zaplněnı́
bı́lých mı́st je možná na prvnı́, v ČR nepočetnou, generaci nadšenců přı́liš velký úkol. Tedy
ještě dlouho nebude jako jediný zdroj možné uvažovat o OSM. Nicméně OpenStreetMap je
životaschopným zdrojem svobodných geodat. Veřejnost, která si ho pomalu bere za svůj,
je jeho velký potenciál. Je jen na geoinformaticı́ch, zda se budou chtı́t do něho zapojit a
promı́tnout v něm své zkušenosti tak, aby jej mohly později využı́vat jako relevantnı́ nebo
paralelnı́ zdroj geodat.
Geinformatics FCE CTU 2008
106
Projekt OpenStreetMap z pohledu geoinformatika
Reference
1. Rapant Petr: Družicové polohové systémy. VŠB-TU Ostrava, 2002. 200 str. ISBN 80248-0124-8. [cit. 2008-03-30] Dostupný na WWW: online19
2. Free Software Foundation: The Free Software Definition online20 . [cit. 2007-06-30].
3. Zeměměřický úřad (2007): Výňatek z cenı́ku výkonů a výrobků ZÚ [online]. [cit. 200706-30]. Dostupný na WWW: online21 .
4. Aujezdský Josef (2005): GNU GPL a použitı́ českého práva [online]. Root [cit. 2007-0630]. Dostupný na WWW: online22 .
5. Otevřel Petr (2007): Rozsudek ohledně GNU/GPL – přituhuje? [online]. Právo v informačnı́ch technologiı́ch [cit. 2007-06-30]. Dostupný na WWW: online23 .
6. Čermák Jiřı́ (2001): GNU/GPL – Právnı́ rozbor licence [online]. Root [cit. 2007-06-30].
Dostupný na WWW: online24 .
7. wiki OpenStreetMap (2007): Map Features [online]. [cit. 2007-06-30]. Dostupný na
WWW: online25 .
8. wiki OpenStreetMap (2007): Database schema [online]. [cit. 2007-06-30]. Dostupný na
WWW: online26 .
9. Coast Stephen (2007). This Mapping Stuff Could Really Take Off. In The State Of The
Map 2007. Manchester : [s.n.], 2007. Dostupný na WWW: online27 .
10. wiki OpenStreetMap (2007): Platform Status [online]. [cit. 2007-06-30]. Dostupný na
WWW: online28 .
11. wiki OpenStreetMap (2007): Protocol [online]. [cit. 2007-06-30]. Dostupný na WWW:
online29 .
12. Ramm Frederik, Topf Jochen (2007): Towards a New Data Model for OSM [online]. [cit.
2008-03-30]. Dostupný na WWW: online30 .
13. Schuyler Erle (2007): In response to ”Towards a New Data Model for OSM” [online].
[cit. 2008-03-30]. Dostupný na WWW: online31 .
19
http://gis.vsb.cz/Publikace/Knizni Publikace/DNS GPS/DNS GPS.pdf
http://www.gnu.org/philosophy/free-sw.html
21
http://www.cuzk.cz/GenerujSoubor.ashx?NAZEV=30-ZU CENIK
22
http://www.root.cz/clanky/gnu-gpl-a-pouziti-ceskeho-prava/
23
http://www.pravoit.cz/view.php?nazevclanku=rozsudek-ohledne-gnugpl-prituhuje&cisloclan \
ku=2007050004
24
http://www.root.cz/clanky/gnugpl-pravni-rozbor-licence/
25
http://wiki.openstreetmap.org/index.php/Map Features
26
http://wiki.openstreetmap.org/index.php/Database schema
27
http://www.slideshare.net/chippy/this-mapping-thing-could-really-take-off/
28
http://wiki.openstreetmap.org/index.php/Platform Status
29
http://wiki.openstreetmap.org/index.php/Protocol
30
http://www.remote.org/frederik/tmp/towards-a-new-data-model-for-osm.pdf
31
http://freemap.in/ sderle/osm-data-model.html
20
Geinformatics FCE CTU 2008
107
Projekt OpenStreetMap z pohledu geoinformatika
14. OpenStreetMap, talk-cs: WikiProject Czechia/free map2osm32 seznam vybraných datasetů pro OSM-cs, [cit. 2008-06-30]
15. Martin Landa: odpověd’ v konferenci33 in FreeGeoCZ 27. prosinec 2006. [cit. 2007-06-30]
32
33
http://wiki.openstreetmap.org/index.php/WikiProject Czechia/free map2osm
http://mailman.fsv.cvut.cz/pipermail/freegeocz/2006-December/000118.html
Geinformatics FCE CTU 2008
108
GUI pro orchestraci GeoWebových služeb
František Klı́mek
Institute of Geoinformatics, VSB-TU of Ostrava
[email protected]
Klı́čová slova: GeoWeb, geoinformatika, webové služby, orchestrace, BPEL, GUI
Abstrakt
Součástı́ výzkumného projektu Orchestrace služeb pro GeoWeb” GA 205/07/0797 řešeného
”
na Institutu geoinformatiky VŠB-TU Ostrava, zabývajı́cı́ho se možnostı́ orchestrace webových
služeb z oblasti GIS a ověřenı́m praktických možnostı́ dostupných jazyků pro popis a plánovánı́
obchodnı́ch procesů je i část zabývajı́cı́ se návrhem grafického uživatelského rozhranı́, které by
umožňovalo uživatelům na různých úrovnı́ch funkcionality pracovat s těmito orchestry služeb.
Jaká je mı́ra funkcionality, kterou jednotlivı́ uživatelé požadujı́? Má jim být umožněno vyhledávat orchestry, spouštět je, parametrizovat, upravovat, či dokonce navrhovat? Na tyto
otázky se snažı́ odpovědět následujı́cı́ řádky, ve kterých jsou shrnuty základnı́ údaje o orchestraci v oblasti GeoWebu, analýza a popis charakteristik jednotlivých uživatelů i návrh
samotného grafického rozhranı́ koncového uživatele a popis komponent, které by měl být v
tomto rozhranı́ pro práci s orchestry k dispozici.
Úvod
Webové služby se neodvratně stávajı́ součástı́ většiny informačnı́ch systémů. S rostoucı́m
počtem volně dostupných i komerčnı́ch služeb se nabı́zı́ možnosti jejich vzájemného propojovánı́ do funkčnı́ch celků. Pouhým statickým spojovánı́m služeb nejsme schopni využı́t jejich
potenciál, natož potenciál servisně orientované architektury (SOA), která přitahuje zájem
všech oblastı́ IT průmyslu a rychle proniká do hlavnı́ch chodů aplikacı́ zásadnı́ch pro plněnı́
obchodnı́ch operacı́. Proto je zapotřebı́ začı́t služby řetězit dynamicky, tzn. spojovat je dle
aktuálnı́ch potřeb, možnostı́ uživatele (stav připojenı́, finance, požadovaná přesnost výsledků,
rychlost odezvy, ap.). V současnosti se mluvı́ o dvou způsobech řetězenı́ webových služeb,
známých jako orchestrace a choreografie [PRAM].
Orchestrace
Standardnı́ technologie jako např. WSDL (Web Service Description Language), SOAP (Simple Object Access Protocol), UDDI (Universal Description, Discovery and Integration) pracujı́cı́ s webovými službami nám poskytujı́ prostředky pro jejich jednotlivý popis, lokalizaci
Geinformatics FCE CTU 2008
109
GUI pro orchestraci GeoWebových služeb
a spouštěnı́. I když webová služba může poskytovat mnoho metod, každý WSDL soubor
popisuje doslova atomické (na nı́zké úrovni) funkce. Co nám však tyto základnı́ technologie neposkytujı́, jsou důležité detaily, které popisuji chovánı́ služby jako součást většı́, vı́ce
komplexnı́ spolupráce. Když se jedná o spolupráci, která je kolekcı́ aktivit (metod, služeb)
navržených tak, aby úspěšně plnila daný business cı́l, jedná se o tzv. business proces. A právě
popis kolekcı́ aktivit, který tento business proces vytvářı́ je nazýván orchestrace [PRAM].
V rámci projektu proběhla analýza několika, pro orchestraci běžně použı́vaných jazyků a
po této analýze byl pro potřeby orchestrace v prostředı́ GeoWebu shledán jako vyhovujı́cı́
jazyk, jazyk BPEL. Hlavnı́ funkcı́ BPEL je orchestrace webových služeb, tedy řı́zenı́ souhry
funkcionality, kterou nabı́zı́ ”backend” část systému, či vı́ce systémů. Tato funkcionalita je dekomponována do operacı́, jež je možné volat přes webovou službu. Na druhé straně BPEL sám
stojı́ za webovou službou, která definuje jeho rozhranı́, tj. vstupnı́ operace. Pro každý vstup
do procesu (v BPMN objekt Start / Intermediate MessageEvent) je tedy ve webové službě,
která popisuje rozhranı́ BPELu, jedna operace. Vstupy procesu však nemusı́ být výhradně
na začátku, asynchronnı́ procesy mohou mı́t vstupy na různých mı́stech. Dá se tedy řı́ci, že
BPEL implementuje webovou službu. Přitom aplikace, která webovou službu použı́vá, nevı́,
zda se za nı́ skrývá proces, či zda je implementována např. EJB modulem. BPEL je rovněž
nezávislý na platformě, implementace pro něj existujı́ na platformě Java EE, .NET a jiných
platformách. Proces implementovaný v jazyce BPEL pomocı́ jednoho nástroje by také mělo
být možné přenést a spustit v nástroji jiném. Někteřı́ výrobci byznys proces management
systému (BPMS) ale použı́vajı́ svá vlastnı́ rozšı́řenı́ jazyka BPEL, která tuto přenositelnost
znemožňujı́ [TBPEL].
Architektura navrženého systému
Jednı́m z hlavnı́ch cı́lů grantového projektu je stanovit metodiku a popsat architekturu, jak by
mohl celý komponovaný systém zahrnujı́cı́ služby v různorodých formách, orchestry, katalogy
atd., vypadat a spolupracovat. Pro návrh grafického uživatelského rozhranı́ je samozřejmě
nutné tuto architekturu alespoň v základnı́ rovině znát a vědět, kde do této architektury
komponenta grafického rozhranı́ vstupuje. V následujı́cı́ch několika řádcı́ch je tedy popsána
architektura systému, dle výzkumného projektu, v jejı́ aktuálnı́ podobě. Do ukončenı́ projektu lze předpokládat ještě jejı́ dalšı́ možné změny, neměly by však být nikterak dramatické.
Nemělo by tedy dojı́t k žádné převratné změněně konceptu grafického rozhranı́.
Jádrem orchestrace je registr služeb, který poskytuje mechanizmy pro registrovánı́, kategorizováni a hlavně vyhledávánı́ webových služeb v reálném čase. Pokud uživatel potřebuje
využı́t nějakou specifickou službu, prohledá daný registr. Tam zı́ská jejı́ popis a může ji začı́t
použı́vat. Registr je však zaměřen nejen na služby, ale i na procesy, které svým rozhranı́m v
podstatě službám odpovı́dajı́ a obsahuje i rozhranı́ umožňujı́cı́ vyhledávánı́ služeb dle popisu,
parametrů, klı́čových slov, podle výkonnostnı́ch metrik, typu atd. Právě k tomuto registru,
či sadě registrů spojených a potenciálně i vzájemně spolupracujı́cı́ch se připojuje uživatel
prostřednictvı́m svého grafického uživatelského rozhranı́ (GUI) a vyhledává potřebné služby,
či procesy. Hlavnı́m požadavkem GUI aplikace je tedy možnost komunikace s registrem služeb
a formulace požadavků uživatele v jemu srozumitelné podobě a následná vizualizace odpovědı́
registru opět v uživatelský přı́větivé formě. Celá architektura je znázorněná na obr. 1, kde
jsou zobrazeny jejı́ jednotlivé komponenty.
Geinformatics FCE CTU 2008
110
GUI pro orchestraci GeoWebových služeb
Obr. 1: Jednotlivé komponenty navrženého systému
ˆ Service 1..n
ˆ Adapter
ˆ Monitoring
ˆ Service register
ˆ BPEL Procesor
ˆ GUI
GUI
Jednı́m z výstupů zmiňovaného grantového projektu má být i grafické uživatelské rozhranı́
(GUI, z Angl. Graphical User Interface). Rozhranı́ má umožňovat práci s orchestry. Původnı́
plán byl, aby v nı́ šly orchestry i vytvářet, toto se však zdá jako nevhodné (viz. dále v
textu). K tomuto úkolu je vhodnějšı́ využı́t externı́ aplikaci. GUI by tedy mělo umět ”jen”
vizualizovat orchestr s aktuálnı́mi instancemi služeb a dovolit uživateli zvolit jiné instance
služeb (pomocı́ vyhledánı́ v registru a umožnit tak uživateli optimalizovat orchestr dle jeho
individuálnı́ch požadavků). Systém by mohl řešit i potřeby uživatelů, alespoň s využitı́m
základnı́ sady parametrů profilu uživatele. Tj. měl by být definován kontext uživatele a podle
něj ve znalostech nalezen adekvátnı́ orchestr (resp. jeho instance).
Takto navržené a popsané GUI by mělo následně být implementováno např. jako plugin do
některé z desktop GIS aplikacı́ (jako vhodná aplikace se jevı́ OpenJump [OJ]), nebo přı́stupné
jako webová aplikace, což se taktéž jevı́ jako velmi vhodná varianta vzhledem k možnému dopadu na velké množstvı́ potenciálnı́ch uživatelů. Druhá zmiňovaná varianta by mohla být reprezentována např. implementacı́ společně s OpenLayers [OL], což je JavaScriptová knihovna
umožňujı́cı́ zobrazovat mapy v prohlı́žeči bez závislosti na serverové části.
Uživatelé
Pokud je požadavkem navrženı́ GUI, s kvalitnı́m, srozumitelným a intuitivnı́m ovládánı́m,
je třeba netradičně začı́t od středu – tj. od U. GUI je předevšı́m navrhováno pro uživatele,
Geinformatics FCE CTU 2008
111
GUI pro orchestraci GeoWebových služeb
je tedy nutnost vyjı́t z analýzy uživatelů, kteřı́ budou k procesu přistupovat a analyzovat
taktéž jejich potřeby. Zajisté každý z nich bude mı́t jiné představy a požadavky jak by mělo
GUI vypadat, jakou mı́ru detailů o daném procesu má poskytovat a co vše má umožňovat.
Nejdřı́ve je tedy potřeba podı́vat se na role a uživatele, kteřı́ k procesu přistupujı́.
Při pohledu na některé zdroje informacı́ o tomto tématu, např. [UBPM], [RBPM], [TILSOA], nebo [BOSSOA] lze nalézt velké množstvı́ různorodých rolı́, které jsou vı́ce, či méně
nezbytné pro správné navrhovánı́ a údržbu procesů postavených na této architektuře. Pro
přı́klad jen jmenujme některé z nich (bližšı́ popis jednotlivých rolı́ a jejich kompetencı́ lze
nalézt ve zmiňovaných zdrojı́ch):
ˆ Vlastnı́k procesu
ˆ Vrcholový (strategický, TOP) tým, nebo manažer
ˆ Liniový manažer
ˆ Animátor BPM
ˆ IT specialista
ˆ Business konzultant
ˆ Architekt BMS
ˆ Procesnı́ týmy
ˆ Agent inovace
ˆ Centrum inovace
ˆ Zákaznı́k procesu
Toto dělenı́ vycházı́ z prostředı́ enterprise aplikacı́ a firem, které obdobné technologie a procesy
postavené na servisně orientované architektuře využı́vajı́. Zajisté se nejedná o kompletnı́ a
neměnný seznam, protože v každé společnosti můžou být role upravené k aktuálnı́ potřebě
společnosti a podobně [RLBPM].
V námi popisovaném prostředı́ však omezı́me množstvı́ uživatelů pouze na následujı́cı́ dvě
skupiny, které jsou z hlediska návrhu GUI pro registr služeb a orchestraci z našeho hlediska
podstatné.
Uživatelé vytvářejı́cı́ proces
Jedná se o uživatele, kteřı́ vytvářejı́ určitý proces a umožňujı́ jej využı́vat. Zpravidla se jedná
o firmy vytvářejı́cı́ procesy, zahrnujı́cı́ např. jimi vytvářené služby. Účelem je tedy využı́vánı́
jejich služeb, z čehož vyplývajı́ např. finančnı́ zisky, nebo reklama apod. Druhou skupinou
vytvářejı́cı́ procesy mohou být nadšenci, které zajı́majı́ tyto technologie, nebo vytvořı́ proces
pro vlastnı́ potřebu a rádi se o něj podělı́ s jinými. Tito uživatelé zpravidla majı́ k dispozicı́
lidi (nebo jsou jimi sami), kteřı́ se vyznajı́ v návrhu a vytvářenı́ procesů, jedná se tedy o
týmy, které obsahujı́ pracovnı́ky, kteřı́ nejen že majı́ znalosti z této problematiky, ale majı́
obvykle k dipozici i potřebné programové vybavenı́ nejen pro návrh, ale i pro implementacı́
procesu na nějaký aplikačnı́ server. Lze je tedy označit, jako uživatelé vytvářejı́cı́ procesy,
Geinformatics FCE CTU 2008
112
GUI pro orchestraci GeoWebových služeb
kteřı́ následně proces chtějı́ zaregistrovat do registru služeb a majı́ zájem na jeho využı́vánı́.
Z hlediska kontextu návrhu GUI lze konstatovat, že tito uživatelé majı́ již většinu potřebného
– at’ již ve formě komerčnı́ch řešenı́, nebo řešenı́ postavených na programech s otevřeným
zdrojovým kódem – k dispozici, nenı́ pro ně tedy třeba vymýšlet dalšı́ nástroje, které jim
umožnı́ proces vizualizovat, upravovat, apod.
Uživatelé využı́vajı́cı́ proces
Existuje však druhá skupina uživatelů, kteřı́ jsou konzumenty takto vytvořených procesů a
chtějı́ je pouze spouštět, či drobně upravovat (parametrizovat) apod. Jedná se tedy o uživatele,
kteřı́ si chtějı́ vyhledat konkrétnı́ proces a s tı́m pracovat, nejčastěji pouze zı́skat jeho popis
a spustit jej. Tato práce, která spočı́vá v komunikaci z registrem služeb, má být uživatelsky
přı́větivá a nevyžadujı́cı́ hlubšı́ znalosti z oblasti SOA. Žádné takové uživatelské prostředı́,
zvláště pro potřeby komunikace s navrženým registrem, však v současné době nenı́ k dipozici.
Jaké má být? Co má uživateli zpřı́stupňovat?
Požadavky uživatelů
V následujı́cı́ch řádcı́ch jsou popsány možné požadavky uživatelů na toto GUI. Požadavky
jsou seřazeny od těch nejjednoduššı́ch, až po pokročilejšı́, které sahajı́, až na hranici návrhu
procesů – tzn. na hranici s nástroji určenými pro skupinu uživatelů vytvářejı́cı́ procesy.
ˆ vyhledánı́ potřebného procesu
Hlavnı́m a základnı́m požadavkem uživatelů je nalezenı́ jimi požadovaného procesu, nebo
služby. Uživatelům musı́ být samozřejmě nabı́dnuto upřesněnı́ vyhledávánı́ v závislostech na
metrikách zjistitelných z registru služeb.
ˆ spouštěnı́ vybraného procesů
Společně s výše jmenovaným požadavkem na nalezenı́ procesu je spuštěnı́ procesu druhým a
zároveň poslednı́m hlavnı́m požadavkem. Kdyby GUI odpovı́dalo pouze těmto dvěma požadavkům, lze předpokládat, že by bylo dostačujı́cı́ pro valnou většinu uživatelů využı́vajı́cı́
služeb registru.
ˆ parametrizace procesu – úprava na základě metrik
V závislosti na mı́ře, v jaké chce uživatel s procesem pracovat lze mluvit o jednoduché a
složitějšı́ parametrizaci. Jednoduchou je myšlena pouhá úprava vstupnı́ch parametrů procesu,
či výběr v závislosti na jakém kritériu má být proces upraven apod. Uživatelův požadavek
může např. znı́t – využij pouze služby, které jsou zdarma. V přı́padě této jednoduché parametrizace je tedy práce ponechána na straně jádra orchestrace a přebı́rá tedy do své režie
logiku výběru. Na vstup je pouze poslána šablona, kterou jádro upravı́ do konkrétnı́ podoby
a výsledek opět vrátı́ uživateli. Naproti tomu v přı́padě složitějšı́ parametrizace přebı́rá zodpovědnost a logiku již na sebe sám uživatel a vybere si např. pouze zástupnou službu za jednu
konkrétnı́, kde vyžaduje např. vyššı́ přesnost.
ˆ podpora pro workflow
Geinformatics FCE CTU 2008
113
GUI pro orchestraci GeoWebových služeb
Některé procesy lze definovat jako dlouho trvajı́cı́ procesy s lidskou interakcı́ (Human Task
Management) [UBPM], u těchto by bylo vhodné zahrnout do tohoto jednotného GUI potřebné
uživatelské rozhranı́ tuto interakci zprostředkujı́cı́. Bude-li tedy do výsledku zahrnuta některá
služba, požadujı́cı́ zpřesňovánı́ vstupu apod., je nežádoucı́, aby uživatel nějakým způsobem
hledal, kde má zpřesněnı́ zadávat, ale je vhodné, aby uživateli byla nabı́dnuta, např. v rámci
sledovánı́ stavu orchestru, jednoduchá možnost toto zpřesněnı́ provést. Pokud tedy v průběhu
procesu dojde např. k požadavku, aby uživatel upřesnil zda analýza má být provedena pro obec
Janovice nad Úhlavou, nebo Janovice (okr. F-M), uživatel toto upřesněnı́ provede výběrem z
nabı́zených možnostı́ přı́mo v navrhovaného GUI.
ˆ zobrazenı́ procesu
Požadavek na zobrazenı́ procesu se vyskytne nejen u skupiny uživatelů, kteřı́ budou chtı́t
složitějšı́m způsobem parametrizovat, či upravovat nabı́dnutý proces, ale jistě se vyskytne i
skupina uživatelů, kteřı́ budou pouze chtı́t vidět, které služby jsou zapojeny apod.
ˆ uloženı́ procesu
Po úpravě procesu do podoby žádané uživatelem, budou někteřı́ uživatelé chtı́t upravený
proces uložit do registru služeb, aby si zajistili jeho znovupoužitelnost v již jednou editované
podobě. Zobrazenı́ procesu a vyhledávánı́ v závislostech na uživateli. Tento bod naplňuje
potřeby uživatelů, kteřı́ rádi využı́vajı́ práce v kontextu uživatele, kdy aplikace vı́ o uživateli
a nabı́zı́ mu výsledky určené právě pro něj. Uživateli v jehož profilu jsou tedy informace o
tom, že je spořivý“ a využı́vá pouze služby zdarma, nebudou nabı́zeny placené služby.
”
ˆ sledovánı́ stavu
Umožňuje uživateli sledovat v jakém stavu se jı́m spuštěný proces momentálně nacházı́ a
zobrazuje informace např. o tom, jak dlouhá doba je předpokládaná do dokončenı́ spuštěného
procesu.
ˆ monitoring
Někteřı́ uživatelé budou vyžadovat bližšı́ informace o probı́hajı́cı́m procesu a budou chtı́t znát
informace o tom, která služba je právě zapojená, na kterou službu se čeká apod. Vhodné by
bylo zobrazenı́ procesu společně s vyznačenı́m právě probı́hajı́cı́ch kroků.
ˆ debuging
V přı́padě neúspěšného provedenı́ orchestru budou někteřı́ uživatelé zajisté chtı́t vědět, proč
došlo k jeho selhánı́, v kterém mı́stě apod. Debuging by jim měl umožnit provést proces
krokovaně a odhalit tedy slabé mı́sto, nalézt mı́sto – službu, která vracı́ nesprávné, nebo
žádné výsledky apod. Na základě toho si budou uživatelé moci vybrat zástupnou službu za
slabé mı́sto v procesu a tak provést požadovaný proces např. rychleji – po odhalenı́ pomalé
služby dojde k jejı́mu nahrazenı́ za službu poskytujı́cı́ použitelná obdobná data rychleji.
ˆ návrhář procesů
Pro skupinu uživatelů – konzumentů procesů se jevı́ jako nepotřebné – viz. výše v textu.
Geinformatics FCE CTU 2008
114
GUI pro orchestraci GeoWebových služeb
Prvky GUI
GUI bude složeno z jistých elementů, které by byly jednak samostatně použitelné, ale jistým
způsobem i provázané. Na základě práce uživatele budou interaktivně zobrazeny aktuálnı́
prvky, které by mohly být k dané činnosti vhodné. Prvky jsou vypsány v pořadı́, který se
snažı́ korespondovat s možnými požadavky uživatelů.
ˆ SearchBox
ˆ Pole s výsledky
ˆ Dialog pro práci s procesem
ˆ Dialog zobrazenı́ podrobných informacı́ o procesu
ˆ Dialog pro jednoduchou parametrizaci
ˆ Dialog pro vizualizaci procesu
ˆ Mapové pole
ˆ Tlačı́tko pro spuštěnı́ procesu
ˆ Tlačı́tko pro uloženı́ procesu
ˆ Sledovač průběhu procesu
ˆ Monitor procesu
ˆ Debuger procesu
ˆ Zobrazenı́ výsledku procesu
ˆ Přihlašovacı́ dialog
Podoba zobrazeného procesu
Při návrhu nového procesu se obvykle použı́vá BPMN. Primárnı́m cı́lem BPMN je však poskytnout notaci, která je snadno srozumitelná všem business uživatelům: business analytikům,
kteřı́ navrhujı́ procesy, technickým vývojářům, kteřı́ implementujı́ technologie pro vykonávánı́
procesů a managerům, kteřı́ tyto procesy monitorujı́ a řı́dı́. BPMN vytvářı́ standardizovaný
most mezi návrhem business procesů a jejich implementacı́. Dalšı́m cı́lem BPMN je umožnit
vizualizaci XML jazyků určených pro návrh a vykonávánı́ procesů (jako např. BPEL4WS)
prostřednictvı́m business-orientované notace [REEN].
Až potom je obvykle tento zápis navrhovaného procesu, převeden do jeho implementace v
BPEL, BPML, či jiném jazyce pro spouštěnı́ procesů. BPMN tedy definuje, jak převádět
jednotlivé elementy a sekvence těchto elementů do jazyka BPEL. Je tedy možné model procesu
do jeho spustitelné podoby převést. Dı́ky poměrné volnosti modelovánı́ v BPMN však nebývá
obvykle možné vygenerovat BPEL automaticky, některé BPMS nástroje však tuto funkci
nabı́zejı́, a to za cenu určitých omezenı́ při samotném modelovánı́ procesu [UBPM3]. Možnost
Automatické generovánı́ lze zajistit i striktnı́m dodrženı́m pravidel definovaných v BPMN.
Geinformatics FCE CTU 2008
115
GUI pro orchestraci GeoWebových služeb
Oproti BPMN nemá BPEL, žádnou implicitnı́ grafickou reprezentaci a sloužı́ k popisu procesu
už na vykonatelné úrovni, v podstatě jde o programový kód. Právě BPEL však bude pro
potřeby vizualizace procesu v GUI přı́stupný z registru. Některé z programových nástrojů
sloužı́cı́ch pro potřeby tvorby aplikacı́ založených na SOA, jako jsou např. NetBeans [NB],
nám usnadňujı́ přechod z BPMN na BPEL tı́m, že se snažı́ použı́vat stejné grafické prvky, to
ale rozhodně nebývá pravidlem [TBPEL]. Tato cesta se vzhledem k tomu, že v registru budou
služby uloženy ve formě jazyka BPEL, jevı́ jako vhodná. Proces je vizualizován v podobě,
který je při troše snahy pochopitelný i pro mı́rně pokročilé uživatelé a lze předpokládat,
že právě pokročilejšı́ uživatele budou vyžadovat pokročilejšı́ funkcionalitu práce s orchestry.
Na následujı́cı́m – obr. 2 – je zobrazen ukázkový proces vytvořený a vizualizovaný právě v
programovém produktu NetBeans a na obr. 3 je proces vizualizován pomocı́ WEEP Engine
[WEEP], který umožňuje konverzi souboru BPEL do podoby SVG, nebo PNG. Tento engine
by mohl být dobře využitelný pro potřeby funkčnı́ implementace popisovaného GUI.
Obr. 2: BPEL proces vizualizován pomocı́ NetBeans [NB]
Geinformatics FCE CTU 2008
116
GUI pro orchestraci GeoWebových služeb
Obr. 3: BPEL proces vizualizován pomocı́ WEEP Engine [WEEP]
Scénář práce
V následujı́cı́ch řádcı́ch je popsán možný scénář práce s grafickým rozhranı́m pro orchestraci
GeoWebových služeb. V závislosti na formě – desktop aplikace, či webovém rozhranı́, uživatel
zahájı́ práci vyvolánı́m nabı́dky menu v aplikaci, pro niž bude např. vytvořen plugin, nebo
spustı́ internetový prohlı́žeč a zadá webovou adresu, kde bude klientská aplikace ve formě
webové aplikace. Následně bude uživateli zobrazeno následujı́cı́ výchozı́ dialogové okno, které
bude obsahovat textové pole a mapové pole, oboje určeno k vyhledávánı́ služeb. Bude zde i
volba pokročilé, které umožnı́ zpřesnit požadovaný vyhledávaný výraz, nebo již v tuto chvı́li
určit, aby výsledné orchestry byly vráceny parametrizované, např. dle ceny. Měla by zde být i
možnost přihlášenı́ uživatele, kterou by následně byly ovlivněny vyhledávané služby a procesy.
Uživateli budou následně zobrazeny vyhledané služby a orchestry s možnostı́ zobrazenı́ si
vı́ce podrobnosti. Pro zobrazenı́ podrobnostı́ geografických bude využito opět komponenty
zprostředkovávajı́cı́ mapové výstupy.
Geinformatics FCE CTU 2008
117
GUI pro orchestraci GeoWebových služeb
Obr. 4: Návrh úvodnı́ stránky portálu sloužı́cı́ho běžným uživatelům
Obr. 5: Zobrazenı́ vyhledaných služeb
Po vybránı́ daného orchestru bude uživateli přı́mo umožněna jeho jednoduchá parametrizace,
nebo spuštěnı́ vybraného orchestru. V přı́padě požadované úpravy procesu bude proces registrem upraven a opět vrácen v obdobném dialogovém okně (webové stránce) a parametrizace
se může stále opakovat, dokud nebude uživatel spokojen.
Geinformatics FCE CTU 2008
118
GUI pro orchestraci GeoWebových služeb
Obr. 6: Zobrazenı́ všech podrobnostı́ o procesu, včetně možnosti parametrizace a spuštěnı́
Při volbě složitějšı́ parametrizace bude uživateli zobrazen proces v jeho grafické podobě –
viz. obr. 2, nebo obr. 3. Při požadavku záměny služby za jinou bude opět využı́ván dialog
pro vyhledávánı́ služeb a jejich volba. Po spuštěnı́ procesu bude uživateli zobrazen dialog o
průběhu a následně zobrazen výsledek.
Navržené rozhranı́
Z výše zmı́něných řádků je patrné, že GUI bude přistupovat k Service Registru a BPEL procesoru. V následujı́cı́ch řádcı́ch je popsáno základnı́ rozhranı́ vůči těmto dvěma zmiňovaným
komponentám.
GUI – Registr Služeb
Geinformatics FCE CTU 2008
119
GUI pro orchestraci GeoWebových služeb
Obr. 7: Zobrazenı́ informacı́ o průběhu spuštěného procesu
ˆ getServices() – vracı́ seznam procesů/služeb upravený v závislostech na metrikách,
šablonách, či uživateli apod. Součástı́ vráceného seznamu jsou i základnı́ metriky a
informace o procesech a službách.
ˆ getDetail() – Vracı́ všechny dostupné informace o procesu, či službě. Umožňuje vrátit
proces ve formě BPEL souboru, který je možno následně vizualizovat.
ˆ save() – sloužı́ k uloženı́ upraveného procesu do registru služeb, k pozdějšı́mu znovupoužitı́.
GUI – BPEL Procesor
ˆ execute() – umožňuje zavolat BPEL procesor, aby spustil konkrétnı́ službu uloženou v
registru služeb, nebo službu, která je upravena uživatelem a nenı́ žádané jejı́ uloženı́ v
registru služeb.
Závěr
V současnosti je grafické uživatelské rozhranı́ navržené v teoretické rovině a byly modelově
vytvořeny dialogy a komponenty, které by mohly být při práci s orchestry využitelné. Pro
potvrzenı́ použitelnosti a uživatelské přı́větivosti však bude nejdůležitějšı́ interakce tohoto
návrhu přı́mo s uživateli. Až po této interakci s vybranou různorodou skupinou uživatelů –
v prvé fázi realizované taktéž v rovině teoretické je vhodné přistoupit k realizaci GUI, jejı́
praktickou implementacı́. Následně je vhodné provést druhé kolo interakce z uživateli a zanést
jejich připomı́nky vzniknuvšı́ při reálné práci s navrženým GUI. Současný návrh vycházı́ ze
současně navržené architektury, která se ještě může drobně upravit, což se může projevit i v
navrženém grafickém rozhranı́.
Geinformatics FCE CTU 2008
120
GUI pro orchestraci GeoWebových služeb
Reference
[BOSSOA] Bose S., Bieberstein N., Fiammante M., Jones K., Shah R., SOA Project Planning
Aspects, online1 .
[NB] Domovská stránka produktu NetBeans, online2 .
[OJ] Domovská stránka projektu OpenJump, online3 .
[OL] Domovská stránka projektu OpenLayers, online4 .
[PRADP] Pager, M., Řetězenı́ webových služeb v prostředı́ open source GIS. Diplomová práce.
2007. Ostrava. online5 .
[PRAM] Prager M., Maršı́k V., Využitı́ orchestrace služeb pro řešenı́ úloh v rámci ISKŘ,
online6 .
[RBPM] Role BPM, BPM Portál, online7 .
[REEN] BPMN & BPEL for business analysts, Úvod do kurzu, online8 .
[RLBPM] Organizačnı́ struktury v procesnı́m řı́zenı́, BPM slovnı́ček, online9 .
[TBPEL] Vašı́ček P., Seriál BPM prakticky, 5. část: Tvorba BPEL modulu, online10 .
[TILSOA] Tilkov S., Roles in SOA Governance, online11 .
[UBPM] Vašı́ček P., Seriál BPM prakticky, 1. část: Proč BPM s open source nástroji, online12 .
[UBPM3] Vašı́ček P., Seriál BPM prakticky, 3. část: Úvod do BPMN, online13 .
[WEEP] Domovská stránka projektu WEEP, online14 .
1
http://www.informit.com/articles/article.aspx?p=422305&seqNum=5
http://www.netbeans.org/
3
http://openjump.org/wiki/show/HomePage
4
http://www.openlayers.org/
5
http://gisak.vsb.cz/ pra089/texty/DP pra089 v1 0.pdf
6
http://gis.vsb.cz/GIS Ostrava/GIS Ova 2008/sbornik/Lists/Papers/093.pdf
7
http://www.procesy.cz/Metodiky/Role-BPM.htm
8
http://www.reengine.cz/index/bpmn-and-bpel-for-business-analysts.do
9
http://bpm-slovnik.blogspot.com/2007/09/organizace.html#Role
10
http://bpm-sme.blogspot.com/2008/04/5-tvorba-bpel-modulu.html
11
http://www.infoq.com/articles/tilkov-soa-roles
12
http://bpm-sme.blogspot.com/2008/02/1-uvod-do-bpm-pro-sme.html
13
http://bpm-sme.blogspot.com/2008/03/3-uvod-do-bpmn.html
14
http://weep.gridminer.org/index.php/About WEEP
2
Geinformatics FCE CTU 2008
121