Metody Informatyki Stosowanej

Transkrypt

Metody Informatyki Stosowanej
Polska Akademia Nauk Oddział w Gdańsku
Komisja Informatyki
Metody Informatyki Stosowanej
Nr 4/2008 (Tom 17)
Szczecin 2008
Metody Informatyki Stosowanej
Kwartalnik Komisji Informatyki Polskiej Akademii Nauk Oddział w Gdańsku
Komitet Naukowy:
Przewodniczący:
prof. dr hab. inŜ. Henryk Krawczyk, czł. koresp. PAN, Politechnika Gdańska
Członkowie:
prof. dr hab. inŜ. Michał Białko, czł. rzecz. PAN, Politechnika Koszalińska
prof. dr hab. inŜ. Ludosław Drelichowski, Uniwersytet Technologiczno-Przyrodniczy w Bydgoszczy
prof. dr hab. inŜ. Janusz Kacprzyk, czł. koresp. PAN, Instytut Badań Systemowych PAN
prof. dr hab. Jan Madey, Uniwersytet Warszawski
prof. dr hab. inŜ. Leszek Rutkowski, czł. koresp. PAN, Politechnika Częstochowska
prof. dr hab. inŜ. Piotr Sienkiewicz, Akademia Obrony Narodowej
prof. dr inŜ. Jerzy Sołdek, Politechnika Szczecińska
prof. dr hab. inŜ. Andrzej Straszak, Instytut Badań Systemowych PAN
prof. dr hab. Maciej M. Sysło, Uniwersytet Wrocławski
Recenzenci:
prof. dr hab. inŜ. Ludosław Drelichowski, Uniwersytet Technologiczno-Przyrodniczy w Bydgoszczy
dr hab. inŜ. Krzysztof Giaro, prof. PG, Politechnika Gdańska
prof. dr hab. inŜ. Larysa Globa, National Technical University of Ukraine
prof. dr hab. inŜ. Janusz Górski, Politechnika Gdańska
prof. Shinya Kobayashi, Ehime University
prof. dr hab. inŜ. Leonid Kompanets, Politechnika Częstochowska
dr hab. inŜ. Georgy Kukharev, prof. PS, Politechnika Szczecińska
dr hab. inŜ. Eugeniusz Kuriata, prof. UZ, Uniwersytet Zielonogórski
dr hab. inŜ. Emma Kusztina, prof. PS, Politechnika Szczecińska
dr hab. inŜ. Karol Myszkowski, Politechnika Szczecińska
prof. dr hab. inŜ. Andrzej Piegat, Politechnika Szczecińska
dr hab. Jacek Pomykała, Uniwersytet Warszawski
prof. dr hab. inŜ. Remigiusz Rak, Politechnika Warszawska
prof. dr hab. inŜ. Valeriy Rogoza, Politechnika Szczecińska
dr hab. inŜ. Khalid Saeed, prof. PB, Politechnika Białostocka
prof. dr hab. inŜ. Boris Sovetov, St. Petersburg Electrotechnical University
dr hab. inŜ. Antoni Wiliński, prof. PS, Politechnika Szczecińska
dr hab. Zenon Zwierzewicz, prof. AM, Akademia Morska w Szczecinie
Redaktor Naczelny:
Antoni Wiliński
Sekretarz redakcji:
Piotr Czapiewski
ISSN 1898-5297
ISBN 978-83-925803-6-2
Wydawnictwo:
Polska Akademia Nauk Oddział w Gdańsku
Komisja Informatyki
Adres kontaktowy: ul. śołnierska 49 p. 104, 71-210 Szczecin
Druk: Pracownia Poligraficzna Wydziału Informatyki Politechniki Szczecińskiej
Nakład 510 egz.
Spis treści
Włodzimierz Bielecki, Krzysztof Kraska
INCREASING DATA LOCALITY OF PARALLEL PROGRAMS EXECUTED IN EMBEDDED SYSTEMS . . . . . 5
Bartosz Bielski, Przemysław Klęsk
PASSIVE OPERATING SYSTEM FINGERPRINTING USING NEURAL NETWORKS AND INDUCTION OF
DECISION RULES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Janusz Bobulski
ANALYSIS OF 2D PROBLEM IN HMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Michał Choraś, Adam Flizikowski, Anna Stachowicz, Marta Redo, Rafał Renk,
Witold Hołubowicz
ONTOLOGY NOTATION AND DESCRIPTION OF VULNERABILITIES IN HETEROGENEOUS NETWORKS
AND CRITICAL INFRASTRUCTURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Magdalena Ciszczyk, Emma Kusztina
THE ROLE OF STANDARDIZATION IN THE PROCESS OF FORMING QUALITY OF EDUCATIONAL
REPOSITORY IN ODL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Paweł Forczmański
UNIFIED JPEG AND JPEG-2000 COLOR DESCRIPTOR FOR CONTENT-BASED IMAGE RETRIEVAL . . 53
Dariusz Frejlichowski
AN OUTLINE OF THE SYSTEM FOR COMPUTER-ASSISTED DIAGNOSIS BASED ON BINARY
ERYTHROCYTES SHAPES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Jarosław Jankowski
PERSONALIZACJA PRZEKAZU INTERAKTYWNEGO Z UDZIAŁEM
METOD ANALIZY CZYNNIKOWEJ.
. 71
Henryk Krawczyk, Sławomir Nasiadka
METODA WYZNACZANIA KONTEKSTU DLA APLIKACJI STEROWANYCH ZDARZENIAMI . . . . . . . . 81
Mariusz Kubanek, Szymon Rydzek
A HYBRID METHOD OF PERSON VERIFICATION WITH USE INDEPENDENT SPEECH AND FACIAL
ASYMMETRY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Krzysztof Makles
IDENTYFIKACJA CHARAKTERYSTYK MODELI W TRYBIE ON-LINE WRAZ Z WIZUALNĄ
REKONSTRUKCJĄ CHARAKTERYSTYKI. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Wojciech Maleika, Michał Pałczyński
VIRTUAL MULTIBEAM ECHOSOUNDER IN INVESTIGATIONS ON SEA BOTTOM
MODELING .
. . . . . 111
Radosław Mantiuk
HDRLIB: BIBLIOTEKA DO SZYBKIEGO PRZETWARZANIA OBRAZÓW HDR WYKORZYSTUJĄCA
ZAAWANSOWANE MOśLIWOŚCI PROCESORÓW CPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Ireneusz Mrozek, Eugenia Buslowska, Bartosz Sokol
MPS(3N) TRANSPARENT MEMORY TEST FOR PATTERN SENSITIVE FAULT DETECTION . . . . . . 129
Adam Nowosielski, Krzysztof Kłosowski
ANALIZA NATĘśENIA RUCHU OSÓB NA MONITOROWANYCH OBSZARACH . . . . . . . . . . . . . . . 139
4
Walenty Oniszczuk
MODELLING AND ANALYSIS OF TWO-SERVER NETWORKS WITH FINITE CAPACITY BUFFERS AND
BLOCKING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Jerzy Pejaś
DIRECTED THRESHOLD SIGNCRYPTION SCHEME FROM BILINEAR PAIRING UNDER SOLE CONTROL
OF DESIGNATED SIGNCRYPTER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Andrzej Piegat, Marek Landowski
LACK OF INFORMATION – AVERAGE DISTRIBUTION OF PROBABILITY DENSITY . . . . . . . . . . . . 173
Jacek Pomykała, Bartosz Źrałek
DYNAMIC GROUP THRESHOLD SIGNATURE BASED ON DERANDOMIZED WEIL PAIRING
COMPUTATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
Orest Popov, Anna Barcz, Piotr Piela
DETERMINING THE EFFECTIVE TIME INTERVALS IN THE RECURRENT PROCESSES OF
IDENTIFICATION OF DYNAMIC SYSTEMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Michał Słonina, Imed El Fray
INFRASTRUKTURA ZARZĄDZANIA UPRAWNIENIAMI WYKORZYSTUJĄCA KONTROLĘ DOSTĘPU
OPARTĄ NA ROLACH ZINTEGROWANĄ Z JĄDREM SYSTEMU OPERACYJNEGO . . . . . . . . . . . . . 203
Marcin Suchorzewski
EVOLVING WEIGHTED TOPOLOGIES FOR NEURAL NETWORKS USING GENETIC PROGRAMMING ..... 211
Galina łariova, Alexandr łariov
EFEKTYWNE ALGORYTMY WYZNACZANIA WYRAśENIA
POSTACI
Y=(A B)X ............................. 221
Tatiana Tretyakova, Abdullah Zair
FORMATION OF THE CONTENTS OF KNOWLEDGE'S BASES FOR LOCAL INTELLIGENT DECISION
SUPPORT SYSTEMS .................................................................................................................... 231
Jarosław Wątróbski, Zbigniew Piotrowski
IMPACT OF THE PRESENCE OF LINGUISTIC DATA ON THE DECISION AID PROCESS .......................... 241
Increasing data locality of parallel programs
executed in embedded systems
Włodzimierz Bielecki, Krzysztof Kraska
Szczecin University of Technology,
Faculty of Computer Science and Information Technology
Abstract:
Increasing data locality in a program is a necessary factor to improve performance of
software parts of embedded systems, to decrease power consumption and reduce memory
on chip size. A possibility of applying a method of quantifying data locality to a novel
method of extracting synchronization-free threads is introduced. It can be used to
agglomerate extracted synchronization-free threads for adopting a parallel program to a
target architecture of an embedded system under various loop schedule options (spacetime mapping) and the influence of well-known techniques to improve data locality. The
choice of the best combination of loop transformation techniques regarding to data
locality makes possible improving program performance. A way of an analysis of data
locality is presented. Experimental results are depicted and discussed. Conclusion and
future research are outlined.
Keywords:
data locality, compilers, parallel processing, embedded systems
1. Introduction
Embedded systems involved in data processing consist of programmable processors,
program components processed by the processors and hardware components often realized in FPGA cooperating with software parts of the system. Software components
enable making corrections quickly, code reusing, elastic changing a program permitting
for reducing the time of delivering product to the market. But programmable processors
consume considerably more energy and they are significantly slower than their hardware counterparts. Hardware solutions assure greater performance and smaller power
consumption however designing time may be long and the design process is expensive [9].
Multiprocessor architectures for embedded systems are widespread on the contemporary electronic market. For example, the Xilinx FPGA Virtex-4FX chip includes up
to two PowerPC405 processors, National Semiconductor’s Geode chips enable to join
several processors to build a multiprocessor system based on the x86 architecture, the
HPOC project (Hundred Processors, One Chip) undertaken at Hewlett Packard attempts
to consolidate hundreds of processors on one chip using co-resident on-chip memory [4].
6
Włodzimierz Bielecki, Krzysztof Kraska
Similarly to the computer software development, the embedded system development
needs programming languages, debuggers, compilers, linkers and other programming
tools. Approved as an IEEE standard, the SystemC language is an example of the tool
that enables the implementation of both software and hardware parts of embedded systems.
The optimal implementation of software components designed for multiprocessor
embedded systems is critical for their performance and the power consumption. However, poor data locality is a common feature of many existing numerical applications [6].
Such applications are often represented with affine loops where the considerable quantities of data placed in arrays exceeded the size of a fast but small cache memory. For an
inefficient code, referenced data has to be fetched to a cache from external memory
although they could be reused many times. Because cache is expensive, memories often
operate at full speed of a processor while cheaper but more capacious external memory
modules operate at several times slower frequency, hence the time to access data located
in a cache memory is significantly less. Improvement in data locality can be obtained by
means of high-level program transformations. Increasing data locality within a program
improves the utilization of fast data cache and delimits accesses to slower memory
modules at lower level. Finally it makes general performance improvement for software
parts of embedded systems.
2. Analysis of data locality for synchronization-free slices
A new method of extracting synchronization-free slices (SFS) in arbitrarily nested
loops was introduced in [1]. The method enables us to extract more parallel threads than
other methods. The well-known technique invented by Wolfe [3] estimates data reuse
factors. It makes possible to adopt such order of the loop execution that increases data
locality in a program. In relation to the method of extracting synchronization-free slices,
the estimation of data locality is a necessary activity to obtain an improved performance
for a program executed on a target architecture. The SFS method extracts maximal
number of the parallel threads however any target embedded architecture consists of the
fixed number of CPU cores usually smaller than the number of threads extracted.
Hence, it is necessary to adjust the level of parallelism in a program to the target architecture [10]. Our previous research conducted on parallel computers indicates that the
extraction of synchronization-free slices as well as applying the tiling and the array
contraction techniques within an individual thread can considerably increase the performance of a parallel program. For example, the results of the experiments performed
for the Livermore Loops Kernel 1 (hydro fragment) [5] and the matrix multiplication
algorithm [6] indicate the considerable gains in the execution time (Figure 1a
and Figure 1b) [2].
On the contrary, the example of a simple code in Figure 2 executed on the same target architecture proves that the extraction of parallel slices under certain circumstances
can limit the performance of a program – the execution time of the parallel code (8
seconds) was about 30% greater than that of the correspondent sequential code (6
seconds). It can be noticed that the parallel code has the decreased spatial-reuse factor
value for a reference to the array a[] caused to maintain the coherence between the
caches of multiple processors to a large extent.
Increasing data locality of parallel programs executed in embedded systems
(a)
(b)
Figure 1. Execution time of (a) Livermore Loops Kernel 1 (b) matrix multiplication
…
#define SIZE 100000000
int main(void)
{
double *a = new double[SIZE];
for(long i=0; i<SIZE; i++)
a[i]=(double)i+1;
for(long i=0; i<(SIZE-2); i++)
a[i+2]=sin((a[i]*SIZE+1)/SIZE);
return 0;
}
(a) Sequential code
#include <omp.h>
…
#define SIZE 100000000
int main(void)
{
omp_set_num_threads(2);
double *a=new double[SIZE];
for(long i=0; i<SIZE; i++)
a[i]=(double)i+1;
#pragma omp parallel for
private(j,i) shared(a)
for(long j=0; j < 2 ; j++)
for(long i=j; i<(SIZE-2); i+=2)
a[i+2]=sin((a[i]*SIZE+1)/SIZE);
return 0;
}
(b) Parallel code
Figure 2. Decreased performance in the parallel version of a simple code
7
8
Włodzimierz Bielecki, Krzysztof Kraska
The assurance of the optimal performance for a program with the level of parallelism limited to the possibility of a target architecture requires the iterative estimation
of data locality for different ways of the slices agglomeration in combination with different types of scheduling (space-time mapping) and applying well-known techniques
for improving data locality. The choice of the best combination of the above-mentioned
elements under a generalized data locality factor will assure the optimal performance of
a program.
For both of the Livermore Loops Kernel 1 and the matrix multiplication algorithm,
the level of parallelism was adjusted to a target architecture by applying the function
omp_set_num_threads( omp_get_num_procs( ) ) from the OpenMP
library. The number of threads created for the target architecture (2 x Intel Pentium 4
with Hyper-Threading technology) was 4. The size of the DataCache-L1 line for the
processor was 128-bytes which corresponds to 32 array elements. Since the directive
#pragma parallel in the both programs omitted the type of the iteration scheduling, the compiler applied the scheduling of the static policy allocating ¼ consecutive
iterations to each thread [12].
The data locality factors for the matrix multiplication algorithm and the Livermore
Loops Kernel 1 calculated with the method [3] are presented in Table 1, Table 2 and
Table 3.
Table 1. Reuse factors for the matrix multiplication algorithm
Reference
Reuse factors
Temporal
Spatial
Self-reuse
Cumulative self
reuse
Data footprint
Rk*
Rj*
Ri*
Fk *
F j*
Fi *
z[i][j]
N
1
1
1
32
1
N
32
1
N
32N
32N
1
N/32
N2/128
x[i][k]
1
N
1
32
1
1
32
N
1
32
32N
32N
N/32
N/32
N2/128
y[k][j]
1
1
N/4
1
32
Σ
1
1
32
N/4
1
32
8N
N
N /32
N2/32
6N2/128
Ri
N2/32 + N/16
Rj
N+N/32 + 1
Rk
72N
i
64N + 32
j
N+33
k
N/4 + 2
i
N + 64
j
N + 33
k
2
In the case of the matrix multiplication, there are no separate references to the same
array, therefore only the self-reuse factors were calculated. For arrays of size N x N,
where N=2048, the data footprint for the outermost loop, Fi*, considerably exceeded the
size of DataCache-L1 causing an access to slower memory level:
Fi* =
6∗ N2
; N = 2048 ; Fi * = 196608 .
128
9
Increasing data locality of parallel programs executed in embedded systems
Table 2. Temporal-reuse factors for the Livermore Loops Kernel 1 (hydro fragment)
Reference
x[k]
y[k]
Z[k+10]
Z[k+11]
Temporal
k
1
1
1
1
l
Loop
Loop
Loop
Loop
Spatial
k
32
32
32
32
l
1
1
1
1
Σ
Reuse factors
Cumulative self
Self-reuse
reuse
Rk
Rl
Rk*
Rl*
32
loop
32
32loop
32
loop
32
32loop
32
loop
32
32loop
32
loop
32
32loop
Fk *
n/32
n/32
n/32
n/32
Fl *
n/128
n/128
n/128
n/128
128
n/8
n/32
4* loop
128
128* loop
Data footprint
Table 3. Spatial-reuse factors for the Livermore Loops Kernel 1 (hydro fragment)
Reference
z[k+11]
z[k+10]
Reuse factors
Group-temporal reuse
k
l
1
1
1
loop
Self-spatial reuse
k
l
32
1
32
1
Cumulative group reuse
k
l
32
32
32
32*loop
The value Fi*=196608 for the 4-bytes array element size gives 768KB. After applying the tiling technique and splitting data into blocks with the side B=32 elements, the
value of the data footprint was decreased because that data amount could be entirely
placed in DataCache-L1:
Fi* =
6 ∗ B2
; B = 32 ; Fi * = 48 .
128
The value Fi*=48 for the 4-bytes array element size gives 192B. It should be noticed
that DataCache-L1 was shared between 2 parallel threads.
In the case of the Livermore Loops Kernel 1 (hydro fragment), the self-reuse factors
are identical for the source with fine grained parallelism and the source with synchronization-free slices extracted. There is also the group-reuse between references z[k+11]
and z[k+10] sorted so that the reuse distance between adjacent references is lexicographically nonnegative. There are also self-temporal and self-spatial reuse factors for
the both references. The group-spatial reuse factor equals one since there is the
self-spatial reuse factor. To take into account reuse between references, a generalized
data reuse factor for the outermost loop Ll is computed by dividing the data footprint
by the cumulative group reuse factor that finally gives:
n / 32
= n / (322 * loop) .
32* loop
3. Experiments
Experiments were performed by means of the software simulator IBM PowerPC
Multi-Core Instruction Set Simulator v1.29 (MC-ISS) [7] intended for the PowerPC
10
Włodzimierz Bielecki, Krzysztof Kraska
405/440/460 embedded systems development and the related IBM RISCWatch v6.0i
debugger [8]. Cache utilization was reached from DCU (sim readdcu) statistics of
the simulator.
The following configuration of the simulator was used to conduct experiments:
 2 x PowerPC405 processors with
 16KB two-way set-associative DataCache-L1 (8 words/32 bytes cache line)
 no DataCache-L2.
The sources exposed to the experiments were developed in a manner representative
for the embedded software development using the cross-platform development environment composed of the Intel PC workstation and the target executable architecture
[8]. The examined C sources were compiled on the Fedora 4 Linux x86 to the PowerPC
Embedded ABI file format by means of the gcc-3.3.1 compiler and executed in the
target system environment using the MC-ISS software simulator. Due to the target architecture limitations, two threads of the data processing were extracted in the sources.
Iterations of a parallel loop were assigned to threads according to the scheduling of
static policy, i.e., one thread has assigned a half of the consecutive loop iterations [12].
Table 4 shows the results achieved for the matrix multiplication code being simulated in the MC-ISS embedded software simulator.
Table 4. The experimental results of DCU utilization for the matrix multiplication code
(N=256, B=8)
RISCWatch
STATUS
DCU total
accesses
DCU misses
Misses/total
[%]
Sequential
Parallel SFS
CPU0
CPU1
Parallel SFS
with Blocking
CPU0
CPU1
Parallel SFS with
Blocking &
Array Contraction
CPU0
CPU1
CPU0
CPU1
31852424
N/A
2160751
N/A
8634538
8634538
317789
317789
317789
317789
6,8%
N/A
6,8%
6,8%
0,22%
0,22%
0,27%
0,27%
127044104 127044104 145012296 145012296 119846472 119846472
Table 5 shows the results obtained for the Livermore loop Kernel 1 (hydro fragment) code executed in the MC-ISS embedded software simulator.
Table 5. The experimental results of DCU utilization for the Kernel 1
(loop=100; array_size=8192*sizeof(int))
RISCWatch
STATUS
DCU total
accesses
DCU misses
misses/total
[%]
Sequential
Parallel
Parallel SFS
CPU0
CPU1
Parallel SFS with
Array Contraction
CPU0
CPU1
CPU0
CPU1
CPU0
CPU1
11527637
N/A
5800687
5800687
8399916
8399916
309546
N/A
155799
155799
5130
5130
5131
5131
2,69%
N/A
2,69%
2,69%
0,04%
0,04%
0,06%
0,06%
11576472 11576472
The examined sources have achieved the same (in the first case) and much better (in
the second case) DCU misses/total ratio after synchronization-free slices extraction.
Increasing data locality of parallel programs executed in embedded systems
11
Obviously, increase in performance of the programs by the parallel execution with no
synchronizations is not considered in the tables. Applying the blocking technique has
further improved data locality. For the Parallel SFS sources with the array contraction
technique applied the misses/total ratio do not properly render data locality improvement since reused data were placed in CPU registers and therefore DCU total accesses
factors were decreased. In fact, data locality is better then previously due to usage of the
fastest memory registers.
The results achieved in the foregoing experiments confirmed the results previously
achieved for real multiprocessor computers. They indicated on the significant improveimprov
ment of the DCU utilization for a PowerPC405 processor used in embedded systems
where well-known
known optimization techniques to improved data locality were applied withwit
in synchronization-free slices.
4. A source-to-source compiler
We intend to implement the results of our research in an academic source-to-source
source
compiler. Figure 3 illustrates the structural overview of the software to be build.
Figure 3. A structural overview of the software to build
based on the results of research
The MainController is responsible for managing the execution of all compiler modmo
ules. The SlicesExtractionModule implements the method of extracting parallel synsy
chronization-free slices [1].. It makes the extraction of slices from an input C source
taking advantage of the Omega Calculator Library [11] to fulfill a dependence analysis.
The LoopTransformationModule analyses possible combinations of the slices agglomeagglom
ration and applies into output code various space-time scheduling options as well as
techniques for improving data locality (i.e. tiling and array
a
contraction). The DataLocalityEstimationModule implements the method of calculating data locality factors.
VOPC domain model of the module worked out during research is presented in FigFi
ure 4. Both latter modules use the linear algebra engine from the Wolfram Research’s
12
Włodzimierz Bielecki, Krzysztof Kraska
Mathematica package. The CodeGenerationModule is responsible for the generation of
source code destined for the target architecture; there is the explicit assumption that the
tool will be able to generate output code according to the OpenMP2.0
Ope
standard.
Figure 4. VOPC Domain Model of the
he DataLocalityEstimationModule
5. Conclusion
The estimation of data locality for parallel threads of a loop extracted by means of
the method introduced in [1] is a necessary activity to assure optimal performance for a
program adjusted to and executed in a targeted architecture. The experiments presented
in this paper targeted to an embedded system environment indicate on the identical
gains achieved through the increased utilization level of DCU for sources where wellwell
known optimization techniques to improved data locality were applied.
An important observation obtained from the research is that other
ot
software components of the system simultaneously worked during the program execution have an unfounf
reseen impact on the cache utilization. For this reason, it is correctly to assume that the
available cache size is different for various configurations of the system. To optimize
cache utilization, it is worth to estimate the actual size of available cache memory.
Let be noticed that the limitations of using the results of our work go from the limilim
tations of the method introduced in [3]. It applies group-reuses
group
only to separate array
references that indexes expressions produce the same coefficient matrix A. Our effort is
targeted toward the expansion of the presented method to operate the cases where:
A1 i1 + c1 = A2 i2 + c2 .
Increasing data locality of parallel programs executed in embedded systems
13
References
[1] Bielecki W., Siedlecki K. Extracting synchronization-free slices in perfectly nested
uniform and non-uniform loops. Electonic Modeling, 2007.
[2] Bielecki W., Kraska K., Siedlecki K. Increasing Program Locality by Extracting
Synchronization-Free Slices in Arbitrarily Nested Loops. Proceedings of the Fourteenth International Multi-Conference on Advanced Computer Systems ACS2007,
2007.
[3] Wolfe M. High Performance Compilers for Parallel Computing. Addison-Wesley,
1996.
[4] Richardson S. MPOC. A Chip Multiprocessor for Embedded Systems. [online]
http://www.hpl.hp.com/techreports/2002/HPL-2002-186.pdf, HP Laboratories,
2002.
[5] Netlib Repository at UTK and ORNL [online].
http://www.netlib.org/benchmark/livermorec.
[6] Aho A. V., Lam M. S., Sethi R., Ullman J. D. Compilers: Principles, Techniques
and Tools, 2nd Edition. Addison-Wesley, 2006.
[7] IBM PowerPC Multi-Core Instruction Set Simulator. User’s Guide, IBM Corporation, 2008.
[8] IBM RISCWatch Debugger. User’s Manual, IBM Corporation, 2008.
[9] Stasiak A. Klasyfikacja Systemów Wspomagających Proces Przetwarzania
i Sterowania. II Konferencja Naukowa KNWS'05, 2005.
[10] Griebl M. Habilitation. Automatic Parallelization of Loop Programs for Distributed Memory Architectures. Iniversitat Passau, 2004.
[11] Kelly W., Maslov V., Pugh W., Rosser E., Shpeisman T., Wonnacott D. The omega
library interface guide. Technical Report CS-TR-3445, University of Maryland,
1995.
[12] Chandra R., Dagum L., Kohr D., Maydan D., McDonald J., Menon R. Parallel
Programing In OpenMP. Morgan Kaufmann, 2001.
Passive operating system fingerprinting using neural
networks and induction of decision rules
Bartosz Bielski, Przemysław Klęsk
Szczecin University of Technology,
Faculty of Computer Science and Information Technology
Abstract:
One of the most difficult task for people managing big- or even medium-size computer
network is determining the accurate number of hosts that are protected. This information
is really helpful for accurately configuring network-based devices such as intrusion
detection systems. Exact knowledge of the operating systems (residing in hosts) can be
useful for excluding many alerts that cannot apply to a remote operating system that is
being examined. In this context, we consider a classification problem (we try to recognize
the class of operating system) when some of the characteristics of the system are modified
by its user or any other program (e.g. for internet connection tuning). We use neural
networks (MLP, RBF) and rule induction techniques. It should be stressed that existing
fingerprinting tools get high accuracy results when tested on the “clean” versions of
operating systems, but they fail to detect systems with modified TCP/IP parameters.
Keywords:
passive OS fingerprinting, TCP/IP fingerprinting, operating systems recognition, neural
networks, induction of decision rules
1. Introduction
Accurate operating system fingerprinting by passive analysis of the network traffic
can be used for maintaining a security access policy. This type of policy may list the
types of hosts that are allowed and disallowed (for example administrator can disallow
really old versions of operating systems which may have serious bugs).
Remote operating systems fingerprinting is the way of determining the operating
system of a remote host. It is based on the fact that the the TCP/IP protocol specification
does not clearly describe how to use all fields in the TCP and IP header. Due to this fact
developers of operating systems implement differently the TCP/IP stack – which can be
used for identification. Even versions and patches can be identified, because programmers are using e.g. other security features in the systems.
Remote operating systems can be identified by three approaches – “just looking”,
active and passive fingerprinting. “Just looking” method is really not accurate and may
give inadequate results, because of the easy way to modify given presented information;
active scanning provides detailed information by actively sending packets and passive
analysis provides real-time (but usually less detailed) information. Nevertheless, scanning consumes hosts resources and network bandwidth and it requires more time on
16
Bartosz Bielski, Przemysław Klęsk
broad networks. Moreover, it can cause some network devices to stop servicing. Passive
fingerprinting hase none
one of the flaws mentioned above.
Figure 1.. OSI and TCP/IP stack
(source: http://tutorials.beginners.co.uk/introducing-tcp-ip.htm?p=2)
http://tutorials.beginners.co.uk/introducing
Passive fingerprinting is a method of recognizing operating systems based only on
the packet traffic which is already transmitted. There is no need to send extra packets to
remote host, because all the packets may be used to identify attacker or any person that
is doing a security audit.
A main goal of this research is to determine how accurately remote operating systems can be detected using passive fingerprinting by means of neural networks and
induction of decision rules. Other goal is to evaluate the fingerprinting on some useruser
modified TCP/IP stacks on which current recognition tools fail to work
w
and determine
how well neural networks can identify operating systems that were not in the training
set.
2. Passive OS fingerprinting – existing solutions
Based on our observations, which were confirmed by some of earlier researches [3],
we can say that currently
urrently existing fingerprinting tools are mostly: rule based (very simsi
ple rules) or nearest neighbour implementation (usually 1-NN).
1
Using such approach
there is no way to accurately fingerprint operating systems having any modifications
that were not included
luded in the fingerprinting database of the systems. On the other hand,
there is no way to include all such information in the database because of the variety
var
of
possible modifications.
Passive operating system fingerprinting using neural networks…
17
3. Passive OS fingerprinting using neural networks
We try the application of neural networks – MLP, RBF – to our problem, knowing
their successful application in such pattern-recognition areas as: handwriting recognition, identifying vehicles, medicine, etc.
The database of the operating systems was taken from the open-source tool named
ettercap [13], which at present includes the largest set of the OS examples – 1765.
The structure of values included in the ettercap set of operating systems is presented
in Figure 2.
WWWW : 4 digit hex field indicating the TCP Window Size
MSS : 4 digit hex field indicating the TCP Option Maximum Segment Size
if omitted in the packet or unknown it is "_MSS"
TTL : 2 digit hex field indicating the IP Time To Live
WS
: 2 digit hex field indicating the TCP Option Window Scale
if omitted in the packet or unknown it is "WS"
S
: 1 digit field indicating if the TCP Option SACK permitted is true
N
: 1 digit field indicating if the TCP Options contain a NOP
D
: 1 digit field indicating if the IP Don't Fragment flag is set
T
: 1 digit field indicating if the TCP Timestamp is present
F
: 1 digit ascii field indicating the flag of the packet
S = SYN
A = SYN + ACK
LEN : 2 digit hex field indicating the length of the packet
if irrelevant or unknown it is "LT"
OS
: an ascii string representing the OS
Figure 2. TCP/IP parameters used to identify operating system
(source: ettercap database)
First of all, different detailed versions of operating systems were grouped into larger
classes – in order to have a sensible proportion: number of examples / number of
classes, see Table 1. Experiments in which we tried to identify the exact OS version
were conducted later.
As seen on Figure 3 in state of full knowledge about operating systems and full trust
researched neural network can identify systems with about 100% probability, just like
current rule-based tools. There is already very rare situation when we can give trust
computers and systems we do not own.
18
Bartosz Bielski, Przemysław Klęsk
Table 1. Operating systems
ems classes
Pos.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
OS class
windows
linux
bsd
other
solaris
unix
mac_os
network modem/router
cisco device
network printer
Number of examples
exa
599
349
169
164
132
94
89
83
44
39
Figure 3. Results of recognition by many kinds of neural networks
When the knowledge and trust drops and especially when we cannot trust remote
systems are clean and unmodified, neural network get really good prediction results.
The good scores are especially valuable when there is no strict signature in the database
of operating systems so that system cannot be recognised at all.
As seen on Figure 4 we compared best neural network from earlier researches (3(3
layer multi layer perceptron network – MLP with 10 neurons
ne
in every hidden layer)
with multi-stage
stage neural network architecture that can detect operating system version
Passive operating system fingerprinting using neural networks…
networks
19
too. The multi-stage
stage architecture was build only for systems that meet the minimum 1%
occurrence of quantity (Windows, Linux, BSD, etc).
Figure 4. Results by recognition depth
Figure 5. Best neural network topology (3 layers, 10 neurons in every hidden layer)
One should note that: for modified TCP/IP stacks and when the original signatures
were not placed in the training set – the network gives 97,7% correct classification rate.
When trying to additionally determine the exact version of the OS – the rate dropped,
but still was very high – 85%.
4. Induction of decision rules, Pareto-optimal
optimal rules
To have a better understanding of the data set we also decided to do the induction of
decision rules. Discovering interesting and statistically relevant rules may allow to perpe
form simple OS fingerprinting without a neural network classifier, since ‘if-then’ rules
are of high interpretive value for humans.
20
Bartosz Bielski, Przemysław Klęsk
4.1. Greedy induction of decision rules
Say we want to find all possible rules with the length of premise equal to p , i.e. in
the premise there are exactly p terms connected with ‘and’. For example, when p = 3
a possible rule is:
if (
x3 = 2) and ( x5 = 3) and ( x6 = 1) then ( y = 2).
p =3
The algorithm to find all such rules can be named as greedy – meaning that we will
iterate over all
( ) combinations of input variables x
n
p
j
and for each single combination
we will also iterate over all possible settings of values for these variables.
The exact total number of iterations in such an algorithm (and in the same time the
total number of rules) can be calculated as follows. Let Cn , p be the set of all
( ) comn
p
binations of indices of variables. E.g.: C4,2 = {(1, 2), (1,3), (1, 4), (2,3), (2, 4), (3, 4)} . The
exact number of iterations is then equal to
∑
, ,…, j p )∈Cn , p
mj mj ⋯mj .
1
2
(1)
p
( j1 j2
And if we wanted to find all rules with premises of length equal to p or less than p
then the number of iterations (and rules) is
p
∑
∑
, ,…, jk )∈Cn, k
k =1 ( j j
1 2
mj mj ⋯mj .
1
2
(2)
k
For a single fixed premise what should be the value chosen as the decision? Let X
be a general notation for the premise and Y for the decision, e.g.:
if (
x3 = 2) and ( x5 = 3) and ( x6 = 1)
then (
y
= 2).
X =x
Y=y
Then, for a fixed premise X = x we deal with a certain conditional probability distribution of decisions P (Y | X = x) . In other words, from the whole data set we take the
subset containing only the cases where X = x and within this subset we look at the
distribution of different decisions y ∈ {1, 2,… , my } . As the decision value we choose
such y for which P (Y = y | X = x) is maximal.
4.2. Rule assessment measures
Each rule should be assigned a certain number assessing how interesting this rule is
and how statistically relevant it is for the decision. The most basic and common meas-
21
Passive operating system fingerprinting using neural networks…
ure is confidence1 of the rule, which is exactly equal to P (Y = y | X = x) . Nevertheless,
we focus on two other rule assesment measures which are more sensitive measures2:
 Conditional entropy given the fixed X = x :
H (Y | X = x) = −∑P(Y = y | X = x)log 2P(Y = y | X = x).
y∈Y
(3)
 Kullback-Leibler's number (divergence):
KL( P(Y | X = x) || P (Y )) = ∑P (Y = y | X = x)log 2
y∈Y
P(Y = y | X = x)
.
P (Y = y )
(4)
See also [12].
In the case of conditional entropy, the smaller the entropy the better the rule. In the
case of Kullback-Leibler's number, the greater the number the better the rule.
The conditional entropy attains the largest (so the worst) values for the rules having
distributions of decisions P (Y | X = x) close to uniform. This is natural, since such
rules poorly predict the decision variable. On the other hand, the more the distribution
P (Y | X = x) is concentrated around a single decision value the closer the entropy is to
zero – the better the rule.
As regards the Kullback-Leibler's number, it specifically rewards those rules for
which the distribution P (Y | X = x) differs largely from the distribution P (Y ) – the
distribution of decision within the whole data set. This can be also well explained having in mind a naive zero rule classifier. The naive classifier responds all the time with
the same decision y which the most frequent one in the whole data set (having the
greatest P (Y = y ) probability). One could say informally that by applying the Kullback-Leibler's number we look for ‘suprising’ rules i.e. such ones that allow to predict
well rare decisions (which is more difficult) not the frequent decisions (which is easy).
That is why the Kullback-Leibler's number is also called divergence or the pseudo distance of distributions [12]. It must not be treated as the actual distance function (the
metric) for example because of the fact it is not a symmetric function with respect to its
arguments.
Basing on the above measures, one can also assess not only the rules but also the
premises alone. I.e. one can detect statistically relevant premises and statistically relevant variables (premises of length one). For this purpose weighted versions of the former formulas are applied. One should take under consideration all the settings of values
for a fixed set of variables in the premise:
1
The confidence measure in conjunction with the support (the number of cases in the dataset
agreeing with the premise of the rule) is the basis of the known A priori algorithm, which is applied to find associative rules in large datasets [11].
2
Especially in the case of unequal decision classes and also in the case of more than two decisions.
22
Bartosz Bielski, Przemysław Klęsk
H (Y | X ) =
∑P ( X = x) H (Y | X = x).
(5)
∑ P( X = x) KL( P(Y | X = x) || P(Y )).
(6)
x∈ X
KL ( P (Y | X ) || P (Y )) =
x∈ X
4.3. Pareto-optimal rules
All found rules can be graphed as points in a two-dimensional plot, where on one
axis we put the support of the rule (the number of cases in the data set agreeing with the
premise of the rule) and on the other axis we put the assessment measure (the KL's
number or the conditional entropy). See the Fig. 6.. As one can see these two magnitudes
are in a conflict,, since we are interested in rules which are supported by as many examexa
ples as possible but on the other hand we are interested in rules with high assessment
measure (which typically happens for rare rules – with small support). Therefore the
rules lying on the border of the graph can be treated as the most interesting. They are
called Pareto-optimal rules, in the sense that for each rule from this border (the Pareto
border) there exists no other rule dominating the former i.e. with greater (or equal) both
the support and the assessment measure (only one of these two magnitudes
magnitu
can be
greater or equal).
Figure 6. Pareto-optimal
optimal rules marked as asterisks within a graph of all rules.
The concept of Pareto-optimal
optimal set of rules can also be met e.g. in [13].
Passive operating system fingerprinting using neural networks…
23
Interesting rules for passive OS fingerprinting
Probability (frequency) distribution of decision variable in the whole data set was:
P(Y = BSD) = 0.19807, P(Y = Windows) = 0.34222, P(Y = Cisco router) = 0.050511,
P(Y = Linux) = 0.095914, P(Y = MacOS) = 0.067537, P(Y = network modem/router) =
0.022134, P(Y = network printer) = 0.014188, P(Y = other) = 0.081158, P(Y = Solaris)
= 0.074915, P(Y = Unix) = 0.053348.
All rules for which the decision distribution differs relevantly from the distribution
above can be considered as interesting ones. Below we show interesting rules with premises of length 1, 2, 3. In parenthesis we show support and probability of decision (confidence).
Rules with premises of length = 1:
1: if (LEN = 48) then (OS = Windows). (395/1762) (0.68101) KL=0.37212.
2: if (LEN = 52) then (OS = Windows). (76/1762) (0.75) KL=0.68209.
3: if (LEN = 64) then (OS = Windows). (130/1762) (0.63846) KL=0.47577.
4: if (WWWW = 60000-65536) then (OS = Windows). (316/1762) (0.59494) KL=0.45079.
5: if (S = 1) then (OS = Windows). (753/1762) (0.58964) KL=0.36388.
6: if (TTL = 128) then (OS = Windows). (786/1762) (0.53308) KL=0.19704.
7: if (LEN = 60) then (OS = BSD). (282/1762) (0.46454) KL=0.49247.
8: if (WWWW = 20001-30000) then (OS = Solaris). (65/1762) (0.35385) KL=1.1769.
9: if (TTL = 64) then (OS = BSD). (759/1762) (0.33465) KL=0.20533.
10: if (TTL = 255) then (OS = Solaris). (165/1762) (0.27273) KL=0.73742.
Rules with premises of length = 2:
1: if (S = 1) and (LEN = 60) then (OS = BSD). (119/1762) (0.88235) KL=1.6055.
2: if (WWWW = 60000-65536) and (S = 1) then (OS = Windows). (181/1762) (0.87845)
KL=0.97709.
3: if (TTL = 128) and (LEN = 48) then (OS = Windows). (239/1762) (0.86611) KL=0.90106.
4: if (TTL = 128) and (S = 1) then (OS = Windows). (366/1762) (0.85792) KL=0.86164.
5: if (WWWW = 0-10000) and (LEN = 60) then (OS = BSD). (125/1762) (0.728) KL=1.0152.
6: if (TTL = 255) and (D = 0) then (OS = MacOS). (80/1762) (0.4625) KL=1.0222.
7: if (TTL = 255) and (D = 1) then (OS = Solaris). (85/1762) (0.43529) KL=1.1085.
8: if (MSS = 40001-50000) and (TTL = 64) then (OS = Cisco router). (74/1762) (0.43243)
KL=1.2226.
9: if (TTL = 255) and (N = 0) then (OS = MacOS). (94/1762) (0.40426) KL=1.095.
10: if (WWWW = 60000-65536) and (LEN = -1) then (OS = Unix). (64/1762) (0.34375)
KL=1.2274.
11: if (TTL = 255) and (S = 0) then (OS = MacOS). (125/1762) (0.304) KL=1.0686.
12: if (TTL = 64) and (S = 0) then (OS = Linux). (432/1762) (0.24769) KL=0.46046.
13: if (S = 0) and (D = 0) then (OS = other). (464/1762) (0.21552) KL=0.56091.
24
Bartosz Bielski, Przemysław Klęsk
Rules with premises of length = 3:
1: if (TTL = 128) and (S = 1) and (F = 0) then (OS = Windows). (185/1762) (0.93514)
KL=1.1885.
2: if (WWWW = 0-10000) and (S = 1) and (LEN = 60) then (OS = BSD). (79/1762) (0.92405)
KL=1.8509.
3: if (S = 1) and (T = 1) and (LEN = 60) then (OS = BSD). (104/1762) (0.91346) KL=1.7969.
4: if (S = 1) and (F = 1) and (LEN = 60) then (OS = BSD). (83/1762) (0.89157) KL=1.6533.
5: if (S = 1) and (N = 1) and (LEN = 60) then (OS = BSD). (119/1762) (0.88235) KL=1.6055.
6: if (MSS = 0-10000) and (TTL = 128) and (S = 1) then (OS = Windows). (337/1762)
(0.85757) KL=0.86806.
7: if (WS = -1) and (D = 0) and (LEN = -1) then (OS = other). (116/1762) (0.34483)
KL=1.1299.
8: if (N = 0) and (D = 0) and (F = 1) then (OS = MacOS). (250/1762) (0.28) KL=0.87672.
5. Summary
In this paper we mine interesting rules for the problem of passive OS fingerprinting.
We also present results of classification with a neural network, which in comparison to
the currently existing tools seems to be much more efficient. Very low error rates were
obtained by both the OS name and OS version classifier. In comparison to existing
solutions this is the best known classification rate in passive OS fingerprinting tested on
unseen or modified data.
References
[1] Berrueta D. B. A practical approach for defeating Nmap OS-Fingerprinting.
[online] http://www.zog.net/Docs/nmap.html. [06/12/2007]
[2] Hortop P. Active OS Fingerprinting Tools. [online]
http://www.networkintrusion.co.uk/ osfa.htm, 2006a. [20/02/2008]
[3] Lippmann R., Fried D., Piwowarski K., Streilein W. Passive Operating System
Identication From TCP/IP Packet Headers, 2003. [20/02/2008]
[4] Internetworking Basics, [online]
http://www.cisco.com/univercd/cc/td/doc/cisintwk/ito_ doc/introint.htm
[07/10/2007]
[5] Introducing TCP/IP, http://tutorials.beginners.co.uk/introducing-tcp-ip.htm?p=2.
[08/11/2007]
[6] Dawson K. T. Linux – podręcznik administratora sieci. OReilly, Wydawnictwo
RM, Warszawa 2000.
[7] Kosiński R. Sztuczne sieci neuronowe. Dynamika nieliniowa i chaos. Wydawnictwa Naukowo-Techniczne, Warszawa 2007
[8] Kwiatkowska A. Systemy wspomagania decyzji. Wydawnictwa Naukowe
PWN/MIKOM, Warszawa 2007
[9] Osowski S. Sieci neuronowe w ujęciu algorytmicznym. Wydawnictwa NaukowoTechniczne, Warszawa 1996.
Passive operating system fingerprinting using neural networks…
25
[10] Agrawal R., Imielinski T., Swami A. Mining associations between sets of items in
massive databases, in Proceedings of the 1993 ACM-SIGMOID Int’l Conf. on
Management of Data, 1993, pp. 207–216.
[11] Gray R. M. Entropy and Information Theory, Springer Verlag, New York, USA.
Information Systems Laboratory, Electrical Engineering Department, Stanford
University, 1990.
[12] Słowiński R., Brzezińska I., Greco S. Application of bayesian confirmation measures for mining rules from support-confidence pareto-optimal set, in Rutkowski,
Tadeusiewicz, śurada, 8th International Conference, Zakopane, Poland, June 2006
pp. 1018–1026.
[13] Ettercap – remote fingerprinting tool, http://ettercap.sourceforge.net [20/05/2008]
Analysis of 2D problem in HMM
Janusz Bobulski
Czestochowa University of Technology,
Institute of Computer and Information Science
Abstract:
Hidden Markov models are widely applying in data classification. They are using in many
areas where 1D data are processing. In the case of 2D data, appear some problems with
applying 2D HMM. This paper describe the important limitations of HMM when we have
to processing two dimensional data.
Keywords:
image processing, hidden Markov models HMM, pseudo 2DHMM, 2DHMM
1. Introduction
Hidden Markov models are widely applying in data classification. They are using in
speech recognition, character recognition, 2-D shape classification, biological sequence
analysis, financial data processing, texture analysis, face recognition, etc. This widely
applying HMM is result of its effectiveness. When we work with one dimensional data,
we have good tools and solution for this. But when we process two dimensional data,
we should apply two dimensional HMM. There is problems, because there aren’t good
and efficient solution of three basic problems of 2D HMM [1, 2]:
1. Given observation O={O1,…,OT} and model λ=(A, B, π ); efficiently compute
P(O|λ):
 hidden states complicate the evaluation;
 given two models λ1 and λ2, this can be used to choose the better one.
2. Given observation O={O1,…,OT} and model λ=(A, B, π) find the optimal state
sequence q = (q1, q2,…, qT ):
 optimality criterion has to be decided (e.g. maximum likelihood);
 “explanation” for the data.
3. Given O={O1,…,OT}; estimate model parameters λ=(A, B, π) that maximize P(O|λ).
2. Classic HMM
HMM is used to the identification process. A HMM is a double stochastic process
with underlying stochastic process that is not observable (hidden), but can be observed
through another set of stochastic processes that produce a sequence of observation. Let
O={O1,…,OT} be the sequence of observation of feature vectors, where T is the total
number of feature vectors in the sequence. The statistical parameters of the model may
be defined as follows [3].
28
Janusz Bobulski
 The number of states of the model, N (Fig.1).
 The transition probabilities of the underlying Markov chain, A={aij} 1≤ i,j ≤N where
aij is the probability of transition from state i to state j subject to the constraint
∑
N
j =1
aij = 1 .
 The observation probabilities, B={bj(OT)}, 1≤ j ≤ N, 1≤ t≤ T which represents the
probability of the tth observation conditioned on the jth state.
 The initial probability vector, Π = {πi} 1≤ i≤ N.
Figure 1. One-dimensional HMM [7]
Hence, the HMM requires three probability measures to be defined, A, B, π and the
notation: λ = ( A, B, π ) is often used to indicate the set of parameters of the model.
The parameters of the model are generated at random at the beginning. Then they
are estimated with Baum-Welch algorithm, which is based on the Forward-Backward
algorithm. Second way to estimate of parameters is Viterbi algorithm, which is very
similar to Forward-Backward algorithm. The forward algorithm calculates the coefficient αt(i) (probability of observing the partial sequence (o1,…,ot) such that state qt is i ).
The backward algorithm calculates the coefficient βt(i) (probability of observing the
partial sequence (ot+1,…,oT) such that state qt is i ). The Baum-Welch algorithm, which
computes the λ, can be described as follows [1].
1. Let initial model be λ0
2. Compute new λ based on λ0 and observation O
3. If log ( P( O|λ ) – log P( O|λ0 ) < DELTA stop
4. Else set λ0 ← λ and goto step 2.
The parameters of new model λ (1), based on λ0 and observation O, are estimated
from equation of Baum-Welch algorithm (Fig.2) [1], and then are recorded to the database.
Baum-Welch algorithm (forward-backward)
Forward probability αj(t) for
following recurrent formula:
2 ≤ j ≤ N − 1 oraz 1 ≤ t ≤ Tr
 N −1

 j=2

α j ( t ) =  ∑ α i ( t − 1) aij  b j ( otr )
is calculated with
29
Analysis of 2D problem in HMM
initial condition:
α1 (1) = 1 , α j (1) = a1 j b j ( Otr ) for 2 ≤ j ≤ N – 1
finishing condition:
α N (Tr ) = ∑ α i (Tr ) aiN
N −1
j=2
Backward probability βi(t) for 2 ≤ i ≤ N – 1 and 1 ≤ t ≤ Tr is calculated with
following recurrent formula:
N −1
β i ( t ) = ∑ aij b j ( otr+1 ) β j ( t + 1)
j =2
initial condition:
β i (Tr ) = aiN
1≤i≤N
N −1
finishing condition:
β1 (1) = ∑ a1 j b j ( o1r ) β j (1) .
j =2
Figure 2. Baum-Welch algorithm [1].
Problem:
For giving observation vector O = (o1, o2,..., oT) estimation model parameters λ=(π,
A, B) in order to take maximum P(O | λ).
Problem solution:
 Estimate the parameters of the model λ=(π, A, B)) for maximum P(O | λ),
 Define ξ(i, j) as a probability of being in state i at time t and in the state j at time t+1,
30
Janusz Bobulski
ξ ( i, j ) =
α ( i ) aij b j ( ot +1 ) β t +1 ( j )
=
P (O | λ )
α ( i ) aij b j ( ot +1 ) β t +1 ( j )
N
N
∑∑ α ( i ) a b ( o ) β ( j )
i =1 j =1
t
ij
j
t +1
t +1
Relation in algorithm:
 define γt(i) as a probability being in state i at time t, given the observation sequence
N
γ t ( i ) = ∑ ξt ( i, j )
j =1

T
∑γ (i)
t =1

t
is the expected number of times state i is visited,
T −1
∑ ξ ( i, j ) is the expected number of transition from state i to state j,
t =1
t
 πi = expected frequency in state i at time (t = 1) = γ1(i),
 aij = (expected number of transition from state i to state j ) / (expected number of
transition from state i):
aij =
∑ ξ ( i, j )
∑ γ (i )
t
t
 bj(k) = (expected number of times in state j and observing symbol k) / (expected
number of times in state j):
bj ( k ) =
∑ γ (i )
∑ γ (i )
t , ot = k
t
t
t
Viterbi algorithm [4, 5] (Fig. 3)
 Define δ(i) – the highest probability path ending in state i,
Algorithm:
 Initialisation:
1≤ i ≤ N
δ1 ( i ) = π i bi ( o1 )
ψ1 = 0
 recursion:
δ t ( j ) = max δ t −1 ( i ) aij  b j ( ot )
1≤ j ≤ N
ψ t ( j ) = arg max δ t −1 ( i ) aij 
1≤ i ≤ N
 termination:
2 ≤ t ≤ T, 1 ≤ j ≤ N
P* = max δT ( i ) 
1≤i ≤ N
qT* = arg max δT ( i ) 
1≤ i ≤ N
31
Analysis of 2D problem in HMM
 path (state sequence) backtracking:
qt* = ψ t +1 ( qt*+1 ) , t = T – 1, T – 2,..., 1
Figure 3. Viterbi algorithm [1]
The testing process consists of computing the probability of observation generating
by the models saved in database and choosing this model for which the likelihood is
maximum. In the proposed method, probabilities are calculated separately for each of
the three models representing parts of the face, then they are added. The face, for which
the sum of probability is maximum, is chosen as the correct face. The probability of
generating sequences of observations is computed from the equations (2)-(4) [1].
P (O | λ ) = ∑ P (O | q, λ ) P ( q | λ )
q
T
P(O | q, λ ) = ∏ P(ot | qt , λ ) = bq1 (o1 )bq2 (o2 )...bqT (oT )
i =1
P ( q | λ ) = π q1 aq1q2 aq2 q3 ...aqT −1qT
3. Pseudo 2DHMM
A pseudo 2D Hidden Markov models are extension of the 1D HMM. A P2DHMM
consist of a number of superstates. The topology of superstate model is a linear model,
where only self transition and transition to the following superstate are possible. Inside
32
Janusz Bobulski
the superstates there are linear 1D HMM. The state sequences in the rows are independent of the state sequences of neighboring rows [6, 7].
Figure 4 shows a pseudo two dimensional hidden Markov model which consist of
four superstates. Each superstate contains three states one dimensional HMM.
4. 2DHMM
An extension of the HMM to work on two-dimensional data is 2D HMM (Fig.5).
The principle of two dimensional hidden Markov models was described in the paper
[2]. A 2D HMM can be regarded as a combination of one state matrix and one observation matrix, where transition between states take place according to a 2D Markovian
probability and each observation is generated independently by the corresponding state
at the same matrix position. It was noted that the complexity of estimating the parameters of a 2D HMMs or using them to perform maximum a posteriori classification is
exponential in the size of data. Similar to 1D HMM, the most important thing for 2D
HMMs is also to solve the three basic problems, namely, probability evolution, optimal
state matrix and parameters estimation. Li Yujian in [2] proposed some analytic solution
of this problems. But this solution has some disadvantages. First, the computation of
parameters and probability are very complexity [8]. Second, This solution can be applying only for left-right type of HMM. And third, we can use only small size of HMM. 2D
HMM is still limited by the computational power of the machine.
Figure 4. Pseudo 2D-HMM [6]
Analysis of 2D problem in HMM
33
Figure 5. 2D Markovian transitions among states [2]
5. Conclusion
We can applying three approach to the 2D data analysis:
 reduce dimensionality data to 1D vector and use 1D HMM,
 divide data to segments and use pseudo 2D HMM
 use complexity analytic calculation in 2D HMM.
MM.
Presented solution 2D HMM is assumption of real full 2D HMM. Therefore, is
needed future work on two dimensional hidden Markov models. Future solution have to
resolve the three basic problems of HMM for ergodic and larger set of states and data.
References
[1] Kanungo T. Hidden Markov Model Tutorial. www.cfar.umd.edu/~kanungo,
www.cfar.umd.edu/~kanungo 1999
[2] Yujian Li. An analytic solution for estimating two-dimensional
two
hidden Markov
models,, Applied Mathematics and Computation 185(2007), pp.810-822
pp.810
[3] Rabiner L. R. A tutorial on hidden Markov models and selected application in
speech recognition.. Proc. IEEE 77, 1989, pp. 257-285.
257
[4] Hu J., Brown M. K., Turin W. HMM Based On-Line
On
Handwriting Recognition,
IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 18 No.10
(1996) 1039-1045
[5] Forney G. D. The Viterbi Algorithm,, Proc. IEEE, Vol. 61 No. 3 (1973) 268-278
268
[6] Eickeler S., Müller S., Rigoll G. High Performance Face Recognition Using
Pseudo 2-D Hidden Markov Models, European Control Conference, 1999
[7] Eickeler S., Müller S., Rigoll G. Recognition of JPEG compresed faceimages based
on statistical methods,, Image and Vision Computing 18(2000), pp. 279-289
279
[8] Li J., Najmi A., Gray R. M. Image classification by a two dimensional Hidden
Markov model,, IEEE Trans. Signal Process. 48(2000), pp.
p 517-533
Ontology notation and description of vulnerabilities
in heterogeneous networks and critical infrastructures
Michał Choraś1,2, Adam Flizikowski1,2, Anna Stachowicz1,2, Marta Redo1,2,
Rafał Renk1,3, Witold Hołubowicz1,3
1
ITTI Ltd. Poznań, 2Institute of Telecommunications, UT&LS, Bydgoszcz,
3
Adam Mickiewicz University, Poznań
Abstract:
In this paper a novel approach to describe heterogeneous networks vulnerabilities is
presented. We propose an ontology-based approach which can be utilized to describe
vulnerabilities in critical architectures, single and multiple domain as well as in
heterogeneous networks. In the paper our ontology based on asset-vulnerabilities is
presented. The created ontology will be used in security-resiliency framework developed
in INTERSECTION Project as well as in the Critical Infrastructures and SCADA
ontology in the INSPIRE Project.
Keywords:
network security, network vulnerabilities, ontology, heterogeneous networks
1. Introduction and motivation to apply ontologies
In both computer science and information science, an ontology is a form of data
model that represents a domain and can be used to for e.g.: reason about the objects in
that domain and the relations between them.
Ontologies are used in many different areas for e.g.: artificial intelligence, the semantic web, software engineering and information architecture of IT systems as a form
of knowledge representation about the world or some part of it.
Generally ontologies describe:
 individuals: the basic objects;
 classes: sets, collections or types of objects;
 attributes: properties, features, characteristics, or parameters that objects can have
and share;
 relations: ways that objects can be related to one another.
In general there are two kinds of ontologies:
1. A domain ontology (or domain-specific ontology) models a specific domain, or part
of the world. It represents the particular meanings of terms as they apply to that domain. Example domain ontologies: Gene Ontology for genies, Foundational Model
of Anatomy for human anatomy, SBO (Systems Biology Ontology) for computational models in biology, Plant Ontology for plant structures and growth/development stages, etc., CIDOC CRM (Conceptual Reference Model) – an ontology for
“cultural heritage information”.
36
Michał Choraś, Adam Flizikowski, Anna Stachowicz, Marta Redo
2. An upper ontology (or foundation ontology) is a model of the common objects that
are generally applicable across a wide range of domain ontologies. It contains a core
glossary in whose terms objects in a set of domains can be described. There are several standardized upper ontologies available for use, including Dublin Core, GFO,
OpenCyc/ResearchCyc, SUMO, and DOLCE.
Ontologies are usually exchanged in XML and RDF/XML language constructs built
from higher level ontology languages, such as OWL (Web Ontology Language) [1] and
SWRL (A Semantic Rule Language) [2].
Comparison of ontology languages is given in [3]. The OWL language, standardised
by the World Wide Web Consortium (W3C), has strong industry support and is widely
used for policy-oriented services. There are different OWL versions, such as OWL full
(complete language), OWL DL (full language used with reasoning systems) and OWL
Lite (used for simple constraints and a class hierarchy) [4].
Such ontology languages can be used for security related problems (policy, vulnerabilities description) specification in order to allow formal semantic specification and
machine reasoning.
Using ontology languages, the policy is defined by concepts (taxonomies), relations,
functions, axioms, and instances.
In our understanding, to successfully apply the created ontology, the following elements have to be taken into account:
 Classes and their attributes with restrictions (created in OWL)
 Rules for these classes and attributes (created in SWRL)
 Instances stored in a related relational database.
To apply the ontology, restrictions and rules are crucial – without them ontology
would not be functional.
Hereby, we propose to apply ontology into the security-resiliency framework and to
Critical Infrastructures Vulnerabilities notation.
The paper is organized as follows: in section 2 the INTERSECTION Vulnerability
Ontology (IVO) is shown in detail. Our original asset-based approach to the problem of
vulnerability description and related standards is presented in section 2.2. In Sections
2.3 and 2.4 we showed how to apply ontologies to real-life architectures and tools. In
Section 3 motivation and our approach for ontology-based notation in Critical Infrastructures and SCADA is proposed. Conclusions are given thereafter.
2. INTERSECTION Vulnerability Ontology (IVO)
2.1. On applying ontologies to heterogeneous networks
In the past years critical infrastructures were physically and logically separate systems with little interdependence. As digital information gained more and more importance for the operation of such infrastructures, what we might call a “cyber component”
of each critical system grew. These cyber components are currently connected through
heterogeneous networks and do represent the information infrastructure on which critical infrastructures rely and depend [5].
Ontology notation and description of vulnerabilities in heterogeneous networks…
37
Unfortunately, the increasing complexity and heterogeneity of the communication
networks and systems used to connect such cyber components also increase their level
of vulnerability. Furthermore, the progressive disuse of dedicated communication infrastructures and proprietary networked components, together with the growing adoption
of IP-based solutions, exposes critical information infrastructures to cyber attacks coming from the Internet.
These infrastructures are characterised by a vulnerability level similar to other systems connected to the Internet, but the socio-economic impact of their failure can be
huge.
Some of networks in which vulnerabilities should be identified, methodically classified and well described are Wireless LAN, GSM/UMTS, Sensor Networks, Satellite
Networks and Wi-Max.
In this paper we propose an ontology-based approach to describe networks’ vulnerabilities. While ontologies are becoming an important tool for description of securityrelated (not only computer networks) issues (attack, threats)[6], there are not many
complete ontologies that include also vulnerabilities description. Furthermore, there is a
lack of ontologies related to heterogeneous networks.
We will show our approach to vulnerabilities identification based on ISO/IEC
13335-1:2004 standard [7] and Shared Information/Data Model [8]. Our ontology will
be created in OWL-DL language using Protégé application [9].
2.2. Definition of a vulnerability – assets-based approach
Networks vulnerabilities tend to be often mistaken with threats and attacks. Therefore we decided to clearly define vulnerability as asset-related network weakness. Obviously, then such weaknesses are exploited by threats and attacks.
Such vulnerability definition is based on ISO/IEC 13335 standard and is shown in
Figure 1.
Networks assets should also be defined and described. We decided to use Shared Information/Data (SID) Model in which networks assets and relations between them are
defined. SID Model provides Physical Resource Business Entity Definitions. SID assets
description is specified in UML and visualized using UML diagrams.
We decided to use the following structure for information regarding networks vulnerabilities to be included in our ontology:
1. Vulnerability identification (ID) – (if possible according to CVE (Common Vulnerabilities and Exposures) or CCE (Common Configuration Enumeration))
2. Vulnerability name (short)
3. Vulnerability description (long)
4. Likelihood of vulnerability
5. Example of related threats and attacks
6. What network assets is vulnerability related to
7. How serious damage vulnerability can cause (impact, severity) – (according to
CVSS (Common Vulnerability Scoring System))
8. Level of tolerance – can vulnerability be tolerated, under what conditions, what is
the risk of tolerating vulnerability?
9. How (if) the vulnerability may affect heterogeneous networks
38
Michał Choraś, Adam Flizikowski, Anna Stachowicz, Marta Redo
10. Parameters of vulnerabilities (if possible)
11. Possible solutions to avoid it or minimise impact, security mechanisms against described vulnerability and related attacks
12. Keywords
Figure 1. Vulnerabilities identification and definition on the basis of networks assets [7]
2.3. Vulnerability Ontology in the INTERSECTION framework
One of the goals of the INTERSECTION project is to identify and classify heterogeneous network vulnerabilities.
The knowledge about vulnerabilities is needed to more effectively cope with threats
and attacks, and to enhance networks security.
Therefore network vulnerabilities should be identified, described, classified, stored
and analyzed.
To achieve these goals, a vulnerability ontology is required and developed.
The major aim of our ontology is to describe vulnerabilities beyond single domain
networks and to extend relations/restrictions onto heterogeneous networks.
The aim of our asset-based approach is to show Vulnerabilities as a part of security
ontology connected with Network Assets – Resources, Risk, Threats, Attacks, Safeguards and interconnection between them.
Every subclass inherited properties and restrictions from its superclass that is why
we decided to classified our ontology in this way. For example classes Wired and Wireless inherited Resources, Topology, Vulnerabilities, Network_Structure, Risk and Safeguards from superclass Network.
In our ontology approach, we found Resources and Vulnerabilities classes as a the
most important components.
Ontology
ntology notation and description of vulnerabilities in heterogeneous
hetero
networks…
39
Figure 2. Our ontology visualized in Protege
Class Resources is based on division proposed in SID (Shared Information/Data
Infor
Model).
It includes following subclasses:
 Physical_Resources,
 Logical_Resources,
 Software and Service.
Class Vulnerabilities is connected with Resources (expose by them). That is why
subclasses of Vulnerability class are:
 Physical_Resources_Vulnerabilities,
 Logical_Resources_Vulnerabilities,
 Software_Vulnerabilities.
Finally, our INTERSECTION Vulnerability Ontology (IVO) visualized in Protégé is
presented in Figure 2.
40
Michał Choraś,, Adam Flizikowski, Anna Stachowicz, Marta Redo
2.3. Application of INTERSECTION Vulnerability Ontology
Ontolog
The created ontologies will support heterogeneous networks security architecture in
the following aspects:
sec
1) ontology provides information about influence of heterogeneity onto networks security and resiliency issues;
th Intrusion Detection and Anomaly Detection
2) ontology will support and interact with
Systems – ontology will provide information about security risks and threats in spesp
cific interconnected scenarios. IDS system will receive information on how to act in
such scenarios (e.g. how often the packets should be sniffed, what features should be
extracted etc.);
3) ontology will support decisions of Intrusion Tolerance Systems – ontology will
provide information about tolerance, expected False Positives etc.;
etc.
4) ontology will provide useful information for security architecture visualization
module – additional information for end-users;
5) threat ontology supports Complex Event Processor Module (a part of IDS system) threat ontology will drive the decision engine while performing the correlation activacti
ity;
6) ontology
tology is a core for design of a relational vulnerabilities database (similar to
NVD) created in FP7 INTERSECTION Project. The database is called IVD (IN(I
TERSECTION Vulnerability Ontology) and is now available at:
http://reebonium.itti.com.pl/intersection/index.php?module=home
http://reebonium.itti.com.pl/intersection/index.php?module=home.
Moreover, IVO ontology is used in the real-life
life ontology-logic
ontology
based tool called
PIVOT.
Figure 3. PIVOT in operation (screenshot)
Ontology notation and description of vulnerabilities in heterogeneous networks…
41
PIVOT (Project INTERSECTION Vulnerability Ontology Tool) is the ontologylogic based manager tool and repository developed within the INTERSECTION project
research.
Our goal was to apply ontology in a real-life application.
It is end-user oriented application, which allows to modify and browse the vulnerability ontology. One of the biggest advantages is tool has client-server architecture,
what allows to share one ontology by multiple users (e.g. by network operators). The
ontology interface built in PIVOT is user-friendly and intuitive.
PIVOT interface in operation is shown in Figure 3. PIVOT is now available at:
http://reebonium.itti.com.pl:8081/lps-4.1.1/pivot/index.html.
3. Critical infrastructures vulnerability notation and description
Another apparent aspect of heterogeneous networks vulnerabilities is related to
Critical Infrastructures and SCADA systems (Supervisory Control and Data Acquisition).
The increasing success of information and communication technologies, together
with the progressive disuse of dedicated communication networks are bringing to a new
way of controlling and managing critical infrastructures, which are currently organized
as strictly connected, albeit different, elements of a single system rather than as
autonomous entities to be appropriately integrated.
More precisely, systems controlling critical infrastructures are rapidly moving from
dedicated and proprietary solutions towards IP-based integrated frameworks made of
off-the-shelf products.
SCADA is an acronym for Supervisory Control and Data Acquisition. SCADA systems are used to monitor and control a plant or equipment in industries or critical infrastructures such as water and waste control, energy, oil and gas refining and
transportation. These systems encompass the transfer of data between a SCADA central
host computer and a number of Remote Terminal Units (RTUs) and/or Programmable
Logic Controllers (PLCs), and the central host and the supervising operator terminals.
A SCADA system gathers information (such as where a leak on a pipeline has occurred), transfers the information back to a central site, then alerts the home station that
a leak has occurred, carrying out necessary analysis and control, such as determining if
the leak is critical, and displaying the information in a logical and organized fashion,
usually through mimic diagrams.
These systems can be relatively simple, such as one that monitors environmental
conditions of a small office building, or very complex, such as a system that monitors
all the activity in a nuclear power plant, a power distribution network or the activity of a
municipal water system. Traditionally, SCADA systems have made use of the Public
Switched Network (PSN), or leased line for monitoring purposes.
Today many systems are monitored using the infrastructure of the corporate Local
Area Network (LAN) / Wide Area Network (WAN), secure tunnels on public networks
and, mainly in power electricity industry, particular communication technologies like
power line communication (PLC); wireless technologies (including satellite) are now
being widely deployed for purposes of monitoring. We also observe an increasing integration of Wireless Sensor Networks (WSN) for fine-grained monitoring and mobile
42
Michał Choraś, Adam Flizikowski, Anna Stachowicz, Marta Redo
devices as information/diagnosis supplies for technical stuff. New communication standards such as ZigBee, WLAN and Bluetooth are also emerging in SCADA and PCS
systems.
Moreover, more and more often, to the aim to guarantee the resilience of SCADA
communication system a heterogeneous network is implemented, through the integration of two, or more, of the above mentioned technologies.
These changes have exposed them to vulnerabilities that have been threatened standard and general purpose ICT system; for example, the switch from using leased telecommunications lines to shared communication infrastructure and, in same case, public
network as well the interconnection of SCADA system with the company network and
systems (due to administrative purposes, for example to facilitate billing or production
forecast activity) have introduced many new vulnerabilities in SCADA systems, dramatically increasing the risk of attacks both from the internal side and from the Internet.
The shared communications infrastructure thus becomes an obvious target for disrupting a SCADA network.
As companies have added new applications, remote access points and links to other
control systems, they have introduced serious online risks and vulnerabilities that cannot be addressed by their physical control policies.
For example an attack could be exploit against the field network based on wireless
technologies, attack that constricts or prevents the real-time delivery of SCADA messages, resulting in a loss of monitoring information or control of portions of the SCADA
system, as well an attacker may engineer a denial of service (DoS) to inhibit some vital
features of a complex SCADA system, such as control data aggregation in a distributed
or a layered control system or a lack of real time status and historical data synchronization in a central SCADA back-up system.
While physical security of critical infrastructure components (included the control
system) has been already investigated, scarce attention has been paid so far to the analysis of vulnerabilities resulting from the use of commercial communication networks to
transport management information between ICT systems devoted to the control of critical infrastructures.
Therefore, in the context of the INSPIRE Project, Critical Infrastructures and
SCADA vulnerabilities ontology is required.
Such ontology will be applied in a Stream Condenser which aggregates events and
translates low level events (i.e., events related to specific aspects of individual components of a SCADA system), to high level events (i.e. events which are relevant in a
resiliency-oriented view of the overall SCADA system).
The transition from low level events to high level events is made possible by the use
of ontologies.
Ontologies represent an effective tool to describe complex and interdependent symptoms related to faults and attacks, since they are meant to provide formal specification
of concepts and their interrelationships.
The ontology based hierarchical organization of event patterns can be used to partially automate the process of deriving queries for resiliency analysis.
In a typical scenario, the system administrator would search for high level events related to the resiliency of the SCADA system, with no need to bother with details related
the specific characteristics of a particular RTU.
Ontology notation and description of vulnerabilities in heterogeneous networks…
43
Therefore, our further work is now connected to extending and reusing IVO ontology described in Section 2, in order to efficiently describe relations in critical infrastructures, telecommunications networks and SCADA systems.
4. Conclusions
In this paper our ontology describing heterogeneous networks vulnerabilities had
been presented. We motivated our approach and proposed our own approach to define,
identify and describe networks vulnerabilities.
The created ontology is now applied in real-life heterogeneous networks securityresiliency framework architecture.
Moreover, on the basis of the created ontology, vulnerabilities database (similar to
NVD database) focusing on heterogeneous networks has been created. PIVOT – an
ontology-logic based tool has been also developed.
Finally, our new approach, to apply ontology in order to describe relations and vulnerabilities in Critical Infrastructures, telecommunications networks and SCADA systems has been also presented, motivated and discussed.
Acknowledgement
The research leading to these results has received funding from the European
Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement
no. 216585 (INTERSECTION Project) and grant agreement no. 225553 (INSPIRE
Project).
References
[1] OWL Web Ontology Language Semantics and Abstract Syntax, June 2006,
http://www.w3.org/TR/owl-features/ .
[2] SWRL: A Semantic Web Rule Language Combning OWL and RuleML, W3C
Member Submission, http://www.w3.org/Submission/SWRL/.
[3] A. Gomez, O.Corcho, Ontology languages for the Semantic Web, IEEE Intelligent
Systems, Jan./Febr., 2002.
[4] FP6 NetQos Project – Deliverable 2.7.: Graphical User Interface and database
repository.
[5] FP7 INTERSECTION (INfrastructure for heTErogeneous, Reislient, Secure,
Complex, Tightly Inter-Operating Networks) Project – Description of Work.
[6] Ekelhart A., Fenz S., Klemen M., Weippl E., Security Ontologies: Improving
Quantative Risk Analysis, Proc. of the 40th Hawaii International Conference on
System Sciences, 2007.
[7] ISO/IEC 13335-1:2004, Information Technology – Security Techniques – Management of information and communications technology security – Part 1: Concepts and models for information and communications technology security
management.
[8] Shared Information/Data Model – TeleManagement Forum, October 2002.
[9] http://protege.stanford.edu/
44
Michał Choraś, Adam Flizikowski, Anna Stachowicz, Marta Redo
[10] Choraś Michał, Renk R., Flizikowski A., Hołubowicz W., Ontology-based description of networks vulnerabilities, Polish Journal of Environmental Science,
2008.
[11] FP7 INSPIRE (INcreasing Security and Protection through Infrastructure REsilience) Project – Description of Work.
[12] D’Antonio S., Romano L., Khelil A., Suri N., INcreasing Security and Protection
through Infrastructure Resilience: the INSPIRE Project, Proc. of The 3rd International Workshop on Critical Information Infrastructures Security (CRITIS'08), October 2008.
[13] G.A. Campbell, Ontology for Call Control, Technical Report CSM-170, ISSN
1460-9673, June 2006.
The role of standardization in the process of forming
quality of educational repository in ODL
Magdalena Ciszczyk, Emma Kusztina
Szczecin University of Technology,
Faculty of Computer Science and Information Technology
Abstract:
Creation of an educational repository results in the development of new conditions
regarding the learning process, especially in case of ODL (Open and Distance Learning).
Working in network environment demands developing mechanisms that allow for
preparation and distribution of didactic materials on a chosen computer platform. It then
requires harmonizing and standardizing the form and way of presenting the content and
analyzing quality indicators in order to evaluate the repository as a final-product. In the
article authors discuss the meaning of standardization in the process of creating the
educational repository.
Keywords:
repository, standardization, quality, competence
1. Introduction
Knowledge development in all forms, its dissemination and practical utilization play
the fundamental role in creating economical welfare. It results in a great significance
being connected to recruiting employers, who will be able to fill tasks placed before
them and contribute to the success of the organization. In order to recognize such employees, define features that decide about the effectiveness of human performance at
certain positions at work, and to get to know the methods of planning the process of
employee development, the concept of competences was introduced.
With the appearance of the new requirements, Higher Education Institutions started
adapting their educational offers to this new situation on the job market. However, the
crucial element of adapting educational offers to job market needs is understanding the
scope and structure of competences, which are the basis for evaluating potential staff.
This creates the need for standardization activities directed at developing an unambiguous way of presenting the scope and content of required competences.
The competence approach to recruiting staff began spreading already in the 60-ties
and 70-ties of the 20th century when McBer & Co. developed the first competence
model aimed at improving methods of selecting diplomats to represent the USA abroad
[25]. Around the same time the Competency Based Teacher Education (CBTE) was also
proposed [26].
Nowadays, in the whole Europe works are being carried out on competence standardization for the needs of learning programmes (e.g. [8]), as well as to better describe
46
Magdalena Ciszczyk, Emma Kusztina
jobs (e.g. [10]). One of the goals of the Bologna Process initiative aimed at creating
European Higher Education Area is to introduce a common Qualification Framework
based on competences [9].
Classification of competence, for unambiguous understanding of its meaning by
both human and machine, is the goal of creating competence description standards, like
IEEE Reusable Competency Definition [24], or HR-XML [22]. These standards were
also analyzed and utilized in the works of the TENCompetence project (2005-2009)
[14]. Perhaps all of this will soon lead to creating a common catalogue of competences,
that will support the mobility of students during their studies and provide basis
for better understanding of academic achievements and acquired competences by
employers.
From the point of view of Higher Education Institutions standardization of competences is going to result in educational offers being compatible with the job market
requirements. This compatibility will be guaranteed through elaborating appropriate
learning programs, and also indispensible didactic materials. The problem of didactic
materials elaboration, their extension and further development in regard to guaranteeing
obtaining appropriate competences on their basis, gains special significance in the learning process carried out in ODL conditions. The development of LMS systems provides
means of distributing the didactic materials but does not solve the problems of creating
and using the content.
As a result, new ideas emerge, aimed at supporting the distance learning process.
One such ideas is creating an educational repository which will possess mechanisms
suitable for adapting the didactic material placed in it to the needs of a particular educational situation consisting of: learning goal, basic knowledge, required competences.
2. Role of the repository in realizing the learning process
From the etymological point of view, repository is known as an ordered supply
storehouse or storage of documents with the means for using them. Today it is well
known as a place of keeping and extending digital resources. The goal and range of
using repositories is the reason why different types can be distinguished: domain knowledge repository (subject-specific), institutional repository, personal or national repository. Many of them are used for distributing scientific publications, but repositories with
didactic materials used for supporting the teaching/learning process are becoming more
and more popular.
Educational repository is designed for modeling philosophical, scientific, scientifictechnical, scientific-technological state of a chosen knowledge domain by preparing
didactic materials in the form of Learning Objects (LO) and subsequently arranging the
LOs in sequences and distributing them in the network environment [1]. Educational
repository is the basis of competence obtaining by the student according to the educational goals. The educational repository consist of two main parts. The first one is the
specific software (Oracle iLearning, LMS – WBTServer, Caroline, Moodle) which
allows for storing and extending didactic materials [18]. The second one is the content
of the stored didactic materials. The guarantee of quality of the first part depends on
keeping the existing standards (e.g. ISO 9126) which might, however, require additional
interpretation. On the other hand, the guarantee of quality of content and the possibility
The role of standardization in the process of forming quality…
47
of exchanging didactic materials between different participants of the learning process
is based on the standards of describing the structure and content of the materials with
the use of metadata (e.g. SCORM, AICC [19]). It lacks standards of defining the way
of representing the type and depth of knowledge within the specified goal and object of
teaching/learning.
Authors suggest, that the scope of theoretical and procedural knowledge delivered
through the repository mechanisms should be presented as a triple: “domain description
excerpt – typical task – typical solution” [1],[12]. The division of knowledge into three
types has its justification in the main goal of creating the repository – assuring that the
competences will be obtained by participants of the learning process. In conditions of
individual learning, such division of knowledge strongly influences the process of solidifying competences. This approach is confirmed by the definition of competence presented in [13], which is based on:
1) knowledge in the common sense (declarative knowledge – I know, “what”),
2) abilities (procedural knowledge – I know, “how”, I can),
3) attitudes (I want and I’m ready to use my knowledge).
The proposed “triple” gives the ability to structure theoretical and procedural knowledge and to link it with the results of personal experience. It forces students to increase
their involvement in the learning process during preparing individual tasks. Student’s
creativity also increases. After assimilating the “triple” elaborated by a teacher, a student using the theoretical ground is able to e.g. formulate a new task and give new mechanisms for its solution or to build a new solution to the same task. The solved tasks
and their solutions can become a basis for filling the repository with new didactic material and can become a new LO.
In accordance to the previously presented foundations, creating the educational repository in ODL conditions requires two main elements: information environment application, software for the needs of distributing didactic materials and realizing the
teaching/learning process; content - with the procedures of its creation and utilization.
Only together these two parts constitute for the final-product, that can be submitted for
quality assessment from the point of view of technical aspects of the product and even
more importantly from the point of view of the user and his/her expectations.
3. Quality concept in repository development
Evaluation of the repository as a quality final-product requires an analysis of the definition of quality [23]. The concept of a product’s quality is an inherent element of its
production and usage. Platon considered quality as the degree of achieved excellence,
something inconceivable, absolute and universally recognizable, though impossible to
be precisely defined [2]. Arystoteles said that quality is a set of features distinguishing a
thing from other things of the same type [3].
Currently, by quality one often understands the degree of a certain object’s concordance with expectations of the experiencing subject [2]. Juran defines quality as a feature or a set of features that can be distinguished and that are essential for a certain
product e.g. size, appearance, reliability, manufacturing method [4]. In the context of
the user the same author defines quality as usability, the degree to which the specified
product satisfies the needs of a specified purchaser [5]. On the other hand, Crosby talks
48
Magdalena Ciszczyk, Emma Kusztina
about quality in the aspect of production and treats it as the conformity with requirements (specification) [6]. According to Deming, quality is the degree of homogeneity
and reliability of the product at the lowest possible costs and maximal adaptation to
market requirements [7]. Interpretation of the term is also made in the ISO standard.
ISO 9000:2000 defines quality as the degree to which a set of inherent properties fulfills
the requirements. The requirement is a need or an expectation, which was defined, is
commonly accepted or is current [2].
The quoted definitions do not exhaust the available in the literature fan of interpretations of the term, but they can become a base for discussion of the quality issue of educational repository, which requires a complex and multi-aspect analysis to decide
whether the provided product properly realizes its functions and ensures uniform quality
of educational transfer for each participant of the learning process, strengthened by
mutual cooperation. Moreover, they are also the basis to consider the quality of the
repository from one side as quality of the product, and from the other side as quality of
repository development and utilization (preparation and utilization process).
4. Quality standards as a part of quality repository assessment
Analysis of the repository as a computer environment (software) and as a content
one should take into account, that there exist some standards which can be the base of
integral repository quality.
ISO 9126 norm is fundamental when providing assessment of software quality and
is based on three sets of characteristics: internal quality, external quality, in-use quality
[23]. Internal quality is the totality of an application’s features referring to its development and maintenance, thus it may include static and dynamic models, as well as documentation and source codes. External quality grasps the assessment of such software
features which concern its utilization in practice and realization of users’ needs. Basic
criteria of internal and external quality include the following: functionality, reliability,
usability, efficiency, maintainability, portability [15][16][17]. According to the ISO
norm, the in-use quality is the degree of meeting requirements of the user working in a
certain application environment. The in-use quality defines the ability of an application
to realize assumed functions when considering four indicators: effectiveness, productivity, user satisfaction and safety [11].
Although the repository is a software product, the technical aspects are not considered in this article. It results from the fact, that many tested commercial (Oracle ILearning, WebCT Vista, LMS – WBTSerwer, Claroline, etc.) and noncommercial
(Moodle) solutions that can play the role of the educational repository exist on the market. For the needs of education, the Moodle platform is especially widely used. This
application, being an LMS (Learning Management System) type solution, enables placing didactic materials in its bounds, administering users, reporting users activity and
their mutual communication.
However, authors, using the quality indicators of the third characteristic (in-use
quality), made an attempt at interpreting, in the context of the repository, preparation
and usage of the content (table 1, table 2).
The role of standardization in the process of forming quality…
49
Table 1. In-use quality: effectiveness and productivity characteristics [23]
Characteristics
(ISO 9126)
Effectiveness
Subcharacteristics
Task effectiveness
Capability of the
software to provide the
user with the possibility to achieve implied
goals in the specified
Task completion
use context.
Error frequency
Productivity
Task time
Capability of the
software to provide the Task efficiency
user with the proper
amount of resources in
Economic productivity
relation to achieved
effectiveness in the
specified context of
Productive proportion
use.
Relative user efficiency
Interpretation for the needs
of knowledge repository
Competences guaranteeing level – the degree of correctly realized goals of the specified task (Which part
of goals of the specified task was correctly realized?
Amount of finished tasks/ total amount of tasks
“attempted” by users). Cooperation level between the
teaching/learning process participants.
Degree of thematic fulfillment of the repository. LO in
the form of triples: “domain description excerpt typical task – typical solution” with a corresponding
test-task.
Correctness of tasks solutions.
The frequency of errors performed while solving tasks.
Time spent on performing a task.
Users efficiency. Effort put into solving a task (amount
of solution attempts).
How cost-effective is the user?
Proportion of time spent by the user on performing
productive actions.
Students efficiency regarding task solution compared
to the teacher’s efficiency.
Table 2. In-use quality: safety and satisfaction characteristics [23]
Characteristics
(ISO 9126)
Safety
Capability of the
software to achieve the
accepted risk level
reached by the user in
the specified context of
use.
It includes the health
and safety software
usage consequences as
well as unintended
economic consequences.
Satisfaction
Capability of the
software to meet users
needs in specified use
conditions.
Subcharacteristics
User health and safety
Economic damage
Software damage
Safety of people affected
by use of the system
Satisfaction scale
Interpretation for the needs
of knowledge repository
Influence of the teaching/learning process based on the
repository on the health and safety of users.
Incidence of economic damage caused by realization of
the teaching/learning process using the software (repository).
Frequency of software malfunction.
Incidence of threats to user safety when exploiting the
repository.
User satisfaction level as a result of the teaching/learning process based on the repository.
Satisfaction questionnaire User satisfaction regarding specific software features.
Discretionary satisfaction Proportion of users choosing the software for teaching/learning needs.
50
Magdalena Ciszczyk, Emma Kusztina
Although the ISO 9126 norm enables assessing the quality of a software product, on
the market there exists also a group of standards which focus more on content description and management: SCORM, AICC, IMS lub PENS [19].
Nowadays, the most popular of them is SCORM (Sharable Content Object Reference Model), developed by such organizations as: IMS, AICC, LTSC and ADL
[20][21]. The leading standardization works concern four main areas:
 Content packaging standard – defining the ways of grouping and linking files creating one didactic unit, so that each of the many files ends up in the right place at the
target learning platform. In SCORM this standard was created on the basis of IMS
GC.
 Communication standard – defining which information should be exchanged and
how communication between the management system and the didactic unit should
be realized (format of data transmission, communication protocol between management system and didactic unit). The communication standard of SCORM (SCORM
Run-time Environment (RTE)) is built by adapting the entire AICC standard.
 Metadata standard – containing information needed to the purpose of didactic materials indexing, what increases the effectiveness of searching and using didactic materials, but most of all defines the structure of particular elements (courses or other
didactic units). Metadata are saved and processed in the form of an XML document.
In SCORM for this purpose the SCORM Metadata Generator is used.
 Sequencing standard – determining the LOs sequence in an interactive environment.
It concerns the issue of a learner’s path through a course. The sequencing standard
in SCORM is based on the IMS Simple Sequencing Specification.
The extensive intensification of efforts regarding assuring transferability of didactic
materials between different LMS systems is the reason why using SCORM it is possible
to describe the way of course creation and pass the content of a course to the learning
process participant, but there is still lack in solutions for such problems as a place for
didactic content storage or a learning model being passed. From the point of view of an
educational repository, in which the contained didactic material should guarantee obtaining competences, it is essential to explain what kind of didactic materials and what
piece of knowledge a didactic unit includes. It is necessary to search for new mechanisms supporting standardization and appropriate content management, and thus increasing the quality of the educational repository.
5. Conclusion
1. The specificity of the educational repository as a final-product lies in the fact that
this product undergoes a constant development process.
2. Assessment of the repository as a final-product should consider the basic task of
this instrument: guaranteeing acquisition of competences during the learning process.
3. It is necessary to prepare appropriate structure and scope of theoretical and procedural knowledge – the proposed division into „excerpt of domain description – typical
task – typical solution”.
4. Evaluation of the educational repository should be performed regarding both the
producer’s side (creator of the repository) and the client’s side (participant of the learn-
The role of standardization in the process of forming quality…
51
ing process), taking into account the fact that the educational repository consist of the
information platform and the content.
5. There exist standards that can be interpreted for the needs of quality assessment
of the educational repository. However, they do not cover in its entirety the quality
assessment of the repository in the learning process.
6. The standardization processes in frames of describing competences, knowledge or
software solutions indicate the openness and relevance of the problem.
7. In ODL implementations it is a very attractive topic for multi-disciplinary researches in such domains as: cognitive science, pedagogy, informatics or knowledge
representation methods.
References
[1] Kushtina E. The concept of an open information system of distance learning. [In
Polish] Szczecin: Wydawnictwo Uczelniane Politechniki Szczecińskiej, 2006.
[2] Urbaniak M. Management of quality, environment and safety in economic practice.
[In Polish]Warszawa: Difin, 2007.
[3] Arystoteles Categories. [In Polish] Warszawa: PWN, 1990.
[4] Juran J.M., Gryna F.M. Quality, designing, analysis. [In Polish] Warszawa: WNT,
1974.
[5] Juran J.M. Quality Control Handbook. New York: McGraw-Hill, 1988.
[6] Crosby Ph. Running things: the art of making things happen. New York: McGrawHill, 1986.
[7] Deming W.E. Quality, productivity and competition position. Cambridge, Masachusett: MIT Press, 1982.
[8] Tuning Educational Structures In Europe Project. [online]
http://www.relint.deusto.es/TuningProject/index.htm, 2008.
[9] Bologna Process. [online] http://www.bologna-bergen2005.no, 2007.
[10] Gluhak A., Adoue F. Catalogue of competences – European competence profiles
for Multimedia jobs v 1.0, European ICT Jobs, CompTrain, R4.2. [online]
http://www.annuaire-formation-multimedia.com/catalogue.pdf, 2008.
[11] ISO/IEC 9126-4: Software engineering – Product quality. Part 4: Quality in use
metrics (2004). ISO, Geneva, 59 p., 2004.
[12] RóŜewski P., RóŜewski J. Method of competance testing in virtual laboratory environment. [In Polish] In: Operational and system researches 2006, pp. 349-360,
Warszawa: Akademicka Oficyna Wydawnicza EXIT, 2006.
[13] Kossowska M., Sołtysińska I. Employeers trainings and organization development.
[In Polish] p. 14, Kraków: Oficyna Ekonomiczna, 2002.
[14] TENCompetence. [online] http://www.tencompetence.org/, 2007.
[15] Hofman R. Software quality models – history and perspectives. [online]
http://www.teycom.pl/docs/Modele_jakosci_oprogramowania.pdf, 2007.
[16] Abran A., Al-Qutaish R.E., Cuadrado-Gallego J.J. Analysis of the ISO 9126 on
Software Product Quality Evaluation from the Metrology and ISO 15939 Perspectives. http://www.gelog.etsmtl.ca/publications/pdf/1005.pdf, 2007.
[17] Glosiene A., Manzuch Z. Usability of ICT-based systems, In CALIMERA Deliverable 9. [online]
52
Magdalena Ciszczyk, Emma Kusztina
http://www.calimera.org/Lists/Resources%20Library/The%20end%20user%20exp
erience,%20a%20usable%20community%20memory/State%20of%20the%20Art%
20Review%20on%20Usability%20of%20Core%20Technologies.doc, 2007.
[18] Kusztina E., Zaikin O., Ciszczyk M., Tadeusiewicz R. Quality factors for knowledge repository: based on e- Quality project. [online] EUNIS 2008,
http://eunis.dk/, 2008.
[19] Waćkowski K., Chmielewski J.M. Rola standaryzacji platform w e-learningu,
[online] http://www.e-mentor.edu.pl/artykul_v2.php?numer=19&id=406, 2008.
[20] Kotrys R. Standardy w nauczaniu na odległość, Poznańskie Warsztaty Poligraficzne. [online] http://www.pwt.et.put.poznan.pl/2004/PWT1613.pdf, 2008.
[21] SCORM. [online] http://www.adlnet.gov/scorm/, 2008.
[22] HR-XML measurable competencies. [online] http://www.hr-xml.org, 2008.
[23] Kusztina E., Ciszczyk M. ISO 9126 norm interpretation from the point of view of
knowledge repository quality in a network environment. ACS 2008, 2008.
[24] IEEE P1484.20.1 reusable competency definitions draft 8. [online]
http://www.ieeeltsc.org/workinggroups/wg20Comp/wg20rcdfolder/IEEE_1484.20.
1.D8.pdf, 2008.
[25] Kline S.D. Management competency models and the life-long learning project:
what role for absel?. [online] http://sbaweb.wayne.edu/~absel/bkl/vol13/13be.pdf,
2008.
[26] College of Education Field-Based/Competency-Based Teacher Education
(FB/CBTE); [online]
http://www.temple.edu/cte/programs/fieldbased/FBCBTE.html, 2008.
Unified JPEG and JPEG-2000 color descriptor
for content-based image retrieval
Paweł Forczmański
Szczecin University of Technology,
Faculty of Computer Science and Information Technology
Abstract:
The problem investigated in this paper refers to image retrieval based on its compressed
form, hence giving much advantages in comparison to traditional methods involving
image decompression. The main goal of this paper is to discuss a unified visual descriptor
for images stored in the two most popular image formats – JPEG/JFIF and JPEG-2000 in
the aspect of content-based image retrieval (CBIR). Since the problem of CBIR takes a
special interest nowadays, it is clear that new approaches should be discussed. To
achieve such goal a unified descriptor is proposed based on low-level visual features. The
algorithm operates in both DCT and DWT compressed domains to build a uniform,
format-independent index. It is represented by a three-dimensional color histogram
computed in CIE L*a*b* color space. Sample software implementation employs a
compact descriptor calculated for each image and stored in a database-like structure. For
a particular query image, a comparison in the feature-space is performed, giving
information about images' similarity. Finally, images with the highest scores are retrieved
and presented to the user. The paper provides an analysis of this approach as well as the
initial results of application in the field of CBIR.
Keywords:
JPEG, JPEG-2000, visual descriptor, content-based image retrieval
1. Introduction
Content-Based Image Retrieval (CBIR) has been a very attractive topic for many
years in the scientific society. Although there are many academic solutions known (e.g.
IBM's QBIC [1], MIT's Photobook [2], VisualSEEK[3]), there are almost no commercial or industrial applications available. It is mostly caused by not trivial problems,
developers of such systems should solve. The most important are the way the images
are represented in the database and the way they are being compared.
The automatic recognition of objects, which are within image, can employ various
features. The most popular and widely used are: shape, texture, color, luminance, context of the information (background, geographical, meteorological, etc.) and behavior
(mostly movement). It is possible to use more than one feature at the same time, but
such an approach is rather rare [4]. Usually, each recognition method is limited to only
one feature. On the other hand, the literature survey shows that combining object instances coming from different sources (instead of different features of the same image)
is gaining the popularity among researchers [5].
54
Paweł Forczmański
The literature survey shows that color and texture are among the most dominant features used in content-based image retrieval. This paper focuses on color since it is the
most popular characteristic discussed in the literature and guarantee good efficiency
when it comes to single-feature-only type of recognition [6],[7]. The approach presented
here involves direct compressed domain representation of color features which requires
less computing power. The following sections explain the principles of respective visual
descriptor, a method of comparison and preliminary results of experiments performed
with help of prototype software and benchmark databases.
2. Algorithm Overview
Image retrieval described in this paper is based on a unified low-level descriptor
which captures information about image colors. It treats an image in a holistic manner,
hence no segmentation has to be performed. This approach is motivated by a fact that in
general-purpose applications, people look mostly for images belonging to different
classes, containing many objects hard to segment and describe. Developed method is
less complicated than other recent proposals like [8] and [9], yet gives relatively good
results. Preliminary discussion and the results of this approach were included in [10].
In this paper, as an uncompressed image we understand joint set of matrices having
the same dimensions containing elements which describe individual pixel intensities
(often represented by 8-bit integer values). In the case of gray-scale image there is only
one matrix, while a true-color image contains at least 3 matrices. On the other hand,
compressed image is a dedicated structure containing data reduced in size needed to
reconstruct visual information corresponding with the original (uncompressed) image.
Compressed file is usually a bitstream hard to inspect in a visual way.
1.1. Color Histogram Descriptor
As it was noted, color is one of the most important features taken into consideration,
when it comes to content-based image retrieval. The use of color is motivated by the
way the Human Visual System (HVS) works. It is proved that in good lighting conditions human-being pays attention: first to intensity and color of objects, second to shape
and movement, then to texture and other properties.
There have been many color descriptors proposed in the past, most of them based on
different color-subspace histograms and dominant/mean values. Nowadays, when the
MPEG-7 standard is being introduced, the most promising are compact descriptors,
which join color information and its distribution: Scalable Color (SCD), Dominant
Color (DCD), Color Layout (CLD). However, not all of the above descriptors can be
easily implemented for images represented in compressed domain.
Color information captured by the descriptor discussed here is represented by a
three-dimensional histogram of pixel values in CIE L*a*b* color space. This space was
intentionally chosen, since it resembles human perception and is linear [11]. It means
that the difference between two colors observed by a human is proportional to the numerical distance in the color space. The CIE L*a*b* space is defined by 3 components: L
– pixel intensity (L=0 means black, L=100 – white), a – pixel position between red and
green (a<0 green, a>0 red), b – pixel position between blue (b<0) and yellow (b>0).
Unified JPEG and JPEG-2000 color descriptor for content-based image retrieval
55
The algorithm needs an image to be represented in the CIE L*a*b* space which is derived from XYZ color space, which is calculated from RGB space according to the
following formula [12]:
 X = 0.431R + 0.342G + 0.178 B

Y = 0.222 R + 0.707G + 0.071B .
 Z = 0.020 R + 0.130G + 0.939 B

(1)
It is an intermediate step, which leads to final XYZ to CIE L*a*b* conversion:
 L* = 116 f (Y / Yn ) − 16
 *
 a = 500 [ f ( X / X n ) − f (Y / Yn )] ,
b* = 200 [ f (Y / Y ) − f ( Z / Z )]
n
n

(2)
3
1
 6 
t > 
t 3

 29 
f (t ) = 
.
2
4
 1  29 
 3  6  t + 29 otherwise
  
(3)
where:
For research purposes the parameters Xn, Yn, and Zn have been chosen arbitrarily,
which corresponds roughly to a mid-day sun in Europe, hence it is also called a daylight
illuminant D65 (Xn=0.950469, Yn=1, Zn=1.088970) [11].
Each dimension of the color-space is uniformly discretized into 8 bins, which for the
whole space gives 512 bins in total. The bin width for L is equal to 100/8=12.5, for a
and b is equal to 15 (6 bins per dimension). The first and the last interval of a and b are
virtually unlimited. The first one is related to values less than -45, while the second one
to values greater than 45, which restricts RGB space to colors possible to obtain in practice. These intervals are relatively wide which is implied by the observation that most of
images of real scenes (i.e. photographs) do not require more precise quantization.
Hence, the total number of bins can be reduced without noticeable loss in the center of
the space.
Histogram of pixel intensities is a standard feature capturing invariant image characteristics. The histogram employed in the color descriptor here has two variants: unweighted (direct) and weighted one. The first type is a typical histogram which counts
pixel occurrences in the image having respective values only. In the second variant the
method of incrementing respective bin is modified in such a way that not only actual bin
is incremented, but all direct neighbors are influenced, too. Each neighboring bin is
incremented according to its weight, which is calculated as a distance between bin center and a color been analyzed. Since CIE L*a*b* space is uniformly discretized, this
distance reflects perceptive similarity of colors. This approach is similar to the one
presented in [13], however there are two main differences: instead of CIE L*u*v*, CIE
L*a*b* is being used, and instead of calculating distances to all bins, only distances to
56
Paweł Forczmański
neighboring bins are considered. The first method of histogram calculation is relatively
fast, while the second one reflects human (more fuzzy) cognition more precisely, howho
ever it requires almost two times longer calculations. Values of unweighted histogram
can be stored in an unsigned integer form, while for weighted histogram they should be
saved as floating-point representation.
The algorithm of histogram calculation works
ks in an iterative manner: for each pixel
in the analyzed image a proper histogram bin (in a unweighted variant) and all neighneig
boring
ing bins (for weighted histogram) are incremented. The number of a bin to be increincr
mented is calculated according to the standard
d Euclidean distance between pixel color
and each bin's center. Weight wi of neighboring bin i depends on its distance to the anaan
lyzed color di and the sum of all distances to all N neighboring bins centers:
wi =
di
N
∑dj
.
(4)
j =1
The centers of the first and the last interval of a and b components are arbitrary set
to -52.5 and 52.5, respectively [10].
Calculated histogram is normalized in such a way that all bin values are divided by a
total number of pixels in the thumbnail and then multiplied by 255 giving unsigned
8-bit
bit representation (for both unweighted and weighted variant). Both histograms for a
sample image are presented in Fig. 1.
Figure 1. Color histograms for a sample image
It is possible to increase histogram computation
tion speed by taking into consideration
thumbnail only, instead of the whole image. In a case of JPEG files it is accessible
57
Unified JPEG and JPEG-2000 color descriptor for content-based image retrieval
through DC components of each block [14]. Since JPEG algorithm divides image info
8x8 pixel blocks, the thumbnail is 8 times smaller than original image. In a case of
JPEG-2000 files, in order to analyze a thumbnail of the same size, the histogram is
calculated on LL band in a third decomposition level. All previous iterations of wavelet
transform have to be performed, however since they work on reduced data volume it is
relatively fast. Respective data used to create image thumbnail are presented in Fig. 2.
Figure 2. Single DCT block and DC component used to generate image thumbnail (left) and
DWT decomposition and third level low-low component (right)
Color histogram calculated over a thumbnail does not introduce any significant errors and the color information is preserved. Moreover, this operation is relatively fast,
since it employs 64 times smaller number of pixels and does not require inverse DCT to
be performed in case of JPEG. In case of JPEG-2000 the most time-consuming iterations (related to inverse DWT) are also eliminated.
1.2. Similarity Measure
Color-based similarity between query Q and database image B is calculated using L1
metric of respective histograms HQ and HB:
8
8
8
DQ( C, B) = ∑∑∑ H Q (l , a, b) − H B (l , a, b) ,
(5)
l =1 a =1 b =1
where l, a, and b are respective dimensions of histograms.
For identical histograms (not necessary identical images), D(C) is equal to zero, while
maximal possible distance is equal to 255 (histograms are different).
Developed algorithm was implemented as a working model in Java2 SE and tested
on several publicly available benchmark datasets [15],[16] containing over 6000 images
of different motives. The main window of the application 'ImSearch' is presented in
Fig. 3. This software was used to perform all the experiments described further in the
paper.
58
Paweł Forczmański
Figure 3. 'ImSearch' window
3. Experiments
The experiments performed using developed software were devoted to the following
problems:
 investigation of the identity of the descriptor for both JPEG and JPEG-2000 files;
 testing the robustness to compression artifacts;
 evaluating the classification ability of the descriptor.
The first group of test involved images compressed by means of both algorithms.
The assumption here is that the same image stored in JPEG and JPEG-2000 files should
look identically for a human observer no matter which compression type we choose (the
compression ratios should not be extremely high). Of course, this same should be true
for a computer program.
The set of images used in this experiment consists of 100 pictures randomly selected
from the database. All images were compressed by means of JPEG and JPEG-2000
creating two subsets. In order to eliminate compression artifacts influence, the compression quality was set to the highest level (lossless compression). The JPEG set was taken
as the reference one, while JPEG-2000 images were taken as query samples. The experiment is conducted according to the following scenario. There are 100 single queries.
During a single step a computer program returns 5 images which are similar to the
query one. They are ordered in the decreasing similarity order. The maximal position of
the correct image (from the range of 1 to 5) is being memorized. A perfect response is
when it occupies exactly the first position. The worst scenario in this case is when it is
on the fifth place. The results of the experiment are presented in Fig. 4 as an average of
100 queries. It confirms the identity of the color descriptors calculated both for JPEG
and for JPEG-2000 files. The differences between weighted and unweighted variant of
color histograms are very slight and do not favor one of them.
The goal of the following experiment was to investigate the robustness of the descriptor to variable compression quality. The assumption here is that images with different compression ratios (quality) are recognized as similar by human observer.
The input data in this case was 50 randomly chosen JPEG images taken from the
main database. It was limited to JPEG files only since the first experiment proved the
Unified JPEG and JPEG-2000 color descriptor for content-based image retrieval
59
descriptor for both JPEG and JPEG-2000 is equal to the same degree (at least for high
compression quality). Each picture was stored in 5 compression variants – a quality
factor was set to 90%, 75%, 60%, 45%, and 30% – making 250 base images. Testing
the lower compression quality was not necessary since it introduces high distortions and
is useless in practice. The query image was assumed to have maximal possible quality.
The result of a single query are five images arranged in the decreasing similarity order.
The expected perfect response is when for a certain query image five its copies (with
different compression) are retrieved. The results presented in Fig. 5 prove that the descriptor in the weighted histogram variant is independent on the compression quality.
The results for unweighted variant are slightly worse.
JPEG to JPEG-2000 relevance
100
A ccuracy [%]
99
98
97
96
weighted
unweighted
95
94
1
2
3
4
5
Maximal pos ition of the c orrec t ans w er
Figure 4. Results for query image stored in JPEG-2000 and base images stored in JPEG files
Compression robustness
100
Accuracy [%]
99
98
97
96
weighted
unweighted
95
94
5/5
4/5
3/5
2/5
1/5
Number of correct answ ers
Figure 5. Results of the retrieval for base images with variable (lower) compression quality
60
Paweł Forczmański
The last experiment
ent is devoted to evaluation of the classification ability of the disdi
cussed method. It gives an information about the practical usability of the proposed
algorithm. This test was performed on a limited number of images to satisfy the classificlassif
cation conditions (not all objects in the database are represented in a sufficient way, i.e.
there is often single image per class).. There are 66 pictures stored in JPEG format did
vided into 11 thematic classes (6 images per class). Each class contains different reprerepr
sentations
tations of the same scene (pictures taken from various perspectives, depicting
various its parts or different instances off the same object) – see Fig. 6
Figure 6. Instances of eleven classes used in the third experiment: 'arch', 'asphalt', 'clouds',
‘pavement', 'gravel', 'meadow', 'facade', 'flowers', 'street', 'tree', 'monument'
The experiment involved checking how many images returned by the computer propr
gram belong to the actual class. It was performed for both weighted and unweighted
variant of color histograms. The results presented in Fig. 7 show high accuracy, especially in case of four images out of five resulting ones.
Classification ability
100
90
80
Accuracy [%]
70
60
50
40
weighted
unweighted
30
20
10
0
5/5
4/5
3/5
2/5
1/5
Number of correct answ ers
Figure 7. Average results of the retrieval for 11 image classes
Unified JPEG and JPEG-2000 color descriptor for content-based image retrieval
61
Summary
In the article some important aspects of CBIR were showed, employing color only
information, extracted directly from compressed files. The algorithm operates in both
DCT and DWT domains and produces a three-dimensional color histogram computed
in CIE L*a*b* color space. Prototypic software calculates a compact descriptor for every
image and stores it in a database-like structure. It makes it possible to search for similar
images in the large database, employing example picture presented by user. The main
advantage over the existing methods is lower computational costs of retrieval, since the
image analysis is performed directly on a compressed form of image. The retrieval
accuracy is comparable to other well known approaches, yet the time of computations
can be radically decreased.
Acknowledgements
Software implementation and the results of experiments are a part of the MSc Thesis
of Mr Adam Bania, presented in 2007 at Szczecin University of Technology, Szczecin,
Poland.
References
[1] Flickner M., Sawhney H., Niblack W., Ashley J., Qian Huang, Dom B., Gorkani
M., Hafner J., Lee D., Petkovic D., Steele D., Yanker P. Query by image and video
content: the QBIC system. Computer. Vol. 28(9), Sep 1995
[2] Pentland A., Picard R. W., Sclaroff S. Photobook: Content-based manipulation of
image databases. SPIE Storage and Retrieval for Image and Video Databases II,
1994
[3] Smith J. R., Chang Shih-Fu. VisualSEEk: a Fully Automated Content-Based Image
Query System. ACM Multimedia, Boston, MA, November 1996.
[4] Forczmański P., Frejlichowski D. Strategies of shape and color fusions for content
based image retrival, Computer Recognition System 2, Springer, Berlin, 2007
[5] Kukharev G., Mikłasz M.: Face Retrieval from Large Database, Polish Journal of
Environmental Studies, vol. 15, no. 4C, 2006
[6] Deng Y., Manjunath B. S., Kenney C., Moore M. S., Shin H.: An Efficient Color
Representation for Image Retrieval. IEEE Transactions on Image Processing, vol.
10, no.1, 2001
[7] Manjunath B. S., Ohm J.-R., Vasudevan V. V., Yamada A.: Color and Texture
Descriptors. IEEE Transactions on Circuits and Systems for Video Technology,vol.
11, no. 6, 2001
[8] Guocan Feng, Jianmin Jiang. Jpeg compressed image retrieval via statistical features. Pattern Recognition, 36(4):977–985, April 2003
[9] Au K.M. , Law N.F. , Siu W.C.. Unified feature analysis in jpeg and jpeg 2000compressed domains. Pattern Recognition, 40(7):2049–2062, 2007
[10] Forczmański P., Bania A. Content-based Image Retrieval in Compressed Domain.
Polish Journal of Environmental Studies, 2008
[11] Fairchild M. D. Color Appearance Models. John Wiley and Sons Ltd., 2005
62
Paweł Forczmański
[12] Lab color space, http://en.wikipedia.org/wiki/Lab_color_space [online]
[13] Lu G., Phillips J. Using perceptually weighted histograms for colour-based image
retrieval. Fourth International Conference on Signal Processing, Beijing, 1998
[14] Schaefer G. Jpeg image retrieval by simple operators. Proceedings of the International Workshop on Content-Based Multimedia Indexing, Brescia, Italy, 2001
[15] Free Public Domain Photo Database: pdphoto.org [downloaded 08.05.2008]
[16] Free Images – Free Stock Photos: www.freeimages.co.uk [downloaded 09.05.2008]
An outline of the system for computer-assisted diagnosis
based on binary erythrocytes shapes
Dariusz Frejlichowski
Szczecin University of Technology,
Faculty of Computer Science and Information Technology
Abstract:
The paper presents some initial results of the work on the system for Computer-Assisted
Diagnosis (CAD) of diseases involving erythrocytes (e.g. anaemia). The approach is
based on binary shapes of red blood cells (RBCs), extracted from digital microscopic
images. This comes from the fact that some diseases have their origin in deformation of
RBCs shapes, what makes the proper delivering of oxygen to body tissues impossible. In
result the blood circulation is non-regulated. The approach under development was
examined and implemented in form of the prototype in Matlab environment.
Keywords:
pattern recognition, binary image, shape description, erythrocytes’ shape analysis
1. Introduction
The role of red blood cells is to deliver oxygen from lungs to body tissues and carbon dioxide back. Sometimes, their abnormal work can lead to a serious disease, e.g.
anaemia or malaria. Those diseases are caused by deformation of erythrocytes' shapes.
It is obvious as deformed RBCs cannot deliver oxygen properly, what in result makes
the blood circulation non-regulated. Therefore, computer assisted automatic diagnosis
of some selected diseases can be based on erythrocytes shapes. It can use digital microscopic images of human blood stained using MGG (May-Grunwald-Giemsa) method
(few examples are provided in fig.1) as an input.
In order to realize the automatic diagnosis the template matching was chosen. This
approach finds a database template which is the most similar to the one under processing. Usage of the object itself and comparing it with the base element is not possible
because of its differences in comparison with the template (cased by rotation, shifting,
resize, noise, and so on). Therefore some features have to be used. The most popular
ones are: shape, colour, luminance, texture, context of the information. Amongst them
the very special attention is put on shape, because in some applications it is the most
relevant and the less changeable feature that can be used. To avoid the mentioned problems, a shape has to be represented properly, using so-called shape descriptors. Some of
them are invariant to the deformations. In this paper three various methods of that kind
were applied for examination - UNL-F, PDH, Log-Pol-F. In all of them transformation
of boundary points from Cartesian to polar co-ordinates is performed. This transformation has several important advantages. Firstly, it gives shape description invariant to
64
Dariusz Frejlichowski
translation and scaling (after normalisation). The rotation becomes circular shifting after
derivation, but all above methods solve this problem easily. Moreover, the representation of a boundary in polar co-ordinates is more convenient, if we are analysing the
overall character of a shape.
Figure 1. Some examples of the input data – microscopic images of human blood
In the discussed system for CAD the feature representation (using a shape descriptor) stage is crucial for the efficient work. However, it is not the only one. Therefore,
three additional steps, typical for recognition based on digital images have to be performed: pre-processing, extraction of particular shapes (both are preceding the feature
description) and matching with the database elements. Using results of the last step with
some rules based on medical knowledge it is possible to make a diagnosis.
The first mentioned stage (pre-processing) uses i.a. thresholding and median filtering. The next one - feature localisation – is based on tracing regions for every separate
object in image. The extracted object is represented using a shape descriptor and
matched with elements in the database. This is a short description of the method as a
whole. The more detailed information about its main parts is provided in the subsequent
sections.
2. Pre-processing
The input microscopic images are represented in greyscale. If not, they have to be
firstly converted to such representation.
The first step of the pre-processing stage is the thresholding, which gives binary image in result. Three different approaches to this problem were experimentally analysed:
An outline of the system for computer-assisted
assisted diagnosis based on binary…
binary
65
rigid thresholding ([1]), thresholding based
d on histogram ([2]) and usage of fuzzy measmea
ures ([3]). The last approach gave significantly better results than the others and was
applied in the final scheme. In order to reduce the influence of noise, median filtering
for the binary image was utilised. An example of the achieved binary representation of
the microscopic image is provided in fig.2. As it can be easily seen, the binary image
contains black pixels for background and white ones for cells.
Figure 2. Exemplary result of the thresholding process
proce of input image
Figure 3. An example of the co-ordinates
ordinates extraction for a cell using tracing the contour. Small
arrows in the middle part of the image denote the directions of tracing.
The second stage of the pre-processing
processing phase is the localisation of cells. It is realised
through tracing regions of every separate object in image. Each time, the maximal coco
ordinates in each direction are stored. Only objects entirely placed in image are considconsi
ered, others (placed on boundaries) are not. In fig.3. the strategy
s
for tracing every cell's
contour is depicted.
In the developed CAD system the diagnosis is based only on erythrocytes (in fact, in
future it is very tempting to enhance it on other blood particles, what can increase the
66
Dariusz Frejlichowski
number of diseases diagnosed). Therefore, the next step of the pre-processing is the rejection of thrombocytes and leukocytes. It is realised through the rejection of regions
significantly bigger (for leukocytes) or smaller (thrombocytes) than the others (erythrocytes). That process gives us additional benefit - it rejects some occluded shapes, which
are very difficult to recognise. Unfortunately, it is still possible that some particles of
those kinds have similar area to RBC. In that case the rejection is based on histogram,
because thrombocytes and leukocytes have very dark (almost black) parts. Therefore,
histogram derived for them is different from the one achieved for erythrocytes. Obviously, in this case we have to go back for a moment to the greyscale representation.
Additionally, the image before the thresholding is enhanced through histogram equalisation. As it turned out, this process reduces the number of occluded shapes.
3. A single cell representation using shape description
As it was explained in the first section, three polar shape representations were chosen for experiments. In this section, brief description of their features is provided. Details can be found in the referred literature.
The first approach, UNL-F ([4]), is composed of two transformations. Firstly, the
transformation of boundary points to polar co-ordinates is performed, but with some
differences in comparison to the traditional way of doing this transformation. The derived co-ordinates are put into matrix, in which row corresponds to the distance from
centroid, and column - to the angle. The obtained matrix is 128 x 128 pixels size. In
fact, that gives another binary image, therefore the Fourier transform can be applied to
it. From the spectrum only the most important part is cut and represents a shape. The
resultant shape description is invariant to scaling (thanks to normalisation), translation
(calculation of co-ordinates according to a particular point within an object) and rotation
(Fourier transform). The representation is also robust to noise.
The Log-Pol-F ([5], [6]) was the second approach examined. It uses the centroid as
an origin of the transform, as in the UNL-F. The polar-logarithmic transform is used
here, and, afterwards, the Fourier transform. Thanks to the similarity to the earlier
method the Log-Pol-F has similar properties. There are only two differences. The invariance to scaling is achieved using logarithmic transform, not normalisation. And, the
usage of polar co-ordinates is simpler - only distances from centroid are taken.
The Point Distance Histogram ([7]) was the last algorithm explored. The method can
use various ways of deriving the origin of the polar transform. Here, to keep the similarity to the earlier approaches, the centroid was utilised. After calculating this point, all
distances between points on the contour and the centroid are derived. Only the points
with maximal distance for particular angle are taken, one for every angle from 0 to 359
degrees. That gives three hundreds and sixty values. The next stage is the normalisation
according to the maximal distance. The last step is the derivation of the histogram for
those values. In the experiments the number of bins in histogram was set to 50, which
is the optimal value for small shapes. The PDH is invariant to translation (thanks to the
derivation according to the centroid), rotation (thanks to the usage of histogram) and
scaling (normalisation).
An outline of the system for computer-assisted diagnosis based on binary…
67
The comparison of shape descriptors was crucial in developing the method. Basing
on the template matching approach, in each case, firstly descriptions were calculated for
base elements (12 various classes of erythrocytes, 5 instances in each) and stored. After
localisation and extraction of the object being identified, it was described using particular method as well. The matching between it and all base elements was performed using
C1 metric ([8]). Maximal value indicated the closest base element. Also, for each type
of RBC, the threshold for rejecting resultant similarity measure was experimentally established. The above process was performed automatically, using digital microscopic
images (1000 magnification), with several dozens to hundreds of shapes within. As it
turned out, the best overall results were achieved using UNL-F (93% RR). Log-Pol-F
was slightly worse (91% RR), and PDH achieved the Recognition Rate equal to 84%. It
means that robustness to noise is more important (UNL-F, Log-Pol-F), than detailed
analysis of a boundary (PDH). In the developed system for diagnosis the UNL-F was
applied as a method for shape description.
4. A prototype of the system for automatic diagnosis
The complete scheme of the proposed approach, composed using stages described in
the former sections, was implemented in Matlab. This environment was chosen for two
reasons. Above all, the implementation had to help in examining the methods used at
particular stages of the approach. Secondly, the initial automatic diagnosis could be
performed. At present, the prototype had only to inform, if the image contained some
affected, deformed cells. However, some initial decision rules were also utilised.
Thanks to this the first experiments on the automatic diagnosis were also carried out. In
future, the system will perform the diagnosis completely automatically.
In fig. 4 the window of the prototype in Matlab is provided. Five separate parts of
the window can be mentioned. The first one (see fig.4.a) is devoted to the selection of
the image for analysis. The second one (fig.4.b) contains the result of calculating the
particular types of red blood cells. It is important for diagnosis. The next one (fig.4.c)
presents the image under analysis. The suggested result of the analysis is provided in the
next subwindow (fig.4.d). The last element of the window is the button for exiting the
application (fig.4.e).
Some results (labelled erythrocytes) of the applied prototype of the system were presented in fig. 5. For better visibility, only the subwindow with analysed images are provided. The first image (the left one) presents blood of a healthy man, with
normoerythrocytes and only one doubtful case. For the second image (in the middle) the
potential diagnosis was thalassemia and sideropenic anaemia and in the last case (on the
right) hemolytic anaemia was suggested.
68
Dariusz Frejlichowski
Figure 4. A window of the prototype in Matlab (particular parts of the window are described
in the text)
Figure 5. Few pictorial results of the suggested diagnosis, performed using explored approach
(rectangles enclose the affected RBCs)
5. Concluding remarks and future work
The developed prototype of the system was tested on over 50 microscopic images.
The suggested diagnosis was in each case the same as the one made by human. It means
that future work with presented approach is worth doing. First of all, it has to be tested
on much bigger collection of images. Secondly, some steps at the pre-processing
pre
stage
can be still improved. And finally, the number of considered shapes can be larger. Not
only erythrocytes, but also leukocytes and thrombocytes can be recognised,
rec
in order to
increase the number of diseases under consideration.
An outline of the system for computer-assisted diagnosis based on binary…
69
Even tough the initial results are very promising, the completely automatic computer
diagnosis was not the goal of the work presented here. The purpose of the system under
development is rather close to the retrieval of the microscopic images and pointing out
the doubtful cases in large databases. After this stage the diagnosis should be made on
those cases by professionals. It means that the system is primarily developed for supporting the analysis of microscopic images. However, in future it can be enhanced to the
completely automatic system.
References
[1] Kukharev G., Kuźmiński A. Biometric Techniques. Part 1. Face Recognition Methods, (in Polish), WI PS Press, Szczecin, 2003.
[2] Sanei S., Lee T.K.M. Cell Recognition based on PCA and Bayesian Classification.
In: 4th International Symposium on Independent Component Analysis and Blind
Signal Separation,. Nara, 2003, pp. 239-243.
[3] Kim K.S., Kim P.K., Song J.J., Park Y.C. Analyzing Blood Cell Image to Distinguish its Abnormalities, In: 8th ACM International Conference on Multimedia, Los
Angeles 2000, pp. 395-397.
[4] Frejlichowski D. Contour Objects Recognition Based On UNL-Fourier Descriptors,
In: Enhanced Methods in Computer Security, Biometrics and Artificial Itnelligence
Systems, 2005, pp. 203-208.
[5] Kuchariew G. Digital Images Processing and Analysis (in Polish), WI PS Press,
Szczecin, 1998.
[6] Luengo-Oroz M. A., Angulo J., Flandrin G., Klossa J. Mathematical Morphology in
Polar-Logarithmic Coordinates. Application to Erythrocyte Shape Analysis. In:
Marques, J. S., Pérez de la Blanca, N., Pina, P. (eds.) IbPRIA 2005. LNCS, vol.
3523, 2005, pp. 199-205.
[7] Frejlichowski D. Shape Representation Using Point Distance Histogram. Polish J.
of Environ. Stud. vol. 16, no. 4A, 2007, pp. 90-93.
[8] Lam K.M., Yan H. An Analytic-to-Holistic Approach for Face Recognition Based
on a Single Frontal View. IEEE Trans. on PAMI. vol. 20, no. 7, 1998, pp. 673-686.
Personalizacja przekazu interaktywnego z udziałem
metod analizy czynnikowej
Jarosław Jankowski
Politechnika Szczecińska, Wydział Informatyki
Abstract:
Growth in the complexity of internet applications, variability of data structures, use of
multimedia objects make it difficult to model information structures and design access
interfaces. The application potential is not always used and the obtained results
maximized. In the article a procedure is presented of adapting objects used in the
interactive communication process. It assumes object decomposition and identification of
the system reaction functions for different selected collections of input variables with
simultaneous minimization of the number of variants. The approach used gives a
possibility to verify various systems and minimize the space of the searches of the design
variant in web applications, where combinatorial nature of a job makes it difficult to use
other methods.
Keywords:
Internet, online marketing, adaptive systems
1. Wstęp
Rozwój systemów informacyjnych i aplikacji internetowych następuje w wymiarze
technologicznym i społecznym, który coraz częściej stanowi podstawę funkcjonowania
wielu przedsięwzięć. Obecnie zarysował się nowy trend w realizacji systemów internetowych, którego konsekwencją jest między innymi przedefiniowanie potrzeb w zakresie
kategoryzacji i dostępu do danych [17]. Rozwój Internetu szerokopasmowego umoŜliwił wprowadzenie nowych obszarów zastosowań, zwiększyła się ilość treści multimedialnych oraz nastąpiła decentralizacja funkcji wydawniczych. W wielu aplikacjach
zorientowanych na środowiska interaktywne występują problemy z tworzeniem interfejsów dostępowych do danych oraz modelowaniem struktur dla zapewnienia uŜyteczności
i generowania określonych efektów. W artykule przedstawiono koncepcję dekompozycji obiektów interaktywnych na elementy składowe w ramach procedury poszukiwania
wartości ekstremalnych funkcji odpowiedzi, która identyfikuje charakterystyki systemu
i daje moŜliwość maksymalizacji oczekiwanych efektów w procesie adaptacyjnym.
2. Ewolucja struktur danych w aplikacjach internetowych
Sektor prywatny związany z technologiami internetowymi po okresie stagnacji na
początku dekady notuje od pewnego czasu ponowny wzrost, potwierdzony dynamiką
inwestycji. Raport Komisji Europejskiej na temat rozwoju nowych przedsiębiorstw
72
Jarosław Jankowski
i innowacji w sektorze elektronicznym wskazuje na zwiększenie liczby firm z tej grupy
[9]. Nowo utworzone organizacje gospodarcze charakteryzuje wyŜszy poziom innowacyjności oraz fakt iŜ operują one w obszarach związanych z protokołami VoIP, identyfikacją radiową, Internetem i technologiami mobilnymi. Wraz z rozwojem serwisów
zorientowanych na społeczności internetowe wykorzystuje się nowe struktury aplikacji
internetowych oraz odmienne podejścia przy ich tworzeniu i eksploatacji, między innymi w postaci ontologii i sieci semantycznych, których kierunki rozwoju wskazują
J. Davies i R. Studer [6]. Ta faza ewolucji Internetu integruje sześć zasadniczych elementów: tworzenie treści przez uŜytkowników, zbiorowa inteligencja, repozytoria danych na niespotykaną dotąd skalę, sieci społeczne, efekt skali, otwartość [1]. Jak wskazują M. Madden i S. Fox aktualne stadium rozwoju jest wynikiem naturalnej ewolucji
dokonującej się w drodze weryfikacji róŜnych znanych wcześniej koncepcji [14]. Obszar ten stanowi przedmiot badań firm analitycznych, m.in. Gartner Group, koncentrujący się na akceptacji określonych technologii przez biznes w ramach cykli (ang. hype
cycle) [10]. Wyodrębnione etapy obejmują wprowadzanie koncepcji i technologii, oraz
oddziaływanie ich na zaangaŜowanie firm technologicznych i inwestorów. Następny
etap cyklu odnosi się do weryfikacji faktycznej wartości innowacji i zazwyczaj skutkuje
redukcją zainteresowania mediów (Rys 1.).
Zainteresowanie biznesu i mediów
Mashupy
Rozpoznawanie mowy w
technologiach mobilnych
Web 2.0
IPv6
Web 2.0 i elementy składowe
Folksonomie
Korporacyjne sieci sematyczne
Papier cyfrowy
Analizy sieci społecznych
RSS korporacyjne
Architektura sterowana modelem
Znaczniki RFID
Inteligencja zbiorowa
Architektura sterowana zdarzeniami
Teleobecność
VoIP
Systemy GRID
Wewnętrzne webserwisy
Tłumaczenia w czasie rzeczywistym
Offline Ajax
Ajax
RFID w obrocie hurtowym
Rzeczywistość rozszerzona
Biometryka w płatnosciach
Przewidywanie zachowań rynków
Smartfony
Wiki
Tablet PC
Obliczenia kwantowe
Blogi korporacyjne
Alikacje lokalizacyjne
Komunikatory w biznesie
Technologie lokalizacyjne
Płatności mobilne
Architektury Tera
Logiki DNA
Okres adaptacji technologii
- mniej niŜ dwa lata
- od 2 do 5 lat
- od 5 do 10 lat
- ponad 10 lat
Sieci sensoryczne
t
Powstanie technologii
Masowe zainteresowanie
Weryfikacja
WdroŜenia i standaryzacja
q
Rysunek 1. Cykl adaptacja nowych technologii (źródło: opracowanie własne na postawie
Gartner's Emerging Technologies Hype Cycle, 2006)
Analizy te identyfikują umiejscowienie Web 2.0 i obszarów powiązanych (np.: folksonomie [18], analizy sieci społecznych, agregatory internetowe typu mashup) w okresie największego zainteresowania. Wynikiem ewolucji są zaawansowane przeobraŜenia,
oddziałujące na kształt środowiska internetowego oraz przyczyniające się do upowszechnienia narzędzi syndykacji, wykorzystania standardów RSS, sieci semantycznych, tworzenia aplikacji z wykorzystaniem protokołów asynchronicznych AJAX [15].
WdraŜanie warstwy technologicznej Web 2.0 umoŜliwia integrację narzędzi stosowanych w innych aplikacjach internetowych. Realizowana w ten sposób funkcja informa-
Personalizacja przekazu interaktywnego z udziałem metod analizy czynnikowej
73
cyjna podnosi jakość usług i wpływa jednocześnie na kreowanie nowych struktur danych zorientowanych na dostęp do treści generowanych przez uŜytkowników (ang. user
generated content). W tym obszarze L. Rosenfeld i P. Morville identyfikują architekturę
informacji jako nową dziedzinę badań [19]. Zwiększaniu uŜyteczności systemów towarzyszy powstawanie podstaw do tworzenia systemów rekomendujących, które bazują na
wzorcach behawioralnych, wskazywanych między innymi w pracach B. Jian Hu i Z.
Hua-Jun z Microsoft Research [13]. Integrowane są metody budowy struktur interfejsów zorientowanych na eksplorację zawartości systemów handlu elektronicznego w
postaci systemów filtracji kolaboracyjnej, których metody zunifikowanej integracji
przedstawili J. Wang i P. Arjen [22]. Widoczne jest ich powiązanie z algorytmami eksploracji danych czy przypisywania do zbiorów danych, obiektów multimedialnych lub
zbiorów tekstowych kategorii lingwistycznych. Inne kierunki zbliŜają zbiory danych do
koncepcji sieci semantycznych i wymagają takiego opisu struktur danych, by umoŜliwić
ich elastyczne przetwarzanie. Determinuje to równieŜ powstawanie nowych obszarów
badawczych w obszarze identyfikacji struktur danych i dopasowania treści serwisów do
preferencji odbiorców i szerzej pojętej personalizacji.
3. ZałoŜenia personalizacji i parametryzacji obiektów interaktywnych
Dla poszczególnych zastosowań moŜna zidentyfikować charakterystyczne struktury
danych wykorzystywanych w komunikacji z uŜytkownikiem. Miary ich uŜyteczności,
szerzej omawiane przez A. Granica i V. Glavinica [12], umoŜliwiają identyfikację
obiektów, które stanowią warstwę pośredniczącą między uŜytkownikiem a częścią
aplikacyjną i podlegają modyfikacji oraz dopasowaniu w czasie. S. DeLoach
i E. Matson precyzują dla środowiska internetowego podejścia adaptacyjne i agentowe
stosowane w projektowaniu systemów informatycznych i interfejsów, gdzie za rozwiązanie idealne przyjmuje się implementację systemu, który dostosowuje się do zmieniających się warunków i potrzeb uŜytkowników [7]. W niniejszym artykule przedstawiono koncepcję dekompozycji obiektu interaktywnego i wyodrębnienie parametrów podzielonych na zbiory danych wejściowych, m.in. grupę parametrów ilościowych i jakościowych. Parametry ilościowe C1,C2,...,Cm reprezentują mierzalne cechy obiektu i jego
części składowych, do których moŜna zaliczyć parametry elementów graficznych. Cechy ilościowe reprezentowane są przez zmienne o charakterze ciągłym lub dyskretnym.
Zmienne dyskretne ujęte są w postaci zbiorów alternatywnych wartości
ZD={e1,e2,....,en}. Przy generowaniu wariantu przekazu interaktywnego następuje selekcja elementów z podanego zbioru, w oparciu o zadaną funkcję selekcji fs(x,z,n),
odpowiedzialną za wybór elementu x ze zbioru z dla wywołania przekazu n, która ma na
celu maksymalizowanie efektów (na przykład kampanii reklamowej) lub uŜyteczności
obiektu. W przypadku parametrów ciągłych dla zmniejszenia przestrzeni decyzyjnej
moŜna zastosować wybrane algorytmy dyskretyzacji, np. proponowane przez L. ChangHwana [5]. Procedura realizowana jest wieloetapowo, w oparciu o strukturę i zbiory
wejściowe w sposób zautomatyzowany generowany jest zbiór n obiektów o róŜnych
cechach. Stanowi on wyjściowy zbiór testujący, który podlega weryfikacji w środowisku interaktywnym dla wyodrębnionych grup odbiorców. Przy generowaniu wariantu
następuje selekcja elementów z podanego zbioru, w oparciu o zadaną funkcję selekcji
sel(e1,E,o), odpowiedzialną za wybór elementu x ze zbioru z dla obiektu interaktywne-
74
Jarosław Jankowski
go o. W takim ujęciu obiekt interaktywny przyjmuje strukturę zbioru: Oi=e1, e2, … e3
gdzie ei naleŜy do zbioru Ei elementów składowych komponentu. Dla obiektu interaktywnego definiuje się zbiór czynników, które wpływają na uzyskane efekty, parametry
wejściowe xk, k=1,2,…,i oraz parametry wyjściowe ym,, m=1,2,…,j które stanowią wyniki obserwacji uzyskane przy określonym doborze czynników wejściowych. Prowadzone badania mają na celu maksymalizację funkcji oceny efektów e=f(x1,x2,…,xk).
Procedura obliczeniowa realizowana jest w komunikacji ze środowiskiem operacyjnym,
z wykorzystaniem zasileń informacyjnych i układów pomiarowych. W przypadku środowiska interaktywnego podstawą do generowania kolejnych wariantów obiektów jest
uŜyteczność i analiza skuteczności poszczególnych wersji. Dla efektywnej realizacji
zadań obliczeniowych i generowania struktur istotny jest dobór metod wspomagania
decyzji i algorytmizacja procesu analizy danych.
4. Podstawy metodyczne systemu i etapy procedury optymalizacyjnej
Realizacja układu adaptacyjnego generowania obiektów spersonalizowanych interaktywnych wymaga doboru metod analitycznego przetwarzania danych i pozyskiwania
rozwiązań decyzyjnych. Podstawowym problemem przedstawionej koncepcji jest duŜa
przestrzeń decyzyjna i kombinatoryczna struktura problemów obliczeniowych. Za
główne zadania na tym etapie badań przyjęto poszukiwanie wariantów optymalnych
przy ograniczeniu ilości kroków wymaganych dla uzyskania ekstremum (lub wartości
zbliŜonych do ekstremum) funkcji odpowiedzi. W kompletnym eksperymencie dla
weryfikacji wszystkich elementów konieczne jest opracowanie planu badań, który
obejmuje 2k wariantów obiektu. Dla redukcji wymiarowości problemu proponuje się
wykorzystanie metody generowania struktury obiektów dla poszczególnych etapów w
oparciu o modele poliselekcyjne, stosowane w analizie czynnikowej. ZałoŜeniem tych
metod jest generowanie planów doświadczeń i ograniczenie przestrzeni poszukiwań.
Literatura podaje wiele modeli generowania tzw. planów frakcyjnych (z wykorzystaniem stopniowego przyrostu wartości poszczególnych parametrów) uznanych za podstawę tego nurtu badawczego publikowanych między innymi w pracach C. Bayena
i I. Rubina [2], D. Montgomeryego [21] oraz S. Deminga i S. Morgana [8]. Rozwój
metod w tym obszarze zapoczątkowała metodologia analizy odpowiedzi RSM (ang.
response surface methodology) zaproponowana przez Boxa i Wilsona [4], która wspomagania planowanie eksperymentu w celu wyznaczenia ekstremum funkcji wielu
zmiennych:
x = f(u1,u2,…,un)
(1)
W metodzie tej zakłada się, Ŝe funkcja f(u) jest nieznana, ciągła i ma jedno ekstremum. Zmienne uk (k=1,2,…s) mogą przyjmować wartości na dwóch głównych poziomach us0+∆us, us0-∆us. W metodzie załoŜono, Ŝe nieznaną charakterystykę moŜna
aproksymować w otoczeniu danego punktu u1,0,u2,0,….,us,0 z uŜyciem hiperpłaszczyzny
przedstawionej za pomocą równania [2].
∧
x = b0 + b1u1 + ⋯ + bs u s + b11u12 + … + bss u s2 + +b12 u1 u2 + … + bs −1, s us −1u s
(2)
75
Personalizacja przekazu interaktywnego z udziałem metod analizy czynnikowej
gdzie b0,b1,…,bk,b11,…,bss,b12,…,bs-1,s są współczynnikami wymagającymi określenia.
W przypadku istnienia duŜej liczby zmiennych poszukuje się planów realizacji doświadczeń, które umoŜliwiają zbadanie głównych czynników determinujących zachowanie funkcji. W takich sytuacjach stosuje się plany Placketta i Burmana, definiowane
jako nasycone z uwagi na wykorzystanie wszystkich elementów do identyfikacji efektów głównych bez pozostawionych stopni swobody [11].
Przykładowe postępowanie w oparciu o plany frakcyjne zilustrowano dla obiektu
z 6 parametrami o wartościach ciągłych, reprezentujących składowe zmiennych odpowiedzialnych za wizualizację i dobór skali barw elementu interaktywnego. Dla poszczególnych parametrów określono dopuszczalne zakresy wartości z zakresu RGB
<0,255>. W pierwszej fazie wykorzystano metodę selekcji planu początkowego eksperymentu Placketa Burmana i wygenerowano układ eksperymentu. Określono początkowy punkt centralny dla wartości c1,1= 127.5, c1,2= 63.5, c1,3= 31.5, c2,1= 63.5, c2,2= 63.5,
c2,3= 31.5.
Tabela 1. Rozkład przyrostowy parametrów
(Źródło: obliczenia własne)
Id
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
C
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
c11
1
1
0
-1
0
1
1
0
-1
0
-1
-1
-1
-1
1
-1
1
1
0
-1
1
c12
1
-1
0
1
0
1
-1
0
1
0
-1
1
-1
-1
-1
1
-1
1
0
-1
1
c13
1
1
0
1
0
-1
-1
0
-1
0
1
1
-1
-1
1
-1
-1
1
0
1
-1
c21
1
-1
0
1
0
-1
1
0
-1
0
-1
-1
1
-1
1
1
-1
-1
0
1
1
c22
1
-1
0
-1
0
-1
1
0
1
0
1
-1
-1
-1
-1
1
1
1
0
1
-1
c23
1
1
0
1
0
1
1
0
1
0
1
-1
1
-1
-1
-1
-1
-1
0
-1
-1
Tabela 2. Plan początkowy w otoczeniu
punktu c0 (Źródło: obliczenia własne)
Id
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
c1,1
255
255
128
0
128
255
255
128
0
128
0
0
0
0
255
0
255
255
128
0
255
c1,2
255
0
128
255
128
255
0
128
255
128
0
255
0
0
0
255
0
255
128
0
255
c1,3
255
255
128
255
128
0
0
128
0
128
255
255
0
0
255
0
0
255
128
255
0
c2,1
255
100
178
255
178
100
255
178
100
178
100
100
255
100
255
255
100
100
178
255
255
c2,2
255
0
128
0
128
0
255
128
255
128
255
0
0
0
0
255
255
255
128
255
0
c2,3
255
255
128
255
128
255
255
128
255
128
255
0
255
0
0
0
0
0
128
0
0
R
0
4
2
6
3
4
6
2
3
2
4
4
6
4
1
6
4
6
2
6
6
W Tabeli 1 przedstawiono poszczególne etapy z uwzględnieniem przyrostu +∆u,
stabilizacji lub jednostkowej dekrementacji wartości zmiennych -∆u. Macierz przyrostowa dla kaŜdej zmiennej wyznacza kierunek zmian w poszczególnych krokach eksperymentu. Na tej podstawie w otoczeniu punktu centralnego generowany jest plan początkowy eksperymentu (Tabela 2). Plan eksperymentu przedstawia hiperpłaszczyna na
Rys 2. Dla poszczególnych wartości parametrów wejściowych dokonano oceny struktury obiektu a dla poszczególnych wariantów planu uzyskano rangi. Na Rys 3. przedsta-
76
Jarosław Jankowski
wiono wpływ zmian wartości parametru c1,1 na odpowiedź
odpowied układu przy wartościach
c1,3=0, c2,1=178.
Rysunek 2. Plan początkowy dla etapów eksperymentu (źródło: obliczenia własne)
Rysunek 3. Rozkład odpowiedzi dla zmian c1,1, c1,3=0, c21=178 (źródło: obliczenia własne)
Analiza obiektu odbywa się w procesie iteracyjnym,
iteracyjnym w którym wykorzystuje się dane z układów pomiarowych oraz początkowe wartości
ści parametrów z układów wejścia.
Ekstremum funkcji wyznacza się interakcyjnie na kilku etapach. W pierwszym realizuje
się niewielką liczbę doświadczeń, które umoŜliwiająą znalezienie zorientowanego lokalloka
nie opisu matematycznego, który wyznacza powierzchnię daną równaniem:
∧
x = b0 + b1u1 + b2 u2 + ⋯ + bs us
(3)
Współczynniki b1,…,bs określają kierunek gradientu, który umoŜliwia
umo
realizację poszukiwań punktu ekstremalnego. Metoda poszukiwania kierunku najszybszego wzrostu
funkcji (ang. metod of steepest ascent)
t) zakłada przemieszczanie kierunku eksperymentu
Personalizacja przekazu interaktywnego z udziałem metod analizy czynnikowej
77
zgodnie ze ścieŜką wzrostu wartości
ci funkcji odpowiedzi obiektu. Po pierwszym etapie
przetwarzania współczynniki regresji modelu sąą wykorzystywane do identyfikacji
ścieŜki poszukiwań. Na Rys. 4 przedstawiono etapy ewaluacji wariantów planu.
Rysunek 4. Etapy ewaluacji planu (źródło:
ródło: obliczenia własne)
Rysunek 5. Rozkład odpowiedzi układu (źródło:
ródło: obliczenia własne)
Dalsze poszukiwania kolejnych rozwiązań koncentrują
koncentruj się wokół punktów z wartościami największymi. Zmiany wartości xj podczas przemieszczania
przemieszczani w kierunku wzrostu
∧
funkcji są proporcjonalne do istotności współczynnika regresji β j . W pracy [3] przedstawiono postać modelu w postaci:
∧
∧
∧
∧
∧
y = β 0 + β1 x1 + β 2 x2 + … + β k xk
(4)
78
Jarosław Jankowski
ŚcieŜka najszybszego wzrostu funkcji przebiega od punktu środkowego projektu
w kierunku największego wzrostu wartości funkcji z ograniczeniem sferycznym
∑
k
j =1
x 2j = r 2 . Procedura maksymalizacji funkcji wykorzystuje mnoŜnik Lagrange’a:
∧
∧
Q ( x1 , x1 ,… , xk ) = β 0 + ∑ j =1 β j x j − λ
k
(∑
k
j =1
x 2j − r 2
)
(5)
gdzie dla mnoŜnika λ określa się wartości skokowe zmiany parametrów ∆ i wyznacza
się wartości pozostałych zmiennych. Szczegółowe metody wyznaczania rozwiązań
omówiono w pracach R.H. Myers i D.C Montgomeryego [21] oraz A.I. Khuriego i J.A.
Cornella [20]. W kolejnej iteracji otrzymano odpowiedź układu przedstawioną na Rys.
5, gdzie dla poszczególny kombinacji parametrów uzyskano wzrost wartości funkcji
oceny. Na tej podstawie moŜna wyznaczyć parametry obiektu dla kolejnego etapu pomiarów. Liczba iteracji uzaleŜniona jest od kosztów realizacji badań testujących i szybkości przyrostu funkcji oceny w stosunku do oczekiwań decydenta. Uzyskane efekty
wskazują na moŜliwość szerszego zastosowania metod planowania struktur obiektów
interaktywnych i dokonaniu ich wersyfikacji z udziałem planów frakcyjnych. W zaleŜności od wyznaczonego celu moŜliwa jest identyfikacja funkcji odpowiedzi systemu
oraz poszukiwanie kombinacji parametrów wejściowych w celu jej maksymalizacji.
5. Podsumowanie
Wzrost złoŜoności aplikacji internetowych, zmienność struktur danych, wykorzystanie obiektów multimedialnych utrudnia modelowanie struktur informacyjnych
i personalizacji przekazu. Nie zawsze wykorzystuje się potencjał aplikacji i maksymalizuje uzyskane rezultaty. Przedstawiona procedura identyfikacji obiektu interaktywnego
i doboru parametrów moŜe stanowić rozszerzenie istniejących systemów adaptacyjnych
w Internecie. Wykorzystane podejście daje moŜliwość weryfikowania róŜnych układów
i minimalizację przestrzeni poszukiwań wariantu projektowego w aplikacjach internetowych, gdzie kombinatoryczna natura zadania utrudnia stosowanie innych metod.
W wielu przypadkach przydatne jest odwzorowanie systemu, a matematyczny model
analityczny wyznaczany jest na podstawie jego zachowań. Istotne jest ograniczenie
udziału decydenta w poszukiwaniu rozwiązań i automatyzacja procedur obliczeniowych. Kolejne etapy badań zakładają konstrukcję środowiska symulacyjnego i integrację z systemami internetowymi dla potrzeb dalszych eksperymentów.
Bibliografia
[1] Anderson P. What is Web 2.0? Ideas, technologies and implications for education,
JISC, Department of Information Science, Loughborough University 2006
[2] Bayne C. K., Rubin I. B. Practical experimental designs and optimization methods
for chemists, VCH Publisher, 1986
[3] Borror C., Kowalski S., Montgomery D. C. A Modified Path of Steepest Ascent for
Split-Plot Experiments, Journal of Quality Technology, Vol. 37, No. 1, 2005
[4] Box G. E. P., Wilson K. B. On the experimental attainment of optimum conditions,
Journal of the Rogal Statistical Society, 1951
Personalizacja przekazu interaktywnego z udziałem metod analizy czynnikowej
79
[5] Chang-Hwan L. Hellinger-based discretization method for numeric attributes in
classification learning, Knowledge-Based Systems archive, Volume 20, Issue 4,
Elsevier Science Publishers, 2007
[6] Davies J., Studer R., Warren P. Semantic Web Technologies: Trends and Research
in Ontology-based Systems, Wiley Publishing, 2006
[7] DeLoach S. A., Matson E. An Organizational Model for Designing Adaptive Multiagent Systems, The AAAI-04 Workshop on Agent Organizations: Theory and
Practice, 2004
[8] Deming S. N., Morgan, S. L. Experimental design: A chemometric approach,
Amsterdam, The Netherlands: Elsevier Science Publishers B.V., 1993
[9] European Commission. The role of new companies in e-business innovation and
diffusion, e-Business W@tch, European Commission. DG Enterprise & Industry,
Special Impact Study No. 2, 2006
[10] Gartner Group. Gartner's 2006 Emerging Technologies Hype Cycle Highlights Key
Technology Themes, 2006
[11] Giesbrecht F., Gumpertz P. Planning, Construction, and Statistical Analysis of
Comparative Experiments, Wiley Series in Probability and Statistics, 2005
[12] Granic A., Glavinic V. Functionality specification for adaptive user interfaces
Electrotechnical Conference, Melecon, 2000
[13] Jian Hu B., Hua-Jun Z., Hua L., Cheng N., Zheng C. Demographic prediction
based on user's browsing behavior, Proceedings of the 16th international conference on World Wide Web table of contents, Microsoft Research, Canada, 2007
[14] Madden M., Fox S. Riding the Waves of Web 2.0, Pew Internet Project, 2006
[15] Mahemoff M. Ajax. Wzorce projektowe, Helion, 2007
[16] Montgomery D. C. Design and analysis of experiments (3rd ed.). New York: Wiley, 1991
[17] O'Reilly T. What Is Web 2.0 Design Patterns and Business Models for the Next
Generation of Software, 2005
[18] Sturtz D. Communal Categorization, The Folksonomy.INFO622, Content Representation, 2004
[19] Rosenfeld L., Morville P. Information Architecture for the World Wide Web: Designing Large-Scale Web Sites, O'Reilly Media, 2002
[20] Kauri A. I., Cornell J. A. Response surfaces: Design and analysis, DEkker, 1996,
New York
[21] Montgomery D. C., Myers R. H. Response surfaces methodology, Willey, New
York, 1995
[22] Wang J., Arjen P., Marcel J. T. Unifying User-based and Item-based Collaborative
Filtering Approaches by Similarity Fusion, SIGIR, 2006
Metoda wyznaczania kontekstu dla aplikacji
sterowanych zdarzeniami*
Henryk Krawczyk, Sławomir Nasiadka
Politechnika Gdańska, Wydział Elektroniki Telekomunikacji i Informatyki
Abstract:
A method of extending software applications by introducing elements of environmental
context information is presented in the paper. The definition of context is based on the
general context ontology. The information coming from the environment is transformed
into certain execution conditions. As an algorithm of the transformation, a Petri net was
used. It enables the application to properly react to certain events occurring in the
environment. This is how “pervasive computing” applications are constructed.
Keywords:
pervasive computing, context, events, ontology, Petri net, context analysis algorithms
1. Wprowadzenie
Popularność urządzeń przenośnych, takich jak telefony komórkowe, PDA, karty
RFID oraz inne obiekty zawierające róŜnego rodzaju czujniki, spowodowała znaczący
wzrost popularności aplikacji typu „pervasive”. Termin ten po raz pierwszy zastosował
Mark Weiser w 1991 [1] roku a oznacza on niezauwaŜalne dla uŜytkownika końcowego
wkomponowanie urządzeń fizycznych w jego codzienne Ŝycie. Ich zasadniczą własnością powinno być ukrywanie swojego istnienia i funkcjonowania, tak aby osoba z nich
korzystająca nie musiała skupiać się nad ich technicznymi aspektami pracy, ale mogła
skoncentrować swoją uwagę na rozwiązaniu właściwego problemu.
Jedną z realizacji koncepcji aplikacji typu „pervasive” są aplikacje kontekstowe.
Istota ich działania polega na dynamicznym dopasowywaniu się do warunków pracy
bez konieczności podejmowania przez uŜytkownika dodatkowych działań. Jedną
z pierwszych, realizujących to podejście inicjatyw, był system „Active Badge Location
System” [2] zbudowany przez Wanta i Hoopera w 1992 roku. Wykorzystywał on promieniowanie podczerwone do lokalizowania uŜytkownika i kierował przychodzące na
jego numer rozmowy telefoniczne do aparatu, który w danej chwili był najbliŜej danej
osoby. Badając pewien kontekst – w tym wypadku połoŜenie – system potrafił na bieŜąco dostosowywać swoje działanie do zmieniających się warunków pracy, nie angaŜując przy tym uŜytkownika.
Kluczowym pojęciem staje się zatem kontekst jako definicja otoczenia środowiska
aplikacji, gdyŜ stanowi on pryzmat poprzez który postrzegają one otaczający świat
i przetwarzają pochodzące z niego sygnały. Pierwsze opracowania definiowały kontekst
*
Wykonano w ramach grantu ministerialnego nr MNiSW N519 022 32/2949
82
Henryk Krawczyk, Sławomir Nasiadka
w sposób statyczny – jako listę moŜliwych elementów, na które system powinien zwracać uwagę. W zaleŜności od zmian zachodzących w ramach tych elementów, powinny
następować zmiany w zachowaniu programu, dzięki czemu moŜna uzyskać adaptowalność funkcjonowania aplikacji. Źródłem informacji o zmianach otaczających aplikację
obiektów mogą być nie tylko informacje bezpośrednio wprowadzane przez uŜytkownika, lecz przede wszystkim róŜnego rodzaju czujniki badające fizyczne właściwości
zewnętrznego świata. Kontekst moŜna takŜe definiować przy uŜyciu synonimów, jednakŜe takie definicje są często zbyt ogólne i przez to trudno realizowane w postaci algorytmów. Przyjęcie optymalnego modelu kontekstu jest zatem niezwykle trudne
i w sposób bezpośredni rzutuje na przyszłe moŜliwości aplikacji. Musi być on z jednej
strony wystarczająco szczegółowo określony, aby mógł zostać zrealizowany w świecie
komputerów a z drugiej na tyle ogólny, aby nie ograniczać się do pewnej z góry określonej liczby stanów jakie moŜe napotkać system podczas funkcjonowania.
Mając właściwie zdefiniowany kontekst naleŜy zaimplementować metodę jego wyznaczania – czyli wyodrębnienia z dostarczanych do systemu sygnałów (zdarzeń) tych,
które są dla niego uŜyteczne i stanowią jego rzeczywisty kontekst działania. Metoda ta
moŜe być statyczna – na stałe zaszyta w programie i korzystająca z zamkniętej grupy
pojęć określających otoczenie aplikacji, bądź teŜ dynamiczna, która jest trudniejsza
w implementacji jednakŜe duŜo bardziej elastyczna, pod kątem dostosowania systemu
do nowego środowiska pracy. Wybór pomiędzy tymi dwiema opcjami jest najczęściej
podyktowany przyjętym modelem definicji kontekstu. Artykuł ten omawia jedną z koncepcji dynamicznego wyznaczania kontekstu opartą na wykorzystaniu mechanizmu
sieci Petriego.
2. Charakterystyka aplikacji kontekstowych
Ostatnie 20 lat stanowi ciągły rozwój aplikacji kontekstowych dzięki coraz większej
dostępności róŜnego rodzaju czujników, funkcjonujących w rzeczywistych środowiskach, z których mogą czerpać dane dotyczące otaczającego je świata rzeczywistego.
Sensory te nie muszą być juŜ dedykowane, lecz znajdują się w coraz tańszych przedmiotach codziennego uŜytku, które jeszcze dodatkowo ciągle podlegają minimalizacji.
Istnieje wiele podejść architektonicznych dla modelowania aplikacji kontekstowych.
ZaleŜą one głównie od dostępności sensorów, liczby uŜytkowników systemu czy teŜ
innych zasobów. Biorąc po uwagę te elementy moŜna wyodrębnić trzy główne kierunki
rozwoju [3]:
 bezpośredni dostęp do czujników – model najczęściej spotykany w sytuacjach gdy
czujniki wbudowane są w oprogramowane urządzenia. System moŜe wtedy bezpośrednio dokonywać odczytów i od razu interpretować uzyskane dane bez konieczności istnienia specjalnej warstwy pośredniczącej. PoniewaŜ jest to ścisłe powiązanie
oprogramowania i warstwy fizycznej podejście takie znacznie utrudnia skalowalność i przyszłą rozbudowę aplikacji oraz wykorzystanie w środowisku rozproszonym;
 dostęp do czujników przy wykorzystaniu warstwy pośredniczącej – podejście to
opiera się na koncepcji rozdzielenia fizycznego dostępu do sensorów od danych
przez nie dostarczanych. Specjalna warstwa pośrednicząca ukrywa szczegóły techniczne wymagane do bezpośredniej komunikacji z urządzeniem końcowym i prze-
Metoda wyznaczania kontekstu dla aplikacji sterowanych zdarzeniami
83
kazuje jedynie wartości przez nie zwracane. Dzięki takiemu modelowi znacznie łatwiej rozwijać aplikacje;
 architektura oparta na serwerze kontekstu – aby umoŜliwić korzystanie z danych
dostarczanych przez czujniki przez wielu klientów w środowisku rozproszonym naleŜy wprowadzić dodatkowy komponent, który będzie zarządzał wieloma jednoczesnymi połączeniami i udostępniał zgromadzone przez czujniki dane. Dodatkową
zaletą tego rozwiązania jest odciąŜenie urządzeń końcowych (które często mają
niewielką moc obliczeniową) od skomplikowanego przetwarzania danych na których muszą pracować i przeniesienia tych operacji na serwer kontekstu.
Najbardziej ogólny model warstwowy aplikacji kontekstowej został przedstawiony
za [4,5] na rysunku 1.
Rysunek 1. Warstwowy model aplikacji kontekstowej
Warstwa sensorów bezpośrednio odpowiada za zbieranie danych ze świata rzeczywistego (np. z czujników badających właściwości fizyczne, z innych aplikacji poprzez
RMI czy teŜ usługi sieciowe, itp.) i przekazywanie ich do warstwy przetwarzania danych surowych. Odpowiada ona za przetwarzanie surowych danych przy pomocy sterowników (w przypadku źródeł danych mierzących fizyczne własności – czujników)
bądź odpowiedniego API (w przypadku komunikacji z innych systemami). Odpowiednio zinterpretowane i zorganizowane (w zaleŜności od konstrukcji aplikacji – mogą być
na przykład w postaci XML) dane są następnie przedstawiane przy pomocy ujednoliconego interfejsu do warstwy wstępnego przetwarzania, której zadaniem jest agregowanie, formatowanie i takie przekształcenie informacji, aby były one uŜyteczne dla wyŜej
połoŜonych w hierarchii abstrakcji elementów systemu (np. konwersja temperatury na
binarną wartość mówiącą czy pojawił się ogień). Warstwa prezentacji odpowiada za
bezpośrednie dostarczanie przetworzonych danych dla klientów, którzy mogą prowadzić komunikację w sposób synchroniczny bądź asynchroniczny. Właściwa interpretacja informacji i odpowiednia do nich zmiana zachowania realizowana jest w ramach
ostatniej warstwy. Model ten jest abstrakcyjny i nie wszystkie istniejące systemy realizują wszystkie warstwy, jednakŜe w większości moŜna wyodrębnić opisane powyŜej
funkcje.
84
Henryk Krawczyk, Sławomir Nasiadka
Łącząc architekturę aplikacji kontekstowej opartąą na serwerze kontekstu oraz przedprze
stawiony powyŜej warstwowy model uzyskuje sięę postać
posta systemu zaprezentowaną na
rysunku 2.
Rysunek 2. Model programowania Publish/Subscribe dla aplikacji kontekstowych
Klienci wysyłają do serwera Ŝądania
dania powiadomienia w przypadku zaistnienia intereinter
sującego je kontekstu. W tym celu muszą przekazaćć informację
informacj o tym jakie dane ich
interesują i jakie muszą zaistnieć między
dzy tymi danymi zaleŜności,
zale
aby mogły je uznać
za uŜyteczne. Klienci zatem reprezentują najwyŜsząą warstwę
warstw modelu przedstawionego
na rysunku 1. Serwer odpowiada za realizacje 4 dolnych warstw modelu. Zbiera on
dane z sensorów, wstępnie
pnie je interpretuje i przetwarza oraz mając
maj wyspecyfikowane
przez klientów konteksty jakie ich interesują,, przeprowadza pod kątem
k
tych kontekstów
analizę przychodzących
cych do niego danych. W momencie stwierdzenia zgodności
zgodno zestawu
informacji jakie przyszły w postaci zdarzeń z oczekiwaniami klienta, serwer powiadapowiad
mia go o tym fakcie.
Kluczowe dla działania powyŜszego mechanizmu jest przyjęcie
przyj
wspólnej zarówno
dla serwera jak i dla klienta reprezentacji kontekstu, tak aby przesyłane dane mogły być
by
tak samo interpretowane.
3. Reprezentacja kontekstu
Istnieje kilka sposobów reprezentacji kontekstu w aplikacjach kontekstowych. W zaleŜności
ci od warunków pracy systemu i rodzaju przetwarzanych przez nie danych moŜna
mo
wyróŜnić dwa zasadnicze: jednoznaczne oraz synonimiczne. Pierwszy zakłada sztywne
określenie, iŜ kontekst zawiera pewien niezmienny
y zbiór elementów, które jednoznaczjednoznac
nie opisują interesujące
ce z punktu widzenia oprogramowania aspekty. Przykładami mogą
mog
być [6], gdzie autorzy opisują kontekst jako miejsce, otaczające
otaczaj
obiekty (w tym ludzie)
Metoda wyznaczania kontekstu dla aplikacji sterowanych zdarzeniami
85
oraz zmiany zachodzące w stanie tych obiektów oraz [7], w którym Dey wymienia stan
emocjonalny uŜytkownika, cel uwagi, miejsce oraz czas, obiekty i ludzie. Podejście
oparte na synonimach zaprezentowane zostało np. w [8] gdzie kontekst oznacza aspekty
trwającej sytuacji. Tego rodzaju definicje są jednakŜe przewaŜnie zbyt ogólne i przez to
nie mogą być zaimplementowane w postaci zrozumiałej dla komputerów. Pewnym
rozwiązaniem tego problemu moŜe być wykorzystanie ontologii, która zawierałaby
wiedzę na temat środowiska działania aplikacji (w postaci pojęć i relacji) i jednocześnie
umoŜliwiała przetwarzanie tej wiedzy przez maszyny.
Rysunek 3. Przykładowe pojęcia z ontologii kontekstu SOUPA
Zasadnicze cele jakie naleŜy osiągnąć podczas tworzenia ontologii kontekstu zawierają [9]:
 prostotę opisu;
 elastyczność i rozszerzalność o nowe elementy opisu;
 moŜliwość wspierania róŜnych typów kontekstów;
 umiejętność wyraŜania moŜliwie największej liczby wartości kontekstu.
Przykładem środowiska działającego w oparciu o ontologię kontekstu jest CoBrA,
której opis działania moŜna znaleźć w [10] i [11].
86
Henryk Krawczyk, Sławomir Nasiadka
Ontologie umoŜliwiają przedstawienie kontekstu w najbardziej ogólny sposób. Dodatkowo dzięki swoim własnościom pozwalają na wnioskowanie i łatwą rozbudowę
o nowe informacje oraz fakty. Jedną z ontologii kontekstu zaprezentowano na rysunku
3. Jest to ontologia SOUPA [12], która została wykorzystana w środowisku CoBrA.
Zawiera ona większość podstawowych pojęć jakie są wykorzystywane do definiowania
danych kontekstowych w istniejących systemach.
Na rysunku 3 przedstawiono podstawowe klasyfikacje informacji stanowiące punkt
wyjścia dla reprezentacji kontekstu w róŜnego typu aplikacjach. Do operacji na ontologii niezbędny jest edytor, który przygotowuje opis wykorzystywany przez serwer oraz
klienta z rysunku 2.
4. Wykorzystanie sieci Petriego
Aplikacje kontekstowe sterowane zdarzeniami potrzebują do pracy danych kontekstowych. Proces wyznaczenia potrzebnego im kontekstu musi uwzględniać, to iŜ informacje od sensorów przychodzą w sposób asynchroniczny i często mają róŜną postać.
Dodatkowo wszystkie dane posiadają pewne określone cechy, które decydują o ich
przydatności dla oprogramowania, natomiast sama definicja kontekstu z jakiej korzysta
system moŜe być statyczna bądź dynamiczna. Wszystkie te czynniki powodują, iŜ istotne staje się skonstruowanie mechanizmu umoŜliwiającego wyznaczenie kontekstu,
który byłby z jednej strony na tyle elastyczny aby odpowiadał wymogom dynamicznej
definicji kontekstu, a z drugiej strony potrafił przetwarzać konkretne dane docierające
do aplikacji.
Propozycją takiego mechanizmu moŜe być mechanizm oparty na sieci Petriego [13]
rozszerzonej o adnotacje, w której Ŝetony reprezentują przychodzącą do systemu informację, natomiast struktura sieci odzwierciedla warunki jakie muszą zaistnieć, aby aplikacja odebrała przychodzący z zewnętrznego świata ciąg zdarzeń jako uŜyteczny.
Adnotacje dołączane do Ŝetonów (informacji ze zdarzeń) umoŜliwiają logiczne ich
łączenie, badanie waŜności oraz wiarygodności. Podczas uruchamiania tranzycji następuje pobranie odpowiednich Ŝetonów (na podstawie adnotacji) z powiązanych stanów
i po ich analizie umieszczenie kolejnego Ŝetonu w stanie docelowym tranzycji. Funkcjonowanie całego mechanizmu moŜna przedstawić przy pomocy listy warunków jakie
odnoszą się do danych wejściowych.
Rodzaje informacji jakie zawarte są w adnotacjach dołączonych do Ŝetonów mogą
mieć swoje definicje zanurzone w ontologii kontekstu, dzięki czemu moŜna jeszcze
bardziej zwiększyć elastyczność całego systemu. Podstawowe rodzaje przenoszonych
przez adnotacje danych to:
 czas dodania – określa kiedy dokładnie została dodana informacja;
 czas waŜności – określa jak długo informacja reprezentowana przez Ŝeton jest waŜna;
 źródło – z jakiego sensora pochodzi informacja. Będzie ona opisana przy pomocy
zrozumiałej dla klienta i zgodnej z przyjętą metodą reprezentacji kontekstu konwencją (na przykład OWL). Podczas przechodzenia Ŝetonów przez sieć element ten zawiera dodatkowo informacje o historii powstania Ŝetonu. Na jego podstawie moŜna
ustalić z jakich poprzednich stanów i ich wartości nastąpiło przejście do stanu,
w którym aktualnie znajduje się Ŝeton;
87
Metoda wyznaczania kontekstu dla aplikacji sterowanych zdarzeniami
 powiązania – pole umoŜliwia logiczne grupowanie danych. Podczas przekazywania
do sieci informacji elementarnych muszą zostać zachowane powiązania między nimi. W przykładzie z rysunku 4 informacja na przykład o temperaturze jest istotna
tylko w kontekście miejsca odczytu tej temperatury – które to miejsce reprezentowane jest przez osobny Ŝeton. Dla późniejszego przetwarzania naleŜy zatem powiązać Ŝeton reprezentujący miejsce odczytu temperatury z Ŝetonem reprezentującym
samą temperaturę;
 inne dane zaleŜne od specyfiki aplikacji i środowiska.
Podczas przechodzenia przez transformację do nowych Ŝetonów dołączane są nowe
adnotacje, zawierające odpowiednio przetworzone dane.
Zasygnalizowane powyŜej rozszerzenie sieci Petriego moŜna przedstawić w sposób
następujący:
Pa = <Sa, T, A, M0>,
gdzie:
Pa – sieć Petriego z adnotacjami,
Sa – jest skończonym zbiorem stanów, z których kaŜdy posiada adnotacje,
T – jest skończonym zbiorem tranzycji: T: Sa -> Sa,
A – jest skończonym zbiorem łuków między stanami a tranzycjami:
A = (Sa x T) U (T x Sa) -> N+,
M0 – początkowe rozłoŜenie Ŝetonów w stanach S.
Na rysunku 4 przedstawiono sieć Petriego dla przykładu uruchomienia alarmu poŜarowego w momencie gdy temperatura i zadymienie w pewnym miejscu wzrosną powyŜej pewnych progów. Tabela 1 reprezentuje warunki takiego mechanizmu.
Tabela 1. Tabela mechanizmu przetwarzania kontekstu
Zdarzenia elementarne
(dostarczane przez
sensory)
Rodzaje informacji
elementarnych
• odczyt temperatury (T)
• T – temperatura
w miejscu X1
• D – poziom
• odczyt poziomu zadyzadymienia
mienia (D) w miejscu X2 • X – miejsce
Warunki
W X1: T > 40°,
w X2: D > 10 %
i X = X1 = X2
Zdarzenia uŜyteczne
(przekazywane do
klientów)
Spełniony został warunek przy wartościach:
T = 42°, D = 15%,
X = sala 101
Klient, rejestrując się w serwerze, określa typy informacji oraz warunki jakie muszą
one spełniać, aby zdarzenie było dla niego uŜyteczne (są to kolumny: „Rodzaje informacji elementarnych” oraz „Warunki”). Dodatkowo klient przekazuje konkretne wartości jakie występują podczas porównywania warunków (w przykładzie 40° dla
temperatury i 10% dla zadymienia). Serwer, po stwierdzeniu zajścia zdarzeń uŜytecznych dla klienta, informuje go o nim oraz dodatkowo przesyła konkretne wartości danych zawarte w Ŝetonach.
Serwer, po otrzymaniu od klienta listy zdarzeń elementarnych oraz warunków na
nich operujących, tworzy dynamicznie sieć Petriego (rysunek 4).
88
Henryk Krawczyk, Sławomir Nasiadka
Rysunek 4. Przykładowa sieć Petriego dla tabeli 1
Sieć Petriego potrafi przyjmować dane typu „temperatura”, „zadymienie” oraz
„miejsce”. W momencie odebrania przez system zdarzenia z czujnika jest ono przekazywane do sieci Petriego, w której następuje stwierdzenie iŜ typ danej zgadza się
z typem stanu wejściowego „temperatura” oraz „miejsce” (czujnik wysyłając daną wysyła odczyt temperatury w określonym miejscu) – informacje „temperatura” oraz „miejsce” dostarczane osobno nie mają dla systemu znaczenia podczas analizowania
problemu czy zaistniał poŜar czy nie. To logiczne grupowanie danych określone jest na
poziomie reprezentacji kontekstu. Następuje utworzenie nowego Ŝetonu, dla którego
adnotacje zawierają wartość odczytanej temperatury, źródło z którego pochodzi odczyt,
czas waŜności informacji oraz ślad. Analogiczna sytuacja ma miejsce w momencie
przychodzenia do systemu odczytów dotyczących poziomu zadymienia.
Tranzycje analizują przychodzące do nich Ŝetony. Gdy tranzycja stwierdzi, Ŝe
w kaŜdym z jej stanów wejściowych jest przynajmniej po jednym Ŝetonie moŜe odczytać wartości tych Ŝetonów i przyrównać je do warunku (a tym samym do konkretnych
wartości) przekazanego przez klienta. JeŜeli porównanie wypadnie pozytywnie (warunek jest spełniony) tranzycja moŜe umieścić nowy Ŝeton w swoim stanie wyjściowym,
umieszczając dodatkowo w tym Ŝetonie ślad (na podstawie jakich wcześniejszych Ŝetonów został on utworzony) – w przypadku ostatniej tranzycji będzie to równoznaczne
z poinformowaniem klienta, o tym iŜ zaistniała interesująca go sytuacja i przekazaniem
wartości ostatniego Ŝetonu.
PowyŜszy przykład moŜe być opisany przy pomocy następującego algorytmu wykorzystującego mechanizm publish/subscribe:
Klient:
1. Wyślij do serwera definicję interesującego kontekstu wraz z wartościami
szczegółowymi (w przykładzie: interesujący kontekst to temperatura > 40°
i zadymienie > 10% w tym samym miejscu).
2. Zarejestruj metodę obsługi zdarzenia dla powiadomienia od serwera
o wystąpieniu kontekstu (subscribe).
Serwer:
1. Oczekuj na rejestracje przez klienta definicji kontekstu i przekazanie wartości
szczegółowych.
Metoda wyznaczania kontekstu dla aplikacji sterowanych zdarzeniami
89
2. Po otrzymaniu od klienta definicji kontekstu dekompozycja i utworzenie na
jej podstawie sieci Petriego.
3. Oczekuj na zdarzenia.
4. W przypadku zajścia zdarzenia dostarczającego informacje pasującą typem
do stanów wejściowych utworzonej sieci Petriego przekaŜ ją do tej sieci
(utworzenie Ŝetonu).
5. Za kaŜdym razem gdy tworzony jest nowy Ŝeton w sieci uruchom procedurę
sprawdzania tranzycji.
a. Sprawdź czy tranzycja ma aktywne (takie którym nie minął termin waŜności) Ŝetony w swoich stanach wejściowych.
b. JeŜeli są Ŝetony określonego typu sprawdź czy są wśród nich takie, które
odpowiadają logicznemu pogrupowaniu (w przykładzie: czy odczyt temperatury i odczyt zadymienia dotyczą tego samego miejsca).
c. JeŜeli są Ŝetony odpowiadające logicznemu pogrupowaniu sprawdź czy
spełniają one warunek zawarty w tranzycji (w przykładzie: czy temperatura w danym miejscu > 40° i czy zadymienie w tym miejscu przekracza
10%). JeŜeli tak zaznacz iŜ Ŝetony te zostały wykorzystane przez tranzycie i utwórz nowy Ŝeton w stanie docelowym tranzycji.
d. JeŜeli stan docelowy tranzycji jest stanem końcowym – przekaŜ do klienta informację o zaistnieniu interesującego go kontekstu i przekaŜ mu wartości Ŝetonu w stanie docelowym (publish).
Przedstawione powyŜej rozszerzenie sieci Petriego daje duŜe moŜliwości w zakresie
budowania mechanizmów reakcji systemu na dochodzące do niego zdarzenia. Rozwiązanie to wspiera w sposób naturalny kontekst, ze względu na określenie moŜliwych
rodzajów danych wejściowych. Informacje mogą przychodzić w sposób asynchroniczny, a dołączenie do nich dodatkowych adnotacji pozwala na późniejszą analizę niezaleŜnie od aktualnego stanu całej sieci. Ścisłe zdefiniowanie poszczególnych elementów
struktury umoŜliwia dynamiczne jej budowanie na podstawie danych przekazanych
przez uŜytkownika lub uzyskanych z innych źródeł takich jak np. ontologia. Samo przetwarzanie przekazywanych przez klienta warunków do formalnej postaci, uŜytecznej dla
serwera kontekstu, takŜe moŜna oprzeć o ontologie, co będzie skutkować jeszcze większym poziomem dynamizmu całego systemu.
5. Podsumowanie
Aplikacje kontekstowe wymagają zaimplementowania mechanizmu, który umoŜliwiałby efektywne i elastyczne przetwarzanie zdarzeń przychodzących do systemu.
Wymogi te spełnia zaprezentowane rozszerzenie sieci Petriego. Pozwala on na automatyczne przetwarzanie asynchronicznie przychodzących do programu informacji i określanie czy spełniają one warunki wpływające na przebieg wykonania programu. Dalsza
praca będzie polegała na zaimplementowaniu przedstawionej koncepcji i wdroŜeniu jej
w rzeczywistych systemach. Struktura sieci Petriego moŜe być budowana w sposób
dynamiczny – na podstawie dostarczanych przez uŜytkownika danych, bądź uzyskiwana
jako efekt przetwarzania ontologii. Warunki odzwierciedlone w tranzycjach sieci są
przekazywane przez uŜytkownika, dlatego teŜ naleŜy opracować mechanizm ich trans-
90
Henryk Krawczyk, Sławomir Nasiadka
formacji na formalną postać. Daje to jeszcze większe odciąŜenie uŜytkownika końcowego od znajomości technicznych aspektów działania aplikacji.
Bibliografia
[1] Weiser M. The Computer for the Twenty-First Century, Scientific American, September 1991, s. 94-104
[2] Want R., Hopper A., Falcăo V., Gibbons J. The Active Badge Location System,
ACM Transactions on Information Systems, 10(1), 1992, s. 91-102
[3] Chen H. An Intelligent Broker Architecture for Context-Aware Systems, PhD Dissertation proposal, 2003
[4] Anind K. Dey, Gregory D. Abowd. A Conceptual Framework and a Toolkit for
Supporting the Rapid Prototyping of Context-Aware Applications, HumanComputer Interaction (HCI) Journal, Volume 16 (2-4), s. 97-166
[5] Ailisto H., Alahuhta P., Haataja V., Kyllönen V., Lindholm M. Structuring Context
Aware Applications: Five-Layer Model and Example Case, Proceedings of the
Workshop on Concepts and Models for Ubiquitous Computing, Goteborg, Sweden,
2002
[6] Bill N. Schilit and Marvin M. Theimer Disseminating Active Map Information to
Mobile Hosts. IEEE Network, 8(5). 1994, s. 22-32
[7] Anind K. Dey. Context-aware computing: The CyberDesk project, Proceedings of
the AAAI 1998 Spring Symposium on Intelligent Environments. Menlo Park, CA:
AAAI Press.
[8] Hull R., Neaves P., Bedford-Roberts J. Towards situated computing, In Proceedings of International Symposium on Wearable Computers, 1997
[9] Korpipää P., Mäntyjärvi J. An Ontology for Mobile Device Sensor-Based Context
Awareness, Proc. Context ’03, LNAI no. 2680, 2003
[10] Baldauf M., Dustdar S. A survey on context aware systems, International Journal of
Ad Hoc and Ubiquitous Computing, 2004
[11] Chen H., Finin T., Anumap J. An Ontology for Context-Aware Pervasive Computing Environments, The Knowledge Engineering Review Vol. 18, 2003, s. 197-207
[12] Chen H., Finin T., Joshi A. The SOUPA Ontology for Pervasive Computing, Ontologies for Agents: Theory and Experiences, 2005, s. 233-258
[13] Carl Adam Petri, Kommunikation mit Automaten PhD., 1962
A hybrid method of person verification with use
independent speech and facial asymmetry
Mariusz Kubanek, Szymon Rydzek
Czestochowa University of Technology,
Institute of Computer and Information Science
Abstract:
In a person identification or verification, the prime interest is not in recognizing the
words but determining who is speaking the words. In systems of person identification, a
test of signal from an unknown speaker is compared to all known speaker signals in the
set. The signal that has the maximum probability is identified as the unknown speaker. In
security systems based on person identification and verification, faultless identification
has huge meaning for safety. In systems of person verification, a test of signal from a
known speaker is compared to recorded signals in the set, connected with a known tested
persons label. There are more than one recorded signals for every user in the set.
In aim of increasing safety, in this work it was proposed own approach to person verification, based on independent speech and facial asymmetry. Extraction of the audio features
of person's speech is done using mechanism of cepstral speech analysis. The idea of improvement of effectiveness of face recognition technique was based on processing information regarding face asymmetry in the most informative parts of the face the eyes
region.
Keywords:
speaker verification, speech signal, independent speech, speech coding, facial asymmetry
1. Introduction
The most important problem in process of person verification is suitable coding of
signal audio [3]. In general, speech coding is a procedure to represent a digitized speech
signal using a few bits as possible, maintaining at the same time a reasonable level of
speech quality. Speech coding has matured to the point where it now constitutes an important application area of signal processing. Due to the increasing demand for speech
communication, speech coding technology has received augmenting levels of interest
from the research, standardization, and business communities. Advances in microelectronics and the vast availability of low-cost programmable processors and dedicated
chips have enabled rapid technology transfer from research to product development; this
encourages the research community to investigate alternative schemes for speech coding, with the objectives of overcoming deficiencies and limitations. To standardization
community pursues the establishment of standard speech coding methods for various
applications that will be widely accepted and implemented by the industry. The business
communities capitalize on the ever-increasing demand and opportunities in the consumer, corporate and network environments for speech processing products [1,2,3].
92
Mariusz Kubanek, Szymon Rydzek
In [9-11] there was proposed approach based on holistic representation of face
asymmetry characteristics to improvement of face recognition techniques as well as
expression qualification. Authors used affine transformations in aim to image normalization on the basis of three landmark points inner eyes corners and base of the nose. In
the experiment it was used two asymmetry measures D-face (Density Difference) and
S-face (Edge Orientation Similarity). Authors reported mean error value of 3.60/1.80%
(FRR/FAR) when using both asymmetry measures in fusion with Fisher Faces in task of
expression qualification.
The idea of improvement of effectiveness of face recognition technique used in the
hybrid method was based on processing information regarding face asymmetry in the
most informative parts of the face the eyes region. Such approach to person identification has been reported as efficient and possible to apply in real-time systems [12].
In the work, it was proposed the method to person verification based on independent
speech and facial asymmetry. To extraction of the audio features of person's speech, in
this work it was applied the mechanism of cepstral speech analysis. For acoustic speech
recognition was used twenty dimensional MFCC (Mel Frequency Cepstral Coefficients)
as the standard audio features. The idea of improvement of effectiveness of face recognition technique was based on processing information regarding face asymmetry in the
most informative parts of the face the eyes region.
2. Preliminary process of signal
Analysis of audio channel one should to begin from filtration of signal, removing
elements of signal being him disturbances. Next, the non-stationary nature of the speech
signal caused by dynamic proprieties of human speech result in dependence of the next
stage on use of division of entrance signal onto stationary frame boxes [4]. Signal is
stationary in short temporary partitions (10 ± 30 ms) [5]. Every such stationary frame
box was replaced by symbol of observation in process of create of vectors of observation. In created system it was assumed that length of every frame box equals 30 ms,
what at given sampling of signal (8kHz) remove 240 samples. Obtained frame boxes do
not overlap.
3. The mechanism of cepstral speech analysis
Speech processing applications require specific representations of speech information. A wide range of possibilities exists for parametrically representing the speech signal. Among these the most important parametric representation of speech is short time
spectral envelope [4,6]. Linear Predictive Coding (LPC) and Mel Frequency Cepstral
Coefficients (MFCC) spectral analysis models have been used widely for speech recognition applications. Usually together with MFCC coefficients, first and second order
derivatives are also used to take into account the dynamic evolution of the speech signal, which carries relevant information for speech recognition.
The term cepstrum was introduced by Bogert et al. in [7]. They observed that the logarithm of the power spectrum of a signal containing an echo had an additive periodic
A hybrid method of person verification with use independent speech and…
93
component due to echo. In general, voiced speech could be regarded as the response of
the vocal track articulation equivalent filter driven by a pseudo periodic source [7].
The characteristics of filters followed the characteristics old human auditory system
[5]. The filters had triangular bandpass frequency responses. The bands of filters were
spaced linearly for bandwidth below 1000 Hz and increased logarithmically after the
1000 Hz. In the mel-frequency scaling, all the filter bands had the same width, which
were equal to the intended characteristic of the filters, when they were in normal frequency scaling.
Spectrum of signal of every frame boxes obtained by Fast Fourier Transform (FFT)
comes under process of filtration by bank of filters. The next step was to calculate the
members of each filter by multiplying the filter's amplitude with the average power
spectrum of the corresponding frequency of the voice input.
Finally, the mel-frequency cepstrum coefficients (MFCC) was derived by taking the
log of the mel-power-spectrum coefficients then convert them back to time (frequency)
domain using Discrete Cosine Transform (DCT). The number of mel-coefficients used,
for speaker recognition purposes, was usually from 12 to 20 [8].
In this work, for speech coding, it was used twenty dimensional MFCC as the standard audio features. Next, obtained for all frame boxes cepstrum coefficients add upped
properly. In this expedient all independent statement one coded by twenty cepstral coefficients.
4. Face asymmetry
The input information for procedure of face recognition was frontal face image taken at resolution of 768x576 pixels with 24-bit color depth. Basing on procedures of image processing developed for purpose of automatic face feature extraction face image
was processed in order to find 8 landmark points (see Table 1).
Table 1. Description of landmark points detected using automatic procedure of feature extraction
At the beginning of the processing we have to determine the area of picture that contains face. It is done by thresholding the image in Irb space (see Fig.1.).
94
Mariusz Kubanek, Szymon Rydzek
Figure 1. Method of determination of face's area
In the next step we have to determine areas for each of the eye. It is done in three
stages: 1 – horizontal projection for whole image, 2 – vertical projection in closest area
of maximum value acquired in first stage, in this stage we are capturing two maximum
values which are horizontal coordinates of eyes, 3 – in the last stage we are performing
again horizontal projection but only in closess area of values from stage 2. Next, we have
to find centre and diameter of circle describing the iris. Processing is done on threthr
sholded eye image. After thresholding, the image is filtered with median filtering and
after canny edge detection we are determining
mining points on the edge of iris (see Fig.2.).
Figure 2. Determination of region of eyes, with use difference of pixel's light and vertical and
horizontal projection and determination of characteristic points of eye, based on skin colour, with
use Canny
y Edge Detection.
A hybrid method of person verification with use independent speech and…
and
95
To find eye's corners we are searching in special determined area. The minimum
value of luminance determines eye's corners. To find edges of eyelids we used threthr
sholding in R/G,B/G space. And next, to described eyelids edges we used polynomial
poly
approximation (see Fig.3.).
Figure 3. Result of determination of eye's iris and
nd corners and edges of eyelids
We used face geometrical dependencies to determine mouth area. In mouth area we
also used R/G, B/G thresholding. Result of determination of
o mouth's corners and external edges was showed on Fig.4.
Figure 4. Result of determination of edges of mouth
96
Mariusz Kubanek, Szymon Rydzek
The face description vector was constructed on the basis of measure of geometrical
dependencies between detected landmark points and asymmetry measures. All distance
measures was done using the Muld unit which is based on the diameter of the iris. Such
approach allows us to avoid necessity of image scaling in aim of size normalization.
Asymmetry measurement was done in two ways.
In the case of measuring shape of the eyelids asymmetry, the input information was
set of points detected on the eyelid. In the first step the coordinates of centroid was
found using:
xc
1
N
N
1
x t , yc
t 0
1
N
N 1
y t
(1)
t 0
where t = 0,1,...,N – 1, N – number of points detected on the eyelid, (x(t), y(t)) – coordinates of detected points.
Next, the values of distances between the centroid and eyelid points was found with
use of:
r t
x t
xc
2
y c 2 , t = 0,1,...,N - 1
y t
(2)
Asymmetry measure was defined as set of subtraction values of corresponding distances between the eyelid and centroid calculated for right and left eye:
d t
rR t
rL t
, t = 0,1,...,N – 1
(3)
where rR(t) i rL(t) – distances between the centroid and eyelid points for right and left
eye respectively.
In the case of measuring asymmetry of eyelid corners position, the dependence constructed on the basis of the Weber-Fechner law was used:
F Asym F , L
ln
F
L
.
(4)
Where F i L – results of the feature measurement (the distance between corresponding
landmark points).
The measure of face similarity was calculated as the weighted mean value of scaled
measures of features difference:
n
w i s i d feature 1
d face
i 1
n
,
(5)
where dface – value of face descriptions similarity, s – vector of scaling values, w –
weights vector, dfeature – vector of the measures of features difference, n – number of
features.
97
A hybrid method of person verification with use independent speech and…
The measures of features difference was defined as euclidean distances between corresponding feature values. For features which description consist of the set of measures
(eyelid shape) the difference was defined as the mean square value of the elements of
the vector.
5. Experimental results
In our experiment, the error level of person verification with use only audio independent speech and coupled audio speech and facial asymmetry was tested. Research
were made, using author base of independent statements and face pictures of different
users. Sixty users were tested, speaking out for two long independent statements, one
statement as training data and one as test data. For each user one photo was made, and
vector of asymmetry was built. Research were made for different degree of disturbance
of signal audio (Signal to Noise Ratio, SNR = 0, 5, 10, 20 dB). It was accept, that for
SNR = 20 dB signal audio is clean. It was assumed, that face pictures is clean for different SNR of audio.
In research the recordings of independent speech were used. Samples were taken at
frequency of 8 khz and 16-bit encoding. Two samples for each user were recorded,
which after encoding using 20 cepstral coefficients were put in system database with
corresponding person ID. In the verification process user has to read long enough sentence randomly selected by the system, and parallel the vector of the facial asymmetry
of the user from the registered photo by the camera is automatically created. Before the
recording, the user gave its own number of ID. The sentence was compared with all
encoded sentences in the database by calculating the euclidean distance of two vectors.
If the ID at least one from two nearest vectors audio and visual to examined is identical
with ID given by the user, the correct verification is qualified. In the case, when simultaneous agreement of ID of both signals audio and visual was required, FAR and FRR
did not appear for different SNR, therefore in this work it was not described. For audio
signal and face asymmetry were a summed weights of 50% and 50% respectively.
Table 2. shows result of person verification with use only audio independent speech.
Table 3. shows result of person verification with use audio independent speech and facial asymmetry.
Table 2. Result of person verification with use only audio independent speech
Person verification with use audio independent speech
FRR [%]
Person
Average
for sixty
speakers
FAR [%]
SNR
20 dB
SNR
15 dB
SNR
10 dB
SNR
5 dB
SNR
0 dB
1,67
6,67
13,67
20,00
41,46
SNR SNR
20 dB 15 dB
0,00
0,00
SNR
10 dB
SNR
5 dB
SNR
0 dB
0,00
1,67
5,00
98
Mariusz Kubanek, Szymon Rydzek
Table 3. Result of person verification with use audio independent speech and facial asymmetry
Person verification with use audio independent speech and facial asymmetry
FRR [%]
Person
Average
for sixty
speakers
FAR [%]
SNR
20 dB
SNR
15 dB
SNR
10 dB
SNR
5 dB
SNR
0 dB
0,00
1,67
3,67
5,00
8,34
SNR SNR
20 dB 15 dB
0,00
0,00
SNR
10 dB
SNR
5 dB
SNR
0 dB
0,00
0,00
1,67
The audio signal was transformed to lower quality with use of noise generator. In
the experiment the level of FAR/FRR errors was investigated. Sampling frequency was
set to 8 khz because of the possibility of easy transmission of such signals, what in the
future can be used to build the system of user identification with use of the phone.
6. Conclusion and future work
In this paper, the new approach to person verification based on independent speech
and facial asymmetry was presented. The basic idea is that not only speech, not only
naturally looking face, but also other facial features such as facial asymmetry contain
much useful information, necessary to verification. It was evaluated the robustness of
the new approach. The new approach will implemented in the module of our user identification and verification system. The method was proposed to improvements of effectivity of user verification.
An advantage of our approach is the simplicity and functionality by the proposed
methods, which fuse together audio and visual signals. Obtained results of research
show, that commits less of errors, if to verification we give information about tone of
voice of examined user, and information about his asymmetry of face. A decisively
lower level of mistakes was obtained in user verification based on independent speech
and facial asymmetry, in comparison to only audio speech, particularly in facilities,
where the audio signal is disrupted.
In future, we plan to built system to verification of users, in which errors will be on
level near 0 % for non-limited users. Also will be unrolled research of suitable selection
of proportion of weights decreasing of length of single independent statements.
References
[1] Kubanek M. Analysis of Signal of Audio Speech and Process of Speech Recognition. Computing, Multimedia and Intelligent Techniques, 2, pp 5564, 2006.
[2] Kubanek M. Method of Speech Recognition and Speaker Identification with use
Audio-Visual Polish Speech and Hidden Markov Models. Biometrics, Computer
Security Systems and Artificial Intelligence Applications, Saeed K., Pejas J., Mosdorof R., Springer Science + Business Media, New York,pp 45-55, 2006.
A hybrid method of person verification with use independent speech and…
99
[3] Chu Wai C. Speech coding algorithms. Foundation and Evolution of Standardized
Coders. A John Wiley & Sons, New Jersey 2000.
[4] Rabiner L., Yuang B. H. Fundamentals of Speech Recognition. Prentice Hall Signal
Processing Series, 1993.
[5] Wiśniewski A. M. Hidden Markov Models in Speech Recognition. Bulletin IAiR
WAT, 7,Wroclaw 1997 [In Polish].
[6] Kanyak M. N. N., Zhi Q., Cheok A. D., Sengupta K., Chung K. C. Audio-Visual
Modeling for Bimodal Speech Recognition. Proc. Symp. Time Series Analysis,
2001.
[7] Bogert B. P., Healy M. J. R., Tukey J. W. The Frequency Analysis of Time-Series
for Echoes. Proc. 2001 International Fuzzy Systems Conference, pp 2-9-243, 1963.
[8] Wahab A., See NG G., Dickiyanto R. Speaker Verification System Based on Human Auditory and Fuzzy Neural Network System. Neurocomputing Manuscript
Draft, Singapore.
[9] Liu Y., Schmidt K., Cohn J., Mitra S. Facial Asymmetry Quantification for Expression Invariant Human Identification. AFGR02, pp 198-204, 2002.
[10] Liu Y., Weaver R., Schmidt K., Serban N., Cohn J. Facial Asymmetry: A New
Biometric. CMU-RI-TR, 2001.
[11] Mitra S., Liu Y. Local Facial Asymmetry for Expression Classification. Proc. of
the 2004 IEEE Conference on Computer Vision and Pattern Recognition CVPR'04,
2004.
[12] Rydzek S. A Method to Automatically Authenticate a Person Based on the Asymmetry Measurements of the Eyes and/or Mouth. PhD thesis, Czestochowa University of Technology, 2007.
Identyfikacja charakterystyk modeli w trybie on-line
wraz z wizualną rekonstrukcją charakterystyki
Krzysztof Makles
Politechnika Szczecińska, Wydział Informatyki
Abstract:
Identification of models characteristic in on-line mode main purpose is to join random
processes identification under on-line conditions, i.e. without storage of measuring data
in computer memory, and with computer graphical models functionality. This approach
allows building models for random processes identification in frequency, magnitude and
phasing domains, with possibility of instantly work result observation with on demand
parameters fitting. This article provides the solution for on-line visualization of identified
characteristic in time partition mode.
Keywords:
simulation, filtering, point analysis, visual identification, on-line
1. Wstęp
Nowoczesne środowisko komputerowe pozwala na identyfikację charakterystyk
modeli w trybie on-line. Dodatkowo, obecnie stosowane środowiska do modelowania
i symulacji, takie jak Matlab-Simulink, pozwalają na łatwe prototypowanie modeli do
identyfikacji w trybie on-line. W pracach [1] oraz [4] pokazano podstawy budowania
modeli do identyfikacji systemów w trybie on-line, pozwalających na natychmiastową
identyfikację na podstawie napływających danych pomiarowych, czyli bez gromadzenia
tych danych w pamięci komputera. Zaproponowane modele składają się z dwóch części:
modeli do analizy punktowej [5], oraz modeli do rekonstrukcji charakterystyki na podstawie wyników analizy punktowej. W pracy [3] pokazano podstawy budowy modeli
rekonstrukcyjnych z przykładem realizacji w środowisku Matlab-Simulink. Porównanie
zaproponowanych modeli z modelami opartymi o analizę FFT pokazano w pracy [6].
Propozycja modeli rekonstrukcyjnych z automatycznym doborem parametrów w oparciu o kryterium stabilności modelu rekonstrukcyjnego pokazano w pracy [7].
2. Obiekty identyfikacji
Na rysunku 1 pokazano uogólniony schemat blokowy zastosowania proponowanego
podejścia. Źródłem informacji pierwotnej moŜe być obiekt rzeczywisty lub symulator
tego obiektu. Na wyjściu systemu selekcji otrzymujemy wektor sygnałów (procesów)
losowych. Skład tego wektora zaleŜy od typu zadania identyfikacji (jednowymiarowa
102
Krzysztof Makles
B(t)=X(t). lub dwuwymiarowa B(t)=[X(t),Y(t)]). Obiekty identyfikacji przedstawione są
na wyjściach bloku identyfikacyjnego (Online Visual Identification Toolbox).
H X ( p)
α 1 (t )
α (t )
H XY ( p 2 | p1 )
α 2 (t )
B (t )
System
selekcji
Źródło
danych
On-line
Visual
Identyfication
Toolbox
α m (t )
H XY ( p1 | p 2 )
H XY ( p )
H XY ( p1 , p 2 )
Rysunek 1. Schemat ogólny
ZałoŜono, Ŝe wektor B(t) posiada cechy stacjonarności, czyli Ŝe obiekty identyfikacji
nie zaleŜą od czasu bieŜącego t. Dla jednego procesu losowego X(t) obiektami identyfikacji mogą być charakterystyki HX(p) w zakresie amplitud (p=x), czasu opóźnienia
(p=τ) oraz częstotliwości (p=ω).
3. Zasady budowy modeli
Wszystkie charakterystyki identyfikowane mogą być funkcjami jednej lub dwóch
zmiennych. W tabeli przedstawiono trzy przykłady charakterystyk HX(p) w zakresie
amplitud, czasu opóźnienia oraz częstotliwości. Charakterystyki te są funkcjami jednej
zmiennej i zaleŜą od dwóch parametrów a1 oraz a2.
W trakcie identyfikacji moŜna określić wartości funkcji H X ( p ) tylko dla konkretnych wartości argumentu p (w punktach analizy). W związku z tym powstają problemy
rekonstrukcji funkcji na podstawie poszczególnych punktów [za pomocą metod interpolacji i (lub) aproksymacji]. Przede wszystkim trzeba określić liczby punktów analizy
oraz odległości między punktami. Jest to problem optymalizacji, którego skutecznym
rozwiązaniem moŜe być sposób heurystyczny zrealizowany przez uŜytkownika z
uwzględnieniem otrzymywanej aktualnej informacji w postaci graficznej (w trybie online).
Prowadzi to do konieczności budowy dwóch typów modeli:
 modeli do analizy punktowej;
 modeli do rekonstrukcji charakterystyki identyfikowanej na podstawie wyników
analizy punktowej.
103
Identyfikacja charakterystyk modeli w trybie on-line wraz z wizualną…
Tabela 1. Przykłady charakterystyk H X ( p )
Charakterystyka Hx(p)
H x ( p) = f x ( x) =
1
σ X 2π
exp(−
Parametry a1,a2
( x − µx )
),
2σ X2
2
p=x
H X ( p ) = K X (τ ) = exp( − a | τ |) cos 2π f 0τ , p = τ
H X ( p ) = GX ( f ) =
2a
+
a 2 + 4π 2 ( f + f 0 ) 2
2a
+ 2
,p= f
a + 4π 2 ( f − f 0 ) 2
a1 = µ X ,
a2 = σ X
a1 = a, a2 = f 0
a1 = a, a2 = f 0
4. Model do analizy punktowej
Jak pokazano w artykule [4, 5] podstawowymi elementami analizy punktowej
w trybie on-line są filtry, które mogą być (w zaleŜności od specyfiki konkretnego zadania identyfikacji) filtrami róŜnej przepustowości (dolno- górno- albo środkowoprzepustowe) oraz róŜnych typów (Batterworta, Czebyszewa, Bessela).
W ogólnym przypadku parametry modeli do analizy punktowej w postaci filtrów są
parametrami zmiennymi, przy czym zmianą tych parametrów kieruje uŜytkownik
(w trybie on-line).
5. Modele do rekonstrukcji identyfikowanej charakterystyki
Analiza moŜliwości rekonstrukcji w trybie on-line prowadzi do wniosku, Ŝe temu
celowi najbardziej odpowiadają modele optymalizacyjne [4, 5]. Aby określić model
identyfikacyjny naleŜy wykonać trzy kroki:
 Zbudować funkcje celu F ( a1 , a2 ,..., an ) , której punkt minimum odpowiada rozwiązaniu zadania identyfikacji. Funkcja F ( a1 , a2 ,..., an ) zaleŜy od niewiadomych parametrów a1 , a2 ,..., an charakterystyki podlegającej identyfikacji [ H X ( p ) ,
H XY ( p1 p2 ) , H XY ( p2 p1 ) albo H XY ( p1 , p2 ) ].
 Określić, w sposób analityczny albo numeryczny, cząstkowe pochodne:
∂ F (a1 , a2 ,..., an )
, i = 1, n .
∂ ai
(1)
 Zbudować układ równań róŜniczkowych, którego rozwiązaniem są niewiadome
parametry:
dai
∂ F (a1 , a2 ,..., an )
= −R
, i = 1, n .
∂ ai
dt
(2)
104
Krzysztof Makles
Układowi (2) odpowiada poszukiwanie minimum funkcji F ( a1 , a2 ,..., an ) według
trajektorii najszybszego spadku. Szybkość spadku zaleŜy od współczynnika R > 0
(akceleratora poszukiwania). Przy wyborze akceleratora istnieje problem jego wielkości, gdyŜ z jednej strony zwiększenie R powoduje wzrost szybkości poszukiwania, ale
z drugiej strony powiększenie akceleratora moŜe spowodować stratę stabilności modelu
identyfikacyjnego [5]. W tej sytuacji najlepszym rozwiązaniem problemu jest automatyczne dopasowywanie parametru R bezpośrednio w toku identyfikacji komputerowej [7].
6. Wizualna rekonstrukcja cech modeli
Dla wszystkich charakterystyk identyfikowanych korzystna jest prezentacja wizualna w postaci przebiegów czasowych w trybie podziału czasu (szybkiej periodyzacji)
[1-4]. Prezentacja wizualna daje informację uŜytkownikowi w celu podejmowania decyzji i współpracy z modelem identyfikacyjnym w trybie on-line. Pozwala to na interaktywne dopasowywania parametrów modelu identyfikacyjnego w taki sposób, Ŝeby
wyniki identyfikacji najbardziej odpowiadały wymaganiom postawionym ze strony
uŜytkownika (reprezentatywność, dokładność itd.).
Tryb szybkiej periodyzacji pomyślany został w taki sposób, aby wizualizacja charakterystyki następowała w czasie rzeczywistym. PoniewaŜ rekonstrukcja identyfikowanej charakterystyki odbywa się na podstawie analizy punktowej, czyli od punktu
początkowego do końcowego, sygnałem periodyzującym jest sygnał piłokształtny.
Parametrami tego sygnału są amplituda A oraz okres T .
Na rysunku 2 pokazano Diagram Simulink modelu testowego do wizualizacji
w trybie szybkiej periodyzacji. Wizualizowany jest sygnał opisany wzorem
x (t ) = e
− a1 ⋅ t
⋅ cos ( a2 ⋅ t ) :
x (t ) = e
− a1 ⋅ t
⋅ cos ( a2 ⋅ t ) .
Rysunek 2. Model do periodyzacji sygnału (3) w środowisku Matlab-Simulink
(3)
105
Identyfikacja charakterystyk modeli w trybie on-line wraz z wizualną…
Parametry sygnału piłokształtnego określono
lono w sposób następujący:
nast
amplituda
okre
na: a1 = 1 ,
A = 1 , okres T = 16 s . Parametry wizualizowanego sygnału określono
1
a2 = 10 . Charakterystykę amplitudową tego sygnału pokazano na rysunku 3. Takie
określenie
lenie parametrów sygnału piłokształtnego powoduje, Ŝe wizualizowana jest 1
sekunda przebiegu badanego sygnału. Na rysunkach 4 oraz 5 pokazano wizualizację
wizualizacj
w trybie szybkiej periodyzacji
acji dla badanego sygnału oraz 1 sekundę
sekund jego przebiegu
czasowego.
Rysunek 3. Charakterystyka amplitudowa sygnału testowego opisanego wzorem (3)
x (t ) = e
Rysunek 4. Wizualizacja w trybie szybkiej periodyzacji dla sygnału
przy amplitudzie
zie sygnału piłokształtnego A = 1
− a1 ⋅ t
⋅ cos ( a2 ⋅ t )
W kroku drugim zwiększona
kszona została amplituda sygnału piłokształtnego do wartości
warto
równ 2 sekundom symulacji
A = 2 . Spowodowało to wizualizację badanego przebiegu równą
przebiegu czasowego (rysunki 6, 7). Okres sygnału piłokształtnego pozostał niezmieniezmi
niony.
106
Krzysztof Makles
x (t ) = e
− a1 ⋅ t
Rysunek 5. Charakterystyka amplitudowa sygnału
symulacji równego 1 sekundę
sekund
⋅ cos ( a2 ⋅ t )
Rysunek 6. Wizualizacja w trybie szybkiej periodyzacji dla sygnału x ( t ) = e
dla czasu trwania
− a1 ⋅ t
⋅ cos ( a2 ⋅ t )
przy amplitudzie sygnału piłokształtnego A = 2
Rysunek 7. Charakterystyka amplitudowa sygnału x ( t ) = e
− a1 ⋅ t
⋅ cos ( a2 ⋅ t ) dla czasu trwania
symulacji równego 1 sekundę i amplitudy A = 2
Identyfikacja charakterystyk modeli w trybie on-line wraz z wizualną…
107
Zmiana okresu sygnału piłokształtnego wymusza zmianę parametru „Time range”
w oscyloskopie. Wartości te muszą być sobie równe, przy czym krótszy czas periodyzacji zapewnia szybszą reakcję modelu wizualizacji w trybie szybkiej periodyzacji na
zmiany wizualizowanego przebiegu.
W przypadku, gdy funkcja rekonstrukcyjna ma postać wielomianu:
a1 x + a2 x 2 + … + am −1 x m −1 + b .
(4)
Model do wizualizacji w trybie podziału czasu pokazano na rysunku 8.
Rysunek 8. Praktyczna realizacja modelu do wizualizacji w trybie podziału czasu dla sygnału
opisanego wzorem (4)
7. Przykład analizy widma sygnału
Jako przykład analizy widmowej w trybie on-line z wizualizacją w trybie podziału
czasu uŜyto sygnału prostokątnego opisanego wzorem
− A
x (t ) = 
A
−π < t < 0
,
0<t <π
dla
dla
(5)
gdzie A jest amplitudą sygnału. Rozwinięcie na składowe harmoniczne dla tego sygnału
ma postać
x (t ) =
gdzie ω =
2π
T
4A 
1
1
 4A ∞ 1
sin ( 2n + 1) ωt
∑
 sin ωt + sin 3ω t + sin 5ω t + …  =
π 
3
5
 π n = 0 2n + 1
(6)
= 1 jest częstotliwością podstawową, z jaką przebiega sygnał prostokątny.
Ze wzoru 6 wynika, Ŝe widmo sygnału prostokątnego składa się z harmonicznych
o częstotliwościach będących całkowitą, nieparzystą wielokrotnością częstotliwości
1
podstawowej, i amplitudach malejących o ( 2n+1) wraz ze wzrostem częstotliwości harmonicznych. Na rysunku 9 pokazano przykładową analizę harmoniczną sygnału prostokątnego.
108
Krzysztof Makles
Rysunek 9 Analiza widmowa sygnału prostokątnego
prostok
Tabela 2 Parametry modelu do analizy widma
wid
Parametry
Wartości
f1 = 0, f 2 = 0.5, f 3 = 1, f 4 = 1.5, f 5 = 2,
f 6 = 2.5, f 7 = 3, f8 = 3.5, f9 = 4, f10 = 4.5,
f11 = 5, f12 = 5.5, f13 = 6, f14 = 6.5, f15 = 7,
Punkty analizy (Hz)
f16 = 7.5, f17 = 8, f18 = 8.5, f19 = 9, f 20 = 9.5
Liczba odcinków aproksymacji
~
Granicy symulacji zmiennej f (Hz)
19
F1 = 0 , F2 = 9.5
Okres powtórzenia sygnału
~
cyklicznego f (s)
Stała czasowa filtrów (s)
T=0,02
T0 =5
Parametry modelu do identyfikacji widma w trybie on-line
on
(rysunek 10) pokazano w
tabeli 2. Wizualizacja w trybie podziału czasu dokonana została na podstawie 19
odcinków interpolacji. Wyniki symulacji dla tak określonych parametrów pokazano na
rysunku 11.
Identyfikacja charakterystyk modeli w trybie on-line wraz z wizualną…
109
Analizator
20 punktowy
Rysunek 10. Model w środowisku Matlab-Simulink
Simulink do identyfikacji widma wraz z rekonstrukcją
rekonstrukcj
charakterystyki
Rysunek 11. Wyniki analizy widmowej uzyskane z modelu do identyfikacji wraz z wizualizacją
wizualizacj
w trybie on-line
Na oscyloskopie zaobserwowano wyraźne
ne impulsy w punktach analizy f 3 = 1 Hz,
Hz. Otrzymany wynik jest zgodny
f 7 = 3 Hz, f11 = 5 Hz, f15 = 7 Hz i . f19 = 9 .Hz.
z wynikami analizy teoretycznej.
8. Podsumowanie
W artykule omówiono model do wizualizacji identyfikowanej charakterystyki
w trybie podziału czasu. Pokazano przykłady modeli dokonujących
dokonuj
wizualizacji
poprzez interpolację oraz za pomocą wielomianu. Modele pokazane w tym artykule
w połączeniu
czeniu z modelami do rekonstrukcji charakterystyk
charakterys
w oparciu o wyniki analizy
punktowej oraz filtrację dolnoprzepustową pozwalają
pozwalaj na identyfikację procesów
losowych w trybie on-line.
line. Zaproponowane modele mogą
mog zostać zastosowane
w analizie danych, metrologii i dziedzinach pokrewnych. Jako przykład zastosowania
zasto
pokazano identyfikację widma w trybie on-line
line wraz z wizualizacją
wizualizacj w trybie podziału
czasu dla sygnału prostokątnego.
tnego. Otrzymane wyniki wizualizacji są
s zgodne z wynikami
110
Krzysztof Makles
analitycznymi. Pokazane modele są częścią autorskiego zestawu bloków dla pakietu
Matlab-Simulink o nazwie „Identyfikacja wizualna”
Bibliografia
[1] Moiseev V., Ciarciński A., Makles K., Modele graficzne do identyfikacji sygnałów
w trybie on-line, 4 Sesja Naukowa Informatyki, Szczecin, 1999, pp. 147-153.
[2] Moiseev V., Makles K., Identification of random processes via on-line filters,
Seventh International Conference on Advanced Computer Systems, Szczecin, Poland, 2000, pp. 108-111.
[3] Moiseev V., Ciarciński A., Dynamic data reconstruction under on-line conditions,
Seventh International Conference on Advanced Computer Systems, Szczecin, Poland, 2000, pp. 218-221.
[4] Makles K., Random Processes Visual Identification – model building general
principles, 9th International Conference on Advanced Computer Systems, Szczecin, Poland. 2002, pp. 337-344.
[5] Makles K., Random Processes Visual Identification – models for multipoint analysis, 10th International Conference on Advanced Computer Systems, Szczecin, Poland. 2003.
[6] Makles K., Wizualna identyfikacja procesów losowych – analiza porównawcza
modeli on-line z modelami korzystającymi z szybkiej transformaty Fouriera, Roczniki Informatyki Stosowanej Wydziału Informatyki Politechniki Szczecińskiej Nr
6, Szczecin, 2004, pp.167-174.
[7] Makles K., Random Processes Visual Identification – proposal of parameters
fitting automation, 13th International Conference on Advanced Computer Systems,
Szczecin, Poland. 2006.
[8] Moiseev V., Interactive Optimization with Graphic Support, Publisher Informa,
Szczecin, Poland, 2000, ISBN 83-87362-26-3.
[9] Ljung L., System identification: theory for the user, Prentice Hall, Englewoods
Cliffs, 1999
[10] Söderström T., Stoica P., Identyfikacja systemów, Wydawnictwo Naukowe PWN,
Warszawa, 1997.
Virtual multibeam echosounder in investigations on sea
bottom modeling
Wojciech Maleika, Michał Pałczyński
Szczecin University of Technology,
Faculty of Computer Science and Information Technology
Abstract:
The article presents problems of verifying DTM creating methods on the basis of real
sounding data. There has been presented a concept of solving these problems by applying
virtual hydrostatic sounding. A method has been worked out of simulating a single and
multibeam echosounder taking into account vessel movement and transducer parameters.
Elements of ray tracing have been used in the algorithm. The method proposed may be a
valuable research tool in methods of creating DTM and in planning real marine
soundings.
Keywords:
digital terrain model, multibeam echo sounder, ray tracing
1. Introduction
One of the most important and at the same time most difficult tasks undertaken in
the complex process of constructing spatial information systems, is the creation of a
digital terrain model (DTM), which is the basic information layer used by systems describing phenomena in a valuable way and provides them with the basis of spatial organisation. Contemporary DTM users set high requirements, laying stress both on data
quality (accuracy, reliability, up-to-dateness), the dynamics of their processing and
visualising and the possibilities of analyses in real time.
In order to construct the DTM of a sea bottom, measurement information has to be
gathered first. Modern measurement systems with devices that make it possible to
record observation results in a continuous and fully automatic way (e.g. multibeam echo
sounders), permit the acquisition in a relatively short time of a huge amount of information about the shape of the sea bottom. [1,2]. Measurement systems register the location
and depth (spatial coordinates) of many million points during one measurement session.
The processing of such amount of data, which in addition are mostly irregularly scattered in space, requires the application of specially prepared methods and properly selected processing algorithms. The DTM is usually made on the basis of a GRID structure (a regular net of squares) (Fig. 1). There are numerous methods of determining
GRID based on measurement data, the ones most frequently applied being kriging,
minimum curvature, nearest neighbour, natural neighbour, modified Shepard’s method,
112
Wojciech Maleika, Michał Pałczyński
radial basis function, polynomial regression, inverse distance to a power, triangulation
with linear interpolation, and methods based on artificial intelligence [3,4]. These methods make use of a series of differentiated algorithms to establish the values of interpolated parameters at node points. The selection of interpolation method in the case of
unevenly distributed measurement data should be determined by a number of features
characterising this data set: the degree of homogeneity of data dispersion, number of
points per area unit, population variance (degree of data changeability and the type of
surface reflected by the data [5].
An essential problem when researching and seeking optimal interpolation methods is
the impossibility of explicit verification of errors emerging in the obtained DTMs,
which results from a different representation of input data (unevenly distributed points
in spaces of real numbers) and output data (evenly distributed points in spaces of integers). In effect it is impossible to compare diverse interpolation methods with regard to
mapping accuracy.
In research currently conducted different approaches are applied to solve the situation:
 for assessing a particular method only the subjective visual estimation of created
models is taken into account,
 for the needs of research synthetic test surfaces are generated (recorded in GRID
structure), on the basis of which points are randomly drawn which constitute the
source material in further research.
The first of the methods described makes a quantitative assessment of the researched
interpolation methods impossible, whereas the other is based on synthetically generated
data which can in a large degree distort the results obtained (there frequently occurs
here a matching between the function on the basis of which the test data were generated
and the researched interpolation method; the sampled data are usually evenly distributed). Neither do the generated synthetic test surfaces fully reflect the surfaces appearing in a natural environment.
modelling
source data
grid data
Figure 1. Source and grid data in sea bottom modelling
Virtual multibeam echosounder in investigations on sea bottom modeling
113
2. Idea of virtual marine sounding
The idea of “virtual marine sounding” described in the article is the authors’ suggestion of solving the difficulties when researching and verifying DTM creation methods.
The method presented is founded on a proper preparation of test data based on real
measurement data. The process (Fig. 2) can be divided into two basic stages:
 forming a test surface in the shape of a GRID net of high resolution, based on real
data,
 marine sounding simulation: a virtual unit equipped with a single or multibeam
sounder performing sounding in the area of the test surface created.
 These are the features of the solution proposed:
 the measurement points obtained in result of virtual sounding have a form identical
as the real data: uneven distribution of points, description in the domain of real
numbers,
 simple verification of DTM models obtained in further research: a comparison between the obtained GRID structure with the test GRID net,
 flexible possibility of controlling the virtual sounding parameters: vessel speed,
single or multibeam sounder, sounder work parameters, selection of sounding profiles.
modelling
modelling
virtual
sounding
source data
high density grid
virtual data
grid data
Figure 2. Data processing in virtual sounding
3. Method description
The presented method of virtual sounding is based on an approach known as ray
tracing [6, 7]. It consists in the assumption that a narrow acoustic beam can be
represented as a line called ray, with a course corresponding to a wave propagation
trajectory, permitting simulation of wave propagation both in the spatial and temporal
aspect, by means of relatively simple mathematical models. This approach was successfully applied for a synthesis of simulated sonar images [8,9,10]. The sonar’s broad
acoustic beam was divided into narrow subspaces in these applications and modelled by
means of a pencil of numerous rays. Applying the ray tracing method seems all the
more suitable for simulating the work of both a vertical and a multibeam echo sounder,
since the acoustic beams applied in them are very narrow, which permits assigning one
single ray to each of them. The idea of virtual sounding requires the echo sounder simulator to generate a set of points corresponding to the points of hitting submarine targets
114
Wojciech Maleika, Michał Pałczyński
by the acoustic impulses; from the point of view of this application, there is no need to
determine the echo intensity or the beam’s return time.
The space of virtual sounding is located in a Cartesian coordinates system, where the
coordinates X and Y correspond to the coordinates in the UTM system, whereas coordinate Z denotes depth, axis Z being directed upwards; this requires a change of coordinate sign from the generated points, as real echo sounders generate depths in the form of
positive numbers.
3.1. Input data
The presented method of virtual sounding requires defining the following input data:
 DTM in the form of a high-resolution GRID net: The grid’s sides are always placed
parallel to axes X and Y. Nx and Ny – number of grid nodes located along both axes,
xmin, ymin, xmax, ymax – coordinates of grid vertices, Zij, (i=1..Nx, j=1..Ny) – depth in a
particular node.
 Parameters of the multibeam echosounder transducer:
Nbeam – number of beams, ∆α – obtuse angle formed by the extreme beams, T[s] –
time between emission of successive acoustic impulses, R0=(x0, y0, z0) – echo
sounder’s location in relation to the position determined for the vessel. It was assumed in the method presented that all beams were of equal width, all generated the
impulse at the same time and that none of them affected the work of the other
beams.
 Parameters of vertical echo sounder transducer:
T[s] - time between emission of successive acoustic impulses, R0=(x0, y0, z0) - echo
sounder’s location in relation to the position determined for the vessel.
 The route of the vessel is given in the form of sequence Nposition of positions expressed as coordinates in UTM system. The course must also be determined for each
position.
Pship=(xship, yship) – location of the vessel for a given position, Rship = [kxship,
kyship] – course of the vessel for a given position, with |Rship| = 1. In the case of
artificial route synthesis the vessel’s speed and parameter T of the transducer must
be taken into account in order to determine proper distances between successive positions.
The proposed resolution of the initially generated GRID net equals 0.1m x 0.1m.
Typical parameters of the multibeam echo sounder exemplified by SIMRAD EM 3000
[11] are as follows: 128 beams, obtuse angle 150°, and the time between impulses
equals 0.04s.
3.2. Algorithm of virtual sounding with multibeam echo sounder
Virtual sounding requires tracking the propagation of an acoustic beam set of the
echo sounder for each of successive Nposition vessel’s positions. The aim of tracking the
beam is calculating the coordinates of the point of its hitting the bottom. The result of
Virtual multibeam echosounder in investigations on sea bottom modeling
115
the whole algorithm’s functioning is finding the set of all points, at which particular
beams hit the bottom for successive vessel positions.
The beams of a multibeam echo sounder are emitted from transducers located at
such short distances from one another, that their locations can be considered as one
point in space; its location for a given position of the vessel can be calculated as follows:
Ptransducer
 xtransducer 
 xship + x0 


=  ytransducer  = Pship + R0 =  yship + y0  .
 ztransducer 
 z0 
(1)
The axes of all multibeam echo sounder transducers lie in a plane perpendicular to
the course. In the method presented it is assumed that the angles between them are equal
and the axis of the central beam is directed vertically downwards. The other beams are
distributed symmetrically on both sides of the vessel and the number of beams Nbeam and
angle ∆α determined by axes of the extreme beams are the echo sounder’s technical
parameters. As the plane where the transducers are located is perpendicular to the
course, the vector located in horizontal plane, directed perpendicularly to port side is
calculated (Fig. 3a):
 kxtransducer   − kyship 
Rtransducer =  kytransducer  =  kxship  .
 kztransducer   0 
(2)
Directional vector Rα of the axis of each Nbeam is calculated according to the formulas:
 kxα   kxtransducer ⋅ cos(α ) 
π
Rα =  kyα  =  kytransducer ⋅ cos(α )  , for α ≠ ,
2
 kzα   − tan(α ) ⋅ cos(α ) 
 kxα   0
Rα =  kyα  =  0
 kzα   −1











 , for α = π .

2

(3)
(4)
α = α 0 + i ⋅ dα ,
i = 0...( N beam − 1), i ≠
α0 =
dα =
π − ∆α
2
∆α
,
N beam − 1
.
N beam − 1
,
2
(5)
116
Wojciech Maleika, Michał Pałczyński
Each beam should be assigned a ray representing the wave propagation trajectory. In
the method presented the medium is assumed to be homogeneous, therefore the trajectory of acoustic wave propagation is rectilinear. This assumption is close to reality for
shallow waters, and so the ray has its start at the location point of Ptransducer and a direction in accordance with beam axis Rα (Fig.3b).
a)
b)
Rtransducer
z0
Rtransducer
α
Rship
∆α
Pship
Rα
dα
Figure 3. Determination of ray direction (a) top view (b) rear view
The ray’s parametric equation:
Pα = Ptransducer + t ⋅ Rα ,
t > 0.
(6)
Ray tracing consists in searching for its intersection with objects making up the bottom; in the case of finding the intersections the closest of them should be selected, and
the coordinates of this point constitute the result of the algorithm’s functioning for a
given position and beam.
The bottom surface given by the GRID net consists of meshes determined by 4
peaks which are usually not coplanar. In order to obtain elementary surfaces in the form
of flat polygons, each mesh of the GRID net is divided into two triangles (Fig. 4a). The
test for the ray’s intersecting the triangle and calculating the coordinates of the hitting
point takes place by a method known from image synthesis, which consists in determining the intersection point of the ray with the plane explicitly defined by the triangle
peaks, and then checking if the projection of the intersection point onto plane OXY is
inside the triangle projection [12,13]. The complete algorithm of virtual soundong is
presented on Fig. 6.
In a general case the ray tracing algorithm bids to check the intersections with all
bottom model objects [12]. Due to the specificity of a GRID net and the regularity of
the set of researched rays, however, the number of grid meshes tested for intersecting
can be considerably reduced [14], which permits an improvement of the method’s efficiency. The most important technique of decreasing the number of intersection tests is
choosing the set of meshes located exactly “under” the ray, which means, that their
projections on XY plane are intersected by the projection of a ray (Fig. 4b).
Virtual multibeam echosounder in investigations on sea bottom modeling
a)
117
b)
P(x,y,z)
Figure 4. Intersection testing (a) triangulation (b) Choosing meshes for intersection tests using
projection on XY plane
3.3. Algorithm of virtual sounding with vertical echo sounder
A vertical echo sounder generates a single acoustic beam directed vertically downwards in relation to the vessel’s deck, which significantly simplifies the algorithm of
determining the points where the wave hit the bottom. Transducer’s position Ptransducer is
calculated in accordance with dependence (1). Assuming that the vessel does not succumb to lists, the direction of the beam’s axis always equals Rα=[0,0,-1]. Since the
GRID net is defined in a plane perpendicular to the beam axis, the intersected grid’s
mesh is always the one with the vessel position’s projection, so it suffices to check the
ray’s intersection with starting point Ptransducer and direction Rα with one or two triangles
(Fig. 5).
Ptransducer
Figure 5. Intersection testing (a) triangulation (b) Choosing meshes for intersection tests using
projection on XY plane
118
Wojciech Maleika, Michał Pałczyński
START
Take down ship’s position and course
Pship, Rship
Calculate transducer position Ptransducer (1)
Calculate vector coordinates Rtransducer (2)
i:=0
Rα(i) (3) - (5)
Calculate ray direction
Find nearest intersection of ray determined by
equation (6) with GRID net
T
Has intersection
been found?
N
Take down intersection point
P = (x, y, z)
i:=i+1
N
i ? Nbeam
T
N
Last
position?
T
STOP
Figure 6. Algorithm of virtual sounding
Virtual multibeam echosounder in investigations on sea bottom modeling
119
4. Conclusion
The method of virtual sounding method presented in the work can make a valuable
tool supporting research in the area of DTM creation methods and methods of creating
sounding work. Its application permits:
 assessing the accuracy of various interpolation methods (GRID nets in particular)
with the application of real sounding data,
 researching the effect of sounding work parameters such as profile density, vessel
speed, vessel movement trajectory etc. on bottom model accuracy,
 researching the effect of echo sounder parameters on bottom model accuracy,
 comparing the accuracy of bottom models built on the basis of soundings using
single and multibeam echo sounder.
In the course of further research on virtual sounding method consideration can be
taken of:
 heterogeneity of the medium in the form of sound velocity profile dependent on
depth, which will make it necessary to track the non-linear trajectory of wave propagation,
 disturbances like research vessel list, position and course errors and measurement
unreliability of the echo sounder.
References
[1] Grabiec D. Marine Hydrography – Quo vadis? A Brief Look at Development Directions of Hydrographic Measurement Ways and Means,
http://pl.wikisource.org/wiki/Hydrografia_morska, 2008 [In Polish].
[2] Fourgassie A. L’hydrographie moderne. Cols Bleus 6, pp 10-13, 2000.
[3] Stateczny A., Kozak M. Space Modelling of the Sea Bottom Shape Using the Geostatistic Method. Polish Journal of Environmental Studies, vol. 14, Supplement I,
2005.
[4] Łubczonek J. Hybrid Neural Model of the Sea Bottom Surface. Artificial Intelligence and Soft Computing – ICAISC 2004, Springer, Berlin/Heidelberg, 2004.
[5] Davis J. C. Statistics and Data Analysis in Geology. John Wiley & Sons, New
York, 1986.
[6] Coates R. Underwater Acoustic Systems. Macmillan New Electronics, London
1990.
[7] Etter P. Underwater Acoustic Modeling. Elsevier Applied Science, London & New
York, 1991.
[8] Bell J. M., Linnett L. M. Simulation and Analysis of Synthetic Sidescan Sonar Images. IEE Proceedings, Radar, Sonar and Navigation, 1997.
[9] Pałczyński M. A Method of Synthesis of Simulated Sonar Images for the Purpose
of Comparative Navigation. Ph. D. thesis, Szczecin University of Technology,
Szczecin 2008 [In Polish].
120
Wojciech Maleika, Michał Pałczyński
[10] Stateczny A., Pałczyński M. Synthesis of Simulated Sonar Images by Means of
Acoustic Rectilinear Rays. Polish Journal of Environmental Studies, vol. 15, No.
4C, 2006.
[11] Kongsberg Maritime, High Resolution Focused Multibeam Echo Sounders,
http://www.hydrographicsociety.nl/documents/hydrographicsociety/downloads/km
%20multibeam.pdf, 2008.
[12] Glassner A. An Introduction to Ray-Tracing. Academic Press, London, 1989.
[13] Zabrodzki J. Computer Graphics. Wydawnictwa Naukowo-Techniczne, Warszawa,
1994 [In Polish].
[14] Pałczyński M. Adaptation of the Ray-Tracing Algorithm to the Needs of Synthesis
of Simulated Sonar Images, X Sesja Naukowa Informatyki (10th IT Scientific Session) Szczecin, 2005 [In Polish].
HDRLIB: biblioteka do szybkiego przetwarzania
obrazów HDR wykorzystująca zaawansowane
moŜliwości procesorów CPU
Radosław Mantiuk
Politechnika Szczecińska, Wydział Informatyki
Abstract:
Development of high dynamic range imaging (HDRI), especially in the display and
camera technology, have a significant impact on broader usage of HDR data in image
processing, analysis and synthesis. Typical 8-bit graphics pipeline (each red, green, and
blue color channels stored in 8-bits) is replaced with the HDR pipeline, in which color is
represented by floating point values. The HDRI pipeline does not suffer from many
problems of 8-bit systems The most important advantage of HDRI is the possibility of
storing full range of luminance and chrominance visible by HVS (Human Visual System).
Unfortunately, HDR images are significantly greater than in 8-bit systems. Therefore
speed and effectiveness of their processing are especially important for practical
applications. In the paper we propose using SIMD and multi-threading CPU technologies
to speed-up HDRI processing. We present new architecture of HDRI, in which the
attention was paid on memory access optimization, effectiveness of vector data processing
and parallelization of computations. We test proposed solution based on a novel and
original implementation and discusse achieved speed-ups.
Keywords:
SIMD, multi-threading programming, CPU programming, HDR images, image
processing and analysis.
1. Wprowadzenie
Technologia HDRI (ang. High Dynamic Range Imaging) [1] jest rozszerzeniem
i uogólnieniem powszechnie stosowanych algorytmów przetwarzania i analizy obrazów.
Wykorzystywane w niej dane obrazu rejestrowane są z precyzją odpowiadającą dokładności ludzkiego aparatu widzenia HVS (ang. Human Visual System). Dzięki temu metody przetwarzania mogą być zorientowane na uzyskanie obrazów o percepcyjnej
jakości, a nie, jak w przypadku konwencjonalnych obrazów LDR (ang. Low Dynamic
Range), na spełnienie technicznych wymagań urządzeń (np. monitorów) [3]. Technologia HDRI zastępuje standardowy potok graficzny proponując dokładniejsze i bardziej
uniwersalne rozwiązania [8][7].
Miarą praktycznego zastosowania algorytmów przetwarzania obrazów w oprogramowaniu komputerowym jest nie tylko jakość uzyskiwanych obrazów, ale równieŜ czas
działania. Dane obrazów HDR reprezentowane są przez liczby zmiennoprzecinkowe, co
znacząco zwiększa ich rozmiary w porównaniu ze standardowymi obrazami. Przykła-
122
Radosław Mantiuk
dowo jeden piksel obrazu LDR to 24-bity danych (8-bitów na kaŜdą składową koloru
RGB), natomiast piksel obrazu HDR zapamietywany jest na trzech liczbach zmiennoprzecinkowych zajmując 96-bitów pamięci. Zwiększony rozmiar danych wymaga większej mocy obliczeniowej potrzebnej na ich przetworzenie.
W artykule prezentowane jest wykorzystanie natywnych cech procesorów CPU, takich jak wielordzeniowość i instrukcje wektorowe [2], do przyśpieszania algorytmów
HDRI. Proponowane rozwiązania zasadniczo róŜnią się od standardowego potoku przetwarzania obrazów, poniewaŜ oprócz obliczeń zmiennoprzecinkowych wymagają zastosowania specyficznych operacji HDRI (np. mapowania tonów [5]).
Opisywane metody obrazowania HDRI zaimplementowane zostały w autorskim
oprogramowaniu – bibliotece HDRLIB. Oprócz zrównoleglania danych na poziomie
procesu i na poziomie danych, w bibliotece zastosowano szereg rozwiązań programistycznych przyśpieszających obliczenia. Przykładem są zarządzanie pamięcią oparte na
przetwarzaniu fragmentów obrazu w szybkiej pamięci podręcznej CPU oraz kolejkowanie operacji zwiększające efektywność wykorzystania instrukcji wektorowych SIMD
(ang. Single Instruction Multiple Data). W artykule zaprezentowano rezultaty testowania biblioteki HDRLIB i porównano jej działanie ze standardowymi technikami przetwarzania i analizy obrazów.
Przykładami istniejącego oprogramowania, wykorzystującego instrukcje SIMD
i wielowątkowość do przyśpieszania obliczeń, mogą być: biblioteka VIPS (VASARI
Image Processing System) stosowania do przetwarzania duŜych obrazów [6], narzędzią
IPP (Integrated Performance Primitives) firmy Intel wspomagające kompilacje programów wielowątkowych [9], biblioteka IPT (Image Processing Library) [10] wspomagająca przetwarzanie obrazów, pakiet GENIAL (GENeric Image Array Library) [11]
przyśpieszjący algorytmy przetwarzania sygnałów czy biblioteki ITK (Insight Segmentation and Registration Toolkit) [12] i MITK (Medical Imaging ToolKit) [13] wykorzystywane w obrazowaniu medycznym. śaden z powyŜszych pakietów nie słuŜy jednak
do przetwarzania obrazów HDR i nie wspomaga natywnych algorytmów HDRI.
W artykule pominięto zagadnienia związane z wykorzystaniem procesorów graficznych GPU (ang. Graphics Procesor Unit) w obrazowaniu HDR [4]. ChociaŜ procesory
GPU mają w porównaniu z CPU znacznie większe moŜliwości w zakresie zrównoleglania obliczeń, ich praktyczne zastosowanie ograniczone jest brakiem ustandaryzowanej
funkcjonalności. Natomiast wiodącym celem powstania biblioteki HDRLIB było jej
komercyjne zastosowanie.
W drugim i trzecim rozdziele artykułu prezentowane jest wykorzystanie moŜliwości
CPU do przyśpieszania przetwarzania obrazów HDR. Biblioteka HDRLIB oraz rezultaty jej testowania opisane zostały w rozdziale czwartym. Rozdział piąty podsumowuje
artykuł i wskazuje kierunki dalszych prac.
2. Wykorzystanie instrukcji SIMD oraz przetwarzania
wielowątkowego
Technologia SIMD po raz pierwszy zastosowana została w latach 60-tych w superkomputerach (nazywanych z tego względu komputerami wektorowymi), jednak jej
powszechne wykorzystanie moŜliwe stało się dopiero w latach 90-tych wraz z pojawie-
HDRLIB: biblioteka do szybkiego przetwarzania obrazów HDR…
123
niem się procesorów Intel Pentium z rozszerzeniem MMX (ang. MultiMedia eXtensions). Ten zestaw instrukcji SIMD nie nadawał się jeszcze do przetwarzania obrazów
HDR, poniewaŜ nie posiadał instrukcji zmiennoprzecinkowych. Wprowadzono je
w kolejnych wersjach rozszerzenia nazwanych SSE (Streaming SIMD Extension) (SSE,
SSE2, SSE3, SSSE3, SSE4) i ich odpowiednikach wdraŜanych przez inne firmy produkujące procesory (np. AltiVEC firm Motorola i IBM). Obecnie kaŜdy producent procesorów CPU oferuje zestaw instrukcji SIMD, realizujących operacje matematyczne,
logiczne oraz dostępu do pamięci. WaŜną cechą rozszerzenia SIMD jest sterowanie
zapisem danych do pamięci podręcznej (ang. cache pollution). Instrukcje SSE w jednym
cyklu zegara przetwarzają cztery liczby zmiennoprzecinkowe pojedynczej precyzji (4bajtowe) umoŜliwiając uzyskanie czterokrotnego przyśpieszenia obliczeń.
Nowoczesne procesory CPU zbudowane są z kilku rdzeni (ang. multi-core CPU),
dzięki czemu funkcjonalnie zbliŜone są do komputerów wieloprocesorowych (ang.
multi-processor computers). Uruchamiane w nich aplikacje mogą wykonywać obliczenia z uwzględnieniem przetwarzania wielowątkowego, dzięki temu uzyskuje się przyśpierzenia o krotności zbliŜonej do liczby rdzeni i procesorów w systemie.
Instrukcje SIMD oraz wielowątkowość procesorów CPU moŜna efektywnie wykorzystać do przyśpieszania przetwarzania obrazów HDR. Przykładem czasochłonnych
operacji HDRI są obliczenia macierzowe takie, jak mnoŜenie i dodawanie macierzy,
mnoŜenie macierzy przez skalar, potęgowanie, logarytmowanie elementów macierzy,
transponowanie i obliczanie macierzy odwrotnej. Kłopotliwe są równieŜ operacje akumulacyjne: obliczanie sumy pikseli w obrazie, szukanie wartości maksymalnych i minimalnych, obliczanie średnich arytmetycznych i geometrycznych. W technologiach
HDRI często wykonuje się instrukcje warunkowe maskujących grupy pikseli bądź kanałów koloru, oraz wykorzystuje zmiennoprzecinkowe tablice LUT (ang. Look-Up Tables). Proste operacje słuŜą do budowania złoŜonych algorytmów takich, jak skalowanie
i obracanie obrazu, konwersje koloru, obliczanie histogramu, tworzenie piramidy Gauss'a, obliczanie splotu i współczynników statystycznych.
3. Przyśpieszanie przetwarzania obrazów HDR
Wykorzystanie instrukcji SIMD do przyśpieszania przetwarzania obrazów HDR polega na jednoczesnym wykonywaniu operacji dla wszystkich czterech kanałów koloru
(RGBA) lub, w przypadku obrazów HDR zawierających tylko kanał luminancji, na
równoczesnym przetwarzaniu czterech pikseli. Zestaw instrukcji SIMD dostępny
w procesorach nie oferuje wszystkich operacji niezbędnych w przetwarzaniu obrazów
HDR. Przykładowo w standardzie SSE (w wersji do SSSE3) brak jest potęgowania
i logarytmowania. Proponowanym rozwiązaniem problemu jest zastosowanie aproksymacji logarytmowania i potęgowania za pomocą szeregów liczbowych [7]. Dzięki instrukcjom SIMD takie aproksymacje obliczane są bardzo efektywnie.
Na rysunku 1 prezentowany jest potok przetwarzania danych HDR zastosowany
w bibliotece HDRLIB. Wymiana danych CPU z pamięcią RAM stanowi wąskie gardło
algorytmów, dlatego w proponowanym rozwiązaniu obraz dzielony jest na fragmenty.
których wielkość nie jest większa od rozmiaru pamięci podręcznej L1 procesora. W ten
sposób unika się czasochłonnych odwołań do głównej pamięci RAM. Przetwarzanie
124
Radosław Mantiuk
fragmentów odbywa się niezaleŜnie od siebie. W potoku przetwarzania uruchamianych
jest tyle wątków, ile oferuje system operacyjny. Po zakończeniu obliczeń związanych
z danym wątkiem, przekazywany jest do niego kolejny fragment obrazu.
Rysunek 1. Przyśpieszanie przetwarzania obrazów HDR w oparciu o instrukcje SIMD, wielowątkowość obliczeń oraz kolejkowanie operacji
Kolejną zastosowaną techniką minimalizującą wymianę danych z pamięcią RAM
jest mechanizm kolejkowania operacji (ang. instruction queueing). Dane piksela pobrane z pamięci poddawane są kolejno kilku operacjom, a tymczasowe rezultaty przechowywane w rejestrach procesora. Dopiero końcowy wynik zapisywany jest do pamięci
RAM. Mechanizm kolejkowania wymaga niestandardowego sposobu programowania,
w którym rzeczywiste wykonanie operacji odbywa się nie w momencie wywołania
funkcji, a dopiero po wykonaniu specjalnej funkcji realizującej kolejkę. Dzięki kolejkowaniu operacji dodatkowo moŜliwa jest analiza i optymalizacja instrukcji w kolejce
przed ich wykonaniem [7].
HDRLIB: biblioteka do szybkiego przetwarzania obrazów HDR…
125
4. Implementacja i testowanie biblioteki HDRLIB
Biblioteka HDRLIB słuŜy do szybkiego przetwarzania obrazów HDR. Wykorzystano w niej opisywane mechanizmy przyśpieszania obliczeń, w szczególności zrównoleglenie na poziomie danych (instrukcje SSE), wielowątkowość, zarządzanie dostępem do
pamięci RAM oraz kolejkowanie.
Biblioteka zaimplementowano w języku C++ jako dynamiczną bibliotekę skompilowaną w systemie MS Windows. Wywoływanie funkcji bibliotecznych odbywa się za
pośrednictwem API napisanego w języku C, o strukturze podobnej do API biblioteki
OpenGL. Implementacja automatycznie wykrywa moŜliwości procesora i wykorzystuje
oferowaną przez CPU wielowątkowość oraz instrukcje SSE. Obsługa i synchronizacja
wątków oparta została na mechanizmie Win32 Event System udostępnianym przez
system operacyjny.
Testowanie biblioteki przeprowadzono na przykładowym obrazie HDR o rozdzielczości 3088x2056 pikseli. UŜycie jednego obrazu było wystarczające, poniewaŜ szybkość działania testowanych algorytmów nie zaleŜy od kontekstu obrazu i jest liniowo
zaleŜna od liczby pikseli w obrazie. Celem testów było określenie przyśpieszenia obliczeń w stosunku do standardowej wersji oprogramowania bez mechanizmów przyśpieszających. ZałoŜono, Ŝe przyśpieszone algorytmy muszą zachować oryginalną jakość
obrazów.
Testowaniu poddano wybrane podstawowe operacje przetwarzania obrazów HDR:
logarytmowanie, podnoszenie do potęgi, instrukcje warunkowe, obliczanie minimalnej
i maksymalnej luminancji w obrazie. Przetestowano równieŜ złoŜony algorytm HDRI:
kompresję tonów (operator gamma oraz operator logarytmiczny [1]).
Wyniki testów prezentowane są w tabeli 1. Szczególnie duŜe przyspieszenie (wynikające z efektywnej implementacji funkcji aproksymacyjnych) widoczne jest dla instrukcji wykorzystujących logarytmowanie i potęgowanie (wiersze 1 i 2). Operacje
z wyraŜeniami warunkowymi wykazują mniejsze przyspieszenia, poniewaŜ w przypadku instrukcji SIMD niemoŜliwe jest pomijanie wszystkich zamaskowanych pikseli
(w cztero-elementowym wektorze przetwarzane są wszystkie piksele, nawet jeŜeli trzy
z nich są zamaskowane) (wiersz 5). Instrukcje złoŜone takie, jak operatory tonów wykorzystujące operacje akumulacyjne (obliczanie minimalnej i maksymalnej luminancji)
wykonywane są kilkakrotnie szybciej (wiersze 3 i 4).
Zaimplementowana biblioteka porównana została z wynikami uzyskiwanymi za
pomocą popularnej biblioteki VIPS [6]. Przykładowo średnia logarytmiczna obliczana
jest przez VIPS 11 razy wolniej od prezentowanej w artykule implementacji potoku
HDR. Fakt ten podkreśla korzyści opracowania biblioteki wyspecjalizowanej w przetwarzaniu obrazów HDR.
126
Radosław Mantiuk
Tabela 1. Wyniki testów przyśpieszenia przetwarzania danych HDR. Testy przeprowadzono na
komputerze PC z procesorem Intel Core 2 Quad Q6600, 2.4 GHz, 4 GB RAM. W drugiej kolumnie prezentowany jest czas działania algorytmów dla standardowej implementacji. Trzecia
i czwarta kolumna zawiera czasy dla przyśpieszonych algorytmów. W ostatniej kolumnie podano
współczynniki przyspieszenia obliczone dla biblioteki z czterema rdzeniami w porównaniu do
wersji stadardowej CPU.
Czas wykonania operacji [ms]
Operacja
CPU
(jeden
rdzeń)
Biblioteka HDRLIB
(SIMD)
Jeden
rdzeń
Cztery
rdzenie
Uzyskane
przysp.
(dla
czterech
rdzeni)
1 Potęgowanie
3734
313
141
26.48
2 Logarytmowanie
2641
156
63
41.92
3 Wykładnicze mapowanie tonów
(obliczenie minimalnej i maksymalnej
luminancji w obrazie, operator gamma o
wykładniku 1.8, normalizacja wyniku i
zapis do pamięci w formacie BGRA8888)
1703
375
210
8.12
4 Logarytmiczne mapowanie tonów
(obliczenie minimalnej i maksymalnej
luminancji w obrazie, operator
logarytmiczny, normalizacja wyniku i zapis
do pamięci w formacie BGRA8888)
1157
375
203
5.69
5 Logarytmiczne mapowanie tonów z
maskowaniem (50% pikseli nie jest
uwzględniana w obliczeniach)
953
297
297
3.20
5. Podsumowanie
W artykule zaprezentowane zostało wykorzystania zaawansowanych mechanizmów sprzętowych oferowanych przez procesory CPU do przyspieszania przetwarzania
obrazów HDR. Dzięki zastosowaniu wielowątkowości, instrukcji SIMD, efektywnemu
zarządzaniu pamięcią, mechanizmu kolejkowania operacji uzyskano wielokrotne przyspieszenie obliczeń w porównaniu ze standardowym sposobem implementacji algorytmów (np. ponad 8-krotne dla operacji logarytmicznego mapowania tonów).
Prezentowane rozwiązania zaimplementowane zostały w komercyjnej bibliotece
HDRLIB.
Autorzy planują dalszy rozwój biblioteki, w szczególności wykorzystywanie moŜliwości nowych procesorów CPU sukcesywnie pojawiających się na rynku (przykładem
jest standard SSE4, między innymi oferujący przydatne w przetwarzaniu obrazów szybkie obliczanie sumy waŜonej). Biblioteka będzie równieŜ rozszerzana o nowe algorytmy
HDRI, w tym lokalne operatory tonów.
HDRLIB: biblioteka do szybkiego przetwarzania obrazów HDR…
127
Bibliografia
[1] Reinhard E., Ward G., Pattanaik S., Debevec P. High Dynamic Range Imaging.
Data Acquisition, Manipulation, and Display. Morgan Kaufmann, 2005.
[2] Gummaraju J., Rosenblum M. Stream programming on general-purpose processors. Proceedings of the 38th annual IEEE/ACM International Symposium on
Microarchitecture, Barcelona, Spain, 2005, pp. 343-354.
[3] Mantiuk R., Krawczyk G., Mantiuk R., Seidel H.P. High Dynamic Range Imaging
Pipeline: Perception-motivated Representation of Visual Content. In: Proc of SPIE
– Volume 6492. Human Vision and Electronic Imaging XII. 649212.
[4] Mantiuk R., Tomaszewska A., Pająk D. Wykorzystanie procesorów graficznych do
szybkiego przetwarzania obrazów HDR. Pomiary Automatyka Kontrola 7’2007,
ISSN 0032-4110, str. 106-108.
[5] Reinhard E., Stark M., Shirley P., Ferwerda J. Photographic Tone Reproduction for
Digital Images. ACM Trans. on Graph. t.21, n.3, str. 267-276, 2002.
[6] Martinez K., Cupitt J. VIPS – a highly tuned image processing software architecture. In Proceedings of IEEE International Conference on Image Processing 2, pp.
574-577, Genova, 2005.
[7] Mantiuk R., Pająk D. Acceleration of High Dynamic Range Imaging Pipeline Based on Multi-threading and SIMD Technologies. Lecture Notes in Computer Science, vol. 5101, no. I, 2008, Poland, pp. 780-789.
[8] Mantiuk R., Krawczyk G., Mantiuk R. High Dynamic Range Imaging Pipeline:
Merging Computer Graphics, Physics, Photography and Visual Perception. Proc.
of Spring Conferece on Computer Graphics (poster materials), 20-22.04, 2006, Casta Papiernicka, Slovakia, pp. 37-41.
[9] Taylor S. Intel Integrated Performance Primitives Book. ISBN 0971786135,
ISBN139780971786134, 2004.
[10] Harmonic Software Inc. IPT – The Image Processing Toolbox for O-Matrix.
http://www.omatrix.com/ipt.html
[11] Laurent P. GENIAL – GENeric Image Array Library.
http://www.ient.rwth-aachen.de/team/laurent/genial/genial.html
[12] Ibanez L., Schroeder W., Ng L., Cates J. The ITK Software Guide. Second Edition.
Kitware, Inc. Publisher, November 2005.
[13] Zhao M., Tian J., Zhu X., Xue J., Cheng Z., Zhao H. The Design and Implementation of a C++ Toolkit for Integrated Medical Image Processing and Analysis. In
Proc. of SPIE Conference, V.6 5367-4, 2004.
MPS(3N) transparent memory test for Pattern Sensitive
Fault detection
Ireneusz Mrozek, Eugenia Busłowska, Bartosz Sokół
Bialystok Technical University, Faculty of Computer Science
Abstract:
Conventional memory tests based on only one run have constant and low faults coverage
especially for Pattern Sensitive Faults (PSF). To increase faults coverage the multiple run
March test algorithms have been used. As have been shown earlier the key element of
multiple run March test algorithms are memory backgrounds. Only in a case of optimal
set of backgrounds the high fault coverage can be achieved. For such optimal backgrounds the analytical calculation of NPSFk fault coverage for 3 and 4 runs of MPS(3N)
test in this paper is presented. All of the analytical calculations are confirmed and validated by adequate experiments.
Keywords:
transparent memory tests, memory testing, pattern sensitive faults, march tests
1. Introduction
It becomes highly important to test various kinds of defects rapidly and precisely to
improve the modern memory quality especially RAM (Random Access Memory) in a
SoC (System-on-a-Chip) design environment. The RAM testing is quickly becoming a
more difficult issue as the rapidly increasing capacity and density of the RAM chips.
Faults modeled from the memory defects can be summarized as follows [1], [2]. The
neighborhood pattern sensitive fault (NPSF) model is not new, but it is still widely discussed in the literature of memory testing, and becoming more and more important for
memory testing. Traditional March algorithms [1] have been widely used in RAM testing because of their linear time complexity, high fault coverage, and ease in built-in
self-test (BIST) implementation. It is known that the traditional March algorithms do
not generate all neighborhood patterns that are required for testing the NPSFs, however,
they can be modified to get detection abilities for NPSFs. Based on traditional March
algorithms different approaches have been proposed to detect NPSFs, such as the tiling
method [1], [4], two-group method [1], row-March algorithm [4] and transparent testing
[3], [5], [6].
2. Transparent memory testing
March tests are superior in terms of test time and simplicity of hardware implementation and consist of sequences of March elements. The March element includes sequences of read/write (r/w) operations, which are all applied to a given cell, before
130
Ireneusz Mrozek, Eugenia Busłowska, Bartosz Sokół
proceeding to the next cell. The way of moving to the next cell is determined by the
address sequence order.
The transparent technique is a well known memory testing approach that retrieves
the initial contents of the memory once the test phase has been finished. It is therefore
suitable for periodic field testing while allowing preserving the memory content. A
transparent BIST is based on a transparent March test that uses the memory initial data
to derive the test patterns. A transparent test algorithm ensures that the last write data is
always equal to the first read value in order to satisfy the transparency property [7], [8].
Transparent tests are particularly suitable for Built-In Self-Test.
Let us concentrate our attention on two March memory tests, namely MATS+:{⇑
(ra,wā); ⇓(rā,wa)}, and modified PS(3N) test – MPS(3N): {⇑ (ra,wā, rā,wa, ra)} [9]. It
is quite important to emphasis that implementation of MPS(3N) in some cases do not
need the value of fault free signature, and the MPS(3N) testing procedure can be interrupted by the system at any time due to preservation of the initial memory contents at
any stage of testing. In a case of MATS+ test initial contents will be at the end of test
procedure.
3. MPS(3N) memory tests efficiency analyses
To investigate the memory March tests let us suppose that NPSFk includes memory
cells with increasing order of addresses α(0), α(1), α(2), ..., α(k − 1), such a way that
α(0) < α(1) < α(2) < ... < α(k − 1) and base cell has the address α(i), where 0 ≤ i ≤ k − 1.
One of the simplest March test is MPS(3N) which checks only all possible transitions in current cell. Let us concentrate our attention on the Passive NPSFk (PNPSFk) as
the most difficult faults to be detected. First of all, it should be emphasized that due to
scrambling information as well as specific optimisation techniques there is huge amount
of such a type of faults. Any k arbitrary memory cells out of all N memory cells can be
involved into the PNPSFk. The exact number of all PNPSFk within N memory cells is
determined according to the equation (1):
 N 
 N 
k
L( PNPSFk ) = 2 × ( N − k + 1) × 2 k −1 × 
 = 2 × ( N − k + 1) 
.
 k − 1
 k − 1
(1)
The number Q(PNPSFk) of detectable faults during the one MPS(3N) memory test
run is (2):
 N 
Q( PNOSFk ) = 2 × ( N − k + 1) 
.
 k − 1
(2)
And the fault coverage FC for MPS(3N) is:
FCMPS 3 N =
Q ( PNPSFk )
L ( PNPSFk )
100% =
1
.
2k −1
(3)
The exact values of fault coverage for different k and for MPS(3N) test are presented
in Table 1.
131
MPS(3N) transparent memory test for Pattern Sensitive Fault detection
Table 1. PNPSFk fault coverage for MPS(3N) test
k
FCMPS3N
3
25
4
12.5
5
6.25
6
3.125
7
1.6
8
0.8
9
0.4
There are some solutions to increase the FCMPS3N shown in Table 1. Among those
the most promising is the multiple run memory testing. The key idea behind this approach is the different background selection for increasing the fault coverage. Let us
analyze the efficiency of this approach for different number of backgrounds. In a case of
two backgrounds the fault coverage will increase in two times when the second background is an inverse version of the first one. More complicated problems arise for three
and four runs of memory test with different backgrounds.
4. Three run MPS(3N) memory test efficiency analyses
For implementation of the three run MPS(3N) like memory tests three different
backgrounds B1, B2 and B3 have to be generated. In the case of three run memory testing
every consecutive background should be not similar to the previous one or more precisely should be dissimilar as much as possible compare with the backgrounds have
been applied during the previous test sessions [10], [11]. Memory background can be
regarded as binary vector and the set of backgrounds can be defined as a set of binary
vector Bi = bi1bi2...biN, i ∈ {1, 2, ..., 2N} where bic ∈ {0, 1}, ∀c ∈ {1, 2, ...,N} and N is
the one-bit wide memory size. As the candidates for the backgrounds the following sets
can be regarded:
B11 = bi1bi 2 bi 3 ...bi ( N / 2 −1) bi ( N / 2) bi ( N / 2 +1) bi ( N / 2 + 2) ...biN
S1: B12 = bi1 bi 2 bi 3 ...b i ( N / 2 −1) bi ( N / 2) b i ( N / 2 +1) bi ( N / 2 + 2) ...biN
B13 = bi1 b i 2 b i 3 ...b i ( N / 2 −1) b i ( N / 2 ) bi ( N / 2 +1) bi ( N / 2 + 2 ) ...biN
B21 = bi1bi 2 bi 3 ...bi ( N / 2 −1) bi ( N / 2) bi ( N / 2 +1) bi ( N / 2 + 2) ...biN
S2: B22 = bi1 bi 2 bi 3 ...bi ( N / 2 −1) bi ( N / 2) bi N / 2 +1 bi N / 2 + 2 ...biN
(
) (
)
B23 = bi1bi 2 bi 3 ...bi ( N / 2 −1) bi ( N / 2 ) bi ( N / 2 +1) bi ( N / 2 + 2) ...biN
B31 = bi1bi 2 bi 3 ...bi ( N /3) bi ( N /3+1) ...bi ( 2 N /3) bi ( 2 N /3+1) ...biN
S3: B32 = bi1 bi 2 bi 3 ...bi ( N /3) bi ( N /3+1) ...bi ( 2 N /3) bi ( 2 N /3+1) ...biN
B33 = bi1bi 2 bi 3 ...bi ( N /3) bi ( N /3+1) ...bi ( 2 N /3) bi ( 2 N /3+1) ...biN
According to definition of S3 set of backgrounds B31, B32 and B33 are as follow:
S3 for N = 18
B31 = 000000000000000000
B32 = 111111111111000000
B33 = 000000111111111111
132
Ireneusz Mrozek, Eugenia Busłowska, Bartosz Sokół
For the simplicity of the following discussions let suppose that N is divisible by 2
and 3. As the simple modification of these sets any random backgrounds B13, B22 and
B23 with the half of zero and half of ones can be used, as well as arbitrary B32 with 2N/3
ones and N/3 zero and B33 with N/3 ones and 2N/3 zero. The efficiency of these sets of
backgrounds can be calculated analytically. Let us start with the first set S1.
A. MPS(3N) efficiency analysis for S1
The first background B11 from the set of S1 allows to detect on the bases of MPS(3N)
test Q11 = Q(PNPSFk) (see (2)) PNPSFk faults. Because backgrounds B12 is inversion of
B11 than within arbitrary k−1 cells will be different patterns compare with in case of B11
background. Therefore Q12 also equals to Q(PNPSFk) and Q11_12 = 2Q(PNPSFk). In a
case of the background B13 new k bit patterns compare with the patterns have been generated on the bases of backgrounds B11 and B12 consists of different binary codes excluding all zero codes (generated by B11) and all ones code (generated by B12). Taking
into account the structure of B13 the additional number of patterns are:
k −2
 N / 2  N / 2
Q13 = 2 × ( N − k + 1) × ∑ 

.
i 
i =1  k − 1 − i  
(4)
Let kn means the number of neighborhood cells (kn = k−1). Then
kn −1
 N / 2 N / 2
Q13 = 2 × ( N − kn ) × ∑ 

.
i 
i =1  k n − i  
(5)
Full amount of the patterns generated after three runs of the MPS(3N) test based of
the set S1 equals to:
  N  kn −1  N / 2  N / 2  
Q11_12 _13 = 2 × ( N − kn ) ×  2 ×   + ∑ 

 .
  kn  i =1  kn − i   i  
(6)
Taking into account that for real applications N is a big integer number, k << N and
Nk >> Nk−1 last equation for the case of even N can be simplifies to:

2
1
Q11_12 _13 ≈ 2 ( N − kn ) N kn 
+
 kn ! 2 kn

kn −1
∑
i =1

.

( kn − i )! × i ! 
1
(7)
Under the same conditions L(PNPSFk) from (1) can simplifies to 2 × 2kn × (N − kn)
× ( N kn /kn!), then the fault coverage FCMPS3N((B11,B12,B13), k) for the three runs of
MPS(3N) test can be estimated as
MPS(3N) transparent memory test for Pattern Sensitive Fault detection
FCMPS 3 N ( ( B11 , B12 , B13 ) , k ) =
Q11_12 _13
L ( PNPSFk )
133
100% =


kn −1
k ! kn −1
1
1
100% = 3 × 2 − 1100%
=  kn −1 + nkn ∑
 2
2 i =1 k − i ! × i ! 
22 kn −1
( n )


(8)
In a case of k = 3 FCMPS3N((B11,B12,B13), k) ≈ (5/8)100% = 62.5% what is sufficiently
high then for one run of MPS(3N) test.
B. Efficiency analysis for S2
The first all zero background B21 from set of S2 allows to detect the same number of
faults as in case of B11: Q21 = Q(PNPSFk) (see (2)). Background B22 and background B23
due their structure allow generating the same amount of new patterns compare with
background B21 within any k memory cells. It should be noted that B22 is inverse version
of B23 what allows to detect different subset of PNPSFk faults. The value of Q22 = Q23
can be calculated as
  N / 2  kn −1  N / 2  N / 2  
Q22 = 2 × ( N − kn ) ×  
+ ∑

 .
  kn  i =1  kn − i   i  
(9)
Full amount of the patterns generated after three runs of the MPS(3N) test on the
bases of S2 equals to:
 N 
  N / 2  kn −1  N / 2   N / 2   
Q21_ 22 _ 23 = 2 × ( N − kn ) ×    + 2 ×  
+ ∑

   .
  kn 
  k n  i =1  kn − i   i   

(10)
According to the same assumptions for k and N as in previous case


1
2
2 kn −1
1
.
Q21_ 22 _ 23 ≈ 2 × ( N − kn ) N kn 
+ kn
+ kn × ∑
 kn ! 2 kn ! 2

i =1
( kn − i )! × i ! 

(11)
Then the fault coverage FCMPS3N((B21,B22,B23), k) for the three runs of MPS(3N) test
can be estimated as:
 1

2k ! kn −1
2
1
FCMPS 3 N ( ( B21 , B22 , B23 ) , k ) ≈  kn + 2 kn + 2 knn × ∑
100% =
2
2
i =1 ( k n − i ) !× i ! 
2
3 × 2kn −1 − 1
=
100%
22 kn −1
(12)
The results obtained for both sets of backgrounds (S1, S2) allow to show that
FCMPS3N((B11,B12,B13), k) = FCMPS3N((B21B22,B23), k) (see (8) and (12)).
134
Ireneusz Mrozek, Eugenia Busłowska, Bartosz Sokół
5. Four run MPS(3N) memory test efficiency analyses
Similar to the three run MPS(3N) like memory tests four different backgrounds B1,
B2, B3 and B4 have to be generated such a way that every consecutive background
should be not similar to the previous one or more precisely should be dissimilar as much
as possible compare with the backgrounds have been applied during the previous test
sessions [10], [11]. To construct B1, B2, B3 and B4 the previous sets the S1, S2, and S3 can
be used. As the result S4 consisting on four backgrounds can be generated as B41 = B11,
B42 = B12, B43 = B13 and B44 = B23.
B41 = bi1bi 2 bi 3 ...bi ( N / 2 −1) bi ( N / 2) bi ( N / 2 +1) bi ( N / 2 + 2) ...biN
S4:
B42 = bi1 bi 2 bi 3 ...bi ( N / 2 −1) bi ( N / 2 ) bi ( N / 2 +1) bi ( N / 2 + 2 ) ...biN
B43 = bi1 bi 2 bi 3 ...bi ( N / 2 −1) bi ( N / 2) bi ( N / 2 +1) bi ( N / 2 + 2) ...biN
B44 = bi1bi 2 bi 3 ...bi ( N / 2 −1) bi ( N / 2 ) bi ( N / 2 +1) bi ( N / 2 + 2 ) ...biN
It should be noted that the Hamming distance between any pair of backgrounds from
the set S4 is not less N/2 or more precisely takes the value N and N/2, what is very close
to the conditions shown in [11]. As the best solution for four run MATS+ like tests is
the set S5 constructing as the extension of S3.
B51 = bi1bi 2 bi 3 ...bi ( N /3) bi ( N /3+1) ...bi ( 2 N /3) bi ( 2 N /3+1) ...biN
S5:
B52 = bi1 bi 2 bi 3 ...bi ( N /3) bi ( N /3+1) ...bi ( 2 N /3) bi ( 2 N /3+1) ...biN
B53 = bi1bi 2 bi 3 ...bi ( N /3) bi ( N /3+1) ...bi ( 2 N /3) bi ( 2 N /3+1) ...biN
B54 = bi1 bi 2 bi 3 ...bi ( N /3) bi ( N /3+1) ...bi ( 2 N /3) bi ( 2 N /3+1) ...biN
As an example for the case of N = 18 the corresponding backgrounds from S5 set
are:
S5 for N = 18
B51 = 000000000000000000
B52 = 111111111111000000
B53 = 000000111111111111
B54 = 111111000000111111
Like in a case of three run memory testing let’s suppose that N is divisible by 2 and
3. The efficiency of these sets of backgrounds can be calculated analytically.
A. Efficiency analysis for S4
The three first backgrounds B41, B42 and B43 allow generating the same patterns as
the corresponding set S1, that is why Q41_42_43 = Q11_12_13. Taking into account the structure of B44 it is easy to show that the additional number of patterns is
135
MPS(3N) transparent memory test for Pattern Sensitive Fault detection
 kn −1  N / 2   N / 2  
Q44 = 2 × ( N − k n ) ×  ∑ 

 .
 i =1  kn − i   i  
(13)
Full amount of the patterns generated after four runs of the MPS(3N) test on the
bases of S4 equals to:
  N  kn −1  N / 2   N / 2  
Q41_ 42 _ 43 _ 44 = 4 × ( N − kn ) ×    + ∑ 

 .
  kn  i =1  kn − i   i  
(14)
Taking into account that for real applications N is a big integer number, k << N and
Nk >> Nk−1 the fault coverage FCMPS3N((B41,B42,B43,B44), k) for the four runs of MPS(3N)
test can be estimated as
FCMPS 3 N ( ( B41 , B42 , B43 , B44 ) , k ) ≈
2kn −1
100% .
2 2 kn − 2
(15)
In a case of k = 3 according to (15) it is easy to calculate that
FCMPS3N((B41,B42,B43,B44), 3) ≈ (3/4)100% = 75% what is sufficiently higher than for
one run of MPS(3N) test.
6. Experimental results
To validate analytical results many experiments have been done. The experiments
were done for two types of PNPSFk (PNPSF3 and PNPSF5), different sizes of memory
and selected sets of backgrounds. In each cases all PNPSF3 (PNPSF5) were generated
what allowed to obtain exact number of activated faults by 3rMPS(3N) test session. So,
each time we could calculate exact value of fault coverage. All experimental results are
presented in Tables 2, 3.
Table 2. Experimental results: PNPSF3 fault coverage for 3rMPS(3N) test session
Set/N
S1
S2
S3
8
64.29%
64.29%
68.75%
16
63.33%
63.33%
65.63%
32
62.90%
62.90%
65.07%
64
62.70%
62.70%
66.41%
128
62.60%
62.60%
66.27%
The results from Tables 2 and 3 confirm the equations (8), (12). In Table 2 experimental results of PNPSF3 fault coverage for 3 run MPS(3N) and S1, S2, S3 sets of backgrounds are presented. In Table 3 you can find the results from the same experiments
but for PNPSF5 fault. It should be noticed that in such type of memory testing the fault
coverage minimally depends on N. The especially high influence on the fault coverage
can be observed for very small N what is shown in details on the Fig. 1.
136
Ireneusz Mrozek, Eugenia Busłowska, Bartosz Sokół
Table 3. Experimental results: PNPSF5 fault coverage for 3rMPS(3N) test session
Set/N
S1
S2
S3
8
18.57%
18.57%
18.75%
16
18.27%
18.27%
18.68%
32
18.12%
18.12%
18.59%
64
18.04%
18.04%
18.56%
128
18.01%
18.01%
18.54%
Figure 1. FCMPS3N((B11,B12,B13), 3) – comparison of the experimental and estimated results
But we have to notice that with increasing N, the influence of the memory size on
the fault coverage becomes smaller. In our analytical investigation we assumed large
value of N. That is why the analytical results are a bit different from the experimental
ones. In all experimental results we can see that with increasing N the experimental
values of fault coverage are more and more closer to the theoretical values. For example, in case of S1 and PNPSF3 analytically calculated fault coverage equals to 62.5%
(see (8)). For N = 8 this value equals to 64.29% but for N = 128 experimental value
equals to 62.6% and is almost the same like theoretical one. The same fact can be observed for PNPSF5. For example for backgrounds S3 theoretical FCMPS3N((B31,B32,B33),
5) = 18.52%. Experimental value for N = 128 equals to 18.54%. So we can see again
that theoretical and experimental values are equal to each other. The same experiments
have been done for 4 runs test session. In this case sets S4 and S5 were used as a backgrounds. Obtained results for 4rMPS(3N) test are in Table 4 and 5 These results confirm
equations (19) and (23) too.
Table 4. Experimental results: PNPSF3 fault coverage for 4rMPS(3N) test session
Set/N
S4
S5
8
78,57%
87,5%
16
76,67%
85,71%
32
75,81%
84,19%
64
75,40%
82,81%
128
75,20%
82,55%
137
MPS(3N) transparent memory test for Pattern Sensitive Fault detection
Table 5. Experimental results: PNPSF5 fault coverage for 4rMPS(3N) test session
Set/N
S4
S5
8
24,64%
25,00%
16
24,04%
24,86%
32
23,73%
24,70%
64
23,59%
24,62%
128
23,51%
24,58%
7. Conclusions
In this paper three and four runs of MPS(3N) test session were investigated. For the
best backgrounds, known from previous publications, PNPSFk fault coverage for
3rMPS(3N) and 4rMPS(3N) was calculated. All analytically obtained results were confirmed and validated by many experiments. Based on those results we can say that experimental results confirmed the validity of the theoretical predictions. In all our
investigations we was focusing on MPS(3N) test. It is quite important to emphasis that
transparent implementation of MPS(3N) in some cases do not need the value of fault
free signature as it is necessary in many others transparent tests. Moreover it should be
stressed that the MPS(3N) testing procedure can be interrupt by the system at any time
due to the preserving the initial memory contents at any stage of testing. In a case of
many other transparent tests initial contents will be available at the end of the test procedure.
Acknowledgement
This work was supported by the grant number W/WI/5/06.
References
[1] Goor A. J. V. D. Testing Semiconductor Memories: Theory and Practice. Chichester, England: John Wiley & Sons, 1991.
[2] Hayes J. P. Detection of pattern-sensitive faults in random-access memories. IEEE
Trans. Computers, vol. 24, no. 2, pp. 150–157, 1975.
[3] Cockburn B. F. Deterministic tests for detecting scrambled pattern sensitive faults
in RAMs, in MTDT ’95: Proceedings of the 1995 IEEE International Workshop on
Memory Technology, Design and Testing. Washington, DC, USA: IEEE Computer
Society, pp. 117–122, 1995.
[4] Franklin M., Saluja K. K. Testing reconfigured RAM’s and scrambled address
RAM’s for pattern sensitive faults, IEEE Trans. on CAD of Integrated Circuits and
Systems, vol. 15, no. 9, pp. 1081–1087, 1996.
[5] Karpovsky M. G., Yarmolik V. N. Transparent memory testing for patternsensitive faults, in Proceedings of the IEEE International Test Conference on
TEST: The Next 25 Years. Washington, DC, USA: IEEE Computer Society, pp.
860–869, 1994.
[6] Nicolaidis M. Theory of transparent BIST for RAMs, IEEE Trans. Comput., vol. 45,
no. 10, pp. 1141–1156, 1996.
138
Ireneusz Mrozek, Eugenia Busłowska, Bartosz Sokół
[7] Yarmolik V. N., Klimets Y., Demidenko S. March PS(23N) test for DRAM patternsensitive faults, in ATS ’98: Proceedings of the 7th Asian Test Symposium.
Washington, DC, USA: IEEE Computer Society, pp. 354–357, 1998.
[8] Yarmolik V. N. Contents independent RAM built in self test and diagnoses based
on symmetric transparent algorithm, in DDECS’2000: Proceedings of the 3rd
Workshop on Design and Diagnostics of Electronic Circuits and Systems, Smolenice - Slovakia, April 5-7, pp. 220–227, 2000.
[9] Yarmolik V. N., Murashko I., Kummert A., Ivaniuk A. Transparent Testing of
Digital Memories, Minsk, Belarus: Bestprint, 2005.
[10] Yarmolik S. V. Address sequences with different average hamming distance, in:
Abstracts of 1st International Conference for Young Researches in Computer
Science, Control Electrical Engineering and Telecommunications, Zielona Gora,
Poland, September 18-20, pp. 67–68, 2006.
[11] Yarmolik S. V., Mrozek I. Multi background memory testing, in MIXDES2007:
Proceedings of the 14th International Conference Mixed design of integrated circuits and systems. Ciechocinek, Poland: IEEE Computer Society, June 21-23, pp.
511–516, 2007.
Analiza natęŜenia ruchu osób na monitorowanych
obszarach
Adam Nowosielski, Krzysztof Kłosowski
Politechnika Szczecińska, Wydział Informatyki
Abstract:
New and unfamiliar quality is being brought in with the development of intelligent
monitoring systems. Video analysis of human dynamics is an excellent example. Based on
the analysis of the problem the algorithm for human traffic analysis from a video stream
is developed.
Keywords:
video analysis of human dynamics, CCTV systems, video analysis
1. Wprowadzenie
Wykorzystywanie systemów rozpoznawania obrazów w Ŝyciu codziennym jest coraz większe i stale rośnie. Przyczyną tego zjawiska jest przenoszenie Ŝmudnych i uciąŜliwych zadań realizowanych przez człowieka na zautomatyzowane systemy
informatyczne. Taka tendencja widoczna jest równieŜ w systemach monitoringu,
w których od zawsze występował operator. Ten ostatni w porównaniu z systemem
komputerowym jest ograniczony (nie moŜe jednocześnie analizować kilku obrazów),
przeciąŜony (jego zdolność skupienia uwagi spada juŜ po 0,5h pracy), bardziej zawodny
(kwestie poświęcenia i uczciwości) [1].
Pomimo rozwoju automatycznych systemów rozpoznawania i analizy obrazów
moŜna np. na skrzyŜowaniach czy teŜ przy wejściach do centrów handlowych spotkać
pracowników, których zadanie polega na ręcznym zliczaniu osób. Problem zautomatyzowania tego procesu podjęty został w niniejszym artykule. NaleŜy tutaj podkreślić, Ŝe
technologie wideoobserwacji i analizy sekwencji wideo są nieinwazyjne, bezkontaktowe i najbardziej naturalne. Nie ograniczają w Ŝaden sposób ruchów uŜytkownika. Systemy bazujące na tych technologiach mogą zaoferować nową, nieznaną do tej pory
jakość oraz ułatwić Ŝycie poprzez tworzenie inteligentnych środowisk [2]. Jednak naleŜy przy tym podkreślić, Ŝe takie zadanie jak śledzenie człowieka naleŜy do jednego
z najbardziej złoŜonych spośród zadań realizowanych przez dynamiczne systemy wizyjne [3]. Przyczyną takiego stanu rzeczy jest fakt, iŜ sylwetka człowieka jest zarówno
dynamiczna jak i elastyczna, sam człowiek posiada zróŜnicowany kolor ubioru o zróŜnicowanej teksturze.
Rozumienie i interpretacja zachowań człowieka w złoŜonych scenach nie jest więc
zadaniem trywialnym. Sam problem śledzenia człowieka zyskał spore zainteresowanie.
Powstało szereg podejść do rozwiązania problemu, które moŜna zakwalifikować do
jednej z dwóch kategorii [3]:
140
Adam Nowosielski, Krzysztof Kłosowski
 podejścia bazujące na wyglądzie (ang. appearance based),
 podejścia bazujące na modelu (ang. model based).
Metody z pierwszej grupy wykorzystują informację o kolorze, konturze lub teksturze śledzonej osoby. Ich niewątpliwą zaletą jest duŜa szybkość działania [3, 4]. Metody
bazujące na kolorze pozwalają na zgrubne oddzielenie obiektu od tła i jego śledzenie
(śledzenie obszaru połączonych komponentów, którego barwa odpowiada zarejestrowanemu modelowi). Ta grupa podejść mocno wykorzystywana jest w problemie detekcji
i śledzenia samej twarzy [3, 5]. Problem sprowadza się wówczas do zadania detekcji
twarzy i jej dalszego trakingu. Zagadnienie to zostało dobrze opisane w literaturze tematu [5]. Jego zastosowanie jest jednak ograniczone do środowisk, w których kamery
rejestrują obrazy twarzy na wprost.
O ile barwę i śledzenie obiektów na jej podstawie moŜna uznać za metodę globalną,
to podejścia wykorzystujące kontury wykorzystują juŜ informację lokalną, taką jak
układ krawędzi i linii. Przykład algorytmu detekcji ruchomych obiektów za pomocą
konturów (skutecznie radzący sobie z zadaniem śledzenia osób) znaleźć moŜna w pracy [4].
Druga grupa metod - bazujące na modelu – charakteryzują się duŜą semantyczną
wiedzą o śledzonym obiekcie. Podejścia te są jednak złoŜone obliczeniowo. Występujące problemy związane są przede wszystkim ze zmianą skali, translacją, obrotami i deformacjami śledzonych obiektów [3].
W dalszej części artykułu zostaną przedstawione moŜliwości oraz zalety wykorzystania automatycznego systemu analizy natęŜenia ruchu pieszych. Przedstawione zostaną załoŜenia budowy takiego systemu. Zaprezentowany zostanie sam system realizujący
postawione zadanie w oparciu o opracowany algorytm analizy sekwencji obrazów
z kamery.
2. ZałoŜenia
Wspomniane we wstępnie sytuacje, kiedy to człowiek „na piechotę” zlicza osoby,
które przemieściły się na danym obszarze w konkretną stronę pokazują, Ŝe istnieje potrzeba stworzenia takiego systemu. Na tym polu zaczynają pojawiać się juŜ rozwiązania
komercyjne, jednak są one zamknięte, a algorytmy na jakich bazują nie są ujawniane.
Informacja o liczbie przechodzących osób wykorzystywana jest przy analizie natęŜenia ruchu. UmoŜliwia wprowadzanie usprawnień i ułatwień w Ŝyciu codziennym
lokalnych społeczności. W przypadku centrów handlowych informacja o natęŜeniu
ruchu klientów moŜe zostać wykorzystana przy planowaniu np. obsady kas (zanim
jeszcze ustawią się kolejki i część klientów zrezygnuje z zakupów) czy lokalizacji stoisk
z promocjami. Porównując liczbę osób odwiedzających sklep do liczby wystawionych
paragonów, moŜna określić ile osób wychodzi bez zakupów, itd.
O ile powyŜszy przykład centrum handlowego ma typowy charakter komercyjny, to
analiza ruchu pieszych na skrzyŜowaniach moŜe mieć ogromne znaczenie w procesie
planowania ruchu komunikacyjnego w całych miastach. JeŜeli bowiem zauwaŜona
zostanie wyraźna tendencja, kiedy ludzie z jednego przystanku przemieszczają się na
inny przystanek, korzystna moŜe okazać się zmiana kursu danego środka komunikacji
miejskiej.
Analiza natęŜenia ruchu osób na monitorowanych obszarach
141
NajwaŜniejsze w takim systemie jest usytuowanie kamery,
kamery a tym samym rodzaj pozyskiwanych obrazów. W tradycyjnym systemie monitoringu, który mógłby stanowić
stanowi
źródło sekwencji wideo do systemu analizującego
cego natęŜenie,
nat
obrazy uzyskiwane są pod
znacznymi kątami, tak aby zarejestrować cała sylwetkę
sylwetk przechodniów [3, 6]. Podobna
sytuacja występuje
puje w przypadku systemów rozpoznawania twarzy. Kamery
K
instalowane
są w sposób umoŜliwiający pozyskiwanie obrazów twarzy jak najbardziej zbliŜonych
zbli
do
frontalnego [7, 8]. W tłumie idących ludzi rejestrowanym „z przodu” dochodzi do ciąci
głych przesłonięć.. Twarze tych samych osób znikają,
znikają ale pojawiają się po jakimś czasie
w innym miejscu kadru. W takich warunkach moŜliwa
Ŝliwa jest realizacja zadania zliczania
ludzi, jednakŜe naleŜy rozwaŜać wówczas mechanizmy śledzenia za twarzami, dopasowywania do siebie zgubionych
bionych twarzy z jednej klatki (zasłonięcie)
(zasłoni
do nowych twarzy
w kolejnych klatkach. Główna jednak niedogodność
ść wynikająca
wynikaj
z omawianego sposobu
instalacji kamery wynika z praktycznej niemoŜliwoś
liwości analizy ruchu w dwóch kierunkach. Do takiej analizy naleŜałoby uŜyć dwóch kamer, monitorujących
monitoruj
niezaleŜnie ruch
w przeciwległych kierunkach. Z dodatkową kamerą wiąŜe się cała infrastruktura (okablowanie, sprzęt analizujący
cy dane obrazowe z kamery), która podnosi koszt systemu
i wcale nie ułatwia realizacji zadania.
W związku z powyŜszym, dogodniejszą jest sytuacja, kiedy kamera umieszczona
bezpośrednio nad monitorowanym obszarem skierowana jest pionowo w dół. Rysunek 1
a) przedstawia przykład instalacji kamery bezpośrednio
średnio nad monitorowanym obszarem.
Kamera zamontowana jest na krawędzi dwóch ścian
cian budynku i skierowana jest pionowo
w dół. Po prawej stronie (rys. 1 b) widać jak moŜe
Ŝe wyglądać
wygl
obraz z takiej kamery.
a)
b)
Rysunek 1. Przykład instalacji kamery nad monitorowanym obszarem oraz obraz uzyskany
z takiej kamery
Dodatkowo załoŜono, Ŝe ruch
uch osób na monitorowanym obszarze powinien odbywać
się w pionie bądź w poziomie (typowy w przypadku monitorowania korytarza,
korytarza wejścia
lub tunelu). Proces monitoringu moŜe odbywaćć się zarówno wewnątrz budynków jak
i na zewnątrz. Jednak naleŜy unikać sytuacji, kiedy w scenie występują
wyst
bardzo silne
źródła światła, powodujące
ce powstawanie znacznych cieni.
3. Struktura systemu
Schemat opracowanego systemu analizującego
cego natęŜenie
nat
ruchu osób na monitorowanych obszarach zaprezentowany został na rys. 2.. W systemie wyróŜnić
wyró
moŜna nastę-
142
Adam Nowosielski, Krzysztof Kłosowski
pujące komponenty: blok obróbki wstępnej, blok detekcji osób, ekstraktor cech, blok
śledzenia osób, oraz generator statystyk.
Rysunek 2. Schemat działania systemu
KaŜdy element składowy systemu
temu poddany zostanie w dalszej części
cz
odpowiedniej
analizie.
3.1. Blok obróbki wstępnej
Na etapie obróbki wstępnej następuje
puje proces przygotowania kolejnych klatek ses
kwencji wideo do dalszej analizy. Realizowane są następujące
nastę
operacje:
 skalowanie,
 tworzenie obrazu poziomów szarości,
 maskowanie.
Skalowanie wejściowego obrazu stanowi waŜny
ny aspekt funkcjonowania całego syssy
temu. Operacja ta pozwala bowiem znacząco
co zmniejszyć
zmniejszy ilość przetwarzanych danych
w dalszych etapach. Nie
ie wnosi przy tym istotnych zakłóceń
zakłóce w procesie analizy natęŜenia ruchu pieszych. Dodatkowo pozwala wyeliminować z analizowanych obrazów szumy.
Przeprowadzone badania pokazały, Ŝe dla scen takich jak zaprezentowana na rys. 3
(pierwszy obraz), dopiero zmniejszenie rozdzielczości
ści poniŜej
poni
100 pikseli w poziomie
(zachowującc proporcje oryginalnego obrazu) prowadziło do zauwaŜalnych
zauwa
błędów,
objawiających się znacznym zmniejszeniem obiektów zainteresowania. Pierwszy obraz
z rys. 3 został przeskalowany do rozdzielczości
ci kolejno: 640x480, 160x120, 80x60.
Elipsą zaznaczono obszar zainteresowania, na którym znajdują
znajduj się dwie osoby. Kolejne
obrazy z rys.3 pokazują ów powiększony
kszony wycinek dla kaŜdej
ka
z rozdzielczości (znorma-
Analiza natęŜenia ruchu osób na monitorowanych obszarach
143
lizowany do jednego rozmiaru). Widoczna
oczna jest utrata jakości,
jako
jednak sylwetki osób są
cały czas rozróŜnialne.
Rysunek 3. Przykładowa scena oraz fragmenty przedstawiające
przedstawiaj
zbliŜenie sceny w róŜnych
rozdzielczościach
Najlepszy stosunek przyśpieszenia obliczeń względem
wzglę
jakości uzyskiwanych rezultatów osiągnięto, gdy rozdzielczość pomniejszonego obrazu zawierała się
si w przedziale
od 100 do 160 pikseli. NaleŜy przy tym podkreślić,
ślić, Ŝe
Ŝ podane parametry dotyczą scen
takich jak ta, zaprezentowana na rys. 3. W zaleŜności
Ŝności od wysokości,
wysoko
na jakiej zamontowana jest kamera oraz jej ogniskowej moŜna
na uzyskać
uzyska duŜą róŜnorodność obrazów,
obejmujące róŜne pole widzenia.
Druga z operacji obróbki wstępnej - tworzenie obrazu poziomów szarości
szaro – przygotowuje obraz do postaci wymaganej przez blok detekcji osób.
osób
Operacja maskowania ma za zadanie eliminację nieistotnych obszarów w scenie.
Określenie obszarów, w których następuje
puje ruch ludzi jest na obecnym etapie działania
systemu wykonywane manualnie przez operatora, w momencie uruchamiania systemu.
Celem tej operacji
racji jest zamaskowanie obszarów, gdzie moŜe
mo pojawiać się dodatkowy
szum lub obszarów, gdzie ludzie nie mogą się poruszać
porusza (a gdzie np. pojawiają się ich
cienie). Przykład zaprezentowany został na rys. 4.
Rysunek 4. Idea maskowania obszaru zainteresowań
zainteresowa
W lewej części rys. 4 występuje
puje przykładowy obraz z kamery nad peronem. Na tym
obrazie okręgiem
giem zaznaczono obszar, gdzie pojawiają się cienie osób idących nad peronem na pomoście.
cie. Ustawienie obszaru zainteresowań w postaci prostokąta (prawy obraz
na rys. 4) pozwala ograniczyć obszar poszukiwań
ń tylko do istotnego fragmentu, gdzie
pojawiają się ludzie i nie ma dodatkowych szumów.
Na wyjściu bloku obróbki wstępnej znajdująą się:
się obraz poziomów szarości (obraz
GRAYi) oraz obraz barwny RGB (obraz RGBi)
144
Adam Nowosielski, Krzysztof Kłosowski
3.2. Blok detekcji osób
Zadaniem kolejnego bloku, zgodnie z jego nazwą, jest detekcja osób w sekwencji
przetworzonych kolejnych klatek. Sam proces detekcji opiera się o budowę i analizę
obrazu róŜnicowego.
3.2.1. Budowa obrazu róŜnicowego
Technika obrazu róŜnicowego polega na porównywaniu bieŜącego obrazu do obrazu
wzorca. W efekcie tworzony jest obraz róŜnicowy, prezentujący róŜnice pomiędzy
badanymi obrazami.
Istnieją dwie podstawowe metody budowy obrazu róŜnicowego: w oparciu o bazowy obraz referencyjny oraz w oparciu o obraz z poprzedniego ujęcia. Wadą pierwszego
rozwiązania jest pojawiający się szum, który jest wynikiem zmiany natęŜenia oświetlenia, drgań kamery, przemieszczających się cieni itp. Z kolei obraz róŜnicowy, powstający z porównywania z obrazem z poprzedniej klatki, zawiera w sobie tylko informacje
o elementach, gdzie występowała istotna róŜnica. Eliminuje się w duŜym stopniu szum,
jednakŜe powstają szczeliny, gdy porusza się obiekt o jednolitej barwie (zmiany wychwytywane są wówczas na obrazie róŜnicowym jedynie na brzegach obiektu). MoŜe to
powodować rozdzielanie elementu na dwie lub więcej części.
W opracowanym systemie postanowiono generować obraz róŜnicowy jako sumę
dwóch obrazów róŜnicowych tworzonych w oparciu o obie przedstawione powyŜej
metody. Takie podejście osłabia indywidualne wady kaŜdego z podejść. Przez zastosowanie sumowania, wzmacnia sie wartość tych punktów, dla których róŜnica między
zdjęciem referencyjnym a bieŜącym była znaczna. Natomiast szumy generowane przez
kaŜdą z metod są osłabiane (kaŜda z metod generuje inny rodzaj szumu). W otrzymanym obrazie róŜnicowym, zniwelowane są teŜ wady obu metod. Generowane przez
drugą metodę przerwy w obiektach są „łatane” przez pierwszą metodę. Opisaną sytuację
przedstawia obrazowo rys. 5. W górnym wierszu znajdują się kolejno obrazy: bazowy,
z poprzedniej klatki i bieŜący. Dolny wiersz przedstawia kolejno obrazy róŜnicowe
uzyskane na podstawie: obrazu referencyjnego, obrazu z poprzedniej klatki i ostatni,
uzyskany jako połączenie obrazów róŜnicowych.
NaleŜy w tym miejscu podkreślić, Ŝe wprowadzając wagi do operacji tworzenia
wspólnego obrazu referencyjnego moŜna osłabić lub wzmocnić wpływ kaŜdego ze
składników. MoŜe to być konieczne, jeŜeli chociaŜby odstępy czasowe pomiędzy kolejnymi obrazami sekwencji będą zbyt długie.
3.2.2. Analiza obrazu róŜnicowego
Analiza obrazu róŜnicowego w prezentowanym systemie składa się z trzech kroków:
progowania, filtracji oraz indeksowania.
Zadaniem progowania jest takie przetworzenie obrazu róŜnicowego, aby wydobyć
istotne zmiany źródłowej sceny. Wybór progu odgrywa tu bardzo waŜną rolę, poniewaŜ
zbyt niski – spowodowuje wydobycie zarówno poruszających się obiektów jak i szumu,
zbyt wysoki – doprowadzi do podziału obiektów lub ich całkowitej eliminacji. Przykład
progowania obrazu róŜnicowego z róŜnymi wartościami progu przedstawiono na rys. 6.
Wynikiem jest obraz binarny.
145
Analiza natęŜenia ruchu osób na monitorowanych obszarach
a)
b)
c)
d)
e)
f)
Rysunek 5. Tworzenie obrazu róŜnicowego przez system: a) obraz referencyjny, b) obraz z poprzedniej klatki, c) obraz bieŜący, d) obraz róŜnicowy uzyskany na podstawie obrazu referencyjnego, e) obraz róŜnicowy uzyskany na podstawie obrazu z poprzedniej klatki, f) wynikowy obraz
róŜnicowy
Przeprowadzone eksperymenty na róŜnych scenach, w róŜnych warunkach oświetleniowych pokazały, Ŝe wartość progowa powinna być ustalana indywidualnie dla kaŜdej
sceny, obserwowanej przez konkretną kamerę. MoŜna rozwaŜać tutaj próbę zautomatyzowania tego procesu, jednakŜe na obecnym etapie prac takich działań nie podejmowano.
a)
b)
c)
d)
e)
Rysunek 6. Progowanie obrazu róŜnicowego; a) obraz róŜnicowy; obrazy binarne po progowaniu
z wartością b) 135; c) 55; d) 20; e) 5
146
Adam Nowosielski, Krzysztof Kłosowski
Pomimo zastosowania w systemie odpowiedniej wartości progowej, na obrazie wynikowym moŜe wystąpić szereg problemów: szum w postaci pojedynczych pikseli lub
ich niewielkich grup, rozdzielanie pojedynczych osób, łączenie grupy osób w jeden
wspólny obiekt. Przeprowadzono szereg prób wyeliminowania tych wad za pomocą
filtracji obrazu binarnego. Najlepsze rezultaty uzyskiwano stosując operację erozji
z małym elementem strukturalnym (przetwarzane w systemie są przeskalowane obrazy).
Operacja erozji usuwa skupiska pikseli o wielkości mniejszej niŜ element strukturalny.
MoŜe doprowadzić do rozłączania elementów stykających się. Cecha ta w zaleŜności od
ukształtowania obiektów po procesie progowania moŜe być korzystna lub nie. Zaletą
gdy rozdziela stykające się obiekty, a wadą gdy dzieli obiekt na mniejsze elementy.
Ostatni z etapów analizy obrazu róŜnicowego to indeksacja obszarów połączonych
komponentów, odpowiadających (z wyjątkami) przemieszczającym się osobom. Obrazy
binarne po procesie progowania i filtracji w wielu przypadkach zawierają obiekty za
małe, zwykle powstałe z rozpadu większych obiektów. Aby uniknąć ich indeksacji
przeprowadzany jest proces eliminacji. Obiekt, w zaleŜności od wielkości i rozpiętości
oraz odległości od pozostałych obiektów, moŜe zostać zaliczony do: usunięcia, przyłączenia, dodania.
Wynikiem bloku detekcji osób jest macierz, w której poszczególne komórki odpowiadają kolejnym pikselom obrazu. W komórkach tej macierzy, obszary połączonych
komponentów są oznaczone kolejnymi liczbami całkowitymi.
3.3. Blok śledzenia osób i ekstraktor cech
W celu realizacji zadania analizy natęŜenia ruchu osób na monitorowanych obszarach niezbędne jest określenie mechanizmu śledzenia osób. W literaturze tematu mechanizmy wykorzystujące ruch, model lub ich kombinacje uŜywane są do szacowania
(przewidywania) połoŜenia obiektu w kolejnej klatce, na podstawie klatki bieŜącej
i poprzednich. W pracy zastosowano podejście, w którym dokonuje się próby przyporządkowania osób z ujęcia bieŜącego do ich odpowiedników z ujęć poprzednich.
JeŜeli śledzony obiekt w przetworzonej bieŜącej klatce róŜnicowej Di ograniczony
jest prostokątnym obszarem: x1 ≤ x ≤ x2 i y1 ≤ y ≤ y2 , wówczas momenty zerowego
i pierwszego rzędu dane są wzorami:
M 00 =
M 10 =
M 01 =
x2
y2
∑ ∑ D ( x, y ) ,
x = x1 y = y1
x2
i
y2
∑ ∑ xD ( x, y) ,
x = x1 y = y1
x2
(1)
i
(2)
y2
∑ ∑ yD ( x, y ) .
x = x1 y = y1
i
(3)
Centroid śledzonego obiektu obliczyć wtedy moŜna następująco:
xc =
M 10
M
, yc = 01 .
M 00
M 00
(4)
Analiza natęŜenia ruchu osób na monitorowanych obszarach
147
Informacja o centrum obiektu stanowi pierwszy czynnik przy przyporządkowywaniu
obiektów wykrytych w bieŜącej klatce Di, do obiektów wykrytych w klatce poprzedniej
Di-1. JeŜeli w poprzedniej klatce znajduje się kilka obiektów o centroidach bliskich
badanemu obiektowi (z bieŜącej klatki) wówczas spośród nich wybierany jest najbardziej podobny.
Podobieństwo określane jest na podstawie informacji o kolorze. Z barwnej klatki
RGBi (nieprzetworzonej) wybierany jest wycinek x1 ≤ x ≤ x2 i y1 ≤ y ≤ y2 odpowiadający śledzonemu obiektowi. Dla tego wycinka liczony jest histogram o grubej dyskretyzacji z parametrem BIN = 8 według wzoru (dla kaŜdego kanału R, G i B osobno):
H RBIN ( b ) =
 256 
b
 −1
 BIN 
∑
j = ( b −1)
H R ( j) ,
(5)
HG ( j) ,
(6)
H B ( j) ,
(7)
256
BIN
 256 
b
 −1
 BIN 
H
BIN
G
(b) = ∑
j = ( b −1)
256
BIN
 256 
b
 −1
 BIN 
H
BIN
B
(b) = ∑
256
j = ( b −1)
BIN
gdzie: HR(j), HG(j), HB(j ) – oznaczają liczbę pikseli w danym kanale (R, G i B) o danej
intensywności j = 0, 1, …, 255 dla danego wycinka; b = 1, 2, …, BIN.
Budując histogramy dla kaŜdej składowej barwnej z parametrem BIN = 8, uzyskać
moŜna 24 elementowy wektor cech dla kaŜdej ze śledzonych osób
X = ( H RBIN , H GBIN , H BBIN ) . W procesie porównywania dwóch wektorów cech wykorzystywana jest miara odległości euklidesowej.
Ekstrakcja cech za pomocą histogramu nie jest skomplikowana, pozwala na szybką
realizację zadania w praktyce. Główna wada metody rozpoznawania obrazów na podstawie histogramu polega na utracie informacji przestrzennej. Okazuje się, Ŝe przy realizowanym zadaniu, cecha ta jest niezwykle korzystna. Kształt poruszających się ludzi
widzianych z góry ulega bowiem ciągłym deformacjom. Informacja o kolorze charakteryzuje się w tym względzie duŜo wyŜszą stabilnością. Potwierdziły to badania praktyczne na rzeczywistych sekwencjach.
Aby wykonać prawidłowe zliczanie osób wraz z uwzględnieniem kierunku ich
przemieszczania się, na scenie zostało rozmieszczonych 8 wirtualnych bramek. Zadaniem tych bramek jest podział sceny na sektory.
3.4. Generator statystyk
Zadaniem generatora statystyk jest zbieranie informacji o natęŜeniu ruchu osób na
monitorowanych obszarach oraz przygotowywanie tej informacji w graficznej formie
umoŜliwiającej jej dalszą analizę przez zainteresowane podmioty (np. kierownictwo
hipermarketu).
148
Adam Nowosielski, Krzysztof Kłosowski
W systemie generowane są dwa rodzaje statystyk: bieŜąca i historyczna. Pierwszy
rodzaj statystyk generowany jest w czasie rzeczywistym i opisuje liczbę obiektów znajdujących się w danej chwili na scenie – jest to bieŜące natęŜenie ruchu osób. Realizowane jest poprzez zliczanie wszystkich zindeksowanych i przyrównanych obiektów
jakie znajdują się w obrazie.
Drugi rodzaj statystyk związany jest z ujęciem badanej sceny w zadanym oknie czasowym. Przedstawia rozkład natęŜenia poruszających się osób w określonym czasie
z podziałem na dwa kierunki ruchu: góra i dół (odpowiednio: lewy i prawy).
4. Działanie systemu
Wymierną ocenę dowolnego systemu rozpoznawania obrazów (a z takim systemem
mamy do czynienia tutaj) moŜna otrzymać przeprowadzając odpowiednie testy na
wzorcowych bazach i wyznaczając odpowiednie wskaźniki. O ile istnieje ogromna
liczba benchmarkowych baz danych dla popularnych i szeroko podejmowanych problemów (np. rozpoznawanie twarzy), to w przypadku nowatorskich rozwiązań i mało
popularnych problemów, takich baz nie ma lub istnieją nieliczne, często z ograniczonym dostępem. DuŜym problemem jest wówczas porównywanie metod pomiędzy sobą
[3].
W związku z powyŜszym, w celu weryfikacji stworzonego systemu przygotowano
autorskie sekwencje, na których rejestrowany był ruch osób. Sceny rejestrowane były w
dwóch róŜnych lokalizacjach o róŜnym natęŜeniu ruchu. Jedne ujęcia kręcone były dla
ruchu poziomego, drugie dla pionowego. Czas trwania poszczególnych sekwencji wynosi kilka minut.
W przypadku pierwszej grupy sekwencji ruch osób odbywa się w kierunku poziomym. Przykład zawarty został na rys. 7. W lewej części występują dwie przykładowe
klatki sekwencji. Po prawej stronie zawarto statystyki generowane przez system, prezentowane w formie wykresu. Przy czym, wykres górny prezentuje całkowitą liczbę
rejestrowanych osób, natomiast wykresy dolne liczbę osób poruszających się w kaŜdym
kierunku. Fragment analizy przytoczony na górnym wykresie pokazuje, Ŝe przez scenę
przewijała się co chwilę jakaś osoba. W pewnym momencie występuje pik, który oznacza, Ŝe na scenie pojawiła się kolejna osoba, gdy juŜ jakaś była obecna (zarejestrowana). Z dolnych wykresów odczytać moŜna, Ŝe w momencie wystąpienia piku osoba
poruszająca się w lewo schodziła ze sceny, natomiast osoba pojawiająca się, podąŜała
w prawą stronę. Oba schodki zazębiają się, stąd na górnym wykresie występuje pik –
przez krótki moment na scenie znajdują się dwie osoby.
W przytoczonej sekwencji natęŜenie osób było niezbyt wysokie (maksymalnie dwie
osoby, na co miało wpływ odpowiednie wykadrowanie sceny w kadrze). Skuteczność
działania opracowanego systemu w takich warunkach dla róŜnych, kilkuminutowych
sekwencji była stuprocentowa.
W przypadku drugiego rodzaju sekwencji (przykład na rys. 8) uzyskano gorsze rezultaty. W scenach tego rodzaju rejestrowano duŜo większe natęŜenie ruchu pieszych.
Jednak nie to było główną przyczyną utraty skuteczności zliczania osób. NaleŜy zauwaŜyć, Ŝe w sekwencji tej umieszczenie kamery nie spełnia postawionych załoŜeń. Scena
rejestrowana jest pod znacznym kątem, osoby nie są widoczne z góry, lecz z ukosa.
Widoczne są ich sylwetki. W takich warunkach dochodzi do przesłonięć obiektów, ich
Analiza natęŜenia ruchu osób na monitorowanych obszarach
149
scalania w obrazie róŜnicowym,
nicowym, na co opracowany algorytm nie został przygotowany.
Z tego powodu rezultat na poziomie 60% skuteczności
skutecznoś naleŜy uznać tutaj za sukces.
Rysunek 7. Pierwszy rodzaj sekwencji oraz rezultat działania systemu
Rysunek 8. Drugi rodzaj sekwencji
encji oraz rezultat działania systemu
150
Adam Nowosielski, Krzysztof Kłosowski
5. Zakończenie
W artykule przedstawiono problem analizy natęŜenia ruchu osób na monitorowanych obszarach. Pokazano przy tym, Ŝe tradycyjny system monitoringu, który mógłby
stanowić źródło sekwencji wideo do systemu analizującego natęŜenie, niekoniecznie
spełnia wymagane kryteria jakości obrazów. Podobna sytuacja występuje w przypadku
systemów rozpoznawania twarzy. Zarówno w jednym jak i w drugim przypadku wymagane są inne kąty obserwacji. W związku z powyŜszym, w artykule opracowano metodę
zliczania dwukierunkowego ruchu ludzi na podstawie kamery instalowanej bezpośrednio nad obserwowaną sceną, skierowaną w dół.
Przeprowadzone eksperymenty pokazały skuteczność działania systemu. Pokazano
przy tym praktyczne moŜliwości wykorzystania systemu.
Na zakończenie naleŜy zauwaŜyć, Ŝe w obecnym stadium rozwoju system zlicza
wszystkie poruszające się obiekty. W sytuacji kiedy przez scenę przebiegnie pies, zostanie on faktycznie uwzględniony w końcowych statystykach. Czy istnieje rozwiązanie
tego problemu? Pomocny moŜe okazać się lepszy ekstraktor cech oraz przygotowana
baza ludzkich sylwetek widzianych z góry „okiem” kamery.
Bibliografia
[1] Woodward Jr. J. D., Horn Ch., Gatune J., Thomas A. Biometrics A Look at Facial
Recognition. Prepared for the Virginia State Crime Commission. RAND Public
Safety and Justice. 32 s., 2003
[2] Pentland A., Choudbury T. Face Recognition for Smart Environments. IEEE
Computer 33(2), 50–55, 2000
[3] Wang J. J., Singh S. Video analysis of human dynamics – a survey. Real-Time
Imaging 9(5), 321-346, 2003
[4] Yokoyama M., Poggio T. A Contour-Based Moving Object Detection and Tracking. Proceedings of the 14th International Conference on Computer Communications and Networks, 271-276, 2005
[5] Yang M., Kriegman D., Ahuja N. Detecting Faces in Images: A Survey. IEEE
Transactions on Pattern Analysis and Machine Intelligence 24(1), 34-58, 2002
[6] Kruegle H. CCTV Surveillance, Second Edition: Video Practices and Technology.
Butterworth-Heinemann, 672 s., 2006
[7] Kukharev G., Kuźmiński A. Techniki Biometryczne. Część 1. Metody Rozpoznawania Twarzy. Pracownia Poligraficzna WI PS, 310 s., Szczecin, 2003
[8] Nowosielski A. Identyfikacja człowieka na podstawie zestawu obrazów twarzy
pochodzących z sekwencji video. Metody Informatyki Stosowanej, Tom 13, Nr
1/2008, 113-125, 2008
Modelling and analysis of two-server networks
with finite capacity buffers and blocking
Walenty Oniszczuk
Bialystok Technical University, Faculty of Computer Science
Abstract:
The study presented in this paper is motivated by the performance analysis of two-server
networks with blocking. This model is based on the performance of the Markovian threenode queuing network with finite capacity buffers, for which new and practical results are
provided. Here, a two-dimensional state graph is constructed and set of steady-state
equations is created. These equations allow for calculating state probabilities for each
graph state. The results of investigation allow the definition of the region where the model
is useful and where the desired QoS is satisfied.
Keywords:
Two-server computer network with blocking, Quality-of-Service (QoS), Markovian exact
algorithm
1. Introduction
Queuing network models have been widely applied as a powerful tool for modelling,
performance evaluation of discrete flow system, such as computer and communication
networks, computer systems and production lines. Finite buffer queues and blocking
have been introduced and applied as more realistic models of systems with finite capacity resources. The main aim of this paper is to formulate such a model and examine the
queuing behaviour under blocking mechanism. To facilitate this aim, we study a Markovian queue network with a single server for each node and finite capacity buffers.
Classical queuing theory provides a conventional framework for formulating and solving the queuing network models (QNMs) with blocking. The variability of interarrival
and service times of jobs can be modelled by probability distributions.
Exact and approximate analytical methods have been proposed in the literature for
solving equations described system performance [1, 9, 10]. These techniques led to
efficient computation algorithms for analysis QNMs. However, there are still many
important and interesting finite capacity queues under various blocking mechanisms and
synchronization constraints to be analyzed [2-8].
2. The network model
Let us consider an open queuing model with blocking, with a single job class and
three nodes: a source station, a main server A and an additional server B (see Fig. 1).
152
Walenty Oniszczuk
Main server A
Source
∞
λ
m1
µ
A
m2
σ
1−σ
µB
Additional server B
Figure 1. Two-server network model
External jobs arrive at the server A according to a Poisson process with rate λ from
the source station. After service completion at the server A, the job proceeds to the
server B with probability 1 – σ, and with probability σ the job departs from the network.
Jobs leaving the server B are always fed back to the server A. The service times at each
server are exponentially distributed with rate µA and µB, respectively. The successive
service times at both servers to be mutually independent and independent of the state of
the network. A finite capacity buffer (with capacity m1 and m2) is allowed at the front
of each server. A job upon service completion at server A attempts with probability 1 –
σ to join server B. If server B at such epoch is full, then the first server must hold the
completed job and becomes blocked (i.e., not available for service on incoming jobs)
until the second server completes service. The nature of the service process in this case
depends only of the service rates in station B. It allows one to treat this job as located in
additional places in the buffer B. Similarly, if the first buffer (with capacity m1) ahead
of the first station is full, then the source station or the server B is blocked. In this case,
the nature of the service process depends only of the service rates in station A and we
can treat these jobs as located in additional places in the buffer A. There can be a maximum of m1+3 jobs assigned to the first servicing station including the jobs in the source
and server B that can be blocked. Similarly, there can be a maximum of m2+2 jobs
assigned to station B with a job blocked in server A (see Figure 2).
In this special type of multistage network with blocking a deadlock may occur. For
example, let us suppose that server A is full and server B blocks it. A deadlock will
occur if the job in service at server B must be sent to server A upon completion of its
service. We assume that a deadlock is detected instantaneously and resolved without
any delay time by exchanging all the blocked jobs simultaneously. Generally, blocking
phenomena is the very important mechanism for controlling and regulation the intensity
of an arriving jobs stream, which comes from the source station to the servicing stations. The arrival rate to the first server depends on the state of the network and blocking factor reduces the rate at which source is sending traffic to this server.
Modelling and analysis of two-server networks with finite capacity buffers and blocking
153
3. Performance analysis
The queuing network model described in Section 2 is a continuous-time homogeneous Markov chain. The queuing network model reaches a steady-state condition and the
underlying Markov chain has a stationary state distribution. The underlying Markov
process of a queuing network with finite capacity queues has finite state space. In this
class of network we may denote the state of the network by the pair (i,j), where i represents the number of jobs in server A and j denotes the number in server B (including the
jobs in service or in blocking). For any nonnegative integer values of i and j, (i,j) represents a feasible state of this queuing network, and pi,j denotes the probability for that
state in equilibrium. These states and the possible transitions among them are shown in
Figure 2. The flux into a state of the model is just given by all arrows into the corresponding state, and the flux out of the state is determined from the set of all outgoing
arrows from the state. The arrows indicate the only transitions that are possible for this
model. Transitions from top to bottom represent a change of state due to an arrival from
the source station. Diagonal transitions from left to right or from right to left represent a
change of state due to a job completing service at server A or at server B. Finally, transitions indicated by bottom to top arrows represent a change of state due to departures
from the network. This occurs at rate µAσ. The state diagram of the blocking network
(see Fig. 2) contains all possible non-blocked states (marked by ellipse) as well as the
blocking states (marked by rectangle). The number of states in the blocking network is
the sum of all possible non-blocking states plus all the blocking states:
(m2+2)(m1+3)+m1+2+m2+1. Based on an analysis the state space diagram, the process of constructing the steady-state equations in the Markov model, can be divided into
several independent steps, which describe some similar, repeatable schemas. These
steady-state equations for the non-blocking states are:
λ · p0,0 = µAσ · p1,0
for j = 1, ..., m2+1
(λ + µB) · p0,j = µA(1-σ) - p1,j-1 + µAσ · p1,j
(λ + µAσ + µA(1-σ)) · pi,0 =
λ · pi-1,0 + µ B · pi-1,1 + µAσ · pi+1,0
for i = 1, ..., m1
A
(λ + µ σ + µA(1-σ) + µ B) · pi,j = λ · pi-1,j + µ B · pi-1,j+1 + µAσ · pi+1,j +
+ µA(1-σ) · pi+1,j-1
for i = 1, ..., m1, j = 1, ..., m2+1
A
A
(λ + µ σ + µ (1-σ)) · pm1+1,0 =
λ · pm1,0 + µ B · pm1,1 + µAσ · pm1+2,0 + µAσ · pm1+3,0
(λ + µAσ + µA(1-σ) + µ B) · pm1+1,j = λ · pm1,j + µ B · pm1,j+1 + µAσ · pm1+2,j +
+ µAσ · pm1+3,j + µA(1-σ) · pm1+3,j-1
for j=1,...,m2
(λ + µAσ + µA(1-σ) + µ B) · pm1+1,m2+1 = λ · pm1,m2+1 + µ B · pm1,m2+2 +
+ µAσ · pm1+2,m2+1 + µA(1-σ) · pm1+3,m2
(1)
154
Walenty Oniszczuk
Figure 2. States and transition diagram for a two-server network with blocking
Modelling and analysis of two-server networks with finite capacity buffers and blocking
155
For states with blocking the equations are:
(λ + µB) · p0,m2+2 = µA(1-σ) · p1,m2+1
(λ + µ B) · pi,m2+2 = λ · pi-1,m2+2 + µA(1-σ) · pi+1,m2+1
µ B · pm1+1,m2+2 = λ · pm1,m2+2 + µA(1-σ) · pm1+2,m2+1
(µAσ + µA(1-σ)) · pm1+2,j = λ · pm1+1,j
(µAσ + µA(1-σ)) · pm1+2,m2+1 = λ · pm1+1,m2+1 + µ B · pm1+1,m2+2
(µAσ + µA(1-σ)) · pm1+3,j = µ B · pm1+1,j+1 (server B blocking)
for i = 1, ... , m1
for j = 0, ... , m2
(2)
for j = 0, ... , m2
Here, a queuing network with blocking, under appropriate assumptions, is formulated as a Markov process and the stationary probability vector can be obtained using
numerical methods for linear systems of equations [10]. The generation of the rate matrix Q can now be accomplished by going through the list of states and generating all
the feasible transitions out of each state and the associated rate of transition. For homogeneous Markov processes in steady state, we simply have:
xQ = 0
(3)
where x is the stationary probability vector whose l-th element xl is the steady-state
probability that the system is in state l. Vector x can be obtained from (3) and the normalizing condition for all network states ∑ xl = 1 , using equation-solving techniques.
4. Performance measures
The procedures for calculating the main measures and the quality of service (QoS)
parameters use the steady-state probabilities in the following manner:
1. The average number of blocked jobs in the source node:
n blS =
2.
+ 1 ⋅ p m1+1,m 2 + 2
(4)
m1+1
∑ 1 ⋅ pi ,m 2+ 2
(5)
i =0
m2
∑ 1 ⋅ p m 1+ 3 , j
(6)
j =0
The average number of active (non-blocked) jobs in server A:
lA =
5.
j =0
The average number of blocked jobs in server B:
n blB =
4.
∑ 1 ⋅ p m 1+ 2 , j
The average number of blocked jobs in server A:
nblA =
3.
m 2 +1
m 1+ 2 m 2 + 1
∑ ∑ 1 ⋅ pi , j
i =1
j =0
+
m2
∑ 1 ⋅ p m 1+ 3 , j
(7)
j =0
The average number of active (non-blocked) jobs in server B:
lB =
m 1+ 2 m 2 + 1
∑ ∑ 1 ⋅ pi , j
i =0
j =1
+
m 1+ 1
∑ 1 ⋅ pi ,m 2 + 2
i =0
(8)
156
Walenty Oniszczuk
6. The average number of jobs in the first buffer vA :
vA =
+
7.
m1+ 1 m 2 + 1
m 2 +1
i =2
j =0
i =1
j =0
(9)
m2
∑ m1 ⋅ p m1+3 , j + m1 ⋅ p m1+1,m2 + 2
j =0
The average number of jobs in the second buffer vB :
vB =
8.
m1
∑ ∑ ( i − 1 ) ⋅ p i , j + ∑ i ⋅ p i ,m 2 + 2 + ∑ m1 ⋅ p m1+ 2 , j
m 1+ 2 m 2 + 1
∑ ∑ ( j − 1 ) ⋅ pi , j
i =0
j =2
+
m 1+ 1
∑ m 2 ⋅ p i ,m 2 + 2
i =0
+
m2
∑ ( j − 1 ) ⋅ p m 1+ 3 , j
(10)
j =2
The mean blocking time in source node:
1
t blS = nblS ⋅ A
(11)
The mean blocking times in server A:
1
t blA = n blA ⋅ B
(12)
10. The mean blocking times in server B:
1
t blB = n blB ⋅ A
(13)
11. The mean waiting time in the buffer A:
1
w A = v A ⋅ ( A + t blA )
(14)
12. The mean waiting time in the buffer B:
1
wB = v B ⋅ ( B + t blB )
(15)
13. The mean response time of jobs at the server A:
1
q A = w A + A + t blA
(16)
14. The mean response time of jobs at the server B:
1
q B = w B + B + t blA
(17)
15. The mean overall network response time of a job:
1
1
1−σ
t res = + t blS + ⋅ q A +
⋅ qB
(18)
µ
9.
µ
µ
µ
µ
µ
µ
λ
σ
σ
where 1/σ and (1- σ)/ σ are the number of visits of a job to the server A and B respectively (simply obtained from the traffic equations – see Fig.1).
16. Servers A and B utilization parameters:
ρA = lA + nblA and
ρB = lB + nblB
(19)
157
Modelling and analysis of two-server networks with finite capacity buffers and blocking
17. Source node blocking probability pblS :
p blS =
m 2 +1
∑ p m1+ 2 , j + p m1+1,m 2 + 2
(20)
j =0
18. Server A blocking probability pblA :
p blA =
m1+1
∑ pi ,m2+ 2
(21)
i =0
19. Server B blocking probability pblB :
p blB =
m2
∑ p m1+ 3, j
(22)
j =0
20. The effective input rate (intensity):
1
λ eff =
1
+ t blS
(23)
λ
5. Numerical results
In this section, we present numerical results of the network with blocking that illustrate the qualitative analysis of the previous sections. To demonstrate this, the following
configuration was chosen: the inter-arrival rate λ from the source station to server A is
changed within a range from 0.5 to 5.0. The service rates in server A and server B are
equal to: µA = 1.2, µB = 3.0. The depart probability σ is chosen as 0.3 and the buffer
capacities are equal to: m1 = 4, m2 = 5.
For this model with blocking, the following results were obtained; the majority of
them are presented in Figure 3 and Table 1.
Measures of effectiveness(1)
1,2
Results
1
blA-pr
0,8
blS-pr
blB-pr
0,6
lam-eff
0,4
util-A
0,2
util-B
0
0
0,5
1
1,5
2
2,5
3
3,5
4
4,5
5
5,5
Input stream intensities
Figure 3. Graphs of QoS parameters, where, blA-pr is the server A blocking probability, blS-pr is
the source station blocking probability, blB-pr is the server B blocking probability, lam-eff is the
effective input rate and util-A and util-B are server utilization coefficients.
158
Walenty Oniszczuk
Table 1. The measures of effectiveness
λ
wA
wB
tblA
tblS
tres
tblB
0.5
1.651 0.030
0.000
0.048
11.259
0.036
1.0
3.217 1.485
0.018
0.269
19.760
0.294
1.5
3.439 2.260
0.037
0.430
22.119
0.260
2.0
3.495 2.310
0.046
0.518
22.253
0.209
2.5
3.524 2.268
0.051
0.574
22.143
0.174
3.0
3.542 2.214
0.054
0.612
22.005
0.143
3.5
3.555 2.165
0.057
0.641
21.877
0.131
4.0
3.564 2.122
0.059
0.662
21.766
0.117
4.5
3.571 2.085
0.060
0.679
21.672
0.105
5.0
3.576 2.054
0.062
0.693
21.590
0.096
For the second group of experiments the following parameters were chosen: the
service rates in station A and station B are equal to µ1 A = 4.6, µ B = 2.7. The interarrival rate λ from the source station to station A is 2.4. The feedback probability 1-σ is
changed within a range from 0.1 to 0.9 (with step 0.1). Buffer capacities are: m1 = 3
and m2 is changed within the range from 1 to 9 (with step 1). For this model the following results were obtained and the majority of them are presented in Table 2 and Fig. 4.
Table 2. The measures of effectiveness
m2
1-σ
vA
vB
lA
lB
ρA
1
0.1
0.525
0.008
0.556
0.093
0.557
2
0.2
0.636
0.048
0.605
0.202
0.607
3
0.3
0.781
0.157
0.660
0.333
0.662
4
0.4
0.983
0.452
0.716
0.491
0.725
5
0.5
1.363
1.463
0.756
0.689
0.808
6
0.6
2.347
4.777
0.661
0.891
0.950
7
0.7
2.802
6.719
0.539
0.932
0.993
8
0.8
2.942
7.905
0.464
0.955
0.999
9
0.9
2.989
8.969
0.411
0.978
1.000
Modelling and analysis of two-server networks with finite capacity buffers and blocking
159
Measures of effectiveness(2)
Results
1,2
1
blA-pr
0,8
blS-pr
0,6
blB-pr
util-A
0,4
util-B
0,2
0
0
1
2
3
4
5
6
Buffer m2 capacity
7
8
9
10
Figure 4. Graphs of QoS parameters, where, blA-pr is the server A blocking probability, blS-pr is
the source station blocking probability, blB-pr is the server B blocking probability and util-A and
util-B are server utilization coefficients.
The results of the experiment clearly show that the effect of the blocking is the very
important mechanism for controlling and regulation the intensity of an arriving jobs
stream. As noted above, feedback probability 1-σ and blocking factor considerably
change the performance measures in such networks. Figures 3-4 and Tables 1-2 illustrate dependencies of measures of effectiveness and QoS parameters on the blocking
mechanism and on the feedback probability.
6. Conclusions
In this paper, we investigated the problem of analytical (mathematical) modelling
and calculation of the stationary state probabilities for a two-server computer network
with blocking. We have developed an analytical, queuing-base model for the blocking
characteristics in this kind of network. Fundamental performance characteristics and
stations blocking probabilities of such network were derived, followed by numerical
examples. The results confirm importance of a special treatment for the models with
blocking and feedback, which justifies this research. The results can be used for capacity planning and performance evaluation of real-time computer networks where blocking and feedback are presented. Moreover, this proposal is useful in designing buffer
sizes or channel capacities for a given blocking probability requirement constraint.
Acknowledgements:
This work is supported by the Bialystok Technical University S/WI/5/08 grant.
160
Walenty Oniszczuk
References
[1] Balsamo S., de Nito Persone V., Onvural R. Analysis of Queueing Networks with
Blocking. Kluwer Academic Publishers, 2001.
[2] Balsamo S., de Nito Persone V., Inverardi P. A review on queueing network models
with finite capacity queues for software architectures performance predication.
Performance Evaluation, No 51(2-4), 2003, pp. 269-288.
[3] Badrah A., et al. Performance evaluation of multistage interconnection networks
with blocking – discrete and continuous time Markov models. Archiwum Informatyki Teoretycznej i Stosowanej, No 14(2), 2002, pp. 145-162.
[4] Boucherie R.J., van Dijk N.M. On the arrival theorem for product form queueing
networks with blocking, Performance Evaluation , No 29(3), 1997, pp. 155-176.
[5] Economou A., Fakinos D. Product form stationary distributions for queueing networks
with blocking and rerouting. Queueing Systems, No 30(3/4), 1998, pp.
251-260.
[6] Gomez-Corral A., Martos M.E. Performance of two-stage tandem queues with
blocking: The impact of several flows of signals. Performance Evaluation, No 63,
2006, pp. 910-938.
[7] Mei van der R.D. et al. Response times in a two-node queueing network with feedback. Performance Evaluation, No 49, 2002, pp. 99-110.
[8] Oniszczuk W. Analysis of an Open Linked Series Three-Station Network with
Blocking. In: Pejaś, J., Saeed K. (eds) Advances in Information Processing and Protection, Springer Science+Business Media, LLC, 2007, pp. 419-429.
[9] Perros H.G. Queuing Networks with Blocking. Exact and Approximate Solution.
Oxford University Press, 1994.
[10] Stewart W. J. Introduction to the Numerical Solution of Markov Chains. Princeton
University Press, 1994.
Directed threshold signcryption scheme from bilinear
pairing under sole control of designated signcrypter
Jerzy Pejaś
Faculty of Computer Science and Information Technology,
Szczecin University of Technology
Abstract:
The paper presents a new ID-based directed threshold signcryption scheme derived from
a bilinear pairing and a gap Diffie-Hellman groups. This scheme (called ID-DTS-DS)
combines the functionalities of signature and encryption and allows a designated
signcrypter to prepare an encrypted and a signed message in cooperation of an
authorized subset of shareholders being in a possession of the shares related to a
signcrypter’s private key. Furthermore, our scheme guarantees that an encryption and a
signature process can be successfully finished only under a signcrypter’s sole control. We
analyze the security of the ID-DTS-DS scheme and show its correctness, verifiable
directedness, public non-repudiation verifiability, confidentiality and unforgeability.
Keywords:
Bilinear pairing, signcryption scheme, ID-based signatures, directed signature scheme.
1. Introduction
European Union Directive on electronic signature [[24]] defines the legal framework
for usage of the electronic signature. This framework describes the conditions under
which an electronic signature can be regarded as legally equivalent to handwritten signature. The most important is one of them: an electronic signature should be created using
a secure-signature-creation system that the signatory can maintain under his sole control.
This requirement means that both in private and in public environments a signatory
must protect his or her private key against its unauthorized usage. Of course, the protection of the signing process is crucial issue in public and distributed environment where
some components of a secure-signature-creation system are located in different places
(see W. Chocianowicz, et al. [[25]]).
Other problems concern different security expectations of a signatory and a message
receiver, respectively. A signatory usually is signing some document having given receiver in his/her mind. For example this is the case when two parties are going to enter
into an agreement and the signatory wants to sign this agreement with particular wellknown receiver. On other side, in many practical cases a receiver of signed document
wants the third party to know nothing on the origin and validity of the signed message
without the help of the signatory or the designated verifier (some examples are signatures on medical records, tax information or messages of routing protocols). The signa-
162
Jerzy Pejaś
tures meet such requirements are called directed signatures and have been first proposed
by C. H. Lim, P.J. Lee [[1]].
Due to the directed signatures properties only the designated verifier can be convinced that the signature actually comes from a proper signer. However, in the case
when the dispute between a signatory and a verifier rises, both the signatory and the
designated verifier in a directed signature scheme can prove to a third party that the
signature is valid (see R. Lu, at el. [[2]]).
A directed signature scheme ensures authenticity and non-repudiation of message,
but in many situations we want to enjoy also confidentiality of signed message. In
another words, we need to achieve these features simultaneously. A traditional approach
to achieve this objective is to “sign-then-encrypt” the message, or employing special
cryptographic schemes otherwise. In 1997, Yuliang Zheng proposed a novel public key
cryptographic primitive that combines encryption and signing in one step at a lower
computational cost, which is called signcryption (Y. Zheng [[3]]). Y. Zheng claimed
that the schemes provide data integrity, confidentiality and non-repudiation. But the
achieving of the non-repudiation for signcryption is not trivial task. The reason is simple: the signcrypted message is “encrypted”.
The solution of a signcryption non-repudiation problem gives, for example,
J. Malone-Lee (J. Malone-Lee [[4]]). His non-interactive scheme seems to be more
attractive for the solution of a signcryption non-repudiation problem. Simply, a
signcrypted message is always containing a non-repudiation evidence – we need extract
it only from there and deliver to the third party for verification in the case when some
dispute a rising.
Directed signature non-repudiation causes difficulties like for signcryption schemas.
However, now the signature receiver is able to prove the validity of the signature to any
third party without disclosure his/her private signing key. Some examples of such
schemes are presented in the papers of R. Lu et al. [[2]] and M. Kumar [[5]].
The current work is the extended and modified version of the article [[26]], which
contains some proposal for combining a directed signature scheme and a signcryption
scheme. We believe that due to special form of the proposed signcryption schema (IDDTS-DS), a signing process can be under sole control of a designated signcrypter. This
special form is based on a three basic building blocks: a verified Shamir’s secret sharing
scheme (R. Lu at el. [[2]], A. Shamir [[6]]), a concept of ID-based system (A. Shamir
[[7]], D. Boneh, M. Franklin [[8]]) and a bilinear pairing (A. Joux, K. Nguyen [[10]],
E.M. Ng [[11]]). The most important is a ID-based key generation process that is supervised by a signcrypter and a receiver, as well (see Section 4). This means that each of
them is in possession a secret unknown to another one.
The rest of this paper is organized as follows. In the next section, we summarize the
bilinear pairings and complexity assumptions on which we build. Then, in Section 4 we
define a secure Verifiable Secret-Sharing scheme. Our new (m, n) directed threshold
signcryption (ID-DTS) scheme is presented in Section 5, followed by the security analysis. Finally, we conclude the paper in Section 6.
Directed threshold signcryption scheme from bilinear pai ring…
163
2. Bilinear pairing maps hard cryptographic problems
Bilinear pairing (see D. Boneh [[12]], D. Boneh and M. Franklin [[8]]) is a mathematical structure that is recently applied extensively in cryptography. It gives rise to
many cryptographic schemes that are yet to be (efficiently) constructed using other
cryptographic primitives, e.g. aggregate signature (D. Boneh et al. [[13]]) and short
signature (D. Boneh et al. [[14]]). We summarize some concepts of bilinear pairings
using similar notions in [[9], [15]]. Let ( G1 , +) and ( G2 , +) be two cyclic abelian
groups of prime order q (sometimes G1 is also written multiplicatively in the literature).
Below, we consider P and Q as two generators of G1 (aP and bQ denote P and Q added
to itself a and b times, respectively). Because q is prime, it turns out that qP = O (here
O denotes the zero element of G1 ); we say that P has order equal to q and that P generates
G1* (= G1 \{ O }).
The bilinear pairing is given as e: G1 × G1 → G2 that is:
 bilinear: e(aP, bQ) = e(abP, Q) = e(P, abQ) = e(P, Q)ab for all P, Q∈ G1 and all
a, b ∈ Z*q; equivalently, this can be restated in the following way: for P, Q, R ∈ G1 ,
e(P+Q, R) = e(P, R) e(Q, R) and e(P, Q+R) = e(P, Q) e(P, R);
 non-degenerate: there exists P, Q∈ G1 such that e(P, Q) ≠ 1∈ G2 ; in other words, if
P and Q are two generators of G1 , then e(P, Q) is a generator of G2 ;
 computable: given P, Q∈ G1 , there is an efficient algorithm to compute e(P, Q).
To construct the bilinear pairing, we can use the modified Weil pairing (D. Boneh,
M. Franklin [[8]]) or Tate pairing (P. S. L. M. Berreto, et al. [[16]]) associated with
supersingular elliptic curves (see also A. Menezes [[17]] for a description of these pairing). With such group G1 we can define the following hard cryptographic problems
(e.g. H. Ker-Chang Chang et al. [[18]]):
 Discrete logarithm (DL) problem: given P, P’∈ G1 , find an integer n such that
P’ = nP whenever such integer exists.
 Computational Diffie–Hellman (CDH) problem: given a triple (P, aP, bP)∈ G1
for unknown a, b, c ∈ Z*q, find the element abP.
 Decision Diffie–Hellman (DDH) problem: given a quadruple (P, aP, bP, cP) ∈ G1
for unknown a, b, c ∈ Z*q, decide whether c ≡ ab (mod q) or not. If so, (P, aP, bP,
cP) is called a valid Diffie-Hellman tuple.
 Gap Diffie–Hellman (GDH) problem: a class of problems where the CDH problem
is hard but DDH problem is easy, i.e. it is possible to solve a given instance of a
triple (P, aP, bP)∈ G1 of a CDH problem with the help of a DDH oracle (see A.
Joux, K. Nguyen [[11]]) that is able to decide whether a tuple (P, a’P, b’P, c’P) ∈
G1 is such that c’ ≡ a’b’ (mod q).
 Modified Generalized Bilinear Inversion (mGBI) problem (J. Baek [[20]]): given
h∈ G2 and P∈ G1 , compute P’∈ G1 such that e(P, P’) = h.
164
Jerzy Pejaś
 Weak Diffie-Hellman Problem (WDH) problem (M. Choudary Gorantla, et al.
[21]): for Q∈ G1 and for some a ∈ Z*q, given (P, Q, aP) compute aQ.
The DDH problem is easy in G1 (D. Boneh [[12]], A. Joux, K. Nguyen [[10]]). Such
groups where the CDH problem is hard but the DDH problem is easy are called Gap
Diffie-Hellman (GDH) groups. Details about GDH groups can be found in D. Boneh,
et al. [[13]].
3. Definitions for secure Verifiable Secret-Sharing (VSS)
Let us assume that user S1) with identity IDS generates secret (a private key) SkS∈ G1
and wants to distribute it among n parties (e.g. n servers) indexed in a set A = {1, 2, …,
n} (|A| = n), such that any m of the shareholders (1 ≤ m ≤ n < q) can find SkS if necessary, but less than m shareholders get no information about SkS (we say that authorized
subset B∈PS(A)2), |B|=m forms an access structure Γ (Pm , n ) of some (m, n) threshold
scheme). To construct our new ID-DTS-DS scheme we need to use the following secure
Verifiable Secret-Sharing (VSS) scheme (based on Shamir’s secret-sharing scheme)
over the group G1 (compare also J. Baek, Y. Zheng [[19]]):
Distribution phase: a signcrypter S distributes his/her private key to n parties
(servers) as follows:
(a) chooses uniformly at random the coefficients ai=piP∈ G1* for 1 ≤ i ≤ m-1,
where pi∈ Z*q;
(b) defines a function f : A→ G1 such that
f ( x ) = ∑ j =0 a j x j
m −1
(1)
and sets a0 = SkS.
(c) uses the polynomials f(x) for shares’ generation, i.e. the values f(i) = Si, for
every i ∈ A;
(d) sends these shares to the corresponding i ∈ A over private channels;
(e) computes the corresponding verification key βi = e(P, Si) for every i ∈ A and
evidences α0 = e(P, SkS),αj = e(P, pjP) for 1 ≤ j ≤ m-1;
(f) to each i ∈ A sends Si and all evidences αj for 0 ≤ j ≤ m-1 over the broadcast
channel; these values can be additionally published and may be accessed by
any shareholder.
Verification phase: after receiving data from a signcrypter S, every party i ∈ A verifies its share Si (while keeps it secret) according to the following procedure:
(a) calculates value e(P, Si) and compares it with the value βi received from
a signcrypter S;
(b) checks whether its share Si is valid by computing:
1)
2)
In our scheme such user will be named as a signcrypter.
PS(X) means a power set of X and is the set of all subsets of X.
165
Directed threshold signcryption scheme from bilinear pai ring…
e ( P, Si ) = ∏ j = 0 α ij
m −1
j
(2)
(c) chooses any authorized subset B∈PS(A) and verify that:
e ( P, Sks ) = ∏ β j
λj
j∈B
where λ j =
i
, λ j ∈ Z q*
i
j
−
i∈B , i ≠ j
∏
(3)
If the conditions (a), (b) and (c) hold, a party broadcasts a “commit” message; otherwise, broadcasts an “abort” message. The last condition may be repeated for others authorized subsets B, until a shareholder assures he possesses the share really
related to the secret SkS.
Reconstruction phase: for some authorized subset B∈PS(A) the secret (the private
key) SkS can be reconstructed by computing:
f ( 0 ) = Sk S = ∑ λ j S j
j∈B
where λ j =
i
, λ j ∈ Z q*
i
j
−
i∈B ,i ≠ j
∏
(4)
It can be shown that above VSS scheme is correct, i.e. we can state and prove
a following modified lemma formulated by J. Baek in [[20]]:
Lemma 1. In VSS scheme, shares held by all the uncorrupted parties can be interpolated to a unique f(x) function degree m-1, and m or more of these shares can reconstruct
the secret SkS. Also, equations (2) and (3) provide a correct procedure for checking the
validity of each share.
Proof. Below, we prove only the correctness of equations (3), while J. Baek in [[20]]
proves the remaining part of this lemma. If each share Si is correct, then by using publicly known verification key βi = e(P, Si) we have:
∏β
j∈B
λj
j
= ∏ e ( P, S j ) = ∏ e ( P, λ j S j )
λj
j∈B
(
) (
j∈B
)
(
= e P, λ j1 S j1 e P, λ j2 S j2 … e P, λ jm S j m
)
(5)


= e  P, ∑ λ j S j  = e ( P, Sks )
 j∈B

The last equality in equation (5) results from equation (4).
4. New ID-based Directed Threshold Signcryption (ID-DTS) scheme
The ID-DTS-DS scheme involves four roles: a trusted party as a Private Key Generator (PKG), a signcrypter S in a double role – as a trusted dealer and a signature composer, a message signer group of n parties (n servers) indexed in a set A (see Section 3)
and a message recipient R. It consists of six algorithms: Setup, KeyGenerate, Share,
Signcrypt, Unsigncrypt (if private verifiability is sufficient) and PVerify (if public
verifiability is needed).
166
Jerzy Pejaś
Setup: This algorithm is run by the PKG to generate its master public key pair and
all necessary common parameters. Let G1 be a cyclic additive group of order q≥ 2k
generated by P ∈ G1 , where q is a prime number and k is a security parameter, and G2
be a cyclic multiplicative group of the same order q. There is specified the bilinear map
e: G1 × G1 → G2 , picked a master key s uniformly at random from Z*q and computed
a public key Ppub = sP. Then, three secure hash functions are selected: H1: {0, 1}* → G1*
and H2: {0, 1}* × G2 → {0, 1} t ⊆ Z*q. Let E and D be the encryption and decryption
algorithms of a symmetric key cipher. Then PKG publishes system parameters params
= { G1 , G2 , e, q, t, P, Ppub, H1, H2, E, D}, and keeps s as a master-key, which is known
only himself.
KeyGenerate: This protocol allows generating the keys for the sender (a signcrypter
S with identity IDS) and the recipient (the designated verifier R with identity IDR) of the
message. The protocol is taken from the work of M. Choudary Gorantla, et al. [[21]]
and eliminates the need of secure channel and a signcrypter’s chosen secret value to
avoid the key escrow problem, which is an inherent drawback in ID-based cryptosystems. In this protocol each user first calculates his/her parameters user-params and
sends them to the PKG along with his/her identifier. Then, the key generation is made
as follows (see M. Choudary Gorantla, et al. [[21]]):
(a) A signcrypter S chooses two secret values sS1, sS2∈Z*q, calculates his/her user
parameters as user-params = {sS1sS2PkS, sS1PkS, sS2P, sS1sS2P}, where PkS =
H1(IDS||tS) and sends them to the PKG. A signcrypter S also sends his/her identifier IDS along with the user-params. The parameter tS means the validity period of the publicly calculated component PkS.
(b) PKG verifies IDS, recalculates PkS and checks whether the equalities
e(P, sS1sS2PkS) = e(sS1sS2P, PkS) = e(XS, sS1PkS) hold good, where XS = sS2P. If
not, it aborts the process.
(c) PKG calculates SkS’ = s sS1 sS2 PkS and public validation parameters PS = (XS,
YS, tS) where YS = sXS = ssS2P.
(d) PKG sends SkS’ to S and publishes PS.
(e) A signcrypter S verifies the correctness of SkS’ by checking e(SkS’,
P) = e(sS1sS2PkS, Ppub). A signcrypter S also verifies whether the published public key component XS is equal to s2P and calculates SkS = ( sS 1 ) SkS’ after suc−1
cessful verification.
The same calculations PKG makes also for the recipient R. Finally, the SkS and PkS
are the private and public key of the signcrypter S, while SkR and PkR the private and
public key of the recipient R. The sender and recipient keep their private keys secret and
everyone is able to calculate public keys and validates these values using appropriate
public validation parameters PS = (XS, YS, tS) where YS = sXS = ssS2P or PR = (XR, YR, tR)
where YR = sXR = ssR2P, which belong to the sender S or recipient R, respectively.
Share: A designated signcrypter associated with an identity IDS performs the distribution phase of the secure Verifiable Secret-Sharing (VSS) scheme (see section 3) and
distributes all necessary information among n signature generation parties, indexed in a
set A. Each party verifies the validity of the shares making the verification phase of the
167
Directed threshold signcryption scheme from bilinear pai ring…
secure Verifiable Secret-Sharing (VSS) scheme. As a result, a signcrypter hasn’t any
share, but he or she keeps a value sS2∈Z*q as a secret.
Signcrypt: A designated signcrypter is going to make a signature on a message
m ∈ {0, 1}* for the designated recipient IDR. Suppose that an authorized group of the
parties B∈PS(A) is involved in this process. Then:
(a) A designated signcrypter with IDS checks that the equality e(XR, Ppub) = e(YR,
P) holds good. If not, aborts the signcryption.
(b) Chooses a random number v∈ Z*q and computes mˆ = H 2 ( m e ( vPk R , YR ) ) .
This last value the sender broadcasts to each party i∈B.
(c) Each party i∈B randomly chooses two numbers ri, ki ∈ Z*q and computes
σ i = mˆ Si + ki ri P ∈ G1* . Next, these individual outputs σ i (the disturbed partial
signatures on message m) as well as Ri = riP and Ki = kiP are sent back to the
sender IDS.
(d) After receiving partial signatures σ i from all parities i ∈ B, a designated
signcrypter verifies theirs correctness by checking if the following equation
holds (compare the equation (2)):
e (σ i , P ) = γ i
(∏
m −1
j =0
α ij
j
)
mˆ
, γ i = e ( K i , Ri ) , ∀ i ∈ B
(6)
(e) If individual partial signatures are valid, the sender computes the group (common) signature
i
σ = ∑ λ jσ j where λ j = ∏
, λ j ∈ Z q*
(7)
i
j
−
j∈B
i∈B , i ≠ j
and checks its correctness:
mˆ

λ 
λ
(8)
e (σ , P ) = µ  ∏ β j j  , µ = ∏ γ j j
j∈B
 j∈B

It is obvious the correctness of the equation (8) results from equations (6)
and (5).
(f) If the equation (8) holds, the common signature is accepted and the sender performs the four successive steps:
a. Computes two commitments V pub = vPpub and VSR = sS 2 vYR (a random
b.
number v∈ Z*q was generated in the step (b)).
Computes encrypted signature σ S = σ + vYR and a key k = H1 (σ S ) .
c.
Computes C = Ek m vYS vYR PK S
(
)
and W = σ S + sS 2V pub .
The final directed threshold signcryption on a message m is SDTS = (W, µ, C,
Vpub, VSR).
Unsigncrypt: To unsigncrypt a signcrypted message SDTS = (W, µ, C, Vpub, VSR) from
sender IDS, the recipient IDR follows the steps below. As a result, the designated recipient recovers the original message and then is checking its integrity and origin.
168
Jerzy Pejaś
(a) Check the correctness of the equality:
( (
)
e (W , P ) = µ e H 2 m e ( Pk R , sR 2V pub ) Pk s , YS
(
e ( sR 2V pub , P ) e ( sR 2 ) VSR , P
−1
)
)
(9)
This equation should be valid because:
e (W , P ) = e (σ + vYR + sS 2V pub , P )
(10)
= e (σ , P ) e ( vYR , P ) e ( sS 2V pub , P )
where (see equation (7)):




e (σ , P ) = e  ∑ λ jσ j , P  = e  mˆ ∑ λ j S j + ∑ λ j k j rj P, P 
j∈B
 j∈B

 j∈B



λ
= e ( mˆ Sk S , P ) e  ∑ λ j k j rj P, P  = e ( mˆ SkS , P ) ∏ γ j j
j∈B
 j∈B

(
= η e H 2 ( m e ( vPk R , YR ) ) ssS 2 Pk S , P
( (
)
= η e H 2 m e ( Pk R , sR 2V pub ) PkS , YS
(11)
)
)
e ( vYR , P ) = e ( v s sR 2 P, P ) = e ( sR 2 v s P, P )
= e ( sR 2V pub , P )
(12)
e ( sS 2V pub , P ) = e ( sS 2 v s P, P ) = e ( v YS , P )
(
) (
= e ( sR 2 ) sS 2 v s sR 2 P, P = e ( sR 2 ) VSR , P
−1
−1
)
(13)
Of course, the following two equations should be also fulfilled (see KeyGenerate algorithm and equations (13)-(14)):
e (YS + YR , P ) = e ( X S + X R , Ppub )
(
)
e ( sR 2 ) VSR + sR 2Vpub , Ppub = e (YS + YR , Vpub )
−1
(14)
(15)
If equations (9), (14) and (15) are not valid, then a signcrypted message SDTS
should be rejected.
169
Directed threshold signcryption scheme from bilinear pai ring…
(
)
(b) Compute k = H1 (σ S ) = H1 W − ( sR 2 ) VSR ; note that
−1
σ S = W − sS 2V pub = W − ( sR 2 ) sS 2 v s sR 2 P
−1
(16)
= W − ( sR 2 ) VSR
−1
(c) Compute m vYS vYR PK S = Dk ( C ) .
(d) Next, calculate σ = σ S − ( sR 2 ) V pub and check the following equality (compare equation (11)):
−1
( (
)
e (σ , P ) = η e H 2 m e ( Pk R , sR 2Vpub ) Pk S , YS
)
(17)
Accept the common signature σ and message m if and only if both sites of the equation (17) are equal, otherwise, reject.
PVerify: If it is necessary (for example, to settle a dispute between the sender and
the recipient), the recipient can forward the signcrypted message SDTS = (W, µ, C, Vpub,
VSR) to any third party IDC convinces it that the ciphertext is the signcrypted version of a
given plaintext message m made by the sender. To make this verification the third party
IDC needs the additional information, i.e. VS = vYS and VR = vYR , from the sender IDS
or from the recipient IDR. Then, because sS 2V pub = ( sR 2 ) VSR = vYS and sR 2V pub = vYR
(compare equations (13) and (14)):
(a) The third party checks for the correctness of the equations (9), (14) and (15). If
this equation holds, the values (W, µ, Vpub, VS, VR) are accepted, otherwise
rejected.
(b) If above values are valid, the third party IDC can follow the steps below
a. Compute k = H1 (W − vYS ) .
−1
b.
Compute m vYS vYR PK S = Dk ( C ) and compare vYS and vYR to the
c.
values received from the sender IDS or the recipient IDR. If both values are
different, the reject the signcrypted message as invalid and abstain the verification process.
Finally, check the following equality
(
e (σ , P ) = µ e H 2 ( m e ( Pk R , vYR ) ) Pk s , YS
)
(18)
5. Analysis of our ID-DTS-DS scheme
In this section we discuss the security aspects of our proposed scheme. The security
considerations include correctness, verifiable directedness, public non-repudiation verifiability and unforgeability as follows.
Correctness. From Theorem 1, we can see the correctness of our schemes is sound.
170
Jerzy Pejaś
Theorem 1. Given a valid signcrypted message SDTS = (W, µ, C, Vpub, VSR), following
the steps in the ID-DTS-DS directed threshold signcryption scheme, the recipient will
surely recover and verify the message m from the signcrypted message.
Proof: The proof is obvious and results form the construction of the algorithm Unsigncrypt.
Verifiable directedness. Directed threshold signcryption scheme is said to be with
the property of verifiable directedness if the invisibility and transitivity conditions are
satisfied (see R. Lu et al. [[2]]).
Theorem 2. The proposed ID-DTS scheme is indeed a directed threshold signcryption scheme.
Proof: The detailed proving techniques of the above statement are similar to those
presented in the paper of R. Lu et al. [[2]] and are omitted here.
Public Non-repudiation Verifiability. An ID-based directed threshold signcryption
scheme is said to provide public non-repudiation verifiability if given a signcrypted
message and possibly some additional information provided by the recipient, a third
party can verify that signcrypted message (and all information included in it) can be
proven to have originated from a specific sender, without knowing the recipient’s private key.
Theorem 3. The proposed ID-DTS scheme is indeed a public non-repudiation verifiability scheme.
Proof: The implication of the Theorem 2 is that our scheme should satisfy the transitivity property. It means, that with the help of the signer IDS or the designated IDR any
third party can check the validity of a signcrypted message SDTS = SDTS = (W, µ, C, Vpub,
VSR), i.e. it can recover a message m and verify a signature σ as well (see the PVerify
algorithm).
When
a
signature
σ
is
valid
then
the
equation:
(
)
e (σ , P ) = µ e H 2 ( m e ( Pk R , vYR ) ) Pk s , YS must be met. This implies that a source of
this signcrypted message SDTS is the signer IDS with public key PKS and, moreover, this
message was designed for the receiver IDR. Additional proof for the correctness of these
conclusions results also from the equations (9), (14) and (15) binding together identities
of the signer and the receiver. We observe also, that because of the KeyGenerate algorithm the signer cannot refer to the fact that the PKG always knows the signcrypter's
private key and is capable of sign any messages at will.
(
Confidentiality. Decryption of the C = Ek m vYS vYR PK S
)
requires the know-
ledge of the session key k = H1 (σ + vYR ) . A passive or an active adversary has no
information about v or sR2 value. So, due to difficult to calculate the vYR or sR 2V pub
based on the public values YR or Vpub only, since it is difficult to solve discrete logarithm
problem without the knowledge of the v or sR2 (note, the last value is the secret part of
the private key which is owned by the receiver R).
Unforgeability. Let assume the adversary is allowed to corrupt the designated verifier R and at most m-1 parties in the authorized group B. Such corrupted group isn’t
able itself to reconstruct the private signing key SkS and therefore to generate the correct
final signature σ. Moreover, the signer S can detect each trail of making such forgeries.
Directed threshold signcryption scheme from bilinear pai ring…
171
On other hands, if the adversary is able to forge valid signature σ, then even knowing
the m-1 partial signature (created by controlled parties) σ ji , ∀ji ∈ B (for i=1, …, m-1)
the adversary isn’t able to calculate the share S jm , jm ∈ B of m-th honest party.
6. Conclusions
The ID-DTS-DS scheme presented in this paper allows a designated signcrypter to
control a signcryption process. It is done this way that shares of a signer’s private key
are distributed among n different shareholders (servers) and an authorized subset of
shareholders is able to prepare a valid signcrypted message only in this case when a
designated signcrypter prepares two commitments V pub = vPpub and VSR = sS 2 vYR . This
second one depends on a secret sS2 known to a designated signcrypter and being a part
of his or her private key. Furthermore, nobody knows both a verifier and a signcrypter
of a signcrypted message unless the verifier agrees to disclose their identities.
We analyze different security aspects of the proposed scheme and among them informally show the confidentiality and unforgeability. However, these last aspects need
to use the techniques from provable security like a sequence of games (a.k.a. the game
hopping technique) that the adversary plays, starting from the real attack game (see V.
Shoup [[22]] and B. Blanchet, D. Pointcheval [[23]]). Due to space limitation, such
formal security proofs of the proposed ID-DTS scheme’s confidentiality under adaptive
chosen-ciphertext-and-identity attacks and the existential unforgeability under adaptive
chosen-message-and-identity attacks are omitted here.
References
[1] Lim C. H., Lee P. J. Modified Maurer-Yacobi’s Scheme and its Applications, Proceedings of the Workshop on the Theory and Application of Cryptographic Techniques: Advances in Cryptology, Lecture Notes In Computer Science; Vol. 718,
1992, pp. 308 - 323
[2] Lu R., Lin X., Cao Z., Shao J., Liang X. New (t,n) threshold directed signature
scheme with provable security, Information Sciences, vol. 178 (2008), pp.756–765
[3] Zheng Y. Digital Signcryption or How to Achieve Cost (Signature & Encryption)
<< Cost(Signature) + Cost(Encryption), Advances in Cryptology, Lecture Notes in
Computer Science, volume 1294, pages 165–179. Springer-Verlag, 1997
[4] Malone-Lee J. Signcryption with non-interactive non-repudiation, Technical Report CSTR-02-004, Department of Computer Science, University of Bristol, 2004.
[5] Kumar M. A Cryptographic Study Of Some Digital Signature Schemes, PhD Thesis, Ambedkar University, AGRA-INDIA, 2005, http://arxiv.org/abs/cs/0501010v2
[6] Shamir A. How to share a secret, Communications of the ACM, 22(11):612–613,
November 1979
[7] Shamir A. Identity-based cryptosystems and signature schemes, Advances in Cryptology-Crypto 1984, LNCS 196, pp.47-53, Springer-Verlag, 1984
[8] Boneh D., Franklin M. Identity Based Encryption from the Weil Pairing, SIAM
Journal of Computing, Vol. 32, No. 3, pp. 586-615, 2003
172
Jerzy Pejaś
[9] Sherman S. M. Chow Forward Security from Bilinear Pairings: Signcryption and
Threshold Signature, MSc Thesis, University of Hong Kong, August 2004
[10] Joux A., Nguyen K. Separating decision Diffie-Hellman from Diffie-Hellman in
cryptographic groups, Journal of Cryptology, 2003, 16(4), pp. 239-247
[11] Ng E. M. Security Models and Proofs for Key Establishment Protocols, Msc thesis,
University of Waterloo, Ontario, Canada, 2005
[12] Boneh D. The decisional Diffie-Hellman problem, in Third Algorithmic Number
Theory Symposium, pages 48–63, Springer-Verlag, 1998.
[13] Boneh D., Gentry C., Lynn B., Shacham H. Aggregate and Verifiably Encrypted
Signatures from Bilinear Maps, Lecture Notes in Computer Science, Vol. 2656, pp.
416–432, Springer, 2003.
[14] Boneh D., Lynn B., Shacham H. Short Signatures from the Weil Pairing, Lecture
Notes in Computer Science, vol.2248, pp. 514–532, Springer, 2001.
[15] Sherman S. M. Chow, Lucas C.K. Hui, S.M. Yiu, K.P. Chow. Forward-secure
multisignature and blind signature schemes, Applied Mathematics and Computation 168 (2005), pp.895–908
[16] Berreto P. S. L. M., Kim H. Y., Scott M. Efficient algorithms for pairing-based
cryptosystems, Advances in Cryptology – Crypto 2002, Lecture Notes in Computer
Science Vol.2442, Springer-Verlag (2002), pp. 354-368.
[17] Menezes A. Elliptic curve public key cryptosystems, Kluwer Academic Publishers,
1995.
[18] H. Ker-Chang Chang, Erl-Huei Lu, Pin-Chang Su Fail-stop blind signature scheme
design based on pairings, Applied Mathematics and Computation 169 (2005),
pp. 1324–1331
[19] Baek J., Zheng Y. Identity-Based Threshold Signature from the Bilinear Pairings,
Proceedings of ITCC 2004, Track, IEEE Computer Society, 2004
[20] Baek J. Construction and Formal Security Analysis of Cryptographic Schemes in
the Public Key Setting, PhD Thesis, Monash University, January, 2004
[21] M. Choudary Gorantla, Raju Gangishetti, Manik Lal Das, Ashutosh Saxena. An
Effective Certificateless Signature Scheme Based on Bilinear Pairings, Proceedings
of the 3rd International Workshop on Security in Information Systems, WOSIS
2005, Miami, USA, May 2005, INSTICC Press 2005
[22] Shoup V. Sequences of games: a tool for taming complexity in security proofs,
Cryptology ePrint Archive, 2004/332
[23] Blanchet B., Pointcheval D. Automated Security Proofs with Sequences of Games,
CRYPTO'06, Lecture Notes on Computer Science, Santa Barbara, CA, August
2006. Springer Verlag
[24] Directive 1999/93/EC of the European Parliament and of The Council of 13 December 1999 on a Community framework for electronic signatures, Official Journal
of the European Communities, January 19, 2000.
[25] Chocianowicz W., Pejaś J., Ruciński A. The Proposal of Protocol for Electronic
Signature Creation in Public Environment, in Enhanced Methods in Computer Security, Biometric and Artificial Intelligence Systems, Springer New York 2005
[26] Pejaś J. ID-based Threshold Directed Signcryption Scheme Using a Bilinear Pairing, Polish Journal of Environmental Studies
Lack of information – average distribution
of probability density
Andrzej Piegat1, Marek Landowski1,2
1
Faculty of Computer Science and Information Technology, Szczecin University of Technology,
2
Quantitative Methods Institute, Szczecin Maritime University
Abstract:
In many real problems we have to face up to information gaps. The paper presents the
method of decreasing granulation of elementary events, which finds an average
distribution of probability density, which represents infinite number of distributions. If we
know the range of the unknown value and have qualitative knowledge about the general
character of its distribution, then we can use this knowledge to surmount the information
gap in a better way than the way proposed by the principle of indifference. According to
the authors' knowledge the concept of the safe distribution is new and unknown in the
literature. It is of fundamental importance for probability theory.
Keywords:
Bayesian networks, prior probability distribution, principle of indifference, information
gaps, uncertainty theory.
1. Introduction
Solving problems under uncertainty (partial lack of knowledge) is one of the most
difficult aims of artificial intelligence (AI). People can solve such problems. To make
AI comparable with the human intelligence it has to be also able to solve problems
under uncertainty. Problems of information gaps are being intensively investigated at
present [7]. An information gap means lack of knowledge about values of variables,
about distributions of their probability or possibility, about variability intervals, etc. The
problem of information gaps is a common one in Bayesian networks where the prior
distributions are necessary but they frequently are unknown.
Prior distributions are also necessary in all other problems where Bayes' rule [5] is
applied. Let A and B be two events. The conditional probability p(A|B) of event A is not
known but necessary for a problem solving. If the inverse conditional probability p(B|A)
and the prior probability p(A) are known then the unknown probability p(A|B) can be
calculated with Bayes' theorem [5],
p( A | B) = α ⋅ L( B | A) ⋅ p( A)
(1)
where: α – normalizing coefficient, L(B|A) = p(B|A) – likelihood of event A given B.
Bayes' theorem is used in probabilistic, automated reasoning. Unfortunately, it
requires knowledge of the prior probability p (A) that frequently is not known. Also in
many other problems, which have nothing to do with Bayes' rule, some data frequently
are unknown.
174
Andrzej Piegat, Marek Landowski
E.g., to solve a problem we need numerical value x* of variable x. But its value is
unknown (information gap). Experts can give us then the interval [xmin, xmax] in which
the value is contained. Example of such simple problem can be famous Bertrand'
problem [3].
“The train leaves at noon to travel a distance of 300 km. It travels at a speed of
between 100 km/h and 300km/h. What is the probability that it arrives before 2 p.m.?”.
To give an answer to the above question the knowledge of the probability density
distribution (function) pdf(v) is necessary. However, this distribution is unknown. Can
we, in spite of all, answer the question? Of course, we can not give a precise and 100%
sure answer because we have here to do with a partial ignorance. Nevertheless, we can
try and give an approximate and surely not the worst solution of the problem with use of
the principle of indifference (the PI) by Laplace [2],[3],[5]. This principle suggests
assuming the uniform, rectangular distribution as shown in Fig.1.
Figure 1. The uniform distribution of probability density (pd) of the train velocity in Bertrand's
problem suggested by the principle of indifference
If the train has to travel the distance of 300 km in a time shorter than 2h, its velocity
must be higher than 150 km/h. Probability of this event for the uniform distribution of
the velocity equals 3/4. Thus, the PI allows us to get a solution of problems with
information gaps. However, it has been attacked and is at present attacked by its
opponents, e.g. in [1] and [7].
Thus Yakov in [7] writes: “The uncertainty is an information gap .... We have now
information, not even probabilistic. By principle of indifference elementary events,
about which nothing is known, are assigned equal probabilities. Knowing a specific
probability is knowledge.”
The above argumentation is not correct. The PI does not claims that the uniform
distribution is the real distribution. It only suggests assuming this distribution to get a
credible, approximate solution of the problem. This principle should only be used if we
have to take a decision and we need a possibly good solution of a problem in conditions
of ignorance. O. Magidor gives in [3] a convincing explanation of the question:
“Assume that two alternatives are equally probable if you do not have any reason
not to do so. Such formulations present action-guiding principles. They do not claim
that in the case mentioned the probabilities are equal - they simply direct us to assume
so. In these cases, the principle can at best be viewed as a rationality principle.”
However, is the uniform distribution of probability really a rational assumption?
A. Piegat carried out investigations, published in [4], that were based on the method
of decreasing granulation of elementary events (DGEE-method). The investigations
showed that the uniform distribution is the average distribution of the infinite number of
Lack of information – average distribution of probability density
175
all possible distributions that can exist in a given interval [xmin, xmax] of the variable.
Assuming the uniform pdf(x) we minimize the sum of relative errors in relation to the
real distribution, independently of which form this distribution has! Thus, the uniform
distribution can be called not only a rational one but also a safe one, in the sense of the
absolute error sum minimization.
2. The Average, Unimodal Distribution of Probability Density
Distribution – Version without Boundary Distributions
(AUPDD-VwBD)
Let us assume that our knowledge concerning the real value x* of variable x is as
follows:
I. The value x* is contained in the interval [xmin, xmax]. For simplicity let us
assume the normalized interval [0,1].
II. The pdf(x) is unimodal one. However, we do not know whether it is symmetric
or asymmetric one. Therefore, we have to allow both the left asymmetry, the
symmetry, and the right asymmetry of the pdf(x).
III. We know that the distribution is not the boundary one, which means that its
maximum does not lie both at the right boundary 1 and at the left boundary 0
of the interval [0,1]. It lies between the boundaries.
Fig. 2 presents few examples of unimodal distributions, which satisfy conditions
I-III.
Figure 2. Few examples of unimodal distributions from the infinite number of distributions that
satisfy conditions I, II, III
The number of possible unimodal, non-boundary (UnB) distributions is infinitely
large. The real pdf(x)-distribution in the considered problem can be any one of them. It
is, in general, possible to determine the average distribution of the infinite number of
distributions?
It seemingly seems impossible. However, there exists a possibility. The average
distribution can be determined with the method of decreasing granulation of elementary
events that was proposed by A. Piegat in [4]. According to the condition I the variable x
can take values only in the normalized interval [0.1]. This interval can be partitioned in
n subintervals ∆xi of width 1/n. Let i be the number of a subinterval, 1≤ i ≤n. As
elementary event will be understood the event x*∈∆xi. By granulation of the elementary
176
Andrzej Piegat, Marek Landowski
event will be understood the width 1/n of the subinterval ∆xi of the event. Granulated
will not only be the variable x but also probability p of elementary events. In this case
the granulation of probability 1/n will mean that probability can take only (n+1) discrete
values. E.g. for n = 3 probability can only take values p ∈ {0, 1/3, 2/3, 1}. Now, let us
consider the question how many distributions are possible for the granulation 1/3. This
granulation is assumed both for the variable x and for probability p of elementary
events. All possible UnB-distributions (histograms) for this case are shown in Fig.3.
Figure 3. All 4 possible UnB-distributions of probability for the granulation 1/3 of the variable x
and of probability p
On the basis of all possible distributions from Fig.2 the average probabilities p1, p2,
p3 of particular elementary events x∈ [0,1/3], x∈ [1/3, 2/3], and x∈ [2/3, 1] can be
calculated.
p1 =
p2 =
1 4
1
p1 j = (0 + 1/ 3 + 1/ 3 + 0) = 1/ 6
∑
4 j =1
4
1 4
1
p2 j = (1 + 2 / 3 + 1/ 3 + 2 / 3) = 4 / 6
∑
4 j =1
4
p3 =
1 4
1
p3 j = (0 + 0 + 1/ 3 + 1/ 3) = 1/ 6
∑
4 j =1
4
The average UnB-distrubution for the granulation 1/3 is shown on Fig.4.
Figure 4. The average UnB-distribution for granulation 1/3 in the form of a histogram (a)
and of pdf(x) (b)
In the second step of the method of decreasing granulation the granulation was
decreased from 1/3 to 1/4. Decreasing the granulation causes an increase of the number
177
Lack of information – average distribution of probability density
of possible UnB-distributions. This time they will be shown in the form of a table in
Tab.1.
Table 1. 14 possible distributions of probability at granularity 1/4 and the average probability of
the particular elementary events i, i∈ {1,2,3,4}
Distribution Probability of subinterval i
j
P1j
P2j
P3j
P4j
1
0
4/4
0
0
2
1/4
3/4
0
0
3
0
3/4
1/4
0
4
1/4
2/4
1/4
0
5
0
2/4
2/4
0
6
0
2/4
1/4
1/4
7
0
0
4/4
0
8
0
1/4
3/4
0
9
0
0
3/4
1/4
10
0
1/4
2/4
1/4
11
1/4
1/4
2/4
0
12
1/4
1/4
1/4
1/4
12
∑p
4/4
20/4
20/4
4/4
Pi aver
pdaver
1/12
1/3
5/12
5/3
5/12
5/3
1/12
1/3
j =1
ij
Fig.5 shows the comparison of the average pdf(x) for the granulation 1/3 and 1/4.
Figure 5. The average distribution of probability density (pd) of 4 possible distributions from
Fig.4 for the granulation 1/3 (a) and the average distribution of 12 possible distributions for the
granulation 1/4 (b)
If we stepwise decrease the granulation of the variable x and of probability p (1/3,
1/4, 1/5, 1/6/ ... ,1/n), then we will more and more approach a limiting distribution that
represents the infinitive number of possible UnB-distributions and which corresponds to
the infinitely small granulation 1/n: n → ∞. The increase of the distributions' number is
very strong and rapid. So, the granulation 1/3 is corresponded by 4 possible
distributions, 1/4 by 12 distributions, ... , 1/25 by 927.797, and 1/27 by 1.888.686
distributions. This rapidly increasing number of distributions causes a very large
memory burden for computers. However, the investigations made by the author's [4]
178
Andrzej Piegat, Marek Landowski
have shown that succeeding average distributions corresponding to decreasing
granulation of elementary events quite quickly approach a certain limiting distribution
and that differences between distributions corresponding to small granulation 1/n
become negligible. In the practice one can observe this phenomenon already at
granulations 1/24, 1/25, 1/26. Therefore the authors stopped generating the distributions
at the granulation 1/27. The average UbN-distribution for this granulation (fig.6) was
approximated with formula (2):
pd ( x) = (97.0167 x6 − 291.0501x5 + 331.8880 x 4 − 178.6925 x3
+39.1084 x 2 + 1.7296 x + 0.0423) / 0,9989
(2)
where x ∈ [0;1] , the mean absolute error equals 0.0017.
Figure 6. The average distribution of probability density (pd) representing 1.888.686 possible
UnB-distributions at the granulation 1/27 (a) and its smoothed approximation (mean absolute
error equals 0.0017) (b)
3. The Average, Unimodal and Boundary Distribution of Probability
Density (AUPDD)
Let us assume that we have the following knowledge about the numerical value x*
of variable x and about its distribution:
VI.
The unknown value of the variable is contained in the interval [xmin, xmax].
For simplicity the normalized interval [0, 1] is used.
VII.
The pdf(x) is unimodal one. However, we do not know whether it is
symmetric or asymmetric one. Therefore, we have to allow both the left
asymmetry, the symmetry, and the right asymmetry of the pdf(x).
VIII.
We know that the distribution is can be the boundary one, which means
that its maximum lie both at the right boundary 1 and at the left boundary
0 of the interval [0,1].
Fig.7 presents few examples of distributions corresponding to the knowledge
contained in VI, VII, VIII.
In the process of determining the average (AUDPD) - distribution also the method
of decreasing granulation of elementary events was applied. For granulation 1/27 there
exist 1.894.704 distributions satisfying conditions VI, V, VI. The average distribution is
shown in Fig.8.
Lack of information – average distribution of probability density
179
Figure 7. Few examples of unimodal (AUDPD) – distributions of probability density of possible
values x* of variable x.
Figure 8. The average, unimodal, right-asymmetric (AUDPD) – distribution and its polynomial
approximation (3), mean absolute error equals 0.0012
pd 27 ( x) = (102.4538x 6 -307.3615x 5 + 350.4505x 4 -188.6318x 3
+41.6527x 2 + 1.4363x + 0.0560)/0.9999
(3)
4. The Average, Unimodal, Right-Asymmetric (or Left-Asymmetric)
Distribution of Probability Density – Version without Boundary
Distributions (AUAPDD-VwBD)
Let us assume that we have the following knowledge about the numerical value x*
of variable x and about its distribution:
IX. The unknown value of the variable is contained in the interval [xmin, xmax].
For simplicity the normalized interval [0, 1] is used.
X.
The real distribution of probability density is unimodal and right-asymmetric
(or left-asymmetric) one. The probability of the left side exceeds 0.5.
XI. The real distribution surely is not the boundary one.
Fig.9 presents few examples of distributions corresponding to the knowledge
contained in IX, X, XI.
180
Andrzej Piegat, Marek Landowski
Figure 9. Few examples of unimodal, right-asymmetrical (AUADPD-VwtBD) – distributions of
probability density of possible values x* of variable x
In the process of determining the average (AUADPD-VwtBD) – distribution also
the method of decreasing granulation of elementary events was applied. For granulation
1/27 there exist 938.461 distributions satisfying conditions IX, X, XI. The average
distribution is shown in Fig.10.
Figure 10. The average, unimodal, right-asymmetric (AUADPD-VwtTBD) - distribution and its
polynomial approximation (4), mean absolute error equals 0.0049
(−455.4885 x5 + 593.9639 x 4 − 333.0256 x 3 + 73.1849 x 2

+3.8007 x + 0.0819) / 0.9997

pd aver1/27 ( x ) = 

6
5
4
3
(1167.1152 x − 5692.2778 x + 11532.8869 x − 12435.4133x
2
+7535.8986 x − 2438.0789 x + 329.87) / 0.9997
for x ∈ [0;14/27]
(4)
x ∈ (14/27;1]
Lack of information – average distribution of probability density
181
5. The Average, Unimodal, Asymmetric Distribution of Probability
Density – Version with Boundary Distributions (AUADPD-VBD)
Lets us assume, we have the following knowledge about the numerical value x* of
variable x and its distribution:
XII. The real value of the variable is contained in the interval [xmin, xmax]. For
simplicity the normalized interval [0, 1] will be used.
XIII. The real distribution pdf(x) is unimodal and right-asymmetric (or leftasymmetric) one. Probability of the left side is higher than 0.5.
XIV. We can not exclude (it is possible) that the real distribution is the boundary
one.
Fig.11 shows few examples of distributions that satisfy the above conditions and
correspond to our qualitative knowledge.
Figure 11. Examples of unimodal, right-asymmetric, (AUADPD-VBD) distributions – version
with boundary distributions
Fig.12 presents (AUADPD-VBD)-distributions for granularity 1/27. For the
granularity 1/27 there exist 941.470 possible distributions.
Figure 12. The average (AUADPD-VBD) - distributions for granulation 1/27 (a) and its
polynomial approximation (b).
182
Andrzej Piegat, Marek Landowski
The distribution AUADPD-VBD pdaver1/27(x) was approximated by the
polynomial, mean absolute error equals 0.0054, (5).
(−431.3794 x5 + 560.4006 x 4 − 316.2614 x 3 + 69.8168 x 2

+3.9402 x + 0.0978) / 0.9994

pd aver1/27 ( x) = 

6
5
4
3
(1163.4027 x − 5674.178 x +11496.2364 x − 12395.9281x
+7512.0008 x 2 − 2430.3619 x + 328.8288) / 0.9994
for x ∈ [0;14/27]
(5)
for x ∈ (14/27;1]
6. Conclusions
In the paper the authors present average probability distributions of density obtained
by method of decreasing granulation of elementary events. The average distributions
can be used when problems with information gaps occur, when we have some
information about unknown, lacking distribution. When we have knowledge about
unimodality or symmetry of distribution we can find appropriate average distribution
using the method of decreasing granulation. The polynomials (2), (3), (4) and (5) can be
used instead of the uniform distribution as assumption for the information gap if
according to our knowledge the real distribution have a specific characteristic. Average
distribution replaces the lack of knowledge, and the solution obtained using average
distribution is characterized by minimal error relatively to any possible solutions. So it
gives opportunity to make a right decision. The problem of average distributions is of
fundamental importance for probability theory, and generally, for uncertainty theory.
References
[1] Bandrit C., Dubois D., Guyonenet D. Joint propagation and exploitation of
probabilistic and possibilistic information in risk assessment. IEEE Transactions
on Fuzzy Systems, Vol. 14, No. 5, pp. 593-608, 2006.
[2] Wikipedia, The Free Encyclopedia,
http://en.wikipedia.org/wiki/Principle_of-indifference, 2008.
[3] Magidor O. The classical theory of probability and the principle of indifference. 5th
Annual Carnegie Mellon/University of Pittsburgh Graduate Philosophy
Conference, http:// www.andrew.cmu/org/conference/2003, pp.1-17, 2003.
[4] Piegat A., Landowski M. Surmounting information gaps - safe distributions of
probability density. Methods of Applied Computer Science, No. 2/2007, pp. 113126, Szczecin, 2007 [in Polish]
[5] Russel R., Norwig P. Artificial Intelligence – A Modern Approach. Second edition,
Prentice Hall, Upper Saddle River, NJ, 2003.
[6] Rutkowski L. Metody i techniki sztucznej inteligencji. Wydawnictwo Naukowe
PWN, Warszawa, 2005.
[7] Yakov B. H. Info-gap decision theory-decisions under severe uncertainty. Second
edition, Academic Press, London, 2006.
Dynamic group threshold signature based on
derandomized Weil pairing computation
Jacek Pomykała, Bartosz Źrałek
Warsaw University, Institute of Mathematics
Polish Academy of Sciences , Institute of Mathematics
Abstract:
We propose the Weil Pairing based threshold flexible signature scheme for dynamic
group. The protocol applies the simple additive secret sharing device. Its security is based
on the computational Diffie-Hellman problem in the gap Diffie-Hellman groups. The
computation of the Weil pairing is the crucial point of our proposition. We have managed
to avoid the random numbers generation in the corresponding Miller’s algorithm without
an essential increase in the computational cost. The system is particularly interesting
when the threshold size is small in relation to the group cardinality.
Keywords:
threshold cryptography, Weil Pairing, secret sharing, digital signature
1. Introduction
Threshold cryptography constitutes a great challenge in the modern fault-tolerant
distributing systems. The corresponding threshold cryptosystems allow to increase
the trust and improve the availability and efficiency of services in the electronic
commerce.
Generally speaking we can distinct between the threshold encryption and decryption
systems, according to whether the respective shared secret is reconstructed in the encryption phase (digital signature) or in the decryption phase (threshold decryption systems). In this paper we shall be concerned with the threshold signature schemes, where
the group G private key is distributed among the group members, so that the messages
could be signed only by the authorized subgroups of G. We shall consider the case of
dynamic group, with the flexible threshold level. This means that the group can increase
(within the suitable restrictions) and the threshold level (size) can vary together with the
group extension. The proposed scheme based on the additive secret sharing admits to
simplify and make more efficient the process of joining the new members to the group
than in the traditional approach based on the polynomial secret sharing device. To handle efficiently with the variable threshold level we have released the restriction to assign
the distinct shares for the distinct group members. The efficient performance of the
corresponding protocol was due to the application of the BLS short signatures, which
provide us with the elegant and secure way of the respective verifiable secret sharing
process (for the corresponding partial signatures). The additional simplification concerns the computation of the Weil pairing which plays the significant role in the signa-
184
Jacek Pomykała, Bartosz Źrałek
ture verification process. We were able to derandomize the corresponding Miller’s algorithm used for the computation of the Weil pairing. The security of the proposed protocol is based on the computational Diffie-Hellman (C-DH) problem.
The current work is the extended and modified version of the article [19]. In particular we present here the efficient computation of the Weil pairing proving that the suitable Miller’s algorithm can be derandomized.
2. Preliminaries and assumptions
In the general approach to the threshold secret sharing protocols we require that
r-distinct values (called the secret shares) are distributed among the group members (not
necessarily in one-to-one correspondence. The minimal number of shares needed to
reconstruct the secret is denoted by t and called the threshold level. The second important parameter - k (called the corruption bound) points out the maximal number of distinct shares that could be controlled by the corrupted players. In our model we admit the
dynamic group G (of variable cardinality n), with r=r(n), t=t(n) and k=k(n) being the
functions of n. For the security reasons we have the following inequalities: k < t ≤ r ≤ n.
Let us assume for the moment that the r shares are distributed uniformly in the initial group G. This means that the classes Ci of members with the same shares have the
cardinality comparable with n/r and therefore the maximal number of completely corrupted classes is approximately kr/n. This value should not overcome r-t-1 in order to
allow the honest users to reconstruct the secret value, hence k < min (r, (r-t-1)n/r).
Clearly the case r=t implies the worst estimate k < min (r, n/r) ≤ n , for the admissible
number of corrupted players, but as a benefit gives us the chance to replace the polynomial secret sharing by the additive sharing device. Having in view the efficient performance and protocol’s security we will deal with this case in the paper.
The system consists of a trusted dealer D, a public board and the (dynamic) group G,
identified with the set of identities: G={ID}. G is split into disjoint classes composed of
members having the same secret shares. The dealer and the group members are connected to each other and to the public board by the secure point-to-point channels. Denote by C = {C1, …, Ct} - the set of all initial classes, L (L ≥ 2)- the number of disjoint
classes an initial class can be split into, a - the number of new players added to a class.
We assume that:
k<t≤
2
n
3
L
(1)
In the system the group cardinality can increase from n to n+t(L-1)a and the threshold
size from t to tL. The new members are added to all except one of the classes falling
apart.
3. Related work
The idea of threshold cryptosystems has been introduced in [5] and [6]. Their solution was based on the secret sharing device (see [16], [1]). The flexibility of the threshold size in the threshold decryption system based on RSA [15] cryptosystem has been
the subject of paper [7]. In [14] the simple RSA based decryption system applying the
Dynamic group threshold signature based on derandomized Weil pairing computation
185
Chinese remainder theorem and the message padding device was presented. Threshold
flexible signature scheme based on RSA cryptosystem was considered in [11]. The
efficient threshold signature scheme based on BLS short signature [4] was presented in
[2]. The dynamic groups approach to threshold signatures in gap Diffie-Hellman groups
was the subject of paper [13], [18]. Another approach particularly interesting for Ad hoc
groups, based on the additive secret sharing using the RSA cryptosystem has been proposed in [12].
Here we apply the additive secret sharing to propose the Weil pairing based threshold signature scheme for dynamic groups. Besides the computational savings due to
the application of the BLS signatures the computation of the Weil pairing (c.f. [3]) will
be derandomized without the significant additional cost.
4. Gap Diffie-Hellman groups and Weil Pairing
Let G = (G, +) be a group of prime order q and P, Q be any nontrivial elements of
G. The discrete logarithm problem (DLP) in G may be stated as follows:
Find a ∈ Z q such that aP = Q.
Let us formulate the following related problems.
The computational Diffie-Hellman problem (C-DH): Given the triple (P, aP, bP)
find the element abP.
The decisional Diffie-Hellman problem (D-DH): Given a quadruple (P, aP, bP, cP)
decide whether c = ab (q) (in which case we shall write that (P, aP, bP, cP)=DH). Here
and in the sequel we use the notation x (q) to stand for x (mod q).
We call the group G=(G, +) a gap Diffie Hellman (G-DH) group if (roughly speaking) the D-DH problem is computationally easy, while the C-DH problem is hard.
In this paper we shall apply the elliptic group structure with the corresponding Weil
pairing [8] to obtain the required GDH group. To be more precise, let E be an elliptic
curve over a finite field K of characteristic p and let n an integer not divisible by p.
Denote by cl(K) the algebraic closure of K. It can be shown that the group E[n] of ntorsion points of E/cl(K) is isomorphic to Zn ×Zn. The Weil pairing is a map
e: E[n] ×E[n] → cl(K)*,
satisfying the following properties:
1. Alternation: For all P, Q ∈ E[n], e(P, Q) = e(Q, P)-1.
2. Bilinearity: For any P, Q, R ∈ E[n] we have e(P + Q, R) = e(P, R)e(Q, R).
3. Non-degeneracy: If P ∈ E[n] is such that for all Q ∈ E[n], e(P, Q) = 1, then P = O.
4. Computability: There exist an efficient algorithm to compute e(P, Q) for any P,
Q ∈ E[n].
We now turn our attention to a more concrete situation. Let p be prime, a ∈Zp*.
Consider the elliptic curve E over Fp and the map Φ: E/cl(Fp) → E/cl(Fp) defined by
E: Y2 = X3 + a, Φ(O) = O; Φ(x, y) = (ζx, y), where ζ ∈ Fp*2 \{1}, ζ 3 = 1, p = 2
(3)
or
E: Y2 = X3 + aX, Φ(O) = O; Φ(x, y) = (-x, iy), where i ∈ Fp*2 , i 2 = -1, p = 3.
(4)
186
Jacek Pomykała, Bartosz Źrałek
One can easily check that Φ is an automorphism. Pick up a point P ∈E/Fp of prime
order q, q | p + 1 = card E/Fp . Then E[q] = < P, Φ(P) >. We define the modified Weil
pairing ê by
ê: G1 ×G1 → G2, ê(R, S) = e(R, Φ(S)), where G1 = < P >, G2 = Fp*2 .
It easy to show that
Fact 4.1 For every R ∈< P > such that ê(R, P) = 1, we have R = O.
It is known that the C-DH problem in G1 is hard (cf. [3]), but as it is shown in [10]
not harder than the DLP in G2. The existence of Weil pairing implies directly that D-DH
problem is easy in G1. The randomized algorithm computing the Weil pairing was proposed by Miller [9]. Below we shall propose the derandomization of the Miller’s algorithm.
5. A simple deterministic version of Miller's algorithm for
computing the Weil pairing
Let's briefly recall the explicit construction of the Weil pairing. We refer the reader
to [17] for further details. Keep the notation from the previous section. Let E be an
elliptic curve over a finite field K and let n an integer not divisible by p = char K. Let (f)
stand for the divisor of a rational function f ∈ K(E). The support of a divisor
D = ∑ P∈E nP ( P) is the set supp D = {P ∈ E : nP ≠ 0} . For a rational function f ∈ K(E)
and a divisor D such that supp (f) ∩ supp D = ∅, the evaluation of f at D is
f ( D) = ∏ P∈supp D f ( P) nP ∈ K * .
We now define the value e(P, Q) for P, Q ∈ E[n]. Choose R, S ∈ E such that
AP = ( P + R ) − ( R ) and AQ = (Q + S ) − ( S ) have disjoint supports. Take a function f P ( f Q )
representing the principal divisor nAP (nAQ respectively). Then
e( P, Q) = f P ( AQ ) / fQ ( AP )
In Miller's algorithm the values f P ( AQ ) and fQ ( AP ) are computed separately, yet
analogously. We restrict our attention to the computation of f P ( AQ ) .
Let Ab= b(P + R) - b(R) - (bP) + (O), (fb)=Ab. Note that (fP) = (fn) and f P ( AQ ) =
f n ( AQ ) . We have the recursive formula
f b+c ( AQ ) = f c ( AQ ) f b ( AQ ) g1 ( AQ ) / g 2 ( AQ )
(2)
where g1 is the line passing through bP and cP, (g1)= (bP) + (cP) + (-(b + c)P) - 3(O), g2
the line passing through (b + c)P and -(b + c)P, (g2)= ((b + c)P) + (-(b + c)P) - 2(O).
Furthermore fn satisfies the initial condition f1 = h1/h2, where h1 is the line passing
through P + R and -(P + R), (h1)= (P + R) + (-(P + R)) - 2(O), h2 the line passing
through P and R, (h2) = (P) + (R) + (-(P + R)) - 3(O).
Dynamic group threshold signature based on derandomized Weil pairing computation
187
Let C(fb(AQ) , fc(AQ) , bP, cP) denote the right-hand side of equation 1. Write n in
the binary representation with the digits aj ( a m =1). The above considerations lead to
the following Miller's algorithm:
Algorithm 5.1 (V. Miller)
INPUT: P, Q ∈ E[n], R, S ∈ E
OUTPUT: fP(AQ)
Let g := f1(AQ) , f := g, Z := P
For j = m - 1 to 0 do
Let f := C(f, f, Z, Z), Z := 2Z
If aj = 1 then let f := C(f, g, Z, P), Z := Z + P
Return f
It can be easily shown that Miller's algorithm succeeds with “high probability” for
a “random” choice of R, S. Our idea of its derandomization depends on setting R := P,
S := Q and checking whether supports of the appropriate functions intersect nontrivially.
Suppose for example that supp (fb) ∩ supp AQ ≠ ∅ for some b involved in Miller's
algorithm and that 2 | n. Then one of the following conditions is satisfied:
1. 2P = 2Q, that is 2(P - Q) = O. This implies P = Q (since 2 | n) and e(P, Q) = 1.
2. P = Q or P = 2Q. This forces e(P, Q) = 1.
3. Q = 2P or Q = bP or Q = O. This implies e(P, Q) = 1.
4. 2Q = O which gives again Q = O (as 2 | n).
5. bP = 2Q . There exist x, y ∈ Z such that 2x + ny = 1. We have Q = (2x + ny)Q =
x(2Q) + y(nQ) = xbP and e(P, Q) = 1.
This leads to the following algorithm:
Algorithm 5.2 (deterministic version of Miller's algorithm)
INPUT: P, Q ∈ E[n]
OUTPUT: or a relation between P and Q proving that e(P, Q) = ±1
Let R1 := {2P = 2Q, 2P = Q, -2P = 2Q, -2P = Q, O = 2Q, O = Q, P = 2Q, P = Q},
R2 := {Z = 2Q, Z = Q, -2Z = 2Q, -2Z = Q, 2Z = 2Q, 2Z = Q},
R3 := {Z = 2Q, Z = Q, -(Z + P) = 2Q, -(Z + P) = Q, Z + P = 2Q, Z + P = Q}
If a relation in R1 is satisfied then return this relation
Let g := f1(AQ) , f := g, Z := P, b := 1
For j = m - 1 to 0 do
If a relation in R2 is satisfied then return this relation with Z = bP
Let f := C(f, f, Z, Z), Z := 2Z, b := 2b
If aj = 1 then
If a relation in R3 is satisfied then return this relation with Z = bP
Let f := C(f, g, Z, P), Z := Z + P, b := b + 1
Return f
188
Jacek Pomykała, Bartosz Źrałek
Note that the algorithm has to check only a “small” number of equalities. Moreover,
if 2 | n, it always outputs fP(AQ) or a relation between P and Q proving that
e(P, Q) = 1.
6. The protocol
6.1. Distribution phase
1. The dealer D executing the algorithm Keygen, generates the bilinear structure
(G1, G2, ê, P) as in section 3, together with a secure hash function H: {0, 1}* → G1.
Moreover he chooses L and the initial value of the threshold size t.
2. Using the public algorithm, D splits the set G into disjoint classes C1, …, Ct such
n
that the following condition holds: card Ci = + θi, |θi | ≤ 1, for every i. He publishes
t
the list of pairs (ID, Ci).
3. D generates at random the group secret s∈Zq, chooses randomly the shares s1, s2,
t −1
..., st-1 and computes st = s − ∑ i=1 si and the verification keys s1P, ..., stP. He sends the
shares by the secure channels to the corresponding group members and divulges the list
of pairs (Ci, siP).
4. D destroys all the computed values and quits the system.
6.2. Threshold increasing phase
1. If some initial class Ci (with the attached secret value si) should be split to
increase the threshold size by an integer l, l ≤ L-1, then the first initializer from Ci starts
the algorithm Threshold increase. He is therefore responsible for
(a) Splitting Ci into disjoint classes Ci, 1, …, Ci, l+1 as in step 2 of the
distribution phase, with obvious modifications. The initializer publishes all the
corresponding pairs (ID, Ci, j).
(b) Via the secure point-to-point channels, distributing new shares uj to each
member of the corresponding Ci, j, j=1, …, l+1, and asking every user from a
fixed class C', chosen among Cm or Cm, 1 for some m≠i, to add si-u1-...-ul+1 to his
share. The initializer publishes all the pairs (Ci, j, ujP) and the pair (C', s'P+(si-u1...-ul+1)P), where s'P is the public key of C'.
2. Every action in (a) and (b) taken above by the current initializer must be
approved (via the public board) by at least t players from the corresponding class (Ci, Ci,
1, …, Ci, l+1 or C'). If not then the next initializer from Ci executes the Threshold
increase algorithm.
6.3. Joining phase
1. If a new user requests through the public board to join a class Ci, j, j≠1, then each
member of Ci, 1 checks whether the limit a for players addition has not been exhausted in
Ci, j. Either at least t confirmations coming from Ci, 1 are collected and the corresponding
new pair (ID, Ci, j) is published, or the request is rejected.
Dynamic group threshold signature based on derandomized Weil pairing computation
189
2. The new member asks for the secret share from the corresponding class via a
secure channel and verifies that the received private key matches with the public key
from the public board.
6.4. Signing and verification phase
1. Given any message m the members apply the algorithm Sign having as input m
and the corresponding shares si and as output the corresponding partial signature σi.
2. Any group member can check the validity of the partial signatures and combine
the first t valid partial signatures using the algorithm Check-combine.
3. Any user can verify the final signature using the algorithm Verify.
7. The algorithms
The threshold flexible signature scheme (in the dynamic group) is a 5-tuple:
TFSS=(Keygen, Threshold increase, Sign, Check-combine, Verify).
Algorithm 1. Keygen(1κ)
B ← (G1, G2, ê, P), ord P = q
return B
Algorithm 2. Threshold increase(t, l, C, Ci, C')
u1, ..., ul+1 ←random Zq*
u0 ← u1+...+ul+1 mod q
s' ← s' + (si - u0) (the members of C', who were given the value si - u0, add it to their
share)
Kj ← new class, j=1, …, l+1
uj ← new share for Kj, j=1, …, l+1
C ← (C \{Ci})∪(K1∪…∪Kl+1)
return (t + l, C )
Algorithm 3. Sign(M, s(ID))
parse M as (m, t)
σID ← s(ID)H(m||t)
return (M, σID)
Algorithm 4. Check-combine(M, σ1, …, σt)
parse M as (m, t)
for i=1, …, t do
r ← (P, siP, H(m||t), σi) = DH ?
if r=0 then reject
σ ← σ1 +…+ σt
return σ
190
Jacek Pomykała, Bartosz Źrałek
Algorithm 5. Verify(M, σ)
parse M as (m, t)
return (P, sP, H(m||t), σ) = DH ?
8. Correctness and security discussion
8.1. Correctness
Recall that (P, aP, bP, cP) is a Diffie-Hellman tuple if and only if ê(aP, bP)= ê(P,
cP). This is a direct consequence of the bilinearity of ê and fact 4.1. Moreover, σi =
siH(m||t) if and only if (P, siP, H(m||t), σi) is a Diffie-Hellman tuple. The correctness of
Check-combine follows. The algorithm Verify is also correct; if (P, siP, H(m||t), σi) is
a Diffie-Hellman tuple for every i then so is (P, sP, H(m||t), σ).
8.2. Robustness
We will prove that the system of confirmations described in 4.1.2 and 4.1.3 allows a
coalition of honest players to permit legal actions and at the same time prevents a
coalition of corrupted players from validating illegal ones.
Consider in the first place the case when an initializer proposes a splitting of Ci into
Ci, 1, …, Ci, l+1. On the one hand, there are, by (1), at most t-1 corrupted players in Ci.
Therefore the corrupted users alone cannot validate the splitting. On the other hand, if
the splitting is proper, it will be permitted by the honest members of Ci. Indeed, still
n
7
from (1), there are at least - 1 - (t - 1) ≥ t of the honest ones.
2
2
Suppose that the splitting is proper and has been accepted. The initializer now
distributes new shares to the players of Ci, 1, …, Ci, l+1 and a correcting additive term to
the members of C' (cf. step 1b from the threshold increasing phase). We deal with the
case when confirmations are coming from a class Ci, j, j fixed. When members of C'
report confirmations then the argument is similar if C' = Cm, 1 and as above if C' = Cm
(m≠i). As previously, the corrupted players of Ci, j cannot validate the initializer's action.
It remains to show that Ci, j contains at least t honest users, which boils down to
n
1
−
−1.
justifying the inequality card Ci, j ≥ 2t-1. We have card Ci, j ≥
t ( l + 1) l + 1
Therefore it is enough to prove that 2t +
1
n
≤
. This in turn is a direct
l + 1 t ( l + 1)
consequence of (1). The following should be observed: even if the initializer acts in
such a way that exactly t confirmations from Ci, j are reported, no honest player of Ci, j
will be left aside in the game. At least one honest member of Ci, j had to receive the
corresponding correct private share from the initializer. This honest user can send it (via
the secure point-to-point channels) to the players of Ci, j who would have obtained the
incorrect share.
Dynamic group threshold signature based on derandomized Weil pairing computation
191
Finally, we must show that the last word on whether to add new users to some Ci, j,
j≠1, in the joining phase belongs to the honest members of Ci, 1. This follows from the
above, for new users cannot join Ci, 1 and, as has been already proved, card Ci, 1 ≥ 2t-1.
Note that we have shown in particular that every class contains at least one honest
player; there is thus always enough honest users to issue a correct combined signature.
8.3. Unforgeability
We first show that the corrupted players do not control, at any time, all the secret
shares and therefore cannot reconstruct the group secret. The proof is by induction on
the number of successful Threshold increase operations followed by additions of new
members.
Originally the dealer creates t classes. By (1), t exceeds the number of corrupted
players. There is thus at least one class containing only honest players, hence at least
one secure secret share.
Now assume that after some number of initial class splittings followed by new players additions the system is secure; it consists of the classes C1', …, Cr' with the corresponding secret shares s1', …, sr' and there are at most r-1 corrupted players. Suppose
that an initial class Ci' is split into K1, …, Kl+1 and the threshold size increased by calling
Threshold increase(r, l, C, Ci', Cj'). We can also suppose that the members of the split
Ci' are all honest. In the contrary case there is no corrupted player among the members
of some class Cm', m≠i. Then the secret share sm' (or sm' + ( si'-u1-...-ul+1) if m=j) is secure. Consider the situation when new users are added to K2, …, Kl+1. There is no loss of
generality in assuming that these new players, as well as some player of Cj' are corrupted. The class K1 has only honest members. Then the corrupted players cannot compute its secret u1. Indeed, their knowledge about u1 is nothing but a system of l+1
equations (they know the numbers si'-u1-...-ul+1, u2, ..., ul+1) in l+2 unknowns: u1, ..., ul+1
and si'. u1, ..., ul+1 are random, as every user from Ci', in particular the initializer who
distributed the new shares, is honest by assumption. si' is also random, because it is
derived from a random number generated by the honest dealer and unknown to the
corrupted players. Consequently, the system is still secure. It remains to prove that the
signature scheme itself is secure. The security of each partial signature against existential forgery on adaptive chosen-message attack follows from theorem 3.2 of [4] - an
adversary who could forge a partial signature with non-negligible probability could also
solve the C-DH problem in G1 with non-negligible probability. For the security of the
combined signature let us consider the worst case when the adversary possesses the
secret shares s1, …, st-1, but doesn't know the remaining secret share st. Suppose that he
can forge a combined signature sH(m||t) of some chosen message m. Then he can forge
the partial signature (s-s1-…-st-1)H(m||t) of m on behalf of the class with the secret share
s-s1-…-st-1 = st. This number, however, is random from the adversary's point of view.
We are thus led to a contradiction with the security of the partial signatures justified
above.
192
Jacek Pomykała, Bartosz Źrałek
9. Final remarks
In the paper we have proposed the model and security analysis of the threshold flexible signature scheme based on the derandomized Weil pairing computation. The solution has a nice property: the length of the corresponding secret shares and verification
keys depends neither on the group cardinality nor the threshold level. We proved that
the signature scheme is robust and unforgeable in the random oracle model provided the
C-DH problem is intractable. For the clarity of presentation we have considered in details only the most important case when each class is to be split only once. However this
obstruction can be easily overcome when decreasing suitably the bound for the threshold level. Finally let us also remark that by the suitable changes in the protocol we
can extend the model for more general case with the inequality k+h ≤ t, where h points
out the least number of the honest members needed to participate in signing.
References
[1] Blakley G. R. Safeguarding cryptographic keys, AFIPS Conference Proceedings,
48, 1979, 313-317.
[2] Boldyreva A. Threshold signatures, multisignatures and blind signatures based on
the Gap-Diffie-Hellman-Group signature scheme, LNCS 2567, 2003.
[3] Boneh D., Franklin M. Identity-based encryption from the Weil Pairing, Proc.
Crypto, LNCS 2139, 2001, 213-229.
[4] Boneh D., Lynn B., Shacham H. Short Signatures from the Weil Pairing, J. Cryptology 17(4), 2004, 297-319.
[5] Desmedt Y. Society and group oriented cryptography: a new concept. In: Crypto'87, LNCS 293, 120-127, Springer-Verlag, 1988.
[6] Desmedt Y., Frankel Y.Threshold cryptosystems, In: Crypto'89, LNCS 435, 307315, Springer-Verlag, 1990.
[7] Ghodosi H., Pieprzyk J., Safavi-Naini R. Dynamic Threshold Cryptosystems: A
New Scheme in Group Oriented Cryptography, Proceedings of PRAGOCRYPT '96
-- International Conference on the Theory and Applications of Cryptology (J.
Pribyl, ed.), Prague, CTU Publishing house, 1996, 370-379.
[8] Joux A. A one-round for tripartite Diffie-Hellman, Proceedings of ANTS IV,
LNCS vol. 1838 (2000), 385-394.
[9] Miller V. Short programs for functions on curves, unpublished manuscript, 1986.
[10] Menezes A., Okamoto T., Vanstone S., Reducing elliptic curve logarithms to logarithms in a finite field, IEEE Transactions on Information Theory, 39, 1993, 16391646.
[11] Nakielski B., Pomykała J. A., Pomykała J. M. A model of multi-threshold signature
scheme, Journal of Telecommunications and Information Technology, 1, 2008, 5155.
[12] Pietro R. Di, Mancini L. V., Zanin G. Efficient and Adaptive Threshold Signatures
for Ad hoc networks, Electronic Notes in Theoretical Computer Science, 171,2007,
93-105.
[13] Pomykała J., Warchoł T. Threshold signatures in dynamic groups, Proceedings of
FGCN, Jeju Island, Korea, IEEE Computer Society, 2007, 32-37.
Dynamic group threshold signature based on derandomized Weil pairing computation
193
[14] B. Nakielski B, Pomykała J., Simple dynamic threshold decryption based on CRT
and RSA, submitted.
[15] Rivest L., Shamir A., Adleman L. M. A Method for Obtaining Digital Signatures
and Public-Key Cryptosystems, Communications of the ACM 21(2), 1978, 120126.
[16] Shamir A. How to share a secret, Communications of the ACM, 22(1), 1979, 612613.
[17] Silverman J. H. The Arithmetic of Elliptic Curves, Springer, 1986.
[18] J. Pomykała, T. Warchoł, Dynamic multi-threshold signature without the trusted
dealer, International Journal of Multimedia and Ubiquitous Engineering, v. 3 no 3,
2008, 31-42.
[19] J. Pomykała, B. Źrałek, Threshold flexible signture in dynamic group, Proceedings
of 15-th International Multi-Conference ACS 15-17, Miedzyzdroje, 15-17 October
2008.
Determining the effective time intervals in the recurrent
processes of identification of dynamic systems
Orest Popov, Anna Barcz, Piotr Piela
Szczecin University of Technology,
Faculty of Computer Science and Information Technology
Abstract:
Using the mathematical models of dynamic objects in the simulation and modeling tasks
we must pay attention to quality and speed of creation of those models. Depending on the
amount of information about the considered object we can distinguish various tasks of
model creation. One of the way of create model is using the identification methods. In
case when we need the mathematical model of considered system while this system is
operating it is necessary to use the recurrent algorithms of identification. Length of
duration of the identification algorithm is specific a priori by the researcher. In the paper
the way of using the singular value decomposition (SVD) in the identification equation for
determining the effective time intervals of identification process is shown. It allows to
determine the time of ending the identification algorithm.
Keywords:
dynamics model, task of identification, singular value decomposition
1. The task of identification
Generally speaking, task of identification of real dynamic objects can be seen as a
problem of determining structure and defining parameters of their mathematical models.
Depending on the amount of information we can distinguish various tasks of identification [3]. It is one of the basic problems of modern theory of systems to determine structure and parameters of real objects model. It is not possible to provide a high quality
simulation of behavior of some real objects and thus it is not possible to computer aided
controlling, without having a high quality mathematical model. Identification of nonlinear dynamic objects could be done in two ways. The first one relies on collecting right
amount of data and next carrying out the identification procedure (off-line identification). Such an organization of identification process is not effective for the sake of
speed of algorithm operating and causes that the model is available after end of whole
identification process. However, in many cases it is necessary to have a model of the
object available on-line while the object is in operation (on-line identification). The
recurrent method of identification has such a possibility. In these methods the evaluation of model parameters in given time of measurement is defined as an evaluation of
parameters in previous time plus a certain correction. One of methods of such a type is
the recurrent least squares method.
196
Orest Popov, Anna Barcz, Piotr Piela
Let assume that the identifiable dynamic object could be described by the linear mathematical model with the fixed parameters in form:
Xɺ = AX + BU ,
X ∈ Rn , U ∈ Rm
(1)
where X – vector of state variables, U – vector of control signals, A and B - unknowns
matrices with adequate dimension and with fixed coefficients.
Let assume, also, that we could measure X(t) and U(t) and it is possible to calculate
ɺ
X (t ) in each moment of time. Thus the identification equation of the matrices A and B
that is based on the least square method could be written in the known form [4]:
Cs Ps = Rs
(
where: Cs = Aˆ s
Bˆ s
)
(2)
– the matrix of the evaluations of matrices A and B elements,
s
s
Xj
Rs = ∑ ( Xɺ j Z j T ) , Ps = ∑ Z j Z j T , Z j =   , s – amount of measure points. Let emj =1
j =1
U j 
phasize that the dimension of matrices has form: dim C = n × ( n + m ) ,
dim Ps = ( n + m ) × ( n + m ) , dim Rs = n × ( n + m ) .
If the condition rankPs = ( n + m ) is fulfilled the solution of identification equation
(2) is unambiguous and has the form:
Cs = Rs Ps−1
The same solution in recurrent form could be shown as [3, 4]:
Cs = Rs Ps−1 = ( Rs −1 + Xɺ s Z sT ) ⋅ ( Ps−−11 − K s Ps−−11 Z s Z sT Ps−−11 )
where: s – number of next measurement, K s =
(3)
1
.
1 + Z Ps−−11 Z s
T
s
To use the recurrent algorithms, initial values of R0 and P0−1 for their start-up are
required. Usually R0 is matrices of zeros with ( n × ( n + m ) ) dimensionality, however
P0−1 is the diagonal matrix with
(( n + m ) × ( n + m ))
dimensionality and the elements
form the diagonal are from range (100 ÷ 106 ) [1].
2. Singular value decomposition in identification tasks
It is known that any rectangular real numerical matrix W of the size ( n × p ) can be
presented as the singular value decomposition in form [2]:
W = Q ⋅ M ⋅ RT
(4)
Determining the effective time intervals in the recurrent processes of identification…
197
where Q and R – orthogonal matrices of dimensionality ( n × n ) and ( p × p ) accordingly. If n < p the rectangular matrix M is in the special form:
 µ1 0 0

M = 0 ⋯ 0
0 0 µ
n

0
0

0 ⋯ 0
0
0 
(5)
Size of matrix M is equal ( n × p ) . The nonnegative elements from this matrix
{µ1 , µ2 ,…, µn }
are the singular values of the matrix W. They are located decreasing
order, that is
µ1 ≥ µ2 ≥ ⋯ ≥ µn > 0 .
(6)
Singular value decomposition could be used for the real matrix Ps−1 in identification
equation (3). This matrix is calculated for each measurement during operating the identification algorithm:
Ps−1 = Gs ⋅ Τ s ⋅ H sT
(7)
where Gs and H s – orthogonal matrices with suitable dimensionality, matrix Ts is
constructed in the same way as matrix M.
Let introduce the quantity that is an inverse to spectral condition number of a matrix
Ts [5, 6]:
∆
ξ=
τ
1
= n .
Cond 2(Ts ) τ 1
(8)
The quantity ξ is a convenient quantitative measure of informative density of identification process as shown in [7] and it can be using for determining the stop time of
the identification process. In fact the most efficient part of identification process is the
section of time in which the index of informative density ξ of a process is rising. The
dξ
= max is meeting, determines the end of the efficient
dt
part of identification process. The continuation of the identification processes in minimal extent influence the results that had been achieved in advance.
moment, when the condition
3. Determining stop time of recurrent identification algorithm with
using the singular value decomposition in the identification
equation – the example
The application of singular value decomposition in the identification tasks will be
demonstrated on an example of fourth order system with one control signal. It can be a
system spring-pendulum presented in a figure 1.
198
Orest Popov, Anna Barcz, Piotr Piela
Figure 1. System spring-pendulum
pendulum
Consider a model of spring-pendulum
pendulum system that is characterized, under some conco
ditions, by the set of nonlinear differential equations:
(
(
)
)
)
(
(

M + m sin 2 ϕ ɺɺ
z = u (t ) − c ⋅ z − k + k sin 2 ϕ zɺ + + m sin ϕ l ⋅ϕ 2 + g cosϕ

1 2

ml 2 M + m sin 2 ϕ ϕɺɺ = − ml cos ϕ ⋅ u (t ) + ml cos ϕ ⋅ c ⋅ z +



+l cos ϕ mk − Mk zɺ − k l 2 M + m sin 2 ϕ ϕɺ +

1
2
2

− ( M + m ) mgl sin ϕ − m 2l 2 sin ϕ cos ϕ ⋅ϕɺ 2

(
)
)
(
)
(9)
where: M – mass of block, m – mass of pendulum, l – length of pendulum, z – displacement of block from the point of balance, ϕ – displacement of pendulum from the
point of balance, u – control signal, g – acceleration of gravity, c – constant of spring,
k1 – friction coefficient of block, k2 – friction coefficient of pendulum.
As a result of using the linearization of the object model for selected point of state
space the object (9) could be described as a linear model with fixed parameters
parame
in form
(1).
Additionally for simplification let assume that U (t ) = 0 then the identification will
be applied to parameters of linear homogeneous model in form:
Xɺ = AX ,
X ∈ R4 ,
(10)
where X – vector of state variables X = ( z ϕ zɺ ϕɺ ) , A – matrix of unknown
parameters.
The measurement data necessary in identification were obtained as a result of simusim
lation the nonlinear model (9) for fixed initial conditions. The identification experiment
lasts for a 20 sec. The measurements were done each 0.01 second. After identification
on basis of the recurrent least squares method, the matrix A has been obtained in the
following form:
T
Determining the effective time intervals in the recurrent processes of identification…
 0.0020

-0.0022
A=
 -0.2002

 0.2006
0.0001 

0.1073 0.0186 1.0187 
1.2227 -0.0490 -0.0130 

-11.2083 -1.9859 -1.9249 
-0.0120
199
1.0006
(11)
The eigenvalues of matrix A are equal:
 -0.1403

 -0.1403
 - 0.7920

 - 0.7920
+ 0.4064i 

− 0.4064i 
.
+ 3.1925i 

− 3.1925i 
In figure 2 and 3 the tracks of the state variables of nonlinear model and the tracks
of the state variables of model after identification are shown. After analysis of both
graphs it follows that the linear model (10) with parameters (11) quite good describes
the behavior of given object.
Figure 2. The track of the state variable z of nonlinear model and the track of the state variable of
the model after identification
After each measurement in the identification process the equation (3) is solved, and
what follows the matrix Ps−1 is calculated. The singular value decomposition is done
and the coefficient ξ is calculated on the basis of equation (8). In figure 4 the track of
the coefficient ξ is shown. The analysis of speed of changes this coefficient (figure 5)
allows to determine the time when the speed of changes has maximum value. The next
identification experiment was conducted for the same initial condition as in previous
experiment and lasts for a 2.44 seconds. The time was determined on the basis of time
of appearance the maximum value of speed of coefficient changes.
200
Orest Popov, Anna Barcz, Piotr Piela
Figure 3. The track of the state variable φ of nonlinear model and the track of the state variable of
the model after identification
Figure 4. The track of coefficient ξ
Figure 5. The track of speed of change of coefficient ξ
Determining the effective time intervals in the recurrent processes of identification…
201
As a result of identification we obtain matrix A1 in form:
 0.0021

-0.0028
A1 = 
 -0.2006

 0.2034
0.0002 

0.1067
0.0140 1.0175 
1.2223 -0.0521 -0.0139 

-11.2057 -1.9665 -1.9198 
-0.0119
1.0009
The eigenvalues of matrix A1 are equal
 - 0.1412

 - 0.1412
 - 0.7903

 - 0.7903
+ 0.4063i 

− 0.4063i 
+ 3.1918i 

− 3.1918i 
and they almost do not differ from the eigenvalues of matrix A. In spite of cut the time
of identification experiment down, the tracks of state variables of model Xɺ = A1 X do
not differ from the tracks of state variables of models Xɺ = AX for the same initial
conditions. It follows that the time of appearance the maximum value of speed of
changes of coefficient ξ determines the stop time of the identification experiment,
which fundamentally influences time and number of necessary calculations.
4. Summary
There are many publications that concerned the recurrent algorithms of identification, but they do not touch on the problem of automatic determining the stop time of the
identification process. In presented paper authors show the possibility of automatic
determining the stop time of identification process with using the singular value decomposition. The results of such task allow to create efficient algorithms of identification in
real time. Presented material poses the outline and point at direction for further research
connected with problem of automatic determining the stop time of recurrent algorithm
of identification.
References
[1] Bielińska E., Finger J., Kasprzyk J., Jegierski T., Ogonowski Z., Pawełczyk M.
Identification of process, Wydawnictwo Politechniki Śląskiej, Gliwice, 2002 (in
polish).
[2] Kiełbasinski A., Schwetlick H. Numerical linear algebra. Introduction to numerical computation. Second edition. Wydawnictwa Naukowo Techniczne, Warszawa,
1992 (in polish).
[3] Ljung L. System Identification Theory for the User. Prentice Hall PTR, Upper Saddle River, New York, 1999.
[4] Popov O. Elements of systems theory – dynamic systems. Wydawnictwo Uczelniane Politechniki Szczecińskiej, Szczecin 2005 (in polish).
202
Orest Popov, Anna Barcz, Piotr Piela
[5] Popov O. Investigation of structural qualities and informative density of dynamic
processes. The method of quantitative estimations. International Conference of
Control Problems, IPU, Moscow, 1999.
[6] Popov O., Tretyakov A. Structural properties and informative density of dynamic
processes: The method of quantitative estimation at the control, management and
identification problems. Proceedings of the 5th International Conference Advanced
Computer Systems, part II, p. 216 – 224, Szczecin, 1998.
[7] Popov O., Tretyakov A., Quantitative measures of systems structural qualities in
control, management and identification problems. Proceedings of Workshop on
European Scientific and Industrial Collaboration WESIC’99, Newport, 1999.
Infrastruktura zarządzania uprawnieniami
wykorzystująca kontrolę dostępu opartą na rolach
zintegrowaną z jądrem systemu operacyjnego
Michał Słonina, Imed El Fray
Politechnika Szczecińska, Wydział Informatyki
Abstract:
Interconnectivity of computers in network environments increases the risk of security
breach in distributed computer systems. In many system architectures security is provided
only in application space. This paper proposes an operating system enforced access
control policy where Role Based Access Control and Privilege Management
Infrastructure(PMI) based on X.509 attribute certificates are integrated into the
operating system kernel. The resulting architecture tries to leverage the cost of
maintaining the security policy by providing an easy way for managing security role
assignments to users of the system.
Keywords:
UNIX, RBAC, PKI, PMI
1. Wprowadzenie
Kontrola dostępu [1,2,3,4,5] jest integralną częścią wszystkich popularnych systemów operacyjnych. Kontrola dostępu opierająca się na rolach RBAC [6][7] stała się
standardowym mechanizmem wykorzystywanym w wielu architekturach systemów
informatycznych przedsiębiorstw. W dzisiejszych systemach informatycznych bardzo
często kontrola dostępu wykorzystuje infrastrukturę klucza publicznego do weryfikacji
toŜsamości podmiotu w systemie, jak i infrastruktury zarządzania uprawnieniami w celu
dynamicznego zarządzania jego uprawnieniami. Projekty takie jak PERMIS[8][9],
PRIMA[10] zapewniają narzędzia integrujące kontrole dostępu RBAC z infrastrukturą
PKI i PMI. Systemy te realizują jednak kontrolę dostępu jedynie na poziomie warstwy
aplikacji.
Problem z bezpieczeństwem na poziomie warstwy aplikacji wynika ze skomplikowanej natury dzisiejszych systemów operacyjnych jak i zawiłości środowiska na którym
kod aplikacji jest wykonywany. W przypadku kontroli dostępu na poziomie warstwy
aplikacji błąd w kodzie programu który z pozoru nie jest krytyczny z punktu widzenia
bezpieczeństwa (np. w bibliotece ładującej obrazki) umoŜliwia nam całkowite obejście
mechanizmu bezpieczeństwa zaimplementowanego w aplikacji.
JeŜeli system operacyjny wykonujący aplikację nie jest chroniony dodatkowo przez
restrykcyjną politykę dostępu MAC istnieje duŜe prawdopodobieństwo wycieku informacji. Skompromitowany system moŜe być uŜyty jako wektor dalszego ataku na cały
system informatyczny.
204
Michał Słonina, Imed El Fray
W [11] badacze dowodzą, Ŝe zagroŜenia stwarzane nowoczesnym środowiskom
komputerowym nie mogą zostać zniwelowane bez wsparcia ze strony systemu operacyjnego i zachęcają do dalszych badań na tym polu.
Większość z dzisiejszych systemów operacyjnych wspiera kontrolę dostępu RBAC
wykorzystującą role. W szczególności w systemach o otwartych źródłach jądra systemu
istnieje wiele łatek wspierających taki mechanizm kontroli dostępu. W systemach tych
brakuje jednak integracji mechanizmów kontroli bezpieczeństwa z architekturą PMI.
W tym artykule podejmujemy próbę integracji mechanizmów kontroli bezpieczeństwa z architekturą PMI (architektura modułu kontroli dostępu bazującego na rolach
RBAC umoŜliwiającego integrację infrastruktury klucza publicznego PKI i infrastruktury zarządzania uprawnieniami PMI z jądrem systemu w oparciu o standardy PKI [12]
i PMI [13]) i próbujemy wyjaśnić dlaczego taka integracja jest potrzebna oraz jakie są
jej zalety.
2. Wymagania architektury systemu
2.1. Infrastruktura zarządzania uprawnieniami PMI
Infrastruktura zarządzania uprawnieniami PMI jest rozszerzeniem infrastruktury
klucza publicznego PKI. PMI umoŜliwia przypisanie dowolnych atrybutów uŜytkownikom infrastruktury klucza publicznego PKI poprzez wykorzystanie certyfikatów atrybutów. W infrastrukturze PKI dokument który wiąŜe klucz publiczny z toŜsamością
właściciela nazywa się certyfikatem klucza publicznego PKC (Public Key Certificate).
W infrastrukturze PMI dokumentem jest certyfikat atrybutów AC (Attribute Certificate)
który wiąŜe atrybut z toŜsamością. Atrybutem tym moŜe być np. przypisanie jakiejś roli
w systemie lub przydzielenie konkretnego prawa dostępu.
Certyfikaty atrybutów są potrzebne poniewaŜ:
 AC nie musi zawierać informacji o kluczu publicznym właściciela tak jak certyfikat
PKC,
 okres waŜności przypisania atrybutu przez AC jest zazwyczaj znacznie krótszy niŜ
okres przypisania klucza publicznego do toŜsamości przez certyfikat PKC,
 uniewaŜnienie AC nie wymaga uniewaŜnienia PKC,
 rozdział obowiązków/toŜsamość uŜytkownika jest uznawana przez inny urząd niŜ
urząd wydający uprawnienia. NaleŜy zwrócić tutaj uwagę Ŝe moŜe istnieć kilka
urzędów atrybutów, z których kaŜdy ma uprawnienia do nadawania innego zestawu
atrybutów, co bardzo często spotykane jest w róŜnego rodzaju organizacjach.
Certyfikaty atrybutów pozwalają nam w łatwy sposób zarządzać przypisaniami ról
do uŜytkowników. Połączenie infrastruktury PMI z kontrolą dostępu z wykorzystaniem
ról RBAC wymaga dobrze zdefiniowanego punktu kontroli uprawnień PVS (Privilege
Verification Subsystem). PVS jest odpowiedzialny za weryfikację toŜsamości oraz
autoryzację uŜytkownika do wykonywania czynności, zawartych w polityce bezpieczeństwa, związanych z rolą nadaną przez urząd atrybutów w certyfikacie atrybutów
AC. PVS weryfikuje równieŜ czy przypisanie roli zgodne jest z polityką przyznawania
ról.
Infrastruktura zarządzania uprawnieniami wykorzystująca kontrolę dostępu…
205
2.2. Wymagania systemu operacyjnego
System operacyjny jest szerokim pojęciem które definiuje oprogramowanie odpowiedzialne za zarządzanie, współdzielenie i alokację zasobów. Większość z dzisiejszych systemów ogólnego przeznaczenia (NT,*BSD,Linux,Solaris) bazuje na fundamentach stworzonych przez system operacyjny UNIX.
KaŜdy z tych systemów posiada następujące koncepcje:
 UID (User Identifier) - identyfikator uŜytkownika w systemie,
 PID (Process Identifier) - identyfikator procesu,
 hierarchię systemu plików - dla uproszczenia modelu załóŜmy, Ŝe wszystkie obiekty
podlegające kontroli dostępu znajdują się w hierarchii systemu plików.
Aby wspierać model PMI RBAC system operacyjny musi posiadać dodatkowo:
 koncepcję identyfikatora sesji SID - identyfikator sesji jest numerem przypisanym
do procesu który reprezentuje sesję uŜytkownika powstałą po jego zalogowaniu do
systemu. Identyfikator sesji moŜe zostać zmieniony tylko poprzez kolejne zalogowanie się. Procesy potomne dziedziczą identyfikator sesji rodziców. System powinien posiadać domyślną sesję dla procesów systemowych, które nie są związane
z Ŝadną rzeczywistą toŜsamością uŜytkownika,
 koncepcję ról - role są abstrakcyjnym pojęciem reprezentującym zbiór pewnych
uprawnień,
 przypisanie ról - funkcja przyporządkowująca identyfikator sesji do ról, przypisanie
roli moŜe zawierać ograniczenia (np. okresu waŜności przypisania) oraz identyfikator certyfikatu atrybutów który został uŜyty w procesie autoryzacji uŜytkownika do
roli,
 przypisanie toŜsamości uŜytkownika do sesji,
 przypisanie UID do toŜsamości X.509.
Tradycyjnie systemy UNIX wspierały tylko model uznaniowej kontroli dostępu
DAC [14] w którym nie było koncepcji ról. ToŜsamość uŜytkownika w systemie UNIX
jest w wysokim stopniu abstrakcyjną koncepcją, która nie jest związana w Ŝaden formalny sposób z prawdziwą toŜsamością uŜytkownika systemu. Najlepszym przykładem
ilustrującym problem jest uŜytkownik root. Nie tylko nie posiada toŜsamości ale
w implementowanych mechanizmach DAC posiada dostęp do wszystkich zasobów
systemu.
Kontrola dostępu DAC jest do tego stopnia zintegrowana z systemem UNIX, Ŝe nie
ma moŜliwości zastąpienia jej przez zupełnie nowy model. Integracja modelu kontroli
dostępu bazująca na PMI/RBAC z systemem UNIX tworzy pewien dysonans związany
z róŜnym znaczeniem toŜsamości w systemie DAC i PMI/RBAC. W modelu DAC
zachowane musi zostać stare znaczenie toŜsamości znane z systemu UNIX, natomiast
model PMI/RBAC musi stworzyć kolejną warstwę systemu kontroli dostępu, w tym
wypadku uznaniowego, w którym toŜsamość uŜytkownika systemu jest związana z jego
prawdziwą toŜsamością. Z tego powodu podsystemy DAC i MAC kontroli dostępu
powinny być traktowane jako osobne koncepcje.
2.2.1. Koncepcja integracji modelu PMI z systemem operacyjnym
Model PMI/RBAC jest abstrakcją wyŜszego poziomu której celem jest bardziej bezpośrednia kontrola nad dostępem uŜytkownika do zasobów systemu. Aby obejść pro-
206
Michał Słonina, Imed El Fray
blem pewnej niekompatybilności związanej z toŜsamością w obu modelach, architektura systemu bezpieczeństwa musi zawierać funkcję odwzorowującą prawdziwą toŜsamość uŜytkownika na jego UID. Zbiorem wartości tej funkcji będą tylko numery UID
które są przypisane do konkretnej prawdziwej toŜsamości uŜytkownika. Z kolei śledzenie sesji poprzez wprowadzenie numeru SID pozwala na autoryzację uŜytkownika do
konkretnych ról tylko w danej sesji. Takie podejście pozwala na prostsze zarządzanie
wymaganym w danej chwili zbiorem uprawnień oraz umoŜliwia dziedziczenie uprawnień przez procesy w systemie.
Wprowadzenie SID pozwala takŜe na wykorzystywanie przez uŜytkownika systemu
zakazanych kombinacji ról, o ile role te są wykorzystywane w innych sesjach. Przypisanie ról łączy koncepcje SID i ról. Przypisanie ról powinno zawierać identyfikator certyfikatu atrybutów który został uŜyty przy autoryzacji do roli, okres waŜności oraz inne
ograniczenia w nim zawarte. Identyfikator certyfikatu atrybutów jest wymagany aby
umoŜliwić implementację mechanizmu uniewaŜnienia certyfikatów. System operacyjny
po stwierdzeniu uniewaŜnienia certyfikatu moŜe odwołać przypisania ról istniejących
w systemie związanych z danym certyfikatem uŜywając identyfikatora certyfikatu.
Potrzeba zawarcia informacji o okresie waŜności w przypisaniu roli podyktowana jest
samą naturą certyfikatu atrybutów. Certyfikat atrybutów z definicji jak opisaliśmy wyŜej jest certyfikatem o krótkim okresie waŜności. Dodatkowe ograniczenia czasowe
mogą wynikać z zastosowań systemu w rzeczywistości (np. lekarz powinien mieć moŜliwość dostępu do danych pacjentów tylko w czasie pracy ).
2.3. Kontrola dostępu oparta na rolach
Wymagania podsystemu RBAC (wymagania podyktowane są cechami infrastruktury PMI, systemu operacyjnego oraz uŜyteczności w środowiskach produkcyjnych):
 hierarchiczny model RBAC z ograniczeniami - jest najbardziej zaawansowanym
formalnym modelem kontroli dostępu przy pomocy ról zdefiniowanym w [15],
 tymczasowa natura przypisania roli - natura przypisania roli jest dynamiczna poniewaŜ okres waŜności przypisania roli jest sprecyzowany w certyfikacie atrybutów.
Przypisanie roli moŜe wygasnąć z powodu wygaśnięcia certyfikatu PKC, certyfikatu
atrybutów lub uniewaŜnienia któregokolwiek z nich. Model Temporal RBAC[16]
idealnie pasuje do tego scenariusza.
Wymóg tymczasowej natury przypisania roli, hierarchii i ograniczeń moŜna łatwo
zrozumieć poprzez spojrzenie na wymagania bezpieczeństwa w organizacjach. Hierarchizacja ról jest naturalnym rezultatem podziału w organizacji wynikającego z odpowiedzialności za dany aspekt jej działalności. Pozwala na odzwierciedlenie wewnętrznej
struktury organizacji w abstrakcyjnej hierarchii ról. Z kolei wymóg okresu waŜności roli
związany jest z przetwarzaniem certyfikatów. Sprawdzenie waŜności przypisania roli za
kaŜdym razem gdy uŜytkownik wymaga prawa z niej wynikającego byłoby kosztowne
obliczeniowo. Lepszym rozwiązaniem jest zapamiętanie przez system okresu waŜności
roli i porównywanie tego czasu z zegarem systemowym gdy dane uprawnienie jest
potrzebne.
Infrastruktura zarządzania uprawnieniami wykorzystująca kontrolę dostępu…
207
3. Implementacja systemu
Implementacja proponowanej architektury systemu moŜe być osiągnięta poprzez
modyfikację istniejącego systemu operacyjnego. Otwarte systemy operacyjne stanowią
idealną platformę do eksperymentów z róŜnymi architekturami bezpieczeństwa z powodu dostępności kodu i dokumentacji, a takŜe istnienia innych rozwiązań kontroli dostępu.
3.1. Komponenty architektury systemu
MoŜemy wyróŜnić następujące komponenty proponowanej architektury:
 jądro systemu bezpieczeństwa - składa się z punktu decyzyjnego uprawnień (PDP Permission Decision Point) oraz punktu egzekwowania uprawnień (PEP - Permission Enforcement Point),
 podsystem weryfikacji uprawnień - jest komponentem odpowiedzialnym za implementację logiki związanej z infrastrukturą PMI. Analizuje i weryfikuje certyfikaty
PKC i AC, a takŜe sprawdza ich poprawność pod kątem polityki bezpieczeństwa.
Jest to komponent działający w przestrzeni uŜytkownika,
 narzędzia przestrzeni uŜytkownika - są to narzędzia i biblioteki wykorzystywane do
zarządzania kontekstem bezpieczeństwa sesji (pozwalają na manipulację toŜsamością sesji oraz przypisaniem ról).
Separacja podsystemu PVS od jądra systemu bezpieczeństwa oraz komponentów
przestrzeni uŜytkownika jest potrzebna aby uniknąć ryzyka związanego z przetwarzaniem skomplikowanych struktur ASN.1 kodujących certyfikaty PKC i AC.
3.2. Podstawowe operacje w systemie
Aby wspierać model PMI RBAC system operacyjny musi udostępniać sposoby manipulacji toŜsamością sesji oraz przynaleŜnością do ról. Podstawowymi operacjami są:
 Ŝądanie_zmiany_toŜsamości (PKC) - Ŝądanie przyporządkowania toŜsamości do
sesji, certyfikat PKC zawiera informacje o toŜsamości którą chcemy przyporządkować danej sesji,
 Ŝądanie_przyznania_roli (AC) - Ŝądanie przypisania roli zawartej w certyfikacie
atrybutów AC do bieŜącej sesji. Właściciel certyfikatu atrybutów musi zgadzać się
z bieŜącą toŜsamością sesji,
 uniewaŜnij rolę (rola) - uniewaŜnia daną rolę w konkretnej sesji
Weryfikacja toŜsamości w proponowanym systemie PMI/RBAC powinna być koncepcją ortogonalną do tradycyjnej toŜsamości uŜytkownika reprezentowanej przez UID
w tradycyjnym systemie DAC, poniewaŜ zmiana UID w systemie UNIX oznacza czasami tylko zmianę w poziomie uprawnień, natomiast toŜsamość w modelu PMI/RBAC
jest ściśle związana z prawdziwą toŜsamością uŜytkownika (rysunek 1).
208
Michał Słonina, Imed El Fray
Rysunek 1. Określenie toŜsamości uŜytkownika w systemie
Aby ustalić toŜsamość sesji uŜytkownik podaje systemowi swój certyfikat PKC
a następnie potwierdza swoją toŜsamość w sposób określony w polityce bezpieczeństwa
(np. poprzez podanie hasła dostępu). PVS weryfikuje to potwierdzenie a takŜe waŜność
certyfikatu, po czym zwraca informacje o weryfikacji wraz z certyfikatem X.509 do
jądra systemu.
Z kolei autoryzacja roli (rysunek 2) dokonywana jest przez uŜytkownika poprzez
wskazanie odpowiedniego certyfikatu atrybutów systemowi operacyjnemu. Pole właściciela certyfikatu atrybutów musi zgadzać się z toŜsamością uŜytkownika który tworzy
Ŝądanie, a wystawca certyfikatu musi mieć prawo nadawania konkretnej roli w polityce
bezpieczeństwa. Rola i ograniczenia zawarte w certyfikacie determinują parametry
operacji przyznania roli. Certyfikat atrybutów jest weryfikowany i analizowany przez
PVS. PVS sprawdza waŜność certyfikatu, uzyskuje informację o przypisaniu roli, waŜności tego przypisania oraz jego ograniczeniach. PVS zwraca do jądra systemu informację o tym jaką rolę przypisać do danej toŜsamości, jaki był numer seryjny certyfikatu
atrybutów Ŝądania oraz jaki jest okres waŜności przypisania roli i jakie są ograniczenia.
Rysunek 2. Autoryzacja roli w systemie
Infrastruktura zarządzania uprawnieniami wykorzystująca kontrolę dostępu…
209
3.3. Korzyści wynikające z systemu
Podstawową zaletą proponowanego systemu jest zwiększone bezpieczeństwo kaŜdego komputera występującego w systemie informatycznym poprzez wprowadzenie
dodatkowej warstwy kontroli dostępu modelującej bezpośrednio w strukturę organizacji
w której jest wdraŜany poprzez zastosowanie infrastruktury PMI/RBAC. Kontrola dostępu jest bardzo silnie zintegrowana z jądrem systemu operacyjnego co istotnie zmniejsza podatność systemu na zagroŜenia wynikające z błędów w aplikacjach działających
w systemie. W najgorszym scenariuszu jeŜeli pokonane zostaną zabezpieczenia uŜytkownika aplikacji, to implementowany system MAC nie pozwoli na rozprzestrzenienie
się zniszczeń poza domenę określoną bieŜącymi uprawnieniami uŜytkownika w systemie.
4. Wnioski
Integracja kontroli dostępu PMI/RBAC z systemem operacyjnym daje wymierne korzyści zarówno w sferze bezpieczeństwa systemu jak i łatwości administracji. Implementacja takiego systemu wydaje się być moŜliwa wykorzystując szkielet LSM
systemu Linux oraz kryptograficzne narzędzia open source. Zdaniem autorów kolejnym
krokiem jest stworzenie takiego systemu oraz analiza jego stosowalności w świecie
rzeczywistym.
Bibliografia
[1] Lampson, B.: Protection. In: Proceedings of the 5th Annual Princeton Conference
on Information Sciences and Systems, Princeton University (1971) 437–443
[2] Bell, D., La Padula, L.: Secure Computer Systems: Mathematical Foundations
(Volume 1). Technical report, ESD-TR-73-278, Mitre Corporation (1973)
[3] Lipton, R., Snyder, L.: A Linear Time Algorithm for Deciding Subject Security.
Journal of the ACM (JACM) 24(3) (1977) 455–464
[4] Harrison, M., Ruzzo, W., Ullman, J.: Protection in operating systems. Communications of the ACM 19(8) (1976) 461–471
[5] Denning, D.: A Lattice Model of Secure Information Flow. Communications (1976)
[6] Ferraiolo, D., Kuhn, R.: Role-based access controls. In: 15th NIST-NCSC National
Computer Security Conference. (1992) 554–563
[7] Ferraiolo, D., Sandhu, R., Gavrila, S., Kuhn, D., Chandramouli, R.: Proposed NIST
standard for role-based access control. ACM Transactions on Information and System Security (TISSEC) 4(3) (2001) 224–274
[8] Chadwick, D., Otenko, A., Ball, E.: Role-based access control with X. 509 attribute
certificates. Internet Computing, IEEE 7(2) (2003) 62–69
[9] Chadwick, D., Otenko, A.: The PERMIS X. 509 role based privilege management
infrastructure. Future Generation Computer Systems 19(2) (2003) 277–289
[10] Lorch, M., Adams, D., Kafura, D., Koneni, M., Rathi, A., Shah, S.: The PRIMA
system for privilege management, authorization and enforcement in grid environments. Grid Computing, 2003. Proceedings. Fourth International Workshop on
(2003) 109–116
210
Michał Słonina, Imed El Fray
[11] Loscocco, P., Smalley, S., Muckelbauer, P., Taylor, R., Turner, S., Farrell, J.: The
Inevitability of Failure: The Flawed Assumption of Security in Modern Computing
Environments. Proceedings of the 21st National Information Systems Security Conference 10 (1998) 303–314
[12] Housley, R., Polk, W., Ford, W., Solo, D.: RFC3280: Internet X. 509 Public Key
Infrastructure Certificate and Certificate Revocation List (CRL) Profile. Internet
RFCs (2002)
[13] Farrell, S., Housley, R.: RFC3281: An Internet Attribute Certificate Profile for
Authorization. Internet RFCs (2002)
[14] Department of Defense: Department of Defense Trusted Computer System Evaluation Criteria. (December 1985) DOD 5200.28-STD (supersedes CSC-STD-00183).
[15] Sandhu, R., Coyne, E., Feinstein, H., Youman, C.: Role-based access control models. Computer 29(2) (1996) 38–47
[16] Bertino, E., Bonatti, P., Ferrari, E.: TRBAC: a temporal role-based access control
model. Proceedings of the fifth ACM workshop on Role-based access control
(2000) 21–30
Evolving weighted topologies for neural networks using
genetic programming
Marcin Suchorzewski
Szczecin University of Technology,
Faculty of Computer Science and Information Technology
Abstract:
NeuroEvolution of Augmenting Topologies (NEAT) is a method of evolving weighted
topologies for neural networks, which has been showed to be effective in solving double
pole balancing problems. Its usability, however, is largely reduced by its high specificity
and custom architecture. In this paper we propose much more standardized framework for
evolving neural networks, achieving comparable or even better performance in three
benchmark problems. We show that tree-based genetic programming (GP) requires only
minor modifications to represent and evolve neural networks.
Despite its high performance, however, we remain sceptical about the robustness of the
proposed method. In a series of further experiments we demonstrate the performance is
sensitive to parameter settings and algorithm details. Therefore we refrain from making
conclusions about performance of our method and only stress the improvement regarding
its simplicity.
Keywords:
NeuroEvolution of Augmenting Topologies, NEAT, genetic programming, neural networks,
parameter setting
1. Introduction
Designing neural networks is a difficult task. Although practical guidelines exist on
how to design a network for a typical problem, no proved methodology exists to
synthesize networks capable of solving arbitrary and complex tasks. One promising
direction of research is to borrow the method from nature and employ evolution. But is it
simpler to design evolutionary algorithm sufficiently powerful to yield robust neural
networks? Many researchers believe so.
NEAT [11, 12, 4] is a method of evolving weighted topologies for neural networks,
i.e. hard-wired, non-learnable networks with homogeneous neurons. It has been shown to
be more effective than several other methods of this kind, as well as conventional
reinforcement learning algorithms. As Stanley and Miikkulainen [11] claim, its increased
effectiveness over previously proposed methods is due to:
1. Homologous crossover, originally called “a principled method of crossover of
different topologies” and is implemented using so called historical markings. It
212
Marcin Suchorzewski
addresses the problem of variable length genomes and the problem of competing
conventions.
2. Minimal initialization – networks in initial population have no hidden neurons and
only then slowly build up, thus reducing the initial exploratory overhead.
3. Speciation of neural networks during the evolution, protects promising innovations
from premature elimination and helps to avoid the competing conventions problem.
The importance of these features was demonstrated in “ablation” experiments.
Deprivation of any of them caused severe impairment in performance – by a factor of 1.5
in case of the lack of crossover and even several times in case of the two other.
NEAT has been shown to be successful in solving XOR problem, pole balancing and
double pole balancing with and without velocities. (It was further applied to several
non-standard problems [12, 10, 9]). For all these benchmarks it performed very well. The
analysis of NEAT, however, reveals that the method doesn't scale up well. Adding just
one more input to the XOR problem (parity-3) makes it much more difficult to solve for
NEAT. The method is only capable of randomly adding connections one-by-one, whereas
N-parity problem is hardly susceptible for a slowly grown solutions – it requires a
scalable pattern grown at once. NEAT can't exploit regularities in the problem partly
because it doesn't facilitate modularity, iteration nor recursion.
The aim of this paper, however, isn't to improve the NEAT performance, but to
question whether its sophistication really pays off. The above mentioned essential
features are quite complicated and require a considerable effort in implementation.
Beside them, there is mutation operator in 3 variants, a mechanism for detecting recurrent
connections and few more. Are these features really justified by performance?
In section 2 we present an alternative, considerably simpler method of evolving
weighted topologies for neural networks. In section 3 we compare it against original
NEAT on 3 benchmark problems: double pole balancing with and without velocities and
parity-3. In section 4 we show that unlike in NEAT, performance of our solution doesn't
depend on some high-level features, but on rudimentary parameters and algorithm details,
what is very common in evolutionary computation. We conclude the paper in section 5.
2. Evolving neural networks using GP
2.1. Neural network and its representation
We evolve random feed-forward networks, except double pole balancing without
velocities problem, where auto-recurrent connections are used. Following the NEAT,
only one type of neuron is allowed across all problems, namely generalized
1
McCulloch-Pitts neuron with modified sigmoid transfer function φ ( x) = 1+ exp( −4.9 x ) .
We represent networks with trees and evolve them using GP [5, 7], which is general
algorithm intended to evolve programs in functional languages, such as LISP
expressions. Yet, any feed-forward, single-output neural network can be represented with
Evolving weighted topologies for neural networks using genetic programming
213
a tree using just one type of terminal node (input xi , with x0 = 1 being input for bias)
and 3 types of function nodes: addition + , weight w , and transfer function φ () .
The grammar of the tree is almost unrestricted, with one exception. Children nodes of
a weight node w must be either input nodes or transfer function nodes, i.e. addition can't
appear directly below weight node. 1 All other node configurations in trees can be
interpreted as valid neural network topology. Figure 1 shows an example of genetic tree
solving XOR problem and its corresponding neural network. The crucial feature of the
tree representation is that neurons or network inputs having multiple outputs – as often is
– are represented multiple times in the tree. Adding a connection between neurons
requires duplicating the whole subtree containing the pre-synaptic neuron and adding it
under post-synaptic neuron. This fact has severe consequences regarding the pattern of
variation and code reuse, though we can't yet decide, whether these consequences foster
or hinder the overall performance. An alternative to the tree representation would be
graph encoding, also used to evolve neural networks in conjunction with Parallel
Distributed Genetic Programming [8, 6].
(a)
(b)
Figure 1. An example of evolved solution for the XOR problem. Tree representation (a) and its
translation to neural network (b).
This is if we assume all neurons are homogeneous and use the φ () transfer function. If we
allow linear neurons then such a restriction is not necessary.
1
214
Marcin Suchorzewski
The representation just proposed is capable of representing feed-forward networks,
only. Solving the two pole balancing problem without velocities, however, requires some
kind of memory or recurrent connectivity. While this is obvious shortcoming of the
representation, we are able at least to circumvent it using another type of transfer function
node φˆ() , acting as if it had an auto-recurrent weighted connection:
yt = φˆ( x + wyt −1 ),
where y t is output of the neuron at time step t , x is the sum of incoming connections
and w is internally stored weight for the auto-recurrent connection.
2.2. Initialization and operators
“Minimal initialization” was recognized as a very important factor in the performance
of NEAT. It means that inputs are initially connected to the output and only weights are
randomly assigned. We confirm such an initialization is a kick-start for the evolution and
apply it as well, except we don't require every input be actually connected once. Weights
are drawn from normal distribution with mean 0 and standard deviation 2 ( N (0, 2) ).
We employ three types of operators:
 point mutation – applies to weights (including weights of auto-recurrent connections
in φˆ nodes) and inputs. The node to be mutated is drawn uniformly from all w and
xi nodes and it is given a new random value, drawn from normal ( N (0, 2) ) and
discrete uniform distribution, respectively.
 subtree mutation – replaces randomly selected node with a new, random (but valid)
subtree. The maximum depth of the new subtree is limited to 3 levels. The subtrees are
generated with “grow” method using probability 0.66.
 crossover – a non-standard, two-offspring operator swapping selected subtrees
between 2 parents. The node to swap is selected randomly in the first parent, but in the
second, selection is restricted to transfer function nodes, only. Chances are that one
offspring will not end with a transfer function node in the root. While this is not a
problem in general, it violates formal requirements and therefore such a tree is
considered invalid, assigned some poor fitness and effectively eliminated.
3. Evaluation
We evaluate the performance of our method on three benchmark problems:
 Double pole balancing with velocity information (DPV). The network has 6 inputs:
position and velocities for a cart and the two poles. The fitness is calculated as
log(T / t ) , where t is the number of time steps the poles are kept balanced.
Candidate neural network qualifies as a solution if t = T = 100 000 .
Evolving weighted topologies for neural networks using genetic programming
215
 Double pole balancing without velocities (DPNV, 3 inputs). The network must be
able to keep the poles balanced over T = 100 000 time steps and moreover to
generalize to at least 200 initial states (test cases) to be qualified as a solution.
Following description in [11], the fitness includes a penalty for “jiggling” and is
calculated as 0.1log(T / t ) + 0.9(2 / (1 + exp( − j / 100)) − 1) , where j is the amount
of jiggling in the control (see [11] for details).
 Parity-3 problem. We assume the network is a solution if absolute error is less than
0.33 for each of the 8 test cases. We evaluate the performance of the original NEAT
for this task with jNeat package2.
GP parameters used in the evaluation are summarized in Table 1.
Table 1. GP-NEAT parameters tableau
Parameter
DPV
DPNV
Parity-3
Terminal nodes
x0 – x6
x0 – x3
x0 – x3
Function nodes
φ , +, w
φˆ , + , w
φ, +, w
Std fitness ( J )
log(T / t )
see text
MAE
Final fitness
Using parsimony: F = J (1 + p ) , p = 0.001exp(0.01tree_size)
Selection
Generational, non-elitist, tournament selection of size:
7
20
7
150
1000
1000
Population
Operators
pc = 0.0
pc = 0.4
pc = 1
pm = 1.0
pm = 0.8
pm = 0.8
Given mut., point mut.: p pm = 1.0 , subtree mut.: psm = 0.5
Termination
Solution found or maximum no. generations τ elapsed:
τ = 200
τ = 100
τ = 100
Table 2 compares the results reported in [11] with our reimplementation of the
original NEAT and the GP-NEAT proposed in this paper. We measure the performance
using success rate (SR) and average number of evaluations to yield a solution (‘#E’).
A single evolutionary run is considered successful, if it delivers a solution within
simulation time (see Termination parameter in Tab. 1). Following the original report, the
results are averaged over 120 simulation runs in case of DPV and only 20 in case of
DPNV. We choose 100 runs for parity-3. Only successful simulations are accounted in
2
See http://www.cs.ucf.edu/~kstanley/neat.html.
216
Marcin Suchorzewski
average number of evaluations ( ×1000 , column ‘#E’) and its standard deviation (column
‘Std’).
We reimplemented the NEAT in two variants – using speciation and island model
(using demes). Despite considerable effort in parameter tuning, we were unable to
reproduce the results reported in original paper. We found many algorithm details hardly
mentioned in the paper, e.g. selection type or details about recurrent connections. So
apparently, the performance depends on details in high degree. On the other hand, so
different population structuring approaches as speciation and island model produced very
similar results.
Table 2. Performance comparison between several NEAT variants. See text for explanations.
NEAT
variant
#E
DPV
Std
Original
Re. speciation
3.60
9.91
2.70
13.2
SR
1.0
1.0
#E
DPNV
Std
33.2
45.1
21.8
34.3
SR
1.0
1.0
Parity-3
#E
Std
SR
57.7
55.4
28.7
22.2
0.59
0.70
Re. demes
9.90
6.51
1.0
41.4
26.7
1.0
38.5
15.7
0.62
GP-NEAT
3.41
1.72
1.0
20.4
17.3
0.95
34.3
24.2
0.73
As can be seen, GP-NEAT slightly outperforms3 the original method, though only
after parameter tuning by hand. We find the performance to be sensitive to parameter
settings and algorithm details. This is illustrated in the next section.
An important observation is that pole balancing problems can be solved using just a
single neuron. In fact, the most effective way to solve it is to restrain the network growth
and just mutate weights. So in our view the problem is not a good benchmark for growing
neural networks.
Quite reverse situation is in case of parity-3 problem, which is solved only by a few
neurons (minimum 3 as observed) and the crossover is better in adding new neurons than
mutation. Moreover this is highly epistatic problem, i.e. the fitness landscape is very
rugged here. In such a scenario, the evolutionary process is close to random search and
even very intense variation should not impair the performance.
4. Experiments
As already mentioned in the introduction, Stanley and Miikkulainen [11] identified 3
features essentially responsible for the improved effectiveness of NEAT and their
observations were confirmed in ablation experiments. We perform experiments in a
similar vein, but instead of high-level features, we consider rather primitive parameters.
During extensive experimentation with the algorithms we observed how thin is the border
3
We don't perform statistical analysis, first because of lack of detailed data from original research, and secondly because in the light of the next section, there is little point in demonstrating
statistical significance of improvement in peak performance.
217
Evolving weighted topologies for neural networks using genetic programming
between success and failure and how effectiveness depends on parameters seemingly of
secondary importance.
The experiments are:
1
1. Simplification of sigmoid transfer function to φ ( x) = 1+ exp( − x ) . This is almost
equivalent to narrowing the normal distribution used in construction and mutation of
weight nodes by 5 times, i.e. using N (0, 0.4) instead of N (0, 2) .
2. Increase in the maximum depth of subtree mutation operator to 5.
3. Modification of crossover operator so that instead of restricting the set of candidates
in the second parent tree to φ nodes, the set of candidates consists of all nodes of the
same type as the type of the node selected in the first parent. This modified operator
also preserves syntactic correctness.
4. Fixation of the probabilities of mutation and crossover to pm = 0.5 and pc = 0.5 .
The results are shown in Table 3. In the first experiment, the change concerned the
phenotype of neural network, so the effects are likely to be relevant not only to
GP-NEAT, but the NEAT as well. Particularly notable is the decrease of performance in
DPNV problem. The second modification affects performance in all problems, though
not so radically. Modifying crossover operator in the third experiment, turned out to be
fatal in the parity-3 problem. (It doesn't affect DPV because the crossover is disabled in
that case.) Fixing the probabilities of operators ( pc in particular) severely degraded
performance in DPV. Surprisingly the results for parity-3 considerably improved,
meaning our prior parameter tuning might not be too accurate.
Table 3. Results of experiments
Setup
GP-NEAT
Exp.1.
Exp.2.
Exp.3.
Exp.4.
#E
DPV
Std
SR
3.41
4.93
4.64
3.41
5.95
1.72
2.83
4.12
1.72
6.37
1.0
0.925
0.942
1.0
0.5
#E
DPNV
Std
SR
Parity-3
#E
Std
SR
20.4
–
34.0
29.4
24.3
17.3
–
25.1
11.6
13.9
0.95
0.0
0.95
1.0
0.95
34.3
62.1
38.2
49.3
28.7
24.7
19.7
23.0
13.6
18.4
0.73
0.66
0.63
0.03
0.79
The experiments showed that although peak performance of GP-NEAT is high, it is
sensitive to many details of algorithm and therefore its robustness is questionable. It's
difficult to say how robust is the original NEAT. Experiments with NEAT
reimplementation suggests that it has its own peculiarities, but we don't explore this
further. Fair comparison between evolutionary methods, including not only their peak
performance, but also their robustness is widely recognized problem [1]. Setting the
parameters of evolutionary algorithms in a systematic way is a major part of this problem
218
Marcin Suchorzewski
[3]. In [2] Daida et al. vividly demonstrated how badly spoiled the comparison and
conclusions can be, when the performance is attributed to alleged novelty in the
algorithm, whereas it actually comes from specifically tuned parameters and algorithm
details.
5. Summary and conclusions
In this paper we proposed a new method of evolving weighted topologies for neural
networks. The GP-NEAT method performs favorably to already known and effective
NEAT [11], but is much more standardized. It makes use of standard, tree-based GP
framework, requiring only minor changes to variation operators and initialization
procedure. In our opinion GP-NEAT is much easier to implement and use in practice,
given the popularity and wide availability of GP software.
We showed GP-NEAT performs favorably in terms of peak performance – the
measure usually used in evaluation and comparison. In section 4, however, we showed
GP-NEAT is sensitive to parameter settings and algorithm details and therefore peak
performance results might be delusive as its overall measure of utility. Accounting for the
effort of parameter tuning, required to obtain peak performance would say more about
method's robustness. A new experimentation framework is necessary to evaluate and
compare evolutionary methods fairly. This is important area for future work.
References
[1] Bartz-Beielstein T., Preuss M. Experimental research in evolutionary computation.
In Proceedings of the 2007 GECCO conference companion on Genetic and
evolutionary computation, pages 3001-3020. ACM Press New York, NY, USA,
2007.
[2] Daida J., Ampy D., Ratanasavetavadhana M., Li H., Chaudhri O. Challenges with
verification, repeatability, and meaningful comparison in genetic programming:
Gibsons magic. In Proceedings of the Genetic and Evolutionary Computation
Conference, volume 2, pages 1851-1858, 1999.
[3] Eiben A. E., Michalewicz Z., Schoenauer M., Smith J. E.. Parameter control in
evolutionary algorithms. In Parameter Setting in Evolutionary Algorithms, Studies
in Computational Intelligence. Springer, 2007.
[4] James D., Tucker P. A comparative analysis of simplification and complexification
in the evolution of neural network topologies. In Keijzer M. (ed.), Late Breaking
Papers at the 2004 Genetic and Evolutionary Computation Conference, Seattle,
Washington, USA, 26 July 2004.
[5] Koza J. R. Genetic Programming. MIT Press, 1992.
[6] Poli R. Parallel distributed genetic programming. In Corne D., Dorigo M., Glover
F. (eds.), New Ideas in Optimisation, chapter 27, pages 403-432. McGraw-Hill Ltd.,
Maidenhead, UK, 1999.
[7] Poli R., Langdon W., McPhee N. A Field Guide to Genetic Programming. Lulu
Press, 2008.
Evolving weighted topologies for neural networks using genetic programming
219
[8] Pujol J., Poli R.. Evolving the topology and the weights of neural networks using a
dual representation. Applied Intelligence, 8(1):73-84, 1998.
[9] Stanley K., Kohl N., Sherony R., Miikkulainen R. Neuroevolution of an automobile
crash warning system. In GECCO’05: Proceedings of the 2005 Conference on
Genetic and Evolutionary Computation, pages 1977-1984, New York, NY, USA,
2005. ACM.
[10] Stanley K. O., Bryant B. D., Miikkulainen R. Real-time evolution in the NERO
video game. In Proceedings of the IEEE 2005 Symposium on Computational
Intelligence and Games. IEEE, 2005.
[11] Stanley K. O., Miikkulainen R. Evolving neural networks through augmenting
topologies. Evolutionary Computation, 10(2):99-127, 2002.
[12] Stanley K. O., Miikkulainen R. Competitive coevolution through evolutionary
complexification. Journal of Artificial Intelligence Research, 21:63-100, 2004.
Efektywne algorytmy wyznaczania wyraŜenia
postaci Y=(A⊗
⊗B)X
Galina łariova, Alexandr łariov
Politechnika Szczecińska, Wydział Informatyki
Abstract:
The two algorithms of Y=(A⊗B)X expression realization with reduced number of
arithmetic operations are proposed.
Keywords:
Matrix-vector multiplication, Kronecker product, fast algorithms, reduction of number of
arithmetical operations
1. Wprowadzenie
MnoŜenie wektora przez macierz, reprezentującą sobą iloczyn tensorowy (inaczej
mówiąc - iloczyn Kroneckera [1, 2] ) dwóch innych macierzy, jest typową „makrooperacją” wielu algorytmów cyfrowego przetwarzania sygnałów [3-5]. Najbardziej powszechnym jej zastosowaniem jest implementacja szybkich algorytmów dyskretnych
transformacji ortogonalnych [6].
Operacja ta potrzebuje duŜych nakładów obliczeniowych poniewaŜ w najbardziej
ogólnym przypadku dla macierzy o wymiarach M×N oraz K×L będących czynnikami
iloczynu tensorowego wymaga wykonania 2KLMN operacji mnoŜenia oraz KM(LN-1)
operacji dodawania. Obecnie nie istnieją algorytmy wymagające przy realizacji tej makrooperacji mniejszej liczby operacji arytmetycznych dla dowolnych macierzy. Okazuje
się, Ŝe moŜliwość konstruowania takich algorytmów jednak istnieje. Dlatego w niniejszej pracy będą syntetyzowane dwie modyfikacje „szybkich”, algorytmów realizacji tej
„makrooperacji”.
Sprecyzujmy najpierw podstawowe wyraŜenie z uwzględnieniem rozmiarów czynników wektorowo-macierzowych. Zakładamy, Ŝe naleŜy zrealizować działania według
następującego wzoru:
YKM ×1 = ( A M × N ⊗ B K × L ) X LN ×1 ,
(1)
gdzie symbol „ ⊗ ” tu i w dalszej części artykułu oznacza operację iloczynu tensorowego, zaś macierze A M × N oraz B K × L są czynnikami iloczynu tensorowego o dowolnych
rozmiarach.
222
Galina łariova, Alexandr łariov
Określimy równieŜ X LN ×1 = [ x0 , x1 ,..., xLN −1 ]Τ jako wektor danych wejściowych oraz
YKM ×1 = [ y0 , y1 ,..., yKM −1 ]Τ jako wektor danych wyjściowych, których rozmiary wynikają z postaci iloczynu tensorowego.
Przypomnijmy równieŜ, iŜ operacja iloczynu tensorowego dwóch macierzy zdefiniowana jest w następujacy sposób:
A M ×N ⊗ BK ×L
 a0,0 B K × L
 a B
1,0 K × L
=

⋮

 aM −1,0 B K × L
a0, N −1B K × L 
a1, N −1B K × L 
.

⋱
⋮

⋯ aM −1, N −1B K × L 
⋯
⋯
a0,1B K × L
a1,1B K × L
⋮
aM −1,1B K × L
(2)
Analiza wyraŜenia (2) pokazuje, Ŝe operacja ta wymaga wykonania KLMN mnoŜeń.
Ponadto dalsze mnoŜenie tej macierzy przez wektor X LN ×1 według wzoru (1) wymaga
KLMN dodatkowych operacji mnoŜenia oraz KM(LN-1) operacji dodawania. Łączna
liczba operacji arytmetycznych niezbędnych do wykonania omawianej makrooperacji
jest więc dosyć wielka, co powoduję przy realizacji obliczeń duŜe straty czasowe. Spróbujemy zredukować tę liczbę. PokaŜemy, iŜ moŜna syntetyzować algorytm realizacji tej
makrooperacji, posiadający mniej operacji mnoŜenia i dodawania.
2. Synteza „szybkich” algorytmów realizacji operacji mnoŜenia
wektora przez macierz, będącą iloczynem tensorowym dwóch
innych macierzy
Zbudujmy najpierw konstrukcję macierzową B KN × LN , określającą operacje przetwarzania elementów wektora danych na pierwszej iteracji procesu obliczeniowego. Zdefiniujmy ją za pomocą iloczynu tensorowego dwóch macierzy I N oraz B K × L :
B KN × LN = (I N ⊗ B K × L ) ,
(3)
gdzie macierz I N - jest macierzą jednostkową N-tego rzędu.
Wprowadźmy teŜ macierz tasowania elementów wektora danych, wyznaczonych w
wyniku realizacji obliczeń na pierwszej iteracji:
PKMN × KN = I N ⊗ (1M ×1 ⊗ I K ) ,
(4)
gdzie 1M ×1 = [1,1,...,1]Τ - macierz składająca się z samych jedynek, ktorej rozmiary określone są za pomocą dolnego indeksu.
Wprowadźmy diagonalną macierz:
N −1
M −1
n=0
m=0
n)
n)
, gdzie D(KM
D KMN = ⊕ D(KM
= ⊕ (I K ⋅ am , n ) ,
(5)
223
Efektywne algorytmy wyznaczania wyraŜenia postaci…
gdzie symbol ⊕ - oznacza sumę prostą (tensorową) dwóch macierzy [6].
Zdefiniujmy teraz jeszcze jedną konstrukcję macierzową określającą prawo sumowania wybranych elementów wektora przetwarzanych danych:
ΞKM × KMN = 11× N ⊗ I KM ,
(6)
Wtedy końcowa procedura obliczeniowa szybkiego algorytmu realizacji omawianej
makrooperacji przybiera następującą postać:
YKM ×1 = ΞKM × KMN DKMN PKMN × KN B KN × LN X LN ×1 .
(7)
MoŜna zaproponować teŜ inny sposób organizacji obliczeń przy realizacji rozpatrywanej makrooperacji. W tym przypadku poszczególne konstrukcje macierzowe będą
wyglądały tak:
ɶ
B
KM × LM = (I M ⊗ B K × L ) ,
(8)
Pɶ LMN × LN = (1M ×1 ⊗ I LN ) ,
(9)
M −1
N −1
m=0
n=0
ɶ
ɶ (m) ɶ (m)
D
LMN = ⊕ D LN , D LN = ⊕ ( I L ⋅ am , n ) ,
(10)
ɶ
Ξ
LM × LMN = I M ⊗ (11× N ⊗ I L ) ,
(11)
zaś procedura końcowa będzie następującej postaci:
ɶ
ɶ
ɶ
ɶ
YKM ×1 = B
KM × LM ΞLM × LMN D LMN PLMN × LN X LN ×1 .
(12)
3. Przykłady konstruowania „szybkich” algorytmów realizacji
operacji mnoŜenia wektora przez macierz, będącą iloczynem
tensorowym dwóch innych macierzy
Rozpatrzmy syntezę szybkiego algorytmu wyznaczenia wyraŜenia (1) dla K=3,
L=2, M=2, N=3. Mamy wtedy:
X6×1 = [ x0 , x1 , x2 , x3 , x4 , x5 ]Τ , Y6×1 = [ y0 , y1 , y2 , y3 , y4 , y5 ]Τ ;
A 2 ×3
 a0,0
=
 a1,0
a0,1
a1,1
 a00
a0,2 
, B 3× 2 =  a10
a1,2 
 a20
a01 
a11  ;
a21 
Y6×1 = ( A 2×3 ⊗ B3×2 ) X6×1 .
Odpowiadający temu przykładowi szybki algorytm realizacji rozpatrywanej makrooperacji według procedury (7) będzie opisywany za pomocą następującego przekształcenia wektorowo-macierzowego:
224
Galina łariova, Alexandr łariov
Y6×1 = Ξ6×18 D18 P18×9 B9×6 X6×1 ,
zaś poszczególne czynniki macierzowe będą wyglądać następująco:
B 9×6 = I 3 ⊗ B 3× 2
P18×9
B 3× 2

=  03× 2
0
 3× 2
1




1








= I 3 ⊗ (12×1 ⊗ I 3 ) = 














03× 2
B 3× 2
03× 2
03× 2 

03× 2  ;
B 3×2 


1


1
0 6 ×3
06×3 


1


1

1


1

1

06×3
06×3 
1

1


1


1

1 
1

06×3
0 6 ×3
1

1 

1 ,
gdzie 0 -macierz zerowa, której rozmiary określone są za pomocą dolnego indeksu.
2
D18 = ⊗ D6( n ) = diag ( D6(0) , D6(1) D6(2) ) ,
m=0
D(0)
6 = ( a0,0 ⋅ I 3 ) ⊕ ( a1,0 ⋅ I 3 ) = diag ( a0,0 , a0,0 , a0,0 , a1,0 , a1,0 , a1,0 )
D(1)
6 = ( a0,1 ⋅ I 3 ) ⊕ ( a1,1 ⋅ I 3 ) = diag ( a0,1 , a0,1 , a0,1 , a1,1 , a1,1 , a1,1 ) ,
D(2)
6 = ( a0,2 ⋅ I 3 ) ⊕ ( a1,2 ⋅ I 3 ) = diag ( a0,2 , a0,2 , a0,2 , a1,2 , a1,2 , a1,2 ) ,
225
Efektywne algorytmy wyznaczania wyraŜenia postaci…
Ξ6×18
1
1
1

 1

1
1




1
1
1
= 11×3 ⊗ I 6 = 
.
1
1
1



1
1
1 


1
1
1

Na rysunku 1 został przedstawiony model grafostrukturalny, reprezentujący strukturę algorytmiczną organizacji procesu wyznaczenia wyraŜenia (1) według produry (7)
dla rozpatrywanego przykładu. Model zorientowany jest od lewej do prawej. Liniami
prostymi oznaczone są operacje transferu danych. Trapezami oznaczone zostały bloki
mnoŜenia wpisanych w nie macierzy przez odpowiednie podwektory danych. Kółkami
oznaczono operacje mnoŜenia przez wpisane w nie wartości.
Syntezujmy teraz inną modyfikację algorytmu dla tego samego przykładu, lecz na
podstawie procedury (12).
Procedura (12) dla tego przykładu przybiera następujacą postać:
ɶ Ξɶ D
ɶ ɶ
Y6×1 = B
6× 4 4×12 12 P12× 6 X6×1 ,
zaś poszczególne czynniki macierzowe będą wyglądać w sposób następujący:
B3× 2
Bɶ 6× 4 = I 2 ⊗ B3× 2 = 
 03× 2
03× 2 
,
B3×2 
1
1 
1
Ξɶ 4×12 = I 2 ⊗ (11×3 ⊗ I 2 ) = I 2 ⊗ 
=
1
1
 1
1
1
1

02×6
 1

1
1

;
=

1
1
1 
02×6


1
1
1

1
2
~
~
~
~
~
D12 = ⊗ D (6m ) = D (60 ) ⊕ D 6(1) , D (6m ) = ⊕ ( a m ,n I 2 )
m =0
n =0
~
D (60) = ( a0, 0 ⋅ I 2 ) ⊕ (a0,1 ⋅ I 2 ) ⊕ ( a0, 2 ⋅ I 2 ) = diag (a 0, 0 , a 0, 0 , a 0,1 , a 0 ,1 , a 0 , 2 , a 0 , 2 ) ,
~
D 6(1) = ( a1, 0 ⋅ I 2 ) ⊕ (a1,1 ⋅ I 2 ) ⊕ (a1, 2 ⋅ I 2 ) = diag (a1, 0 , a1, 0 , a1,1 , a1,1 , a1, 2 , a1, 2 ) ;
226
Galina łariova, Alexandr łariov
Pɶ12×6
1

 1





1


1



1 


1

= 12×1 ⊗ I 6 = 
;
1


 1



1




1


1 


1

a0,0
a0,0
a0,0
a1, 0
a1, 0
x0
x1
x2
x3
x4
x5
B 3×2
a1, 0
a 0 ,1
a 0 ,1
B 3×2
a 0 ,1
a1,1
a1,1
B 3×2
a1,1
y0
y1
y2
y3
y4
y5
a0 , 2
a0 , 2
a0 , 2
a1, 2
a1, 2
a1, 2
Rysunek 1. Graf ilustrujący organizację procesu obliczeniowego według bezpośredniej realizacji
procedury (7) dla K=3, L=2, M=2, N=3
227
Efektywne algorytmy wyznaczania wyraŜenia postaci…
Na rysunku 2 został przedstawiony model grafostrukturalny, reprezentujący strukturę algorytmiczną organizacji procesu wyznaczenia wyraŜenia (1) według procedury (12)
dla rozpatrywanego przykładu.
a0 ,0
a0 ,0
a0 ,1
x0
x1
a0 ,1
a0 , 2
x2
x3
x4
x5
a0 , 2
y0
B 3×2
a1, 0
a1, 0
B 3×2
a1,1
y1
y2
y3
y4
y5
a1,1
a1, 2
a1, 2
Rysunek 2. Graf ilustrujący organizację procesu obliczeniowego według bezpośredniej realizacji
procedury (12) dla przykładu K=3, L=2, M=2, N=3
4. Oszacowanie liczby niezbędnych operacji
Łatwo zauwaŜyć, iŜ realizacja procedury (7) wymaga wykonania KN ( L + M ) operacji mnoŜenia oraz K [( L − 1) N + ( N − 1) M ] operacji dodawania. Wprowadźmy współczynniki zysku obliczeniowego osobno dla operacji mnoŜenia oraz dla operacji
dodawania dla kaŜdej z syntetyzowanych procedur. Dla procedury (7) współczynniki te
przyjmują następujace wartości:
k×(1) =
k+(1) =
2 KLMN
2 LM
,
=
KN ( L + M ) L + M
( LN − 1) KM
( LN − 1) M
.
=
( L − 1) KN + ( N − 1) KM ( L − 1) N + ( N − 1) M
Realizacja procedury (12) wymaga wykonania LM ( K + N ) operacji mnoŜenia oraz
M [( N − 1) L + ( L − 1) K ] operacji dodawania.
Dla tej procedury współczynniki zysku będą miały następują postać:
228
Galina łariova, Alexandr łariov
k×(2) =
k+(2) =
2 KLMN
2 KN
,
=
ML ( N + K ) N + K
( LN − 1) KM
( LN − 1) K
.
=
( L − 1) KM + ( N − 1) LM ( N − 1) L + ( L − 1) K
Jak moŜna zauwaŜyć ciekawą właściwością jest fakt, Ŝe opracowane procedury dla
tych samych rozmiarów macierzy wchodzących w skład iloczynu tensorowego mają
róŜne stopnie redukcji operacji arytmetycznych (róŜne wartości współczynników zysku). W jednych przypadkach korzystniejsza jest procedura (7), w drugich zaś – procedura (12). Jak widać, dla rozwaŜonego w artykule przykładu procedura (12) daje lepsze
wyniki, poniewaŜ wymaga wykonania 24 operacji mnoŜenia i 14 operacji dodawania
zamiast 36 mnoŜeń i 21 dodawań dla procedury (7). Na marginesie podkreślimy, Ŝe
bezpośrednia realizacja procedury (1) dla tego przykłady wymaga wykonania 72 operacji mnoŜenia oraz 30 operacji dodawania.
5. Podsumowanie
Zaproponowane w pracy algorytmy posiadają zredukowaną względem bezpośredniej realizacji wzoru (1) liczbę operacji arytmetycznych. Czyni je to bardziej konkurencyjnymi do zastosowania w systemach cyfrowego przetwarzania danych (sygnałów,
obrazów) czasu rzeczywistego, gdy mamy do czynienia z koniecznością wykonania
wszystkich operacji obliczeniowych i sterujących w czasie wystarczającym na niezakłócony przebieg trwającego procesu. Warto podkreślić, Ŝe decyzję o wyborze jednej
z dwóch przedstawionych procedur w kaŜdym konkretnym przypadku naleŜy podejmować na podstawie wstępnego oszacowania wartości współczynników zysku dla obu
wariantów.
Bibliografia
[1] Graham A., Kronecker Products and Matrix Calculus: With Applications. Ellis
Horwood Limited, 1981.
[2] Regalia P. A., Mitra S. K., Kronecker Products, Unitary Matrices and Signal
Processing Applications, SIAM Review, v. 31, no 4, pp. 586-613. 1989.
[3] Brewer J.W., Kronecker Products and Matrix Calculus in System Theory. IEEE
Transaction on Circuits and Systems, v. 25, pp. 772-781, 1978.
[4] Granta J., Conner M., and Tolimieri R., Recursive Fast Algorithms and the Rrole of
Tensor Products. IEEE Transaction on signal processing, v. 40, no 12, pp. 29212930, 1992.
[5] Tsai C.-Y., Fan M.-H., and Huang C.-H., VLSI Circuit Design of Digital Signal
Processing
Algorithms
Using
Tensor
Product
Formulation,
http://jcst.ict.ac.cn/ptafile%5C2830.pdf
[6] Huang C.-H., Johnston J.R., and Johnston R.W., A tensor product formulation of
Strassen’s matrix multiplication algorithm., Appl. Math Letters, 3(3) pp. 104-108,
1990.
Efektywne algorytmy wyznaczania wyraŜenia postaci…
229
[7] Liu C.-B., Huang C.-H., and Lei C.-L., Design and Implementation of Long-Digit
Karatsuba’s Multiplication Algorithm Using Tensor Product Formulation, In: The
Night Workshop on Compiler Techniques for High-Performance Computing, pp.
23-30, 2003.
[8] Dagman E., Kukharev G. Szybkie dyskretne transformaty ortogonalne. Nauka,
Nowosybirsk, 1983 ( w języku rosyjskim).
[9] A. łariov, Modele algorytmiczne i struktury wysokowydajnych procesorów cyfrowej obróbki sygnałów, Szczecin, Informa, 2001.
[10] Tolimieri R., An M., Lu C., Algorithms for Discrete Fourier Transform and Convolution, Springer-Verlang, New York, 1989.
[11] Blahut R. E. Fast Algorithms for Digital Signal Processing. Addison-Wesley 1985.
Formation of the contents of knowledge’s bases for local
intelligent decision support systems
Tatiana Tretyakova, Abdullah Zair
Szczecin University of Technology,
Faculty of Computer Science and Information Technology
Abstract:
The problem of creation of intellectual decision support systems basing on use of
hydrometeorological information is considered at this article. Part of this problem is the
acquisition and structurization of knowledge at corresponding problem's areas. Formal
and structural models of intelligent warning and decision support system (IWDSS) on a
regional level are submitted. Local intelligent subsystems are included in structure of
IWDSS. These subsystems warns of threat of the dangerous natural phenomenons and
support of decision-making in management of region if the economic objects of this
region are on dangerous territories. In article formal models of decision-making process
and ontologies to area “Threat of mudflow” are submitted and role of ontologies creation
as one of tools of acquisition, structurization and knowledge's management is underlined.
The submitted models are the methodical basis for performance of phases of acquisition
and structurization of knowledge for local intelligent subsystems IWDSS. In article are
shown fragments of database IWDSS for administration of area Dzungarian Ala Tau in
Kazakhstan which is under constant threat of mudflow.
Keywords:
intelligence warning and decision support system, local intelligent system, elicitation and
structurization of knowledge, formal model, ontology, decision
1. Introduction
The task of decision support systems (DSS) is integration of the data and models
with the purpose of search of the decision of ill-structured problems[1,2]. Such class
systems are used in management for many years. Their important role in process of the
decision acceptance in economic activities is recognized as all. Last years DSS are
evolving by means of use of artificial intellect's methods. It considerably expands
sphere of their application. The high quality of their main components – knowledge
bases is the necessary conditions at their creation. Thus the stages of engineering of
knowledge: the elicitation and structurization play the most important role. Features of
problem area essentially influence on a choice of ways of the decision of tasks in this
area. Through this choice in process of elicitation and structurization of knowledge
unique methods of the decision of tasks of concrete problem area are forming.
Researches of authors concern to area of creation of knowledge bases for local intelligent decision support systems of IWDSS and development of methodical approaches
for elicitation and structurization knowledge for them on the example of one of areas of
232
Tatiana Tretyakova, Abdullah Zair
Kazakhstan - Dzungarian Ala Tau, which is positioned under threat of mudflow. Objects of management in area Dzungarian Ala Tau are agricultural, industrial, service,
social, communication and other economic objects. Subject domains to which functioning these objects is connected, and the problems arising at their functioning under influence of hydro- and meteorological factors, are demanding the corresponding
approaches to elicitation and structurization knowledge for DSS which can function as
local intelligent systems in structure IWDSS of the given object. The problem of decision-making at realization of processes of functioning of economic objects is always
connected to different subject domains. For this reason formal models of problem domains in which there are processes of functioning of economic objects, and of task of
decision-making in view of the hydrometeorological information can serve as the auxiliary tool in process elicitation and structurization knowledge for local intelligent subsystems of IWDSS of a different level.
2. Formal model of task of decision-making as the help tool of
elicitation and structurization knowledge for local intelligent
subsystems IWDSS
Elicitation and structurization knowledge for anyone intelligent DSS it is always
connected to studying of decision-making processes within the framework of processes
of functioning of economic objects. Any decision-making process should occur according to principles of the system analysis[3,4] and include stages of identification of problems, search of ways of their decision and a choice of best decision in view of the
chosen criterion of estimation and existing restrictions. Taking into account, that the
hydrometeorological information is not always reliable, in decision-making process it is
necessary to consider risk factor of occurrence of a problem under influence of hydrometeorological factors. In this connection the formal model of a problem of decisionmaking (MDEC) in view of the hydrometeorological information, shown in article[5], has
been submitted in view of a risk factor (RJn):
MDEC={DJ, RJn , CJ n , OJ n , AJ n , KJ n },
J
(1)
RJn
J
where: D – problem domain, in which we aim to make decisions;
– risk of occurrence of a problem under influence of hydrometeorological factors; C n – task’s goals,
concerns problem solving in the current domain; OJ n – limitation for the current task;
AJ n – set of decision’s alternate variations for problem solving; KJ n – set of criteria, to
which we aim to choose the right decision.
This formal model in principle presents stages for decisions making. These stages
are practically for every problematic situation, in which appearing under the influence
of the dangerous natural disasters. These stages permit partly to emphasize fields, which
should be described in one ontology’s level used to manage the knowledge of the expert
HMDEC system. It is possible to describe the decision-making process as a decision
tree. Working out all elements of the decision tree is demanding of constructing necessary models and rules to be used in the knowledge base of HMDEC. It is worthwhile
noticing, that providing the possibility of conclusion with the help of mechanisms of
deduction and creating the production part of the base needed include constructing of
Formation of the contents of knowledge’s bases for local intelligent…
233
cognitive models[6], based on studying main decision-making on the example of being
under threat of dangerous phenomena or natural disasters.
3. The structure of the intelligent warning and decision support
system (IWDSS) that includes local systems HMLevel and
HMDEC
Use of such systems as local intelligent DSS at the territories, which are subject to
constant threat of mudflow, is useful. Local systems HMLevel and HMDEC are included into structure IWDSS. System HMLevel analyzes and transfers in a mode on
line the information about a water level in mountain lakes, reacts to a water level exceeding a mark of a normal level, and transfers a signal about danger to control centre
of crisis situations. Other local system – expert system HMDEC is intended for support
of decisions making on the base of the hydrological and meteorological information in
the areas that are subject to threat of mudflow. On figure 1 the structure of the intelligent system of warning and support of decision-making is submitted. As follows from
figure 1, monitoring of a water level in mountain lake is carried out by automatic system. The system HMLevel analyzes and registers the data received from this automatic
system. This system transfers a signal about threat mudflow to in Crisis Situations Management Center (CSMC). The information received in CSMC is registered in a corresponding database of information system of regional management. This information
system contains the local expert system HMDEC. HMDEC serves for support of decisions in conditions of existing threat of mudflow. The model's base that also should be
stipulated in information system CSMC is accessible for HMDEC. On the basis of
modeling on these models and in view of simulation of possible development of dangerous situations in region at mudflow with help HMDEC risks are estimated and the
decisions on protective actions make.
The role of DSS in solving business problems is currently considered to be irreplaceable. It is connected with considerable improvement in decision-making quality
as a result of using this class systems. However, improvement in decision-making quality is only possible under one condition, when the quality of the main components this
type of systems – the knowledge base, is suitable. In paper [7] a concept of a three levelled information system with a model base was suggested. Later interest of the authors
of this paper was oriented towards elaborating a conceptual structure of an information
system towards IIS classes – Integrated Information System’s - with the addition of intelligent components such as, expert systems [8,9]. It is possible to increase the quality
of the decisions by adding Expert Systems (ES) to the information system. We should
notice, that improving the quality of decisions is based not only on the simulation on
various tasks, but also by decision supporting in situations which is not standard and by
preparing personnel within a framework of training with the assistance of suitable expert systems.
234
Tatiana Tretyakova, Abdullah Zair
Figure 1. The structure of the intelligent warning and decision support system that includes local
systems HMLevel and HMDEC. Source: own study
Looking to the formal models of the software (Sfw) and knowledge (Iint) of the DSS
structure, we can see that contents of knowledge bases are essential components in the
ability of providing suitable solutions.
4. Formal models of structure of software and knowledge in the
IWDSS
Software components (Sfw) in IWDSS class systems are presented in the following
set:
Sfw = {Drep, Db , Sdb, Kb, Skb , Interf, Aint, ESloc, Bweb}
(2)
where: Drep – repository and data warehousing connected witch OLAP technology;
Db-relational data bases; Sdb – data base management system; Kb - knowledge base; Skb
– knowledge management system; Interf – interfaces; Aint – intelligent agents; ESloc –
local expert systems; Bweb – browsers.
Knowledge components (Iint) is presented as a set:
Iint = {d, m, r}
where: d – data; m – models; r – rules.
(3)
Formation of the contents of knowledge’s bases for local intelligent…
235
In the represented formal model the structures of Sfw component are presented local
agent and expert systems, which were created taking into account the wishes of participants of the decision process. We should notice, that nowadays there is a possibility to
connect using interfaces relational database with expert system database or completely
replace knowledge base using a relational databases of DSS. We should emphasize,
that local ES type depends on the knowledge structure included in its knowledge's base,
which in considerable way is determined by type of problem domain[10]. In one case it
can be used to support decisions in economic process during unusual situations. In other
cases it can serve to e-learning and to e-training of personnel in order to pass on knowledge and form habits and skills essential for dealing with unusual situations.
5. Knowledge for IWDSS containing local HMDEC expert system
For a regional management or business enterprise located in a natural dangerous territories the solving problem may be based on hydro- and meteorological information’s
and on special knowledge from domain, where can develop specific problem. Taking
under consideration this information is vitally important, it rises its users adaptive potential, allowing them to choose the best strategy and co-operation with nature. Its main
advantage is that it allows its users to avoid or minimalize losses of natural catastrophes.
Hydro- and meteorological information is being used to solve various problems in different domain: regional and city management, transport (aviation, navigation and others), agriculture, water management, building industry, power engineering etc. Some of
these problems are [9] : regional and object management (process management); laying
out plans and managing establishment, aimed to preventing losses, which can occur as a
result of natural disasters; planning tasks which concerns choosing the best location for
building sites; ensuring the security of social objects, establishments and agricultural
objects from bad influence of hydrometeorology’s factors; managing limited water resources; etc.
Within a framework of such problems the managers, according to set goals, can
solve problems that arise under the strength of dangerous nature factors. High level of
uncertainty concerning natural disasters and the necessity of fast and correct reaction in
case of a threat calls for working out and making full use of proper methods and models. Those are risk management methods or models resolving possible issues concerning
the realization of actions preventing losses caused by natural disasters. Such technologies, like DSS class systems and their intelligent components - ES, allow the use of
knowledge included in certain knowledge base during decision making processes in
order to avoid possible negative effects, caused by dangerous natural disasters. In the
knowledge base of the information system (of region, city, economic object) different
models can be set in order to make the right decision based on the current trends concerning the current situation. Adding inference rules to the knowledge base, we create
ES, which on the ground of worked out scenario and resolving problems models rising
while such progress in situation can formulate conclusions supporting decisions making.
This expert system is called HMDEC. A concept of this systems structure HMDEC was
presented in paper[9]. In order to make use of HMDEC model we need specified data,
which should be located in its database and databases of IWDSS and others or operationally input as input/output data. Those are mostly data about economic, social, cul-
236
Tatiana Tretyakova, Abdullah Zair
tural objects of thee region (region, city, economic object), which are located in endanenda
gered areas. As the example, a database containing data about objects located in one of
Kazakhstan’s region can be used as a HMDEC expert system database. Figure 2 shows
the fragments of database
atabase project of objects located in one of Kazakhstan’s regions Djungarski Alatau (river Talgar valley).
Figure 2. The fragments of database of IWDSS: objects located in one of Kazakhstan’s regions –
Djungarski Alatau (river Talgar valley). Source: the own study on the ground of the information
from a research project[11]
On the ground of these data and with results of the simulation on problem solving
models based on situations taken into consideration in scenarios conclusions are exe
tracted. Taking those conclusions into consideration when dealing with different decidec
sions about: the evacuation of people and tangible property and cultural value;
realization of undertakings preventing loses caused by the mudflow; arranging new
buildings; agricultural objects; social objects etc.
DSS class system aiding decision-making
making processes on the various levels of regional
managing and in economic subjects can pick up and analyze information about the state
of the environment from different outside sources. In case of large amounts of analyzed
data, acquire from outside sources OLAP technology can be applied, which is a techtec
nology of the operating analytical data process, as a form of external data for the
HMDEC expert system data from different databases
abases of the hydrometeorology services
containing data about changes in hydrology characteristics of hydro objects over a long
Formation of the contents of knowledge’s bases for local intelligent…
237
period of time. It is worthwhile notice that the information about threats from natural
disaster comes from hydro- and meteorological services. At such threats, the management at the regional level sends the right message to special region commissions or cities, and they signal about threat to the management of economic, social, cultured
objects.
The entire knowledge essential to extraction conclusions aiding decision-making
processes is finding out in the process of knowledge elicitation with applying the ontological approach suitable for the problem domain description. This is one of the tasks
solved with knowledge engineering process for DSS. At the expert system HMDEC this
is a problem domain: “Threat of mudflow”. In this domain mainly weakly structured
problem are solving. Using HMDEC as a personnel-training tool one can acquire the
ability and the habit of making the right decision in a fast pace. The tool aiding the description of the task for decisions making at solving problems ill or unstructured can be
used as a formal model of the structure of this task.
6. Formal model of the ontology of domain “Threat of mudflow” in
managing the knowledge of the expert HMDEC system
Concepts from the expert HMDEC system domain, definitions of these concepts and
connections between them are presented in ontology domain: “Threat of mudflow”. It is
a well-known fact that ontologies can be used as instruments of knowledge managing.
The ontology of this domain presents the main concepts behind this field and the relationships between them. The problem models, which should be solved during the time
of natural disaster threats, which can affect the functioning of local commercial objects,
should also be presented. By ontology we mean a structuralized specification of the
domain in question[12]. Each ontology can be widened and made to better depending on
the process of elicitation knowledge in the domain in question. Using ontology we can
determine the hierarchical level of knowledge and also it is considered to be an easy to
use tool managing knowledge in knowledge bases. Also the basic concepts and relationships between them presented in the ontology create a tool for the fast introductions
to the domain in question. Thus it allows us to easily solve a problem of filling a knowledge base of any DSS class. While shaping the knowledge base of the expert HMDEC
system the quality of knowledge managing depend on described accuracy of ontology.
We will take under consideration the ontological approach on the example of domain
“Threat of mudflow”. Description of the ontology of domain mentioned above requires
the description of the entire hierarchy of ontology. These are only some of the fields,
introduced by ontologies to this hierarchy: managing system (region, city, economic
subject); possible results of the influence of threats of nature disaster (scenario); actions
taken to prevent losses caused by natural disasters; conclusion rules under the conditions of a natural disaster threat.
Usually different languages and systems are used for the description of ontology.
Graphs are comfortable form for the visual display and analyzing field. Such form of
display was demonstrated in[5]. A description of ontology in the formalized form is also
being used. We will introduce this last approach towards the description of ontology in
formalized form, by which formal model of the ontology of the problem domain “Threat
of mudflow” – MDMF – as following:
238
Tatiana Tretyakova, Abdullah Zair
MDMF = { MPA, PD,MS, MET }
(5)
where: MPA – ontology of subject domain, in which decisions are being taken (on the
level of the region, the city, the economic subject); PD – ontology of problems, solved
in the subject domain; MS – ontology of the domain “Scenarios of the development in
the situation”; MET – ontology of the domain “Protective actions”.
Falling below in borders of the hierarchy of ontology MDMF we will introduce the
model to the ontology of domain “Tasks, solved in subject domain” – PD:
PD = {MPS1, MPS2 … MPSk}
(6)
where: MPSk – models of solving problems.
Model of the ontology “ Scenarios of the development in the situation” – MS:
MS = {MRisk, MDm, MPr},
(7)
where: MRisk – ontology of risk; MDm – ontology of potential damages; MPr – ontology
of undertakings preventing loses.
The model of ontology „Protective actions” contains ontologies of the next lower
level:
MET = {MPinv, MEv},
(8)
where: MInv – ontology of the domain “Actions taken to face threats caused by mudflow
fall”; MEv – ontology of the field “Evacuation of people, financial and cultured values”.
This way looks the entire hierarchy of ontologies, being components of the ontology
of the problem domain “Threat of mudflow”. Every ontology should contain concepts
from described problem domain together with their definitions, mechanism of interpretation of concepts and relationships between them, as well as used axioms. Only then it
is possible to say, that ontology was completely described and stage of the conceptualization of knowledge while creating SE was implemented. Results of elicitation and
structurization of knowledge on the basis of the ontological approach at problem domain conceptualization give base to further realization of the knowledge engineering,
and that is to choose models of the presentation of knowledge in the knowledge base.
Task of knowledge's presentation in the knowledge base of the expert system HMDEC
is realized through presenting facts and rules (rule-based models). Rule models are being used to introduce the knowledge of the experts from the domain in tasks resolved
logically (e.g. evacuation of people in case of mudflow). Using rules for knowledge
presentation is an approach most often practiced in modern expert systems. Working out
set of rules type „if<antecedent>than<conclusion>” will let us introduce known variants
of solving a problem on the basis of the information about dangerous phenomena of the
nature in the base. Big part of some rules can be shown in the character of the scenario,
in which the state of the object is being described with the help of predicates (then the
whole description of the state of the object is a conjunction of all of those predicates), or
in the general form, containing the scenario of dangerous situations description. Frame
models are centre of the description of the static knowledge; they are comfortable for
the description of the hierarchy of abstract and concrete concepts. This approach is similar to the object attempt in introducing knowledge.
Formation of the contents of knowledge’s bases for local intelligent…
239
7. Conclusion
Changes of the climate even more often influence economic activities of the organizations and the enterprises. In this connection creation of such decision supporting systems, which promote fast adaptation to threats as the part of a nature, is necessary. The
structure and formal model of intelligent warning and decision support system (IWDSS)
is presented. The problem of knowledge's engineering for this system is considered.
One of tasks of engineering of knowledge is treated in article for intelligent decision
support systems with estimation of hydro meteorological information. The formal
model of ontology of problem area “Decisions making in a situation of a mudflow”,
submitted in article, is the methodical tool for search and structurization of knowledge
of projected system. With the description the ontology of problem domain “Decisions
making in a situation of a mudflow” preliminary conception of the knowledge bases,
worked with support from introduced formal models was shown. The fragments of database of objects located in area Djungarian Alatau of Kazakhstan are given. Models of
the presentation of knowledge offered for applying in the expert HMDEC system were
shortly discussed. Also the formal model of the structure of the task of making decisions and formal model of the ontology of domain “Decisions making in a situation of a
mudflow” in managing the knowledge of the expert HMDEC system was presented.
References
[1] Keen P. G. W., Scott Mormon M.S. Decision Support Systems: an organizational
perspective. Addison-Wesley, 1978.
[2] Radosiński E. Information systems in the dynamic decision analysis. PWN, Warszawa-Wrocław, 2001.
[3] Rudwic B. Systems Analysis for effective Planning: Principles and Cases. N.Y.
1969.
[4] Tretyakova T. The methodology of functional-structural analysis of decision processes and its role in the knowledge engineering for information systems. Materials
of the Polish Society of experts in the field of knowledge’s management, nr 8., Ed.
Waldemar Bojar, PSZW, Bydgoszcz, 2007.
[5] Tretyakova T., Zair A. Elicitation and structurization of knowledge for intelligence
subsystems of DSS. Polish Journal of Evironment Studies. Vol.17, No.3B. 2008.
[6] Solso R. Cognitive Psychology. PITER, St Petersburg, 2002 [in Russian].
[7] Popov O., Sołdek J., Tretiakowa T. Adaptive Information Management System.
Collection of articles: Information Systems in Strategic management, Informa,
Szczecin, 1997.
[8] Popov O., Tretyakova T. Approach to an Evaluation of Efficiency of the Integrated
Information System’s Implementation at the Enterprise: Orientation to Processes,
Proceedings of the 8-th International Conference Advanced Computer Systems
ACS’2001, Szczecin, Poland, 2001.
[9] Tretyakova T. The knowledge base of expert system “HMDecision” in information
system of class DSS – the object approach to construction. Proceedings of IV International conference The Analysis, forecasting and management in complex systems, St. Petersburg 2005 [in Russian]
240
Tatiana Tretyakova, Abdullah Zair
[10] Luger G. F. Artifical Intelligence. Structures and Strategies for Complex Problem
Solving. Addison-Wesley, 2002.
[11] Tretyakova T., Radugin D., Kolobov V., Titova E. St.Petersburg State Hydrological Institute: Scientific report on results of the research project: Development of a
technique of a social and economic estimation of mudflow actions, St. Petersburg
1990 [in Russian].
[12] Gruber T. R. Toward principles for Design of Ontologies Used for Knowledge
Sharing. International Workshop on Formal Ontology, Marcz, Padowo, Italy, 1993.
Impact of the presence of linguistic data
on the decision aid process
Jarosław Wątróbski, Zbigniew Piotrowski
Szczecin University of Technology,
Faculty of Computer Science and Information Technology
Abstract:
Many decision situations are described using both numerical and linguistic values.
Because of the presence of linguistic data, analytics are forced to apply appropriate
measures. These measures are often applied arbitrary, without consulting the Decision
Maker, which creates an intangible gap between the DM’s intentions and the final
decision model. The paper analyses the impact of applying different conventions for
utilising linguistic values in the decision aiding process. The considered measures include
quantification, fuzzy modelling and applying linguistic versions of MCDA methods.
Concluding remarks describes advantages of the alignment of the decision aiding
process to the Decision Maker’s problem formulation.
Keywords:
linguistic values, decision aiding
1. Introduction
Making decisions is a fundamental activity of human existence. Most Decision
Makers are focused on establishing potential actions in order to implement one of them.
A DM does all activities connected with analysing a domain specific aspect of the decision and with a methodological strategy of a given situation. Decision making is an
activity, whereas decision aiding is a process of helping a DM to make his decisions.
Furthermore decision aiding assumes involving an analyst into the process, who provides a DM with methodological framework and directs his expressions to suit the best
to the chosen decision aiding approach. In the result an synergetic effect is created between a DM’s preferences and the structure of a decision situation [1].
Existing decision situations cover virtually all areas of human’s life, and plays a
role in social phenomena, namely organisations. One of main concern of organisational
science is the decision theory. The high importance of decision made within organisation implies the significant role or aiding decision makers with their tasks. Supporting
them in order to gain the maximal benefit for the organisation requires ensuring high
quality of the process on all stages and from all points of view. This paper is focuses on
the data modelling aspect of decision aiding.
Subjects of decisions are frequently complex and detached from strict measurements, hence a situation cannot be described quantitatively. Furthermore for some aspects it would be inappropriate or even falsifying the reality to apply numerical grades.
242
Jarosław Wątróbski, Zbigniew Piotrowski
The most representative examples where applying qualitative measurements was suggested are: decision of choosing the appropriate strategy for a company [2], choosing an
information system [3] or evaluating localisations [4].
Data describing surrounding reality/environment can be divided into two main categories which are: hard data (numerical properties) and soft data (linguistic data, graphic
data, feelings). Therefore two main approaches to deal with these data are utilised:
quantitative methods for hard data and qualitative methods for soft data. Furthermore
systems for solving “hardly” defined problems operate on precise and crisp mathematical models, where “softly” defined problems are solved using linguistic reasoning and
dealing with imprecise entities [5].
The quality of input data used to describe a decision situation and to analyse available alternative is crucial for the overall reliability of the decision aiding process. The
recommendation cannot be of a better quality than data used to create a model. Hence,
using inaccurate or approximate data shall result in approximate recommendation.
A common mistake made in specifying requirements is to ask for an explicit recommendation, where available data are imprecise or ambiguous.
Data handling has to reflect characteristic of data used to describe the problem and
its possible outcomes. For that reason an analyst has to consider a source of linguistic
data used in the decision process because the data can be inputted by humans or gathered from measuring equipment. Probabilistic characteristics of data gathered from
instruments are often reflected by human-provided descriptions, however if human
input is the only source of a description of a particular feature, calculations shall be
done in a different manner, as different extensions principles for linguistic sets have to
be used [6].
2. Describing decision situations
Many real life decision situations have it source in the need of a human for choosing
the best option among a given set of options. The goal of a situation of that kind is to
satisfy the decision maker. It can be achieved by providing him with maximal satisfaction which can be also recognised as the “usefulness” or “utility value”. The source of
the decision situation lays inside human’s brain. Therefore the description of the problem is done in the method appropriate for “the human brain’s data model”, which is
based on linguistic information.
Mentioned conditions are important especially in situations where problems are unstructured and described mostly by qualitative criteria. Moreover criteria and attributes
values often cannot be described by numbers due to their specifics/characteristic/source.
Another difficulties which the DM is faced to are comparing objects that have similar
values of parameters and comparing options described with multiple attributes. Such
tasks are beyond human ability to make reliable statements [7].
Human provided linguistic judgements are inconsistent due to various opinions attached with the same label by different persons. The result of that ambiguity is a fact
that decision making involving multiple decision causes high risk for a decision quality
when all decision makers use the same scale. Moreover a value attached to a label is
determined by a context of a situation and a wide range of values causing an inconsistency in a value scale may be created in the result. A reason for such an inconsistency is
Impact of the presence of linguistic data on the decision aid process
243
that a human is forced to make judgements bonded by a fixed set of labels which source
is his vocabulary and common practice [8].
A common approach utilised by many widely applied decision aid method requires
a decision maker to specify his/her preferences on a given scale. The scale is strictly
defined and usually it’s a numerical scale with fixed distances. The scale can be described by labels, however it’s values are treated as (or even converted into) numbers.
The main data loss which happens in such a transformation is that in human’s perception judgements’ labels are not distributed evenly. Another aspect is practical meaning
of numerical values in some categories. As an example the accuracy of temperature is
important when designing a chemical or technological process, however in describing
the weather in leisure localisations it is enough to use one of commonly used labels like
“warm”, “hot”, “moderate”.
The Figure 1 presents the suitability of reflecting crisp values on linguistic descriptions. The main difference lays in technical and behavioral genesis of a given factor.
Many decision situations are directly related to human-specific activities and can be
described only by humans themselves. Hence human-specific language shall be used to
model such decisions. An example of such a language is the Precisiated Natural Language developed by L. Zadeh and described in [9].
Figure 1. Using linguistic values to describe numerical values (source: based on [9])
Another source of not crisp data are decision alternatives’ descriptions. Humancentric decisions are focused on subjects which are described subjectively by decision
makers. Furthermore in many situations it is more appropriate to assign a subjective
label for a precise numerical value of a property/an attribute than to consider the value
itself. The most common approach is to utilise the fuzzy sets theory in comparing alternatives’ performance for a given criterion. Such an approach helps to avoid judging
about the performance of an alternative based on an insignificant value difference between alternatives. The paper [10] provides a description of common areas when linguistic decision analysis is applied.
244
Jarosław Wątróbski, Zbigniew Piotrowski
An interesting and dynamically developed direction in the research on making ubituqous computing more friendly for humans is the Natural Language Processing. Utilising results of these research would allow to recognise a decision situation based on
a description in the natural language. The text summarisation capability should be able
to identify criteria and objectives from sentences in the natural language. Moreover it
should be possible to create an software agent which could search the internet for promising decision alternatives. The search would be directed on analysing various web
pages and performing text mining in order to extract an possible decision alternative.
The further reading about extracting knowledge from a text in the natural language the
paper is available in [11]. The issue is interesting due to the possible ability to use it for
widening the scope of gathering possible decision alternatives. It is a common convention to search for data in the DM’s language. However the NLP methodology can be
used to search for options described in other languages, even in such different from
western languages like Thai, which is a matter of creating an appropriate ontology [12].
Applying NLP to linguistic enabled decision aiding system allows to consider much
greater number of decision alternatives which increases the probability of achieving
better results for the DM.
The description of a decision problem consists of the following elements (described
in [13]):
 criteria, standing for tools for evaluation and comparison of possible outcomes of an
implemented decision,
 problem statement, one of possible problematics, which are: choice, sorting, ranking
and classification,
 potential actions or alternatives, which describe a subject of the choice made in
a given scenario.
Dealing with mentioned elements in the linguistic form requires a specific approach
to create a decision model. After identifying a problem, the DM elaborates the mentioned above elements and it is necessary to gather values for provided hierarchy of
properties along with appropriate scales. Expressing values on a scale can be done numerically, verbally or graphically. In the following stage, a judgement matrix is established an evaluations of criteria and alternatives are performed. Finally the final ranking
can be constructed and used to choose the best alternative [14].
3. Decision Makers internal preferences
Linguistic judgements specified by a human includes an amount of metainformation. This meta-data represent the DM’s tacit knowledge and cannot be extracted directly. The only possible manner for not losing that knowledge is to propagate
it throughout the decision aiding process and to elaborate it in the final recommendation. Such knowledge can include information about uncertainties, probabilities or approximations made by making such an elaboration. Even when a particular term is
associated with a triangular set, the DM can posses his own preferences independently
from the membership function. Hence the only method to discover such hidden assumptions is to face the DM itself with it.
Furthermore each human posses his/her own internal value system and each decision situation a DM builds an internal scale. One of the purposes of linguistic reason-
Impact of the presence of linguistic data on the decision aid process
245
ing/computations is to utilise such a scale instead of forcing the DM to express his preferences on an unsuitable for him fixed scale. An ideal decision aiding method would be
transparent for data used in the process and would provide a recommendation elaborated
in the same ontology/language as a description of a decision situation provided by the
DM. A fully linguistic decision aiding method would perform calculations with “granules” defined in [15] and distinguish it among basic types which include: possibilistic,
probabilistic, veristic and generalized [15].
Whereas currently developed methods MCDA methods utilise various approaches to include meta-information (tacit knowledge) which come with linguistic
data, these approaches are still based on reflecting uncertain and linguistic data on fuzzy
sets with its shape defined a priori. Utilised approaches provide different levels of
meta-data preservation. Hence decision aiding process is based on various layers of
information depending on a chosen MCDA method.
The first step in the linguistic decision process is to choose the linguistic term set
with its semantic. Meta-information of a decision has to be tied to a specific domain
(linguistic expression domain) in order to perform calculations. Further steps include
choosing an aggregation operator and performing all the calculations [10].
4. Using linguistic data in decision aiding
A course of a decision making process with linguistic variables can look as follows
(based on [8]):
1. Selecting linguistic terms by expert and defining fuzzy sets for selected terms,
evaluating alternatives.
2. Synthetising judgements for alternatives’ attributes.
3. Synthetising descriptions for all alternatives and formulating a decision.
In the paper [8] a problem of choosing a new computer system for a company was
solved using linguistic descriptions provided by a group of experts. First of all a scale of
linguistic variables was defined and encoded using triangular fuzzy sets. In the next step
domain experts from appropriate departments assigned values for all alternatives which
were considered as criteria performances for alternatives, because each department
analysed properties concerned with its area of expertise. Afterwards experts’ judgements are synthetised using weighted sum aggregating method and the best alternative
is pointed out.
The fuzzy extension of TOPSIS method was described in the paper [16], where
weights of criteria and values of qualitative properties were considered as linguistic
variables. The DM used tables of defined variables and trapezoidal fuzzy number to
specify his input. Further the calculations were made according to the TOPSIS algorithm adapted to deal with fuzzy sets treated as four points vectors.
A modified comparison of three approaches for solving decision problems are introduced to pinpoint impact differences between various methods of dealing with linguistic
data. The methods described below are based on an “aggregation-and-ranking scheme”
[17].
The first of analysed methods is based on the Extension Principle, the issue connected with using this method is a linguistic approximation for matching linguistic
terms with fuzzy sets. In the described example, Euclidean distance was applied. The
246
Jarosław Wątróbski, Zbigniew Piotrowski
next solution is based on the Symbolic Approach, where linguistic aggregation is used
along with a weighting vector, however the outcome lacks precision. Therefore the last
considered method avoids approximation of linguistic labels and uses the 2-tuple Fuzzy
Linguistic Representation Model. Information this model is stored as 2-tuples of the
form ( s, α ) , where s is a linguistic term and α is a value from the range [ −0.5, 0.5 )
where linguistic information is encoded in. From the mentioned approaches, the result
obtained with the last method was the most reliable and the only one of the satisfactory
accuracy [17].
5. Conclusions
Among decision situations met by humans, there is not many of them which can be
described using only quantitative measurements. Moreover the vast majority of decisions concerns some level of fuzziness or qualitative parameters. The first and still
common approach of dealing with such situations is to force a DM to express his/her
judgements on a defined numerical scale with verbal labels assigned to each values.
Performing numerical calculations about subjects which have qualitative nature creates a knowledge gap between a DM’s/a human’s description of a problem and a computer model. Therefore processing of linguistic data using various approaches were
introduced. The first important contribution of that direction was applying the fuzzy sets
theory to decision aiding, which allowed to model fuzziness of human judgements.
The paper describes approaches for linguistic decision aiding and supports its applications. The decision process was presented as a human-oriented process. Such a process has to offer alignment to a DM’s perception of a decision situation. Linguistic terms
are one of commonly applied methods for making communication with a computer
more natural for a human.
Decision aiding supporting linguistic descriptions in a DM’s preferences and in descriptions of decision alternatives makes the process closer to Perception Based Information Processing, as described in [18]. Furthermore, perceptions are the only
information available for many factors – especially for subjective ones. Most current
decision aiding approaches can be summarised as “database oriented”, whereas information is “internet oriented”. The differences are presented in the Table 1.
Table 1. Database and Internet oriented models
Database
Distributed
Controlled
Query (QL)
Precise
Structure
Internet
Distributed
Autonomous
Browse (Search)
Fuzzy/Imprecise
Unstructured
Source: [18]
Impact of the presence of linguistic data on the decision aid process
247
Further development of linguistic decision aiding was a result of including natural
characteristic of humans’ elaborated judgements. These characteristic includes asymmetry of verbal expressions and dependency of their meaning on a specific context.
Contextual character of human judgements was a foundation of Natural Language
Processing technologies, which can be used along with linguistic data mining to automate the process of gathering decision alternatives.
A recommendation for a DM has to be a response for his problem and to be
aligned with his expectation. Hence the accuracy and abstraction level of a recommendation as well as the whole decision aiding process shall be performed in accordance
with the precision and certainty of a problem formulation provided by the DM.
A variety of decision aiding methods is based on different approaches for including imprecise or linguistic information on all stages of decision aiding process. These
approaches include: linguistic criteria formulation, linguistic alternative descriptions,
and linguistic preferences elaboration. Furthermore linguistic data are encoded in different forms and calculations are done with various levels of propagation of linguistic
assessments.
The main issue connected with applying a method not aligned with a particular
situation is a loss of data during the decision aiding process. Inaccuracy in data leads to
inaccurate recommendation which makes the whole decision aiding process senseless.
Therefore a concern should be put on choosing an appropriate approach to deal with
linguistic data according to an available descriptions provided in the input to the process.
References
[1] A. Tsoukiàs, On the concept of decision aiding process: an operational perspective, Annals of Operations Research, 154 (2007), pp. 3-27.
[2] E. Ertugrul Karsak and E. Tolga, Fuzzy multi-criteria decision-making procedure
for evaluating advanced manufacturing system investments, International Journal
of Production Economics, 69 (2001), pp. 49-64.
[3] S.-L. Chang, R.-C. Wang and S.-Y. Wang, Applying a direct multi-granularity
linguistic and strategy-oriented aggregation approach on the assessment of supply
performance, European Journal of Operational Research, 177 (2007), pp. 10131025.
[4] J. Malczewski and C. Rinner, Exploring multicriteria decision strategies in GIS
with linguistic quantifiers: A case study of residential quality evaluation, Journal
of Geographical Systems, 7 (2005), pp. 249-268.
[5] V. A. Niskanen, A soft multi-criteria decision-making approach to assessing the
goodness of typical reasoning systems based on empirical data, Fuzzy Sets and
Systems, 131 (2002), pp. 79-100.
[6] A. Piegat, Are Linguistic Evaluations Used by People of Possibilistic or Probabilistic Nature?, in J. G. a. S. Carbonell, J., ed., Artificial Intelligence and Soft Computing - ICAISC 2004, Springer, Berlin / Heidelberg, 2004, pp. 356-363.
[7] H. Moshkovich, A. Mechitov and D. Olson, Verbal Decision Analysis, Multiple
Criteria Decision Analysis: State of the Art Surveys, 2005, pp. 609-633.
248
Jarosław Wątróbski, Zbigniew Piotrowski
[8] J. Ma, D. Ruan, Y. Xu and G. Zhang, A fuzzy-set approach to treat determinacy
and consistency of linguistic terms in multi-criteria decision making, International
Journal of Approximate Reasoning, 44 (2007), pp. 165-181.
[9] Zadeh, Precisiated Natural Language, Aspects of Automatic Text Analysis, 2006.
[10] F. Herrera and E. Herrera-Viedma, Linguistic decision analysis: steps for solving
decision problems under linguistic information, Fuzzy Sets Syst., 115 (2000), pp.
67-82.
[11] H. Mangassarian and H. Artail, A general framework for subjective information
extraction from unstructured English text, Data & Knowledge Engineering, 62
(2007), pp. 352-367.
[12] A. Imsombut and A. Kawtrakul, Automatic building of an ontology on the basis of
text corpora in Thai, Language Resources and Evaluation.
[13] B. Roy, Paradigms and Challenges, Multiple Criteria Decision Analysis: State of
the Art Surveys, 2005, pp. 3-24.
[14] M. S. Garcia-Cascales and M. T. Lamata, Solving a decision problem with linguistic information, Pattern Recognition Letters, 28 (2007), pp. 2284-2294.
[15] Zadeh, Some reflections on soft computing, granular computing and their roles in
the conception, design and utilization of information/intelligent systems, Soft
Computing - A Fusion of Foundations, Methodologies and Applications, 2 (1998),
pp. 23-25.
[16] Đ. Ertuğrul and M. Güneş, Fuzzy Multi-criteria Decision Making Method for Machine Selection, Analysis and Design of Intelligent Systems using Soft Computing
Techniques, 2007, pp. 638-648.
[17] V.-N. Huynh and Y. Nakamori, Multi-Expert Decision-Making with Linguistic
Information: A Probabilistic-Based Model, Proceedings of the Proceedings of the
38th Annual Hawaii International Conference on System Sciences (HICSS'05) Track 3 - Volume 03 (2005), pp. 91.3.
[18] M. Nikravesh and D.-Y. Choi, Soft Computing for Perception Based Information
Processing, Soft Computing for Information Processing and Analysis, 2005, pp.
203-255.

Podobne dokumenty