Abstract

Recently, the volume of data produced in academia and industry has grown drastically. Distributed computing systems including Grids make use of computer networks (e.g. Internet) to share various computing resources around the world in order to improve the processing. Due to large data volumes being transferred between geographically spread computing nodes, network aspects of the computing systems have become significant. In this article, we introduce a model of an overlay distributed computing system, which could be used by for multiple classifier systems. We formulate an Integer Programming optimization problem with the objective to minimize the OPEX cost including processing and data transfer. Next, an effective heuristic algorithm based on the Greedy Randomized Adaptive Search Procedure (GRASP) approach is developed and examined.

References

[1]
Alpaydine
E.
Introduction to Machine Learning
2010
2nd
The MIT Press
[2]
Akbari
B.
Rabiee
H.
Ghanbari
M.
An optimal discrete rate allocation for overlay video multicasting
Computer Communications
2008
, vol. 
31
 (pg. 
551
-
562
)
[3]
Basak
J.
Kothari
R.
A classification paradigm for distributed vertically partitioned data
Neural Computation
2004
, vol. 
16
 (pg. 
1525
-
1544
)
[4]
Corchado
E.
Arroyo
Á.
Tricio
V.
Soft computing models to identify typical meteorological days
Logic Journal of the IGPL
2010
 
[Epub ahead of print; doi:10.1093/jigpal/jzq035, 21 July]
[5]
D'Costa
A.
Ramachandran
V.
Sayeed
A. M.
Distributed classification of Gaussian space-time sources in wireless sensor networks
IEEE Journal on Selected Areas in Communications
2004
, vol. 
22
 (pg. 
1026
-
1036
)
[6]
Foster
I.
Kesselman
C.
The Grid 2: Blueprint for a New Computing Infrastructure.
2003
Morgan Kaufmann Publishers Inc.
[7]
Freitas
A. A.
Lavington
S. H.
Mining Very Large Databases with Parallel Processing.
1998
Kluwer Academic Publishers
[8]
Festa
P.
Resende
M. G. C.
GRASP: an annotated bibliography
Essays and Surveys on Metaheuristics
2002
Kluwer Academic Publishers
(pg. 
325
-
367
)
[9]
Han
J.
Data Mining: Concepts and Techniques.
2005
Morgan Kaufman Publishers Inc.
[10]
Jackowski
K.
Multiple classifier system with radial basis weight function
LNCS
2010
, vol. 
6076
 (pg. 
540
-
547
)
[11]
Kasprzak
A.
Designing of Wide Area Networks.
2001
Wroclaw University of Technology Press
[12]
Kuncheva
L. I.
Combining Pattern Classifiers: Methods and Algorithms.
2004
Wiley
[13]
Kacprzak
T.
Walkowiak
K.
Woźniak
M.
GRASP algorithm for optimization of grids for multiple classifier system
Proceedings of the 5th International Workshop on Soft Computing Models in Industrial and Environmental Applications SOCO 2010
2010
Springer
(pg. 
137
-
144
Vol. 73/2010 of Advances in Soft Computing
[14]
Lima
L.
Novais
P.
Costa
R.
Cruz
J. B.
Neves
J.
Group decision making and Quality-of-Information in e-Health systems
Logic Journal of the IGPL
2010
 
[Epub ahead of print; doi:10.1093/jigpal/jzq029; 2010, 3 July]
[15]
Leonidas
S.
Pitsoulis
L.
Resende
M. G. C.
Greedy randomized adaptive search procedures
Handbook of Applied Optimization.
2002
Oxford University Press
[16]
Luo
P.
Xiong
H.
K.
Shi
Z.
Distributed classification in peer-to-peer networks
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD'07), August 12–15
2007
San Jose, CA, USA
[17]
Marcialis
G. L.
Roli
F.
Foresti
G. L.
Regazzoni
C.
Varshney
P.
Fusion of face recognition algorithms for video-based surveillance systems
Multisensor Surveillance Systems: The Fusion Perspective
2003
Kluwer Academic Publishers
(pg. 
235
-
250
)
[18]
Milojičić
D.
Kalogeraki
V.
Lukose
R.
Nagaraja
K.
Pruyne
J.
Richard
B.
Rollins
S.
Xu
Z.
Peer to peer computing
Technical Report HPL-2002-57
2002
HP laboratories Palo Alto
[19]
Magoulès
F.
Nguyen
T.
Yu
L.
Grid Resource Management: Toward Virtual and Services Compliant Grid Computing.
2009
CRC Press
[20]
Miller
D. J.
Zhang
Y.
Kesidis
G.
Decision aggregation in distributed classification by a transductive extension of maximum entropy/improved iterative scaling
EURASIP Journal on Advances in Signal Processing
2008
, vol. 
2008
 pg. 
21
  
Article ID 674974
[21]
Nabrzyski
J.
Schopf
J.
Wêglarz
J.
Grid Resource Management: State of the Art and Future Trends.
2004
Kluwer Academic Publishers
[22]
Pióro
M.
Medhi
D.
Routing, Flow, and Capacity Design in Communication and Computer Networks.
2004
Morgan Kaufman Publishers
[23]
Ryba
P.
Kasprzak
A.
The two-criteria gateways location and topology assignment problem in hierarchical WANs: an exact algorithm and computational results
LNCS
2008
, vol. 
5073
 (pg. 
896
-
906
)
[24]
Prodromidis
A. L.
Chan
P. K.
Stolfo
S. J.
Kargupta
H.
Chan
P.
Meta-learning in distributed data mining systems: issues and approaches
Advances in Distributed Data Mining
2000
AAAI Press
(pg. 
81
-
114
)
[25]
Puchinger
J.
Raidl
G.
Pferschy
U.
The multidimensional knapsack problem: structure and algorithms
INFORMS Journal on Computing
2009
, vol. 
22
 (pg. 
250
-
265
)
[26]
Przewoźniczek
M.
Walkowiak
K.
Woźniak
M.
Optimizing distributed computing systems for k-nearest neighbors classifiers - evolutionary approach
Logic Journal of the IGPL
2010
 
[Epub ahead of print; doi:10.1093/jigpal/jzq034; 6 July]
[27]
Van Erp
M.
Vuurpij
L. G.
Schomaker
L. R. B.
An overview and comparison of voting methods for pattern recognition
Proceedings of the Eighth International Workshop on Frontiers in Handwriting Recognition (IWFHR'8)
2002
(pg. 
195
-
200
)
[28]
Wadenstein
M.
The LHC Data Stream.
2008
Nordic DataGrid Facility
[29]
Wolpert
D. H.
The supervised learning no-free-lunch theorems
Proceedings of the 6th Online World Conference on Soft Computing in Industrial Applications
2001
[30]
Wu
G.
Tzi-cker
C.
Peer to peer file download and streaming
RPE report, TR-185.
2005
Department of Computer Science, Stony Brook University
[31]
Walkowiak
K.
Woźniak
M.
Decision tree induction methods for distributed environment
Men-Machine Interactions, Advances in Intelligent and Soft Computing
2009
, vol. 
59/2009
 
Springer
(pg. 
201
-
208
of Advances in Soft Computing
[32]
Zhu
Y.
Li
B.
Overlay networks with linear capacity constraints
IEEE Transactions on Parallel and Distributed Systems
2008
, vol. 
19
 (pg. 
159
-
173
)
This content is only available as a PDF.