Cluster
Environment Observer (CEO)
A Tool for Monitoring
and Administration of Heterogenenous Clusters
The problem
The usage of clusters
of computers as high-performance and reliable computing platform is
getting popular day by day. The normal tendency is to build a small
cluster with few nodes in the beginning and add more nodes as computational
requirements grow. This on-demand growing tendency of clusters leads them
to becoming heterogeneous both in hardware (for instance, node processing
power or memory size) and software configuration. It is also quite common
to find a cluster where some of the machines run the Unix
operating system while others run NT.
The existing monitoring tools are only targeted for one of these two
systems and it forces system administrators to use different tools for
monitoring different clusters. This approach goes against the idea that
cluster should offer a Single-System Image.
The administrator has to use different tools for different nodes. The ideal
solution would be a single monitoring/administration tool that could be
used to monitor and adminster both Unix (including Linux) and Windows clusters
in the same way. Our Cluster Environment Observer aims offer an environment
that allows to monitor and administer heterogeneous clusters through a
single interface.
Our approach
There are several possible approaches to solve this
problem. The first one consists on developping a completelly new tool that
is able to work on both system. Although it is a feasible solution, it
requires too much effort. The other solution is to find already existing
tools that work on one of the system and port some part of them in order
to be able to run them on the other system.
In our case, we have picked three tools that were
already working on Unix systems and implementd the necessary code to run
them also under NT giving the image of a single system.
Sever based tools
After studing many monitoring/dministration tools,
we found that most of them follow the client/server approach. There is
a processs in each node to be monitored/administered that knows what happens
in the machine. This sever informs another process (one per system) that
is in charge of interacting with the system administrator.
This architecture has simplified our task very
much as we have only needed to implemnt the server that runs and manages
each node. The user interfaces are the ones alredy existing.
Furthermore, as these server usually offer the
same kind of information, we have been able to implement a single server
that is fully compatible with three already existing monitoring/administration
tools.
Tools that can be NOW be used in both systems
(NT and Unix):
Download the tool
-
STILL UNDER DEVELOPMENT
For more information please send an e-mail to
Toni
Cortes.
Publications
People
-
Oriol
Teixió, Universitat Politècnica de Catalunya, Barcelona,
Spain
-
Toni Cortes,
Universitat Politècnica de Catalunya, Barcelona, Spain
-
Rajkumar
Buyya, Monash University, Melbourne, Australia
Project Homes