|
Parallel I/O in Practice
Rob Latham, Rob Ross, Rajeev Thakur,
and William Loewe
Tutorial Description
This tutorial will be a
one-day tutorial focusing on parallel I/O for computational scientists.
The first half of the day will focus on parallel file systems in theory
and practice, including a discussion of the three most relevant parallel
file systems in use today: GPFS, Lustre, and PVFS2. The second half of
the day will focus on the I/O software stack as a whole, including POSIX
I/O,MPI-IO, and high-level I/O libraries. Combined, this material
prepares attendees to make informed decisions about how to approach
parallel I/O for applications and provides attendees with an
understanding of how the I/O stack works to enable high-performance I/O.
If it is preferred, we can present a half-day version of the tutorial
instead. This was presented at Cluster 2005, with 7 people signing up
and 14 showing up for the tutorial. If we were to present in a half-day
format, only two of the presenters would attend, and we would focus on
the second half of the material.
General Description
This tutorial
fundamentally covers how to use the parallel I/O resources available on
today’s computational science platforms to obtain high performance for
application I/O. To understand how to best use these resources, one must
first have an understanding of parallel file systems and how they
operate. This provides the base knowledge necessary for understanding
how higherlevel I/O components interact with the parallel file system.
The first half of our tutorial is dedicated to parallel file systems. We
cover common architectures and features of parallel file systems first,
then we examine three relevant examples: GPFS, Lustre, and PVFS2. We
wrap up our coverage of parallel file systems by discussing the results
of benchmarking these systems. We highlight the relationships between
benchmarks and application patterns and the strengths and weaknesses of
each file system. Effective parallel I/O involves more than just
parallel file systems. Applications always interact with the parallel
file system through some application programmer interface (API). There
are a number of interface options for parallel I/O, and in fact some of
these are built on one another (I/O software stacks). The second half of
our tutorial covers interfaces for parallel I/O.We cover the four most
common APIs: POSIX,MPI-IO, Parallel netCDF, and HDF5, in order of
increasing functionality (and complexity). For each interface we first
discuss the features of the interface and show a simple example. We
follow this up with a more practical and detailed example. We then
discuss what happens below the interface, leveraging our discussion from
the first half of the day.
Goals
Attendees will come out of
this tutorial with an understanding of:
• how parallel file systems and I/O software stacks are constructed,
• the relative merits of some popular parallel file systems,the I/O
interfaces available for applications and their features, and
• how in I/O stacks components work together to manage parallel I/O.
This background will allow them to make educated decisions about how to
manage I/O in their applications. They will have an opportunity to
interact with experts in all of these components during the tutorial and
over lunch to discuss how to best approach parallel I/O in the context
of their particular applications.
Target Audience, Prerequisites,
and Content Level
This tutorial is intended
for an audience interested in learning how to best use I/O resources on
parallel computers. We assume some proficiency in MPI, but no experience
with parallel file systems or any of the I/O interfaces or libraries
that we will cover. The content is approximately 30% beginner, 40%
intermediate, and 30% advanced.
Topic Relevance
Parallel I/O is always a
hot topic in parallel computing. Making best use of available I/O
resources is an important issue for applications in order to avoid
bottlenecks. New machines such as the IBM BG/L and the Cray Red Storm
provide a significant improvement in computational power, making proper
use of I/O resources an even more critical issue.
Ensuring Cohesive Content
The three Argonne
presenters have worked together for a number of years on projects such
as PVFS, PVFS2, Parallel netCDF, and ROMIO. All the presenters have met
numerous times for discussions related to parallel I/O and presented
this material at SC2005. The second half of the tutorial has been
presented a number of times by subsets of the presenters.
Demos and Exercises
It is impractical to
provide all attendees access to a system running all the parallel file
systems and libraries described in the tutorial itself. Instead we will
concentrate on practical examples, benchmarks, and knowledge that could
be applied across a wide range of application domains and file systems.
We will also provide pointers to benchmarks, file systems, and libraries
used in the tutorial so that attendees can experiment on their own.
Outline
1
Introduction
– I/O software stacks
– Parallel file systems
2 Parallel file systems in practice
– Example parallel file systems
– Benchmarking and parallel file system performance
-- Lunch--
3 Interfaces for parallel I/O
– POSIX
– MPI-IO
– Parallel netCDF (PnetCDF)
– HDF5
4 Conclusions
Grid Computing and Gridbus
Rajkumar Buyya
Audience
This tutorial should be of interest
to a large number of participants from academia, government, and
commercial organizations as it focuses on both theory and practice of
grid computing and applications. They include: (A) students, researchers,
and developers interested in creating technologies and applications for
Next Generation Grids (B) participants from commercial organizations
interested in creating online Grid marketplace and applications, and (C)
policy makers of Grid Computing as we will be offering a live
demonstration of current Grid technologies and their applications.
Course Description
Grid computing,
one of the latest buzzwords in the ICT industry, is emerging as a new
paradigm for Internet-based parallel and distributing computing. It
enables the sharing, selection, and aggregation of geographically
distributed autonomous resources, such as computers (PCs, servers,
clusters, supercomputers), databases, and scientific instruments, for
solving large-scale problems in science, engineering, and commerce. It
leverages existing IT infrastructure to optimize compute resources and
manage data and computing workloads. The developers of Grids and Grid
applications need to address numerous
challenges: security, heterogeneity, dynamicity, scalability,
reliability, service creation and pricing, resource discovery, resource
management, application decomposition and service composition, and
qualify of services. A number of projects around the world are
developing technologies that help address one or more of these
challenges. To address some these challenges, the Gridbus Project at the
University of Melbourne has developed grid middleware technologies that
support rapid creation and deployment of eScience and eBusiness
applications on enterprise and global Grids.
The components of Gridbus middleware are: Grid application development
environment for rapid creation of distributed applications, Grid service
broker and application scheduler, Grid workflow management engine, SLA (service-level
agreements) based Scheduler for clusters, Web-services based Grid market
directory (GMD), Grid accounting services, Gridscape for creation of
dynamic and interactive resource monitoring portals, Portlets for
creation of Grid portals that support web-based management of Grid
applications execution, and GridSim toolkit for performance evaluation.
In addition, Gridbus also includes a widely used .NET-based enterprise
Grid technology and Grid web services framework to support the
integration of both Windows and Unix-class resources for Grid computing.
The tutorial covers the
following topics:
1. Fundamental principles
of grid computing and emerging technologies that help in creation of
Grid infrastructure and applications.
2. A Review of major international efforts in developing Grid software
systems and applications both in academic, research and commercial
settings.
3. Service-Oriented Grid Architecture for realising utility computing
environment that supports resource sharing in research and commercial
environments. Realization of this architecture by leveraging standard
computing technologies (such as Web Services) and building new services
that are essential for constructing industrial-strength Grid engines.
4. Gridbus middleware and technologies for creating enterprise and
global utility Grids.
5. Issues in setting up Grids that can scale from enterprise to global
and deploying applications on them.
6. Case studies on the use of Gridbus technologies in creating
applications in the area of Drug Discovery, Neuroscience, High Energy
Physics, Natural Language Engineering, Environmental Modelling,
Medicine, Portfolio and Investment Risk Analysis.
7. Live demonstration of Gridbus technologies and their use in creating
and deploying sample applications on the World Wide Grid (WWG).
8. Sociological and industrial implications of this new Internet-based
distributed computing paradigm and its impact on the marketplace.
The tutorial places emphasis on concepts of Grid economy, how to design
and develop Grid technologies and applications capable of dynamically
leasing services of distributed resources at runtime depending on their
availability, capability, performance, cost, and users' quality of
service requirements.
|
|