ScienceGrid
The nation has a wide range of unique resources: large experimental facilities such as LIGO and SNS, supercomputer centers, petabyte data archives, high speed Internet, and most importantly the expertise of scientists at labs and universities across the country. “The easy science problems are all done,” says somebody important, “Tomorrow’s scientific breakthroughs in biology, nanotechnology, physics require large multidisciplinary teams and the ability to effectively use the scientific resources distributed around the country.” In fact, all areas of science have the need for bringing resources to the fingertips of the experts in those areas. Connecting scientists, instruments, computing, and data is the Science Grid.
The Science Grid creates the environment that science needs to solve tomorrows problems. It enables innovative approaches to scientific computing through secure remote access to online facilities, high-speed Internet access, distance collaboration, shared petabyte datasets, and large-scale distributed computation.
Scientific communities have need for multiple users to remotely access high performance computing resources and large data archives to perform simulations or to analyze the results of experiments. They need to collaborate with others involved in the simulations or experiments and coordinate the access and use of the resources.
The complex and evolving nature of scientific discovery requires general services that can be combined in many different ways to support different problem solving approaches and the ability to evolve along with the scientific understanding of the problem. Resource management for such dynamic and distributed environments require global naming and authorization services, scalability, fault tolerance, data management, security, authentication, and protection of proprietary data.
The goal of the Science Grid is to provide a common and supported set of services across all the scientific resources so that scientists can easily access, use, and share these resources more efficiently with the larger scientific community.
The design and deployment of large, multi-site Grids are still evolving. The current state-of-the-art in providing persistent and usable Grid services can be seen at the following sites.
Technical Progress
- Resource Usage Data Management and Accounting
To dynamically monitor, manage, and account for usage data of Grid resources, such as computational and storage resources, a distributed Resource Usage Data management and Accounting system (RUDA) has been designed and developed.
Click the links here (slides , draft paper) for more information of RUDA and here (source code) to download the code. - Fault Tolerance and Fault Monitoring
In order to support fault-tolerant distributed computation in grid environments, we have developed the eXtended Virtual Machine (XVM) and Slave MONitor (SMON) software to coordinate available computing resources and monitor software or hardware failures. Please click this link for papers and more details.
Related Links
- Global Grid Forum – an international effort to define common interfaces and protocols for Grid software.
- DOE Science Grid supplies persistent Grid services needed by the Scientific Discovery through Advanced Computing (SciDAC) program.
- NASA Information Power Grid is NASA’s high performance computational Grid.
- NSF TeraGrid a distributed infrastructure for open scientific research.
- Earth Systems Grid – A data grid linking producers and users of large-scale climate simulation data.
- Particle Physics Data Grid supplies data grid services to current and future high-energy and nuclear physics experiments such as ATLAS and BaBAR.
- NEESgrid is a national-scale distributed virtual laboratory for advanced earthquake engineering.
- EdGrid – to promote applications of modeling and visualization in science and mathematics education.
- more …