Cloud computing is the next stage in the Internet’s evolution, providing the means through which everything — from computing power to computing infrastructure, applications, business processes to personal collaboration — can be delivered to you as a service wherever and whenever you need. The “cloud” in cloud computing can be defined as the set of hardware, networks, storage, services, and interfaces that combine to deliver aspects of computing as a service. Cloud services include the delivery of software, infrastructure, and storage over the Internet (either as separate components or a complete platform) based on user demand.
Cloud computing has four essential characteristics: elasticity and the ability to scale up and down, self-service provisioning and automatic deprovisioning, application programming interfaces (APIs), billing and metering of service usage in a pay-as-you-go model. This flexibility is what is attracting individuals and businesses to move to the cloud.
Cloud computing solutions come in many forms, from full software services to individual operating system configurable virtual machines. For example, Microsoft focuses efforts on its Azure cloud, which provides an easy-to-use platform for deploying software services. In contrast, Amazon’s EC2 offering focuses on the needs of users who wish to manipulate the lower-level operating system configurations. Azure is Microsoft’s cloud computing platform, and EC2 is the service offered by Amazon.
Cyberinfrastructure describes the new research environments that support advanced data acquisition, data storage, data management, data integration, data mining, data visualization and other computing and information processing services over the Internet. In scientific usage, cyberinfrastructure is a technological solution to the problem of efficiently connecting data, computers, and people with the goal of enabling derivation of novel scientific theories and knowledge.
Apache Hadoop is a framework for running applications on large clusters built of commodity hardware. The Hadoop framework transparently provides applications both reliability and data motion. Hadoop implements a computational paradigm named Map/Reduce, where the application is divided into many small fragments of work, each of which may be executed or reexecuted on any node in the cluster. In addition, it provides a distributed file system (HDFS) that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both Map/Reduce and the distributed file system are designed so that node failures are automatically handled by the framework.
High-performance computing (HPC)
High-performance computing (HPC) uses supercomputers and computer clusters to solve advanced computation problems. The term is most commonly associated with computing used for scientific research or computational science.
HUBzero™ is a platform used to create dynamic web sites for scientific research and educational activities. With HUBzero, you can easily publish your research software and related educational materials on the web. The original site to run on the HUBzero platform was nanoHUB.
In the broadest sense, an OptIPortal is a visualization cluster which can be deployed on a variety of hardware platforms. Functionally, OptIPortals are tiled displays, that is, many displays capable of acting as one or many virtual displays. OptIPortals can be used in a wide variety of visualization approaches: viewing high definition static images, video, or in streaming mode across one or more OptIPortals. OptIPortals include both 2D and 3D environments.
Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently (“in parallel”).
In computing, petascale refers to a computer system capable of reaching performance in excess of one petaflop, i.e. one quadrillion floating point operations per second. The standard benchmark tool is LINPACK, and Top500.org is the organization which tracks the fastest supercomputers. Some uniquely specialized petascale computers do not rank on the Top500 list since they cannot run LINPACK.
The TeraGrid project was launched by the National Science Foundation in August 2001. TeraGrid was an open scientific discovery infrastructure combining leadership class resources at eleven partner sites to create an integrated, persistent computational resource. Using high-performance network connections, the TeraGrid integrated high-performance computers, data resources and tools, and high-end experimental facilities around the country. More than 10,000 scientists used the TeraGrid to complete thousands of research projects, at no cost to the scientists.
In July 2011, the National Science Foundation launched the Extreme Science and Engineering Discovery Environment (XSEDE) project. XSEDE replaces and expands on the NSF TeraGrid project. Currently, XSEDE supports 16 supercomputers and high-end visualization and data analysis resources across the country. It also includes other specialized digital resources and services to complement these computers. These resources will be expanded throughout the lifetime of the project.