Supercomputing in the 21st century
Story posted May 9, 2007

John Towns, Director of NCSA Persistent Infrastructure
Editorial by
John Towns, Director of Persistent Infrastructure
NCSA
As information technology and the emerging concept of cyberinfrastructure permeate science and engineering research endeavors, the notion of a traditional supercomputing center becomes dated. NCSA is working to keep ahead of these changes and is evolving into an institution whose raison d'etre is more than big iron. The supercomputing center of the 21st Century will be the lynchpin in an enterprise focused on the concept of empowering scientists and engineers.
Cyberinfrastructure is still, and will remain, indispensable. This is true both for us as a center and for our users who need the incredible capacity and capability that cyberinfrastructure provides. But in a new world we have to think in new ways about the things that remain the same. At NCSA, the shift in how we think about ourselves has led to a shift in the way we deploy and manage even our longest-standing services.
The major usage mode of the past 25 years for high-end computing -- allocation of time to a single investigator, submission of jobs to a queue for execution, post-processing and analysis of the output later by the researchers -- is giving way to other usage modes. While this will remain a critical process for some users, as the community of scientists and engineers expands (both in number and into new disciplines) we find many users have new needs. Needs such as rapid turnaround to accelerate the science process -- through interactive science and to allow for timely progress in development activities at scale. Others need dedicated resources for extended periods to ensure the timely completion of their projects or to assure that data can be handled when it becomes available from various data sources. Some even need on-demand computing to react to unforeseeable events. NCSA is working closely with the scientists and engineers we support to ensure that our cyber-resources and policies match these needs. The shift can already be seen along a number of dimensions.
First, we're working more closely with researchers in particular communities during the planning and acquisition process. Through our ongoing interactions, we continuously gather the evolving requirements; we then design, build, and operate the system expressly for them. The deployment of our newest resource, Abe, is a direct response to several requirements noted by the user community: Researchers will be able to reserve blocks of Abe for days or even weeks, allowing them to accelerate their research. We are hopeful about the currently submitted proposals for the NSF "Track 1" and "Track 2" solicitations, proposals born of the in-depth discussions we have had with both current and expected users of these resources. The systems are truly designed-to-order, reflecting users' technical needs better than systems of this magnitude ever have to provide capability computing given increasing general availability of capacity computing resources.
We're also allocating our existing cyber-resources -- open to any peer-reviewed project that is awarded an account -- so that they are more conducive to real-world use. For example, our Abe system has a peak speed of about 90 teraflops, making it the largest system supported by the National Science Foundation to date. The operation of Abe will build on capabilities we have developed on Tungsten to provide what we call tailored allocations. In a tailored allocation, specific pieces of the machine are turned over to particular users for given periods of time. These periods are planned in advance so that users know when they're going to get access, how long they're going to have it, and how much computing oomph they're going to have available. These dedicated runs give users the capability they need to complete crucial computations that must be done in a specific time frame, that require an unusually large number of processors, or that otherwise give the queuing system fits. We are excited to be able to expand this service while still leaving room for the traditional capacity-oriented use, with smaller jobs passing through the queue and running unimpeded. Policy issues are the primary means of changing the services provided to scientists and engineers; these are driven by their needs and our desire to support their success.
NCSA's superior support of scientists and engineers is reflected in the 2005 and 2006 Cyberinfrastructure Partnership user surveys; an overwhelming majority of our users and users at other NSF-funded sites were pleased with the performance of our systems and the user support they receive, giving the highest marks in these areas to NCSA. But we won't rest there. We are increasing the services NCSA offers not only to provide a superior level of service to our traditional users as they continue to scale their applications to higher levels, but to also address the needs of researchers with less proficiency accessing cyber-resources via novel cyberenvironments. Both the complexity and scale of the resources continue to dramatically increase. As a result, we are developing the expertise in application scaling and performance optimization and in supporting cyberenvironments. Thus we can support the most demanding applications and deploy the cyberenvironments to support the needs of broad communities requiring low thresholds to access. To proceed in any other manner is unthinkable.
Cyber-resources will remain crucial to our success. They energize our ideas. They are the core that we build out from. They must be as powerful as our ingenuity can make them. But making them larger and faster is no longer our organizing principle. This fact pushes us outside our comfort zone and forces us to think in new ways. It's an exhilarating time at NCSA as we prepare to define this world turned upside down.