The Meaning of Performance:
An Interview with Daniel Reed

At SC99, Daniel Reed, University of Illinois, gave a talk entitled
"Performance: Myth, Hype, and Reality?".
HPCwire interviewed Reed to learn
more about his views. Following is a partial transcript of that discussion.
data link editor's note: Dan Reed is a professor and head of the computer
science department at the University of Illinois at Urbana-Champaign. He is a
member of the Alliance Executive Committee and leads the portal initiative
within the Alliance. Reed also serves on the NSF CISE directorate advisory committee.
HPCwire:
Could you start by summarizing your talk, "Performance: Myth, Hype,
and Reality?"
Reed: Well, it's going to touch on what some of the challenges are in
trying to achieve high performance on systems. It's a combination of
economic, political, as well as technical. Then, I'll do a bit of a
checkpoint about where we are in terms of assessing the performance of
systems - what the technical challenges are, what progress we've made over the
last 5 years or so. Then I'll try to conclude by looking at what the big
challenges are for the next few years.
HPCwire:
What of the HP benchmarks? Have they seen their day, so to
speak?
Reed:There are many reasons why one might do benchmarking, certainly in the
workstation, I think that the standard benchmarks have been successful in
providing the baseline for concurring systems. But, at the high end, there are
so many reasons why you buy a particular system. The high level application
requirements of storage and computation networking visualization, there are so
many points in that space that people have different expectations. The truth
is that we really don't have much choice except to use some subset of your own
applications as a benchmark for the system.
Parallel systems are very performance conditioned; by that, I mean you
choose a set of applications and you go run on the directs. You rank them
based on execution time. You won't get the same order, much less the same
times, because they're so sensitive to small variations. One of the things I
think is a real problem for computing in general is that we don't have a good
systematic methodology for doing perfomance engineering, building systems for
hardware and software that meets
known performance goals.
HPCwire:
What do you think about the goal of an accurate measure of
performance across platforms?
Reed:Well, the problem is not so much measuring the performance, there are
pretty good techniques for doing that. It's being able to predict what you
would get without having to go through that excercise. We don't have any, at
the moment, very predictive techniques. If you walk up with an application,
and say, "how fast is it going to run on this architecture?", about all we
have are some rules about mental intuitions based on experience, but there's
no quantitative way to say, "You're going to get this performance," even, say,
plus or minus twenty percent.
HPCwire:
What do you see facilitating a way to set up an accurate
means of performance measurement?
Reed:It's more than just an engineering problem, there actually are some
hard research problems there that have to be solved. One of the things that
people have proposed is to look at composable components - components that
have known performance signatures that, when you compose the building blocks
to build a system with a known signature. But people have tried that in the
past, the thing that has made it difficult has been, given the space of high
performance applications, it's only a slight exaggeration to say that every
application we run on our machine lies in a hard execution space that it may
never have been tested in before.
So there are some interesting research problems about how to do compile time
prediction, where you use compile time information to try to estimate the
range of performance that you might see. There are issues that try to do the
same thing for library components, where you could build a suite of library
components that would deliver known performance on platforms. I think you
could choose among those based on the characteristics of the problem. But
several groups are working on those things. But right now, there aren't any
good solutions yet.
HPCwire:
Do you think a solution will ever be achieved? If so, which
possibilities seem the most promising?
Reed:Well, I think that, despite the fact that high performance computing
has the better part of a 50 year history, we're still in a period of very
rapid change. You start to see that level of maturity in performance during
periods when you have very stable markets. One of the the things that's
really different between the pc market and the high end market is, most pcs
don't have compilers on them, most people don't do application development.
They run standard applications that they buy from application developers.
People don't really care how fast their things work; they do in some abstract
sense, but they don't care in the sense that they'll go to great pains to
squeeze more performance out of it.
I'm not saying that performance isn't important, it is, but what is really
important is time to solution, not how fast your code runs, and that means all
of the issues about usability, about infrastructure support. Because, as
expensive as high end machines are, people's time is even more expensive.
HPCwire:
Many people say that the benchmarks have seen their day and that
most benchmarks today were achieved on a one-time-only basis. How do you see
this?
Reed:I think that most of the benchmarks which people have looked at in the
past tend to focus on a particular piece of what are now larger computations.
A sweeping generalization - I think that the applications that people care
about now tend to be multidisciplinary, they tend to involve code from
multiple discipline groups. They often involve multiple libraries from
different sources, they involve not just computation, but data managaement,
visualization; there may be a networking component and a distributed
collaboration aspect.
You may want to do that across multiple machines concurrently, if you have a
unique scientific instrument that you get your real-time data from to do
analysis. Benchmarks tend not to capture all of that richness of complexity,
and if any one of those components does not work well, then the complete
solution will fail. Benchmarks tend to focus on individual pieces. So, to
get good coverage, you really would have to have a benchmark suite that looks
at how long those things will interact, and that's what's really hard. That's
why people tend to want to use applications as benchmarks, because they tend
to be richer and cover more of the space.
HPCwire:
Thank you for your time, is there anything else that you would like
to add?
Reed: A couple things. If you look back at the history of applications on
high-end machines, there was a time when they tended to be much more regular.
Applications are becoming increasingly irregular in the sense that they are
dynamic, they are adaptive, not only in terms of the numerical sense, but also
in having to adapt to a changing resource base. I think the real challenge
for performance analysis is going to be how to make robust applications run in
that environment.
If you have, for example, a remote instrument generating a real-time data
stream over a network, and you are doing real-time data reduction, you are
archiving it on a tertiary storage system and you want to do real-time
business, most of those resources are shared. Those things can change out from
under you in the midst of the application. So how do you build a system that
is resilient to change, that it itself can adapt to a changing environment and
maximize the performance? That's the real challenge that we're facing now,
and that's what we're trying to fix.
data link acknowledges the source of this article,
HPCwire, the electronic
news magazine for high-performance computing. HPCwire released this interview
for general distribution.