[linux-elitists] 32 essential computer books?
Thu Dec 18 19:03:43 PST 2003
On Thu, 2003-12-18 at 15:27, Jim Thompson wrote:
> Ben Woodard writes:
> > On Thu, 2003-12-18 at 13:25, Nick Moffitt wrote:
> > > begin Ben Woodard quotation:
> > > > On Thu, 2003-12-18 at 10:02, Jim Thompson wrote:
> > > > > The Art of Computer Programming, Volumes 1-3, Donald Knuth.
> > > >
> > > > I think that these are fine books but I'd have to argue with you
> > > > regarding them. I think it is important for any programmer to have good
> > > > algorithms and data structures experience but these books are not the
> > > > way to get it. Plus they are terribly out of date not necessarily with
> > > > regard to the information that they have but with regards to the place
> > > > that the information has in the wider realm of computer programming.
> > > >
> > >
> > > I would argue that they have other faaults as well. They were
> > > written during a time when memory accesses were equivalent to
> > > instruction execution, and before pipelining created elaborate cache
> > > hierarchies within a CPU. Knuth is only *now* re-writing them for
> > > RISC, which was So Over by 1998 that it's kind of embarassing.
> > >
> > > They're like good heavy leather-bound legal books: they look
> > > impressive on your bookshelf.
> > Thank you for bringing up this point, Nick. I had forgotten about that
> > one. It is rather amazing what happens to algorithms when you change
> > some of the assumptions and throw all the oddities of modern hardware
> > into the fray. For example sometimes it is worth it to have the
> > processor do twice as much work internally rather than doing an
> > additional memory access.
> Note, you haven't shown this example.
Correct, I haven't shown you the exact machine language. However, the
next time I'm reading through some optimized itanium assembly language
and I see a case of it, I will try to remember to send it onto this
list. One situation where this tends to come up is when:
1) you calculate a value
2) use it one way
3) then store it somewhere
4) the calculated number gets pushed out of the registers
5) you need the value again.
With no -O it fetches it from the memory.
With -O3 it recalculates it.
In C it might look like this:
<lots of code>
What actually gets generated with the optimizer turned on is something
<lots of code>
I've seen something like that recently when working with the intel
compiler on an itanium machine but I don't remember exactly where.
This kind of thing shows up ALL the time in ccNUMA and grid computing
codes. Think about it for a second, if the piece of information you need
could be halfway across the planet on another node in a grid, or if it
exists on other nodes's cache on a ccNUMA machine it is much faster to
recalculate the information rather than fetch it from wherever it
happens to be.
> > For example, in my current world of high performance computing, what I
> > am seeing is that most of the time you can't use data structures more
> > elaborate than an array. Even linked lists are verbotten. The problem is
> > if you have this massive amount of data and you need to do something to
> > it, and you have lots of processors the data pretty much has to be in an
> > array. There is no parallel algorithm to traverse a linked list, or most
> > kinds of trees or things like that.
> Bullshit. Here is one quick example.
Uh no. You and are are talking about a different scale of parallelism.
Our smallest machine is 512 CPUs. Two of our machines have over 2000
CPUs. The one we are currently building has 4000 CPUs and the one that
is scheduled for 2005 is 65,000 CPUs. The technique that the author used
to parallelize his merge sort doesn't scale.
What you find overall is that many if not most normal data structures
are exceptionally difficult to work with when you need to parallelize
the application. Arrays are good because you can have different
execution units work on different portions of the array. Some kinds of
trees work well because you send different processors down different
branches of the tree.
> Now, yes, thats merging, not traversing, but traversal is part of
> merging, etc.
> You must be talking to a bunch of "pornographic fortran" geeks.
> That all said, most scientific computing is based in matrix algebra,
> and as such, is well-represented by vectors (arrays).
> > Try to come up with a parallel algorithm that searches for something?
> > What you will find is that in today's world when performance really
> > matters and you are forced to do things in parallel, the computer
> > science just doesn't exist.
> There is a ton of reasearch on parallel search. Google for "Young
> Brothers Wait", for instance.
I note that you clipped out my sentence that said for "most kinds of
trees". There are certain operations on trees and certain kinds of trees
which parallelize very well.
> > I don't know if I understand the problem well enough to explain it but
> > what I gather is the fundamental problem is that all the mathematics of
> > computer science is based upon the assumption that you are executing on
> > a Von Neuman machine and parallel computers are not Von Neuman machines.
> > -ben
> Wow... are you ever wrong.
> I cut my teeth (first real unix kernel job) at Convex, which built
> SIMD (vector) machines, then parallel SIMD machines. That was 1986 or so.
Parallel computers today are not that simple. I will readily admit that
when I was talking about parallel computers, I was talking about the
parallel computers that I am used to working with. These are overgrown
beowulf clusters not even single system image ccNUMA machines.
I will stand behind my statement though. The computer science is lagging
very far behind the state of the art in the hardware we can build. We do
not have very many well researched and well understood algorithms and
data structures which work well on the kinds of parallel machines that
we are building. I've heard it repeated several times that much of the
problem is that the mathematics to express these algorithms are pushing
the limits of our understanding.
> Even then, the above statement was wrong.
I would argue that the statement was more wrong then than it is now.
> These days its cheap to build an N-way (for small values of 'N')
> parallel machine, and various CPUs have SIMD fp units attached.
> While vectorizing arrays is EZ (read: cheap to implement), arrays
> bring with them their own special issues. Accesses tend to saturate
> the memory system (and thrash the data cache)
I would say that with SIMD you are absolutely correct. However, what we
are dealing with here is not SIMD. We are dealing with these days are
distributed memory parallel execution environments.
I would argue that yours is a variation on a straw man attack.
More information about the linux-elitists