I'm building a workstation and want to get into some heavy CUDA programming. I don't want to go all out getting the Tesla cards and have pretty much narrowed it down to either the Quadro 4000 and the GeForce 480, but I don't really understand the difference, on paper it looks like the 480 has more cores 480 vs 256 for the 4000, but the 4000 is almost twice as much the 480 in price. Does someone understand the difference here to justify the higher price.
I will be doing scientific computing on it, so everything will be in double precision, if that makes a difference between them.
If you neither care about visualization nor rendering (drawing final results on screen e.g. raytracing) than the answer to your question is slightly more simple, but not trivial.
I'm not going to go into detail about the differences between Quadro and GeForce cards, but I will just underline the significant points which can contribute in choosing between them.
If you need lots of memory than you need Tesla or Quadro. Consumer cards ATM have max 1.5 Gb (GTX 480) while Teslas and Quadros up to 6 Gb.
GF10x series cards have their double precision (FP64) performance capped at 1/8-th of the single precision (FP32) performance, while the architecture is capable of 1/2. Yet another market segmentation trick, quite popular nowadays among hardware manufacturers. Crippling the GeForce line is meant to give the Tesla line an advantage in HPC; GTX 480 is in fact faster than Tesla 20x0 - 1.34TFlops vs 1.03 TFlops, 177.4 Gb vs 144 Gb/sec (peak).
Tesla and Quadro are (supposed to be) more thoroughly tested and therefore less prone to produce errors that are pretty much irrelevant in gaming, but when it comes to scientific computing, just a single bit flip can trash the results. NVIDIA claims that Tesla cards are QC-d for 24/7 use.
A recent paper (Haque and Pande, Hard Data on Soft Errors: A Large-Scale Assessment of Real-World Error Rates in GPGPU) suggests that Tesla is indeed less error prone.
My experience is that GeForce cards tend to be less reliable, especially at constant hight load. Proper cooling is very important, as well as avoiding overclocked cards including factory overclokced models (see Figure 1 of the previously mentioned paper).
for production HPC/scientific computing:
Quadro if need FP64 and/or also need advanced rendering features (the new "Fermi" Teslas have similar rendering capabilities as a GeForce)
If you want to use FP64 intensively, forget about GeForce, otherwise
The two cards you mention are from entirely different league and therefor not directly comparable. If you need the Quadro's rendering features get a Quadro. Otherwise, Quadro is not really worth it especially not the 4000 which is even slower than a GTX 460 while it costs ~3.5x more. I think you're better off with a GTX 470 or 480, just make sure that you buy the ones with standard frequencies.
Note that the crippled GeForce double precision performance is not an issue in this comparison, but let me elaborate. As the Quadro 4000 is a low-end model with AFAIR only 450 MHz shaders (I can't find the reference ATM, but it should be definitely lower than the 5000 which is clocked at 513 MHz) which gives it around 115 GFlops FP64. At the same time, the capped GTX 480 is around 168 GFlops FP64 and even a GTX 460 is around 113 GFlops (peak).
Both the FP32 performance and memory bandwidth is much lower on the Quadro 4000 comapred to the GTX 480 (86.9 vs 177.4 GB/s)!
Note, that from the point of view of theoretical peak performance the GTX 480 (data sheet) is considerably faster than both Tesla C2050/2070 and Quadro 6000 which is reflected in most applications.