Nvidia A100 GPUs...up to 19.5 FP64 TFLOPS


When and if you get a new Nvidia RTX 3000 series video card, you will be pleased if you try it on tasks utilizing double-precision calculations.

- From TechPowerUp: "The third-generation Tensor Cores in the NVIDIA Ampere architecture are beefier than prior versions. They support a larger matrix size — 8x8x4, compared to 4x4x4 for Volta — that lets users tackle tougher problems.
That's one reason why an A100 with a total of 432 Tensor Cores delivers up to 19.5 FP64 TFLOPS, more than double the performance of a Volta V100."

- From MC.AI: "...(interestingly, the new Ampere architecture has 3rd generation tensor cores with FP64 support, the A100 Tensor Core includes new IEEE-compliant FP64 processing that delivers 2.5x the FP64 performance of V100)."

- More info at Nvidia. Happy crunching! By the way, 19.5 FP64 TFLOPS is 2.5 times the speed of my rig with four Radeon HD 7990 video cards.
I'd love for this to be true since Nvidia has intentionally nerfed their double-precision on consumer cards for years, but it looks to me like this only applies to the HPC data center cards. The 3xxx cards do have Tensor cores, if they have this same capability apps will have to be written to take advantage of that. Right now both Tensor and RTX cores are just wasted die space for BOINC use.