• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

TPU's GPU Database Portal & Updates

Last edited:
Hello,

I would like to take advantage of adding some info about computing:

For AMD:

* GCN5: INT8 (4x FP32) - INT24 (= FP32) - INT32 (1/5 FP32) - INT64 (1/20 FP32)
* GCN4 / 3: INT8 (4x FP32) - INT24 (= FP32) - INT32 (1/5 FP32) - INT64 (1/20 FP32) - FP16 (= FP32)
* GCN2 / 1: INT24 (= FP32) - INT32 (1/5 FP32) - INT64 (1/20 FP32)
* TeraScale 3: INT24 (= FP32) - INT32 (1/5 FP32) - INT64 (1/20 FP32)
* TeraScale 2/1: INT24 (1/4 FP32) - INT32 (1/4 FP32) - INT64 (1/20 FP32)

For NVIDIA:

* Volta GV100: INT24-INT32 (= FP32) - INT64 (1/5 FP32)
* Pascal GeForce: FP16 (1/64) - INT8 (4x FP32) - INT24-INT32 (1/3 FP32) - INT64 (1/15 FP32)
* Maxwell: Same as Pascal, but not FP16 and INT8.
* Kepler: INT24-INT32 (1/5 FP32) - INT64 (1/20 FP32)
* Fermi GF100 / 110: INT24-INT32 (1/2 FP32) - INT64 (1/8 FP32)
* Other Fermi: INT24-INT32 (1/3 FP32) - INT64 (1/12 FP32)
* Tesla: INT24 (= FP32) - INT32 (1/5 FP32) - INT64 (1/24 FP32)

AIDA64 software has served me a lot to know the values of calculations INT.

I hope this will enrich the database :)
 
Hello,

I would like to take advantage of adding some info about computing:

For AMD:

* GCN5: INT8 (4x FP32) - INT24 (= FP32) - INT32 (1/5 FP32) - INT64 (1/20 FP32)
* GCN4 / 3: INT8 (4x FP32) - INT24 (= FP32) - INT32 (1/5 FP32) - INT64 (1/20 FP32) - FP16 (= FP32)
* GCN2 / 1: INT24 (= FP32) - INT32 (1/5 FP32) - INT64 (1/20 FP32)
* TeraScale 3: INT24 (= FP32) - INT32 (1/5 FP32) - INT64 (1/20 FP32)
* TeraScale 2/1: INT24 (1/4 FP32) - INT32 (1/4 FP32) - INT64 (1/20 FP32)

For NVIDIA:

* Volta GV100: INT24-INT32 (= FP32) - INT64 (1/5 FP32)
* Pascal GeForce: FP16 (1/64) - INT8 (4x FP32) - INT24-INT32 (1/3 FP32) - INT64 (1/15 FP32)
* Maxwell: Same as Pascal, but not FP16 and INT8.
* Kepler: INT24-INT32 (1/5 FP32) - INT64 (1/20 FP32)
* Fermi GF100 / 110: INT24-INT32 (1/2 FP32) - INT64 (1/8 FP32)
* Other Fermi: INT24-INT32 (1/3 FP32) - INT64 (1/12 FP32)
* Tesla: INT24 (= FP32) - INT32 (1/5 FP32) - INT64 (1/24 FP32)

AIDA64 software has served me a lot to know the values of calculations INT.

I hope this will enrich the database :)
our 32 and 64 should be accurate for nvidia and most of amd

https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#maximize-instruction-throughput

our pascal FP16 is incorrect right now but volta GV100 is 1/2 FP32

can you link a similar amd document
 
Last edited:
It is strange that the results of the AIDA64 tests do not reflect the data entrusted by NVIDIA. Unless it's a question of instructions, where there will not really be any ratio sets.
 
It is strange that the results of the AIDA64 tests do not reflect the data entrusted by NVIDIA. Unless it's a question of instructions, where there will not really be any ratio sets.
AIDA64 is wrong for that specific thing i mentioned

right now my concern is FP16 for pascal i know we are not correct, but our 32 and 64 should be 100%

and amd ill find out more but im 80% sure its ok now that you mention gcn 3 as 1:1 for FP16
 
Do you know any benchmarking software for FP16?
 
Do you know any benchmarking software for FP16?
i wish!! FP16 is my new project now, FP64 seems to be useless unless you have ECC memory and i lost interest in it. but FP16 can improve games and im interested

help me find out if GCN 3.0 really has 1:1 FP16 i need documents though
 
Indeed: All Pascal (Except GP100) have FP16 (1/64 FP32), but all Pascal have INT8 (4:1) and INT16 (2:1)
 
Indeed: All Pascal (Except GP100) have FP16 (1/64 FP32), but all Pascal have INT8 (4:1) and INT16 (2:1)
yeah its going to take more time to fix but i know the calculation now xD
 
More infos for Intel HD Graphics :) :

=>Each EU has a 128-bit wide FPU that natively executes eight 16-bit or four 32-bit operations per clock cycle (Clarkdale, Arrandale, Sandy Bridge, and after)
=>FP64 (1/4 FP32) (Bay Trail, Ivy Bridge, Haswell, Braswell, Broadwell, Skylake, Gemini Lake, Kaby Lake, Coffee Lake)
=>FP64 (1/8 FP32) (Apollo Lake)
=>FP16 (2:1 FP32) (Skylake, Gemini Lake, Kaby Lake, Coffee Lake)
 
More infos for Intel HD Graphics :) :

=>Each EU has a 128-bit wide FPU that natively executes eight 16-bit or four 32-bit operations per clock cycle (Clarkdale, Arrandale, Sandy Bridge, and after)
=>FP64 (1/4 FP32) (Bay Trail, Ivy Bridge, Haswell, Braswell, Broadwell, Skylake, Gemini Lake, Kaby Lake, Coffee Lake)
=>FP64 (1/8 FP32) (Apollo Lake)
=>FP16 (2:1 FP32) (Skylake, Gemini Lake, Kaby Lake, Coffee Lake)
I will add all of that but please provide proof buddy
 
Wikipédia for the moment
 
RX Vega M-GH/GL are already Polaris? False Vega? o_O
 
RX Vega M-GH/GL are already Polaris? False Vega? o_O
yes its more Polaris than Vega and under NDA says its Polaris 22

they thought that just because it has HBCC that they could call it Vega, its a GFX 8 chip, Vega is GFX 9

RX Vega M
Graphics/Compute: GFX8 (gfx804)
Display Core Engine: 11.2
Unified Video Decoder: 6.3
Video Compression Engine: 3.4
ROCm Support

RX Vega 64
Graphics/Compute: GFX9 (gfx900)
Display Core Engine: 12.0
Unified Video Decoder: 7.0
Video Compression Engine: 4.0
ROCm Support
 
Ok

Ah, for other GPU:

Fermi GF110/GF100-GL (Quadro/Tesla): FP64 (1/2 FP32)
Fermi GF110/GF100 (GeForce): FP64 (1/8 FP32)
Fermi GF11x/GF10x (GeForce/Quadro): FP64 (1/12 FP32)
Tesla (GT200 only - GeForce/Quadro/Tesla): FP64 (1/8 FP32)

:)
 
Ok

Ah, for other GPU:

Fermi GF110/GF100-GL (Quadro/Tesla): FP64 (1/2 FP32)
Fermi GF110/GF100 (GeForce): FP64 (1/8 FP32)
Fermi GF11x/GF10x (GeForce/Quadro): FP64 (1/12 FP32)
Tesla (GT200 only - GeForce/Quadro/Tesla): FP64 (1/8 FP32)

:)
all nvidia calculations are by cuda version, we should be all set for nvidia (besides pascal FP16 atm)
 
Ok,

I see OpenCL for Radeon HD2000 and HD3000. These generation don't support OpenCL, only from HD4000. They use ATI CAL (Used up to GCN1)

:)
 
Last edited:
Ah? So why no software like LuxMark and other don't detect HD3870 in OpenCL with the same driver than my HD4890, where it is detected instead?? strange
 
Ah? So why no software like LuxMark and other don't detect HD3870 in OpenCL with the same driver than my HD4890, where it is detected instead?? strange
it was my bad, ATi made a software called Close to Metal for R600 Series and switched to CL later on, i thought CTM was the beta name for CL

i updated the Graphics IP page

Ah? So why no software like LuxMark and other don't detect HD3870 in OpenCL with the same driver than my HD4890, where it is detected instead?? strange
your right we dont have anything Cuda 2.0 and below, Cuda 1.3 is 1/8 its GT200 etc, you confused me with the GF100 stuff because you repeat it.

find the Cuda version for the fermis you said and that is the unified rate. for that version
 
Last edited:
Ok, i understand :)

I forget for IGP AMD APU Excavator "Carrizo" and "Bristol Ridge": FP64 (1:2 FP32)
 

Attachments

  • aida_07dtub7.png
    aida_07dtub7.png
    112.9 KB · Views: 389
Ok, i understand :)

I forget for IGP AMD APU Excavator "Carrizo" and "Bristol Ridge": FP64 (1:2 FP32)
i fixed R600 to be no CL support and R700 now has 1.0 --> 1.1

Ok, i understand :)

I forget for IGP AMD APU Excavator "Carrizo" and "Bristol Ridge": FP64 (1:2 FP32)
FP16 fixed in Pascal, Carrizo etc. fixed
 
For Nvidia, it is clear that it is easier to find the information, since they communicate a lot on their GPU.

But AMD is really heartbreaking, especially that in the same generation of architecture, we can have different ratios, as the case of Bristol Ridge against others of its version as Fiji or Tonga.
 
Back
Top