TPU's GPU Database Portal & Updates

T4C Fantasy · May 25, 2018

TRINITAS said:
For Nvidia, it is clear that it is easier to find the information, since they communicate a lot on their GPU.

But AMD is really heartbreaking, especially that in the same generation of architecture, we can have different ratios, as the case of Bristol Ridge against others of its version as Fiji or Tonga.

true, i really need more info on fermi though, its Cuda 2.0 for GL and cuda 2.1 for the rest

TRINITAS · May 26, 2018

Infos for ratio: https://www.anandtech.com/show/2977...tx-470-6-months-late-was-it-worth-the-wait-/6 (GF100)
https://www.anandtech.com/show/3809/nvidias-geforce-gtx-460-the-200-king/3 (GF104 and other)

T4C Fantasy · May 26, 2018

TRINITAS said:
Infos for ratio: https://www.anandtech.com/show/2977...tx-470-6-months-late-was-it-worth-the-wait-/6 (GF100)
https://www.anandtech.com/show/3809/nvidias-geforce-gtx-460-the-200-king/3 (GF104 and other)

what would the ratio of G80 and G90 be?

https://www.techpowerup.com/forums/threads/nvidia-graphics-ip.244158/

TRINITAS said:
Infos for ratio: https://www.anandtech.com/show/2977...tx-470-6-months-late-was-it-worth-the-wait-/6 (GF100)
https://www.anandtech.com/show/3809/nvidias-geforce-gtx-460-the-200-king/3 (GF104 and other)

never mind Double precision started at cuda 1.3 which is GT200

i updated Double Precision, the only version of cuda i didnt update is CUDA 2.1

TRINITAS · May 26, 2018

No double précision for all Tesla G8x, G9x and GT215/216/218. Only GT200 with 1:8

Two exemple from me:

T4C Fantasy · May 26, 2018

TRINITAS said:
No double précision for all Tesla G8x, G9x and GT215/216/218. Only GT200 with 1:8

2.1 updated

TRINITAS · May 26, 2018

For Fermi: (From me also)

1:8 for GF100/110 GeForce
1:2 for GF100/110 Quadro/Tesla
1:12 for other Fermi GeForce and Quadro

T4C Fantasy · May 26, 2018

TRINITAS said:
For Fermi: (From me also)

1:8 for GF100/110 GeForce
1:2 for GF100/110 Quadro/Tesla
1:12 for other Fermi GeForce and Quadro

yes but remeber this isnt always correct, nvidia seems to go by Cuda version not workstation server, since Fermi shares the same chip and same cuda versions

GF100/110 is Cuda 2.0 and 2.0 is 1/2

but if you can find a chart that says 2.0 is an exception ill change xD

TRINITAS · May 26, 2018

T4C Fantasy said:
yes but remeber this isnt always correct, nvidia seems to go by Cuda version not workstation server, since Fermi shares the same chip and same cuda versions

GF100/110 is Cuda 2.0 and 2.0 is 1/2

but if you can find a chart that says 2.0 is an exception ill change xD

Version CUDA ok, but GeForce is for public, no necessary to have FP64 in max speed. The situation of Fermi is the same as Hawaii....or GK110 is CUDA 3.5, but 1:3 for Quadro/Tesla and Titan.....and 1:24 for GeForce GTX 780 (Ti).......same situation

T4C Fantasy · May 26, 2018

TRINITAS said:
Version CUDA ok, but GeForce is for public, no necessary to have FP64 in max speed. The situation of Fermi is the same as Hawaii....or GK110 is CUDA 3.5, but 1:3 for Quadro/Tesla and Titan.....and 1:24 for GeForce GTX 780 (Ti).......same situation

in the cuda documents it shows 3.5 as 1:3 only there is no 1:24

i fixed it.

TRINITAS · May 26, 2018

The version gives the specifications in general of a series of chipset, and the max of their potentials. However, the functions of a video card template are defined by the bios of the card.

Physically, the GF100 and GK110 are identical between the Quadro and GeForce with the same number of FP64 units. But the BIOS will define how the softs will access the functions and instructions in the GPU. The Quadro and Tesla BIOS give 100% access to FP64 units, but GeForce gives them half, if not less.

There are two methods for bridging the FP64: Either by disabling FP64-specific instructions or by decreasing the frequency of FP64 shaders when they are called. This is the case of the GTX 780/780 Ti where the frequencies are simply decreased in the 30-40 MHz in FP64.

The CUDA version gives the MAX capabilities as possible, but it's the BIOS that sets the priorities, not just the frequencies, voltages or whatever (as some people think).

If we could search and compare the BIOS of a Quadro 7000 with a GTX 580, we will see differences about it.

T4C Fantasy · May 27, 2018

TRINITAS said:
The version gives the specifications in general of a series of chipset, and the max of their potentials. However, the functions of a video card template are defined by the bios of the card.

Physically, the GF100 and GK110 are identical between the Quadro and GeForce with the same number of FP64 units. But the BIOS will define how the softs will access the functions and instructions in the GPU. The Quadro and Tesla BIOS give 100% access to FP64 units, but GeForce gives them half, if not less.

There are two methods for bridging the FP64: Either by disabling FP64-specific instructions or by decreasing the frequency of FP64 shaders when they are called. This is the case of the GTX 780/780 Ti where the frequencies are simply decreased in the 30-40 MHz in FP64.

The CUDA version gives the MAX capabilities as possible, but it's the BIOS that sets the priorities, not just the frequencies, voltages or whatever (as some people think).

If we could search and compare the BIOS of a Quadro 7000 with a GTX 580, we will see differences about it.

i already fixed it.

TRINITAS · May 31, 2018

I have information for old GPU:

This is still a bit difficult (information about these generations is less complete than at present), but it seems that the GeForce FX series were capable of FP32 calculations (that for 3D of course). It remains to know the true values, because being non-unified architectures. I know it's VLIW vector, but I do not know what type (like VLIW4 for Pixel Shaders, and VLIW3 for Vertexs Shaders). Also, you need to know how a GeForce 7900 GTX (650 MHz) can reach 300 GFLOP / s FP32 (Source: Nvidia slide) ..... namely.

In any case, all this started with the GeForce FX5000 and the Radeon R300.

T4C Fantasy · May 31, 2018

TRINITAS said:
I have information for old GPU:

This is still a bit difficult (information about these generations is less complete than at present), but it seems that the GeForce FX series were capable of FP32 calculations (that for 3D of course). It remains to know the true values, because being non-unified architectures. I know it's VLIW vector, but I do not know what type (like VLIW4 for Pixel Shaders, and VLIW3 for Vertexs Shaders). Also, you need to know how a GeForce 7900 GTX (650 MHz) can reach 300 GFLOP / s FP32 (Source: Nvidia slide) ..... namely.

In any case, all this started with the GeForce FX5000 and the Radeon R300.

Look at my rsx gpu in database to see how fp32 is calculated with pixel vertex

TRINITAS · Jun 3, 2018

I could also know the actual speed of GPUs in texture filtering (with TMUs)

All Pascal / Maxwell / Kepler / Fermi: INT8-INT16 (1: 1) - FP16 (1: 2) - FP32 (1: 4)
Tesla GT2xx / G9x / G8x: INT8-INT16 (1: 1) - FP16 (1: 2) - FP32 (1: 4)
All AMD (GCN / TeraScale): INT8 (1: 1) - INT16 (1: 2) - FP16 (1: 2) - FP32 (1: 4)

(Source: techreport.com)

T4C Fantasy · Jul 24, 2018

some generation sorting improvements will be coming soon!

TRINITAS · Sep 18, 2018

Hi, new information for nvidia Turing: FP16 is fast: 2:1 FP32

T4C Fantasy · Sep 28, 2018

New Features!

https://www.techpowerup.com/gpu-specs/geforce-rtx-2080-ti-rev-a.c3305

https://www.techpowerup.com/gpu-specs/nvidia-tu102.g813

londiste · Nov 12, 2018

I was discussing Vega10 die size with someone in a thread. AMD news release contains official die sizes for both Vega10 as well as Vega20:

http://ir.amd.com/news-releases/news-release-details/amd-takes-high-performance-datacenter-computing-next-horizon said:
Radeon Instinct™ MI60 contains 13.2 billion transistors on a package size of 331.46mm2, while the previous generation Radeon Instinct™ MI25 had 12.5 billion transistors on a package size of 494.8mm2 – a 58% improvement in number of transistors per mm2.

(vs inner/outer die size of Vega10:

https://flic.kr/p/2arLBGt

)

T4C Fantasy · Nov 12, 2018

londiste said:
I was discussing Vega10 die size with someone in a thread. AMD news release contains official die sizes for both Vega10 as well as Vega20:

(vs inner/outer die size of Vega10:

https://flic.kr/p/2arLBGt
)

So basically if you say die size abd dont specify then 510mm2 technically isnt incorrect since its still part of the die?

Gouca · Dec 6, 2018

Hello!

Not sure if this is the appropriate place to mention our gatherings but your database is completely missing the ASUS MINING-P104-4GB and at least this ASUS and possibly Gigabyte's variants have 8GB of GDDR5X on board which seems to fully match the "recently revealed GTX1070 GDDR5X". See my post at https://www.techpowerup.com/forums/...-should-fit-in-nvidia-p104-100-series.250367/

TRINITAS · Jan 10, 2019

Vega 20 have already 128 ROPs?

RadeonProVega · Feb 3, 2019

I find myself going to that GPU Database site every week lol, i love it. Props to the staff for making it, it comes in handy.
Question though, what is faster, my W5000 or the https://www.techpowerup.com/gpu-specs/firepro-v7900.c580
https://www.techpowerup.com/gpu-specs/firepro-w5000.c588

Might be a dumb question because the w5000 is newer, but the specs on the v7900 is much faster. Just curiuos, I'm really aiming for W7000 4GB version (if i can get a deal)

T4C Fantasy · Feb 3, 2019

u2konline said:
I find myself going to that GPU Database site every week lol, i love it. Props to the staff for making it, it comes in handy.
Question though, what is faster, my W5000 or the https://www.techpowerup.com/gpu-specs/firepro-v7900.c580
https://www.techpowerup.com/gpu-specs/firepro-w5000.c588

Might be a dumb question because the w5000 is newer, but the specs on the v7900 is much faster. Just curiuos, I'm really aiming for W7000 4GB version (if i can get a deal)

The w5000 is much more power efficient and still supported in drivers

added usb-c to outputs picture

added amd and nvidia graphics ip

RadeonProVega · Feb 12, 2019

wait a second, you said the w5100 is more power efficient, but the V7900 specs are way higher than the w5000. what's faster in performance overall? even though the w5000 might be cooler i am sure.

T4C Fantasy · Feb 12, 2019

u2konline said:
wait a second, you said the w5100 is more power efficient, but the V7900 specs are way higher than the w5000. what's faster in performance overall? even though the w5000 might be cooler i am sure.

GCN vs terascale 28nm vs 40nm, so its just more efficent by that alone

System Name	Whaaaat Kiiiiiiid!
Processor	Intel Core i9-14900K @ Default
Motherboard	Gigabyte Z690 AORUS Elite AX DDR4
Cooling	Corsair H150i AIO Cooler
Memory	Corsair Dominator Platinum 128GB DDR4-3200
Video Card(s)	EVGA GeForce RTX 3080 FTW3 ULTRA @ Default
Storage	Samsung 970 PRO 512GB + Crucial MX500 2TB x3 + Crucial MX500 4TB + Samsung 980 PRO 1TB
Display(s)	27" LG 27MU67-B 4K, + 27" Acer Predator XB271HU 1440P
Case	Thermaltake Core X9 Snow
Audio Device(s)	Logitech G PRO X 2 Lightspeed
Power Supply	SeaSonic Platinum 1050W Snow Silent
Mouse	Logitech G903 Lightspeed
Keyboard	Logitech G915 X Lightspeed
Software	Windows 11 Pro
Benchmark Scores	FFXV: 19329

System Name	Game computer
Processor	AMD RyZen 7 5800X3D 4.35GHZ
Motherboard	ASRock X470 Taichi
Cooling	be quiet! Pure Rock 2 Black
Memory	32768 Mo DDR4-3200 G-Skill CL16
Video Card(s)	AMD Radeon RX 7900 GRE (x2)
Storage	SSD Samsung 970 EVO M2 250 Go, Samsung 970 EVO M2 500 Go, Samsung 850 EVO SATA 500 Go, Toshiba 4 To
Display(s)	AOC 24' 1440p 144 Hz DisplayPort + ACER KG251Q 24' 1080p 144 Hz DisplayPort
Case	NZXT Phantom Black
Audio Device(s)	Corsair Gaming VOID Pro RGB Wireless Special Edition
Power Supply	BeQuiet Straight Power 11 1000W
Mouse	Roccat Kone XTD
Keyboard	BTC USB
Software	Windows 11 24H2 Pro x64

System Name	Whaaaat Kiiiiiiid!
Processor	Intel Core i9-14900K @ Default
Motherboard	Gigabyte Z690 AORUS Elite AX DDR4
Cooling	Corsair H150i AIO Cooler
Memory	Corsair Dominator Platinum 128GB DDR4-3200
Video Card(s)	EVGA GeForce RTX 3080 FTW3 ULTRA @ Default
Storage	Samsung 970 PRO 512GB + Crucial MX500 2TB x3 + Crucial MX500 4TB + Samsung 980 PRO 1TB
Display(s)	27" LG 27MU67-B 4K, + 27" Acer Predator XB271HU 1440P
Case	Thermaltake Core X9 Snow
Audio Device(s)	Logitech G PRO X 2 Lightspeed
Power Supply	SeaSonic Platinum 1050W Snow Silent
Mouse	Logitech G903 Lightspeed
Keyboard	Logitech G915 X Lightspeed
Software	Windows 11 Pro
Benchmark Scores	FFXV: 19329

System Name	Game computer
Processor	AMD RyZen 7 5800X3D 4.35GHZ
Motherboard	ASRock X470 Taichi
Cooling	be quiet! Pure Rock 2 Black
Memory	32768 Mo DDR4-3200 G-Skill CL16
Video Card(s)	AMD Radeon RX 7900 GRE (x2)
Storage	SSD Samsung 970 EVO M2 250 Go, Samsung 970 EVO M2 500 Go, Samsung 850 EVO SATA 500 Go, Toshiba 4 To
Display(s)	AOC 24' 1440p 144 Hz DisplayPort + ACER KG251Q 24' 1080p 144 Hz DisplayPort
Case	NZXT Phantom Black
Audio Device(s)	Corsair Gaming VOID Pro RGB Wireless Special Edition
Power Supply	BeQuiet Straight Power 11 1000W
Mouse	Roccat Kone XTD
Keyboard	BTC USB
Software	Windows 11 24H2 Pro x64

System Name	Whaaaat Kiiiiiiid!
Processor	Intel Core i9-14900K @ Default
Motherboard	Gigabyte Z690 AORUS Elite AX DDR4
Cooling	Corsair H150i AIO Cooler
Memory	Corsair Dominator Platinum 128GB DDR4-3200
Video Card(s)	EVGA GeForce RTX 3080 FTW3 ULTRA @ Default
Storage	Samsung 970 PRO 512GB + Crucial MX500 2TB x3 + Crucial MX500 4TB + Samsung 980 PRO 1TB
Display(s)	27" LG 27MU67-B 4K, + 27" Acer Predator XB271HU 1440P
Case	Thermaltake Core X9 Snow
Audio Device(s)	Logitech G PRO X 2 Lightspeed
Power Supply	SeaSonic Platinum 1050W Snow Silent
Mouse	Logitech G903 Lightspeed
Keyboard	Logitech G915 X Lightspeed
Software	Windows 11 Pro
Benchmark Scores	FFXV: 19329

TPU's GPU Database Portal & Updates

T4C Fantasy

CPU & GPU DB Maintainer

TRINITAS

T4C Fantasy

CPU & GPU DB Maintainer

TRINITAS

Attachments

T4C Fantasy

CPU & GPU DB Maintainer

TRINITAS

Attachments

T4C Fantasy

CPU & GPU DB Maintainer

TRINITAS

T4C Fantasy

CPU & GPU DB Maintainer

TRINITAS

T4C Fantasy

CPU & GPU DB Maintainer

TRINITAS

T4C Fantasy

CPU & GPU DB Maintainer

TRINITAS

T4C Fantasy

CPU & GPU DB Maintainer

TRINITAS

T4C Fantasy

CPU & GPU DB Maintainer

londiste

T4C Fantasy

CPU & GPU DB Maintainer

Gouca

TRINITAS

RadeonProVega

T4C Fantasy

CPU & GPU DB Maintainer

RadeonProVega

T4C Fantasy

CPU & GPU DB Maintainer

Processor	Ryzen 7800X3D
Motherboard	ROG STRIX B650E-F GAMING WIFI
Memory	2x16GB G.Skill Flare X5 DDR5-6000 CL36 (F5-6000J3636F16GX2-FX5)
Video Card(s)	INNO3D GeForce RTX™ 4070 Ti SUPER TWIN X2
Storage	2TB Samsung 980 PRO, 4TB WD Black SN850X
Display(s)	42" LG C2 OLED, 27" ASUS PG279Q
Case	Thermaltake Core P5
Power Supply	Fractal Design Ion+ Platinum 760W
Mouse	Corsair Dark Core RGB Pro SE
Keyboard	Corsair K100 RGB
VR HMD	HTC Vive Cosmos

System Name	Dell Workstation t5810
Processor	Xeon CPU's E5-2683 v4 Broadwell-E Technology
Motherboard	Broadwell-E X99
Cooling	Default fan System Level 3
Memory	48GB DDR4
Video Card(s)	Radeon Pro VII 16GB
Storage	2 Internal SSD, 6 External HDD
Display(s)	Dell 27 Inch Monitor
Case	Dell Precision 5810
Audio Device(s)	RealTek High Definition
Power Supply	825 Watts PSU
Mouse	Soundless Black Quiet Mouse
Keyboard	Dell Black
Software	Windows Pro 10 x64