NVIDIA Tesla P4
- Graphics Processor
- GP104
- Cores
- 2560
- TMUs
- 160
- ROPs
- 64
- Memory Size
- 8 GB
- Memory Type
- GDDR5
- Bus Width
- 256 bit
The Tesla P4 is a professional graphics card by NVIDIA, launched on September 13th, 2016. Built on the 16 nm process, and based on the GP104 graphics processor, in its GP104-895-A1 variant, the card supports DirectX 12. The GP104 graphics processor is a large chip with a die area of 314 mm² and 7,200 million transistors. It features 2560 shading units, 160 texture mapping units, and 64 ROPs. NVIDIA has paired 8 GB GDDR5 memory with the Tesla P4, which are connected using a 256-bit memory interface. The GPU is operating at a frequency of 886 MHz, which can be boosted up to 1114 MHz, memory is running at 1502 MHz (6 Gbps effective).
Being a single-slot card, the NVIDIA Tesla P4 does not require any additional power connector, its power draw is rated at 75 W maximum. This device has no display connectivity, as it is not designed to have monitors connected to it. Tesla P4 is connected to the rest of the system using a PCI-Express 3.0 x16 interface. The card measures 168 mm in length, and features a single-slot cooling solution.
Being a single-slot card, the NVIDIA Tesla P4 does not require any additional power connector, its power draw is rated at 75 W maximum. This device has no display connectivity, as it is not designed to have monitors connected to it. Tesla P4 is connected to the rest of the system using a PCI-Express 3.0 x16 interface. The card measures 168 mm in length, and features a single-slot cooling solution.
Graphics Processor
Graphics Card
Relative Performance
Based on TPU review data: "Performance Summary" at 1920x1080, 4K for 2080 Ti and faster.
Performance estimated based on architecture, shader count and clocks.
Clock Speeds
- Base Clock
- 886 MHz
- Boost Clock
- 1114 MHz
- Memory Clock
-
1502 MHz
6 Gbps effective
Memory
- Memory Size
- 8 GB
- Memory Type
- GDDR5
- Memory Bus
- 256 bit
- Bandwidth
- 192.3 GB/s
Render Config
- Shading Units
- 2560
- TMUs
- 160
- ROPs
- 64
- SM Count
- 20
- L1 Cache
- 48 KB (per SM)
- L2 Cache
- 2 MB
Theoretical Performance
- Pixel Rate
- 71.30 GPixel/s
- Texture Rate
- 178.2 GTexel/s
- FP16 (half) performance
- 89.12 GFLOPS (1:64)
- FP32 (float) performance
- 5.704 TFLOPS
- FP64 (double) performance
- 178.2 GFLOPS (1:32)
Board Design
- Slot Width
- Single-slot
- Length
- 168 mm
6.6 inches
- TDP
- 75 W
- Suggested PSU
- 250 W
- Outputs
- No outputs
- Power Connectors
- None
Graphics Features
- DirectX
- 12 (12_1)
- OpenGL
- 4.6
- OpenCL
- 3.0
- Vulkan
- 1.2
- CUDA
- 6.1
- Shader Model
- 6.4
