• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

DNA basecalling performance benchmarked, RTX3080 vs RTX1080Ti

Joined
Mar 18, 2008
Messages
5,717 (0.97/day)
System Name Virtual Reality / Bioinformatics
Processor Undead CPU
Motherboard Undead TUF X99
Cooling Noctua NH-D15
Memory GSkill 128GB DDR4-3000
Video Card(s) EVGA RTX 3090 FTW3 Ultra
Storage Samsung 960 Pro 1TB + 860 EVO 2TB + WD Black 5TB
Display(s) 32'' 4K Dell
Case Fractal Design R5
Audio Device(s) BOSE 2.0
Power Supply Seasonic 850watt
Mouse Logitech Master MX
Keyboard Corsair K70 Cherry MX Blue
VR HMD HTC Vive + Oculus Quest 2
Software Windows 10 P
One of the collaborators recently acquired a RTX3080FE for his work. Before retiring the 1080Ti FE he decided to compare the base calling performance between the two.

This is base calling 4GB of RAW current based DNA graph into DNA basepairs in FASTQ format. Exact same base caller, exact same parameters.

CPU is a 3970X Threadripper 32Core64Threads, 256GB DRAM

The DNA sequencer in context here:



So about 31 minutes for 1080Ti, 6 minutes for 3080

1080Ti has 3584 CUDA cores, 3080 has 8704 CUDA cores. Based on the number, the 5X performance increase definitely is way more than simply increases in CUDA cores themselves.


I have processed over 10TB of data on my 3090 so far. Things are noticeably faster than my 2080Ti stock vs stock.

It was always nice to think that I will just work less time when I have more computing power at hand. What actually ends up happening is I got more work piled up simply because I can push them through analysis faster.

1080Ti.png



3080.png
 
Joined
Oct 17, 2020
Messages
64 (0.05/day)
Location
United States
I eyeball that at about 500% clock-watching time.
That's a huge number, especially if revenue is linked to the analysis.
Wasn't there something about them doing two backend processes per clock cycle with the 3080?
 
Last edited:

dgianstefani

TPU Proofreader
Staff member
Joined
Dec 29, 2017
Messages
4,241 (1.84/day)
Location
Swansea, Wales
System Name Silent
Processor Ryzen 7800X3D @ 5.15ghz BCLK OC, TG AM5 High Performance Heatspreader
Motherboard ASUS ROG Strix X670E-I, chipset fans removed
Cooling Optimus AMD Raw Copper/Plexi, HWLABS Copper 240/40+240/30, D5, 4x Noctua A12x25, Mayhems Ultra Pure
Memory 32 GB Dominator Platinum 6150 MHz 26-36-36-48, 56.6ns AIDA, 2050 FLCK, 160 ns TRFC
Video Card(s) RTX 3080 Ti Founders Edition, Conductonaut Extreme, 18 W/mK MinusPad Extreme, Corsair XG7 Waterblock
Storage Intel Optane DC P1600X 118 GB, Samsung 990 Pro 2 TB
Display(s) 32" 240 Hz 1440p Samsung G7, 31.5" 165 Hz 1440p LG NanoIPS Ultragear
Case Sliger SM570 CNC Aluminium 13-Litre, 3D printed feet, custom front panel with pump/res combo
Audio Device(s) Audeze Maxwell Ultraviolet, Razer Nommo Pro
Power Supply Corsair SF750 Platinum, transparent custom cables, Sentinel Pro 1500 Online Double Conversion UPS
Mouse Razer Viper Pro V2 Mercury White w/Tiger Ice Skates & Pulsar Supergrip tape
Keyboard Wooting 60HE+ module, TOFU Redux Burgundy w/brass weight, Prismcaps White & Jellykey, lubed/modded
Software Windows 10 IoT Enterprise LTSC 19053.3803
Benchmark Scores Legendary
Interesting. I'm tempted to use my rig for my study since I use medical and scientific software in my course at Swansea Medical school.
 
Joined
Mar 18, 2008
Messages
5,717 (0.97/day)
System Name Virtual Reality / Bioinformatics
Processor Undead CPU
Motherboard Undead TUF X99
Cooling Noctua NH-D15
Memory GSkill 128GB DDR4-3000
Video Card(s) EVGA RTX 3090 FTW3 Ultra
Storage Samsung 960 Pro 1TB + 860 EVO 2TB + WD Black 5TB
Display(s) 32'' 4K Dell
Case Fractal Design R5
Audio Device(s) BOSE 2.0
Power Supply Seasonic 850watt
Mouse Logitech Master MX
Keyboard Corsair K70 Cherry MX Blue
VR HMD HTC Vive + Oculus Quest 2
Software Windows 10 P
I eyeball that at about 500% clock-watching time.
That's a huge number, especially if revenue is linked to the analysis.
Wasn't there something about them doing two backend processes per clock cycle with the 3080?


Yeah hence the huge jump in CUDA counts

Interesting. I'm tempted to use my rig for my study since I use medical and scientific software in my course at Swansea Medical school.
Check out Nvidia developer here https://developer.nvidia.com/nvbio
 
Joined
Apr 24, 2020
Messages
2,563 (1.75/day)
Yeah hence the huge jump in CUDA counts

Not that the 3080 isn't impressive, but NVidia is very misleading about their CUDA-core counts. They changed what a "CUDA core" means between 2xxx and 3xxx.

It used to be 32 CUDA-cores per SM. But now its 64 CUDA-cores per SM, but they still only run 32-threads per clock tick. The threads are now superscalar though, so its a bit like the Pentium vs 486 (Pentium could run two instructions per clock tick, but the 486 could only run one per clock tick). As such, its more like one core thats faster sometimes, when integer and floating point pipelines can stay filled.

Switching to superscalar (going from 1 instruction / clock to 2 instructions/clock) isn't too big a jump in the great scheme of things. In theory, it doubles speed. But in practice, you just end up waiting on RAM more often. Besides, 1 instruction/clock is already kind of ridiculous given what needs to happen (decoding / grabbing registers / figuring out dependencies/ etc. etc.). It requires a tightly pipelined core before you can get to 1 instruction/clock theoretical, so 2 instructions/clock theoretical isn't too much of a difference. You're already taking advantage of instruction-level parallelism to get to 1 instruction/clock.

Back in the 386 days, before pipelines were common, you'd need 12 clock ticks per instruction as you went through every stage of instruction execution. (Decoding, Fetch, Execution, etc. etc.). 1-clock tick per instruction was possible due to pipelines, but you'd reach read/write hazards (which GPUs still have to resolve today). You can only reach 1-instruction (or 2-instructions/clock) if you're read/write hazard free. Same as always.
 
Joined
Jan 5, 2006
Messages
17,830 (2.67/day)
System Name AlderLake / Laptop
Processor Intel i7 12700K P-Cores @ 5Ghz / Intel i3 7100U
Motherboard Gigabyte Z690 Aorus Master / HP 83A3 (U3E1)
Cooling Noctua NH-U12A 2 fans + Thermal Grizzly Kryonaut Extreme + 5 case fans / Fan
Memory 32GB DDR5 Corsair Dominator Platinum RGB 6000MHz CL36 / 8GB DDR4 HyperX CL13
Video Card(s) MSI RTX 2070 Super Gaming X Trio / Intel HD620
Storage Samsung 980 Pro 1TB + 970 Evo 500GB + 850 Pro 512GB + 860 Evo 1TB x2 / Samsung 256GB M.2 SSD
Display(s) 23.8" Dell S2417DG 165Hz G-Sync 1440p / 14" 1080p IPS Glossy
Case Be quiet! Silent Base 600 - Window / HP Pavilion
Audio Device(s) Panasonic SA-PMX94 / Realtek onboard + B&O speaker system / Harman Kardon Go + Play / Logitech G533
Power Supply Seasonic Focus Plus Gold 750W / Powerbrick
Mouse Logitech MX Anywhere 2 Laser wireless / Logitech M330 wireless
Keyboard RAPOO E9270P Black 5GHz wireless / HP backlit
Software Windows 11 / Windows 10
Benchmark Scores Cinebench R23 (Single Core) 1936 @ stock Cinebench R23 (Multi Core) 23006 @ stock
Top