• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Should we have a "benchmarks" type thread for LLMS?

Local LLM Benchmark table


  • Total voters
    21

Easy Rhino

Linux Advocate
Staff member
Joined
Nov 13, 2006
Messages
15,675 (2.34/day)
Location
Mid-Atlantic
System Name Desktop
Processor i5 13600KF
Motherboard AsRock B760M Steel Legend Wifi
Cooling Noctua NH-U9S
Memory 4x 16 Gb Gskill S5 DDR5 @6000
Video Card(s) Gigabyte Gaming OC 6750 XT 12GB
Storage WD_BLACK 4TB SN850x
Display(s) Gigabye M32U
Case Corsair Carbide 400C
Audio Device(s) On Board
Power Supply EVGA Supernova 650 P2
Mouse MX Master 3s
Keyboard Logitech G915 Wireless Clicky
Software Fedora KDE Spin
I can't seem to find any good data that highlights tokens per second with local LLMS and graphics cards.

I am proposing something like a table that include, tpu username, graphics card and vram, graphics driver and version, platform (ollama, lm studio, etc), cpu, ram, operating system and version, model, tokens per second, date of benchmark.

Are there any other relevant fields to add? Are other people interested in this kind of information?
 

Solaris17

Super Dainty Moderator
Staff member
Joined
Aug 16, 2005
Messages
27,498 (3.84/day)
Location
Alabama
System Name RogueOne
Processor Xeon W9-3495x
Motherboard ASUS w790E Sage SE
Cooling SilverStone XE360-4677
Memory 128gb Gskill Zeta R5 DDR5 RDIMMs
Video Card(s) MSI SUPRIM Liquid X 4090
Storage 1x 2TB WD SN850X | 2x 8TB GAMMIX S70
Display(s) 49" Philips Evnia OLED (49M2C8900)
Case Thermaltake Core P3 Pro Snow
Audio Device(s) Moondrop S8's on schitt Gunnr
Power Supply Seasonic Prime TX-1600
Mouse Razer Viper mini signature edition (mercury white)
Keyboard Monsgeek M3 Lavender, Moondrop Luna lights
VR HMD Quest 3
Software Windows 11 Pro Workstation
Benchmark Scores I dont have time for that.
I was thinking about doing something like this so I’m for it.
 

Easy Rhino

Linux Advocate
Staff member
Joined
Nov 13, 2006
Messages
15,675 (2.34/day)
Location
Mid-Atlantic
System Name Desktop
Processor i5 13600KF
Motherboard AsRock B760M Steel Legend Wifi
Cooling Noctua NH-U9S
Memory 4x 16 Gb Gskill S5 DDR5 @6000
Video Card(s) Gigabyte Gaming OC 6750 XT 12GB
Storage WD_BLACK 4TB SN850x
Display(s) Gigabye M32U
Case Corsair Carbide 400C
Audio Device(s) On Board
Power Supply EVGA Supernova 650 P2
Mouse MX Master 3s
Keyboard Logitech G915 Wireless Clicky
Software Fedora KDE Spin
I think the challenge will be verifying the results. People could just post crap because reasons. it would be neat if projects like ollama had a way to upload results to a database.
 

Solaris17

Super Dainty Moderator
Staff member
Joined
Aug 16, 2005
Messages
27,498 (3.84/day)
Location
Alabama
System Name RogueOne
Processor Xeon W9-3495x
Motherboard ASUS w790E Sage SE
Cooling SilverStone XE360-4677
Memory 128gb Gskill Zeta R5 DDR5 RDIMMs
Video Card(s) MSI SUPRIM Liquid X 4090
Storage 1x 2TB WD SN850X | 2x 8TB GAMMIX S70
Display(s) 49" Philips Evnia OLED (49M2C8900)
Case Thermaltake Core P3 Pro Snow
Audio Device(s) Moondrop S8's on schitt Gunnr
Power Supply Seasonic Prime TX-1600
Mouse Razer Viper mini signature edition (mercury white)
Keyboard Monsgeek M3 Lavender, Moondrop Luna lights
VR HMD Quest 3
Software Windows 11 Pro Workstation
Benchmark Scores I dont have time for that.
I think the challenge will be verifying the results. People could just post crap because reasons. it would be neat if projects like ollama had a way to upload results to a database.

I was just thinking that. I was going to maybe make something bundled that just runs the benchmarks, but I'm legit stupid when it comes to DB APIs. We should def think tank this though, I was going to start a thread of my own literally like last week in my case, specifically for the Intel cards using AI playground just for the lols, but I had my teeth yanked so im on hella drugs right now and Dont feel like the song and dance.

I think a good place to start is reading some of the GPU reviews where w1zz displays some AI stats to see what is important.

Off the top of my head (imo)

- Tokens total (some models and services charge by tokens total)
- Tokens/s (tokens /s determines readability, and weather your talking to an adult at normal speech cadence or someone that barely knows how to read.)
- Time to completion
- VRAM Usage

I was also thinking "Accuracy" In my case I was going to throw it a known problem, to make a script I already know how to make as a human, and see how well it does in replicating it or something that works, in a vanilla sense. IE: not telling it to iterate or improve my own.

Accuracy is a little harder though and I wasd in the middle of thinking about it when I started taking pain killers. Love the idea though.
 
Last edited:
Joined
Jun 24, 2017
Messages
202 (0.07/day)
Local LLMs should be the way to go in most cases. TPU should encourage local testing on non-workstation PCs.
 

Solaris17

Super Dainty Moderator
Staff member
Joined
Aug 16, 2005
Messages
27,498 (3.84/day)
Location
Alabama
System Name RogueOne
Processor Xeon W9-3495x
Motherboard ASUS w790E Sage SE
Cooling SilverStone XE360-4677
Memory 128gb Gskill Zeta R5 DDR5 RDIMMs
Video Card(s) MSI SUPRIM Liquid X 4090
Storage 1x 2TB WD SN850X | 2x 8TB GAMMIX S70
Display(s) 49" Philips Evnia OLED (49M2C8900)
Case Thermaltake Core P3 Pro Snow
Audio Device(s) Moondrop S8's on schitt Gunnr
Power Supply Seasonic Prime TX-1600
Mouse Razer Viper mini signature edition (mercury white)
Keyboard Monsgeek M3 Lavender, Moondrop Luna lights
VR HMD Quest 3
Software Windows 11 Pro Workstation
Benchmark Scores I dont have time for that.
Local LLMs should be the way to go in most cases. TPU should encourage local testing on non-workstation PCs.

I dont think any of the testing that is done in the GPU reviews is on workstation PCs. We actually dont do any testing on workstation class gear since forever. I think Easy is talking about member contributed here, not like a new way to review.
 
Joined
Jun 24, 2017
Messages
202 (0.07/day)
I dont think any of the testing that is done in the GPU reviews is on workstation PCs. We actually dont do any testing on workstation class gear since forever. I think Easy is talking about member contributed here, not like a new way to review.

My bad, what I meant is TPU is the perfect place to do tests of local LLMs for non-workstation hardware.

Smaller sites haven't the resources and the variety of hardware TPU has.

For workstation class hardware servethehome, level1techs or maybe the articles from puget. I must say its difficult to find good "workstation" review places or at least as "wide" as TPU. Since brands don't usually want their hardware to be exposed (hp, dell, etc.) and non-branded workstations is such a niche market.
 

Solaris17

Super Dainty Moderator
Staff member
Joined
Aug 16, 2005
Messages
27,498 (3.84/day)
Location
Alabama
System Name RogueOne
Processor Xeon W9-3495x
Motherboard ASUS w790E Sage SE
Cooling SilverStone XE360-4677
Memory 128gb Gskill Zeta R5 DDR5 RDIMMs
Video Card(s) MSI SUPRIM Liquid X 4090
Storage 1x 2TB WD SN850X | 2x 8TB GAMMIX S70
Display(s) 49" Philips Evnia OLED (49M2C8900)
Case Thermaltake Core P3 Pro Snow
Audio Device(s) Moondrop S8's on schitt Gunnr
Power Supply Seasonic Prime TX-1600
Mouse Razer Viper mini signature edition (mercury white)
Keyboard Monsgeek M3 Lavender, Moondrop Luna lights
VR HMD Quest 3
Software Windows 11 Pro Workstation
Benchmark Scores I dont have time for that.
My bad, what I meant is TPU is the perfect place to do tests of local LLMs for non-workstation hardware.

Smaller sites haven't the resources and the variety of hardware TPU has.

For workstation class hardware servethehome, level1techs or maybe the articles from puget. I must say its difficult to find good "workstation" review places or at least as "wide" as TPU. Since brands don't usually want their hardware to be exposed (hp, dell, etc.) and non-branded workstations is such a niche market.

I always dreamed that one day TPU would expand there review market; I would jump on signing up to do WS/Enterprise gear. Maybe someday.

As far as doing more AI stuff; I hope given its proliferation more tests will be included during things like GPU reviews, that fact that it was even added at all is a good sign, maybe one day there will be more targeted tests in an official capacity from w1zz.

We're just peons though. So humble threads is the best we can do.
 
Joined
Feb 21, 2006
Messages
2,315 (0.33/day)
Location
Toronto, Ontario
System Name The Expanse
Processor AMD Ryzen 7 9800X3D
Motherboard Asus Prime X670E-Pro Wifi BIOS 3222 AGESA to PI 1.2.0.3a
Cooling Corsair H150i Elite LCD XT
Memory 64GB G.SKILL Trident Z5 Neo RGB DDR5 6000 CL 30-40-40-96 1T
Video Card(s) XFX Radeon RX 7900 XTX Magnetic Air (25.3.1)
Storage WD SN850X 2TB / Corsair MP600 1TB / Samsung 860Evo 1TB x2 Raid 0 / Asus NAS AS1004T V2 20TB
Display(s) LG 34GP83A-B 34 Inch 21: 9 UltraGear Curved QHD (3440 x 1440) 1ms Nano IPS 160Hz
Case Fractal Design Meshify S2
Audio Device(s) Creative X-Fi + Logitech Z-5500 + HS80 Wireless
Power Supply Corsair AX850 Titanium
Mouse Corsair Dark Core RGB SE
Keyboard Corsair K100
Software Windows 10 Pro x64 22H2
Benchmark Scores https://valid.x86.fr/0412jp https://browser.geekbench.com/v6/cpu/11073923
Good Idea I would participate.
 
Joined
May 10, 2023
Messages
760 (1.12/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
Something like Vlad's benchmark for SD would be pretty handy and doable:

Maybe use llama-bench as the baseline and build a DB on top of it. Don't even need to get fancy with it, any nosql DB or just your regular rdbms (if it's already in place) will work.
 

bug

Joined
May 22, 2015
Messages
14,221 (3.96/day)
Processor Intel i5-12600k
Motherboard Asus H670 TUF
Cooling Arctic Freezer 34
Memory 2x16GB DDR4 3600 G.Skill Ripjaws V
Video Card(s) EVGA GTX 1060 SC
Storage 500GB Samsung 970 EVO, 500GB Samsung 850 EVO, 1TB Crucial MX300 and 2TB Crucial MX500
Display(s) Dell U3219Q + HP ZR24w
Case Raijintek Thetis
Audio Device(s) Audioquest Dragonfly Red :D
Power Supply Seasonic 620W M12
Mouse Logitech G502 Proteus Core
Keyboard G.Skill KM780R
Software Arch Linux + Win10
I can't seem to find any good data that highlights tokens per second with local LLMS and graphics cards.

I am proposing something like a table that include, tpu username, graphics card and vram, graphics driver and version, platform (ollama, lm studio, etc), cpu, ram, operating system and version, model, tokens per second, date of benchmark.

Are there any other relevant fields to add? Are other people interested in this kind of information?
All content on TPU is static (i.e. no animation, no dynamic HTML). How would you make a table with that many columns useful/readable?
 

Easy Rhino

Linux Advocate
Staff member
Joined
Nov 13, 2006
Messages
15,675 (2.34/day)
Location
Mid-Atlantic
System Name Desktop
Processor i5 13600KF
Motherboard AsRock B760M Steel Legend Wifi
Cooling Noctua NH-U9S
Memory 4x 16 Gb Gskill S5 DDR5 @6000
Video Card(s) Gigabyte Gaming OC 6750 XT 12GB
Storage WD_BLACK 4TB SN850x
Display(s) Gigabye M32U
Case Corsair Carbide 400C
Audio Device(s) On Board
Power Supply EVGA Supernova 650 P2
Mouse MX Master 3s
Keyboard Logitech G915 Wireless Clicky
Software Fedora KDE Spin
Something like Vlad's benchmark for SD would be pretty handy and doable:

Maybe use llama-bench as the baseline and build a DB on top of it. Don't even need to get fancy with it, any nosql DB or just your regular rdbms (if it's already in place) will work.

I did not know that program exists. If that could be extended to support output to a URL I could setup a database and provide a restful endpoint to publish the benchmark results.
 
Joined
Jan 3, 2021
Messages
3,985 (2.59/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
All content on TPU is static (i.e. no animation, no dynamic HTML). How would you make a table with that many columns useful/readable?
TPUCPUGPUSSD databases are already close to what would be needed and usable. Their search pages are dynamic HTML, even if the lists are filtered on the server, not the browser.
 
  • Like
Reactions: bug
Joined
May 10, 2023
Messages
760 (1.12/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
I did not know that program exists. If that could be extended to support output to a URL I could setup a database and provide a restful endpoint to publish the benchmark results.
That should be totally doable. Any way that you can think of to validate results with a TPU account or something like that?
 

Easy Rhino

Linux Advocate
Staff member
Joined
Nov 13, 2006
Messages
15,675 (2.34/day)
Location
Mid-Atlantic
System Name Desktop
Processor i5 13600KF
Motherboard AsRock B760M Steel Legend Wifi
Cooling Noctua NH-U9S
Memory 4x 16 Gb Gskill S5 DDR5 @6000
Video Card(s) Gigabyte Gaming OC 6750 XT 12GB
Storage WD_BLACK 4TB SN850x
Display(s) Gigabye M32U
Case Corsair Carbide 400C
Audio Device(s) On Board
Power Supply EVGA Supernova 650 P2
Mouse MX Master 3s
Keyboard Logitech G915 Wireless Clicky
Software Fedora KDE Spin
That should be totally doable. Any way that you can think of to validate results with a TPU account or something like that?

Yes, if the project is forked then a custom binary could be created specifically for TPU. Unfortunately I do not know CPP.
 
Joined
May 10, 2023
Messages
760 (1.12/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
Yes, if the project is forked then a custom binary could be created specifically for TPU. Unfortunately I do not know CPP.
I can lend a hand if you want.
Any language you or the TPU team would be comfortable with? The bench itself could also be just wrapped (in the sense of invoking it and storing the results) within another language to make it easier for others to work with.
 
Top