AMD FireStream 9250 Breaks the 1 Teraflop Barrier

eidairaman1 · Jun 16, 2008

Firestream seems to be a Highly Programmable Processor- what im saying is it can be programmed for just about anything, Since the CPU isnt programmed, it just sits there.

Solaris17 · Jun 16, 2008

btarunr said:
I sooo wanted the CELL to come to the desktop. If only Windows supported PPC, it would have been possible. Afterall, the CELL is based on the PowerPC machine architecture.

so did i i wish they actually sold it to people along ith the motherboards.....because suse fedora and ubuntu come with cell development tools for programs etc.....i might just pic up a ps3 mod it to have a huge HDD and use it as a desktop running linux.

btarunr · Jun 16, 2008

Yes. Afterall, distro's such as YDL are based on PPC-supportive kernels. Close to every kind of OS supports (or did support in the past) PPC. IIRC, early versions of Windows did support PPC, they scrapped the support since Windows 98.

Exceededgoku · Jun 16, 2008

Cell isn't X86 (is it???) and isn't an in order CPU so performance would suck in day to day tasks like games and windows :S......

btarunr · Jun 16, 2008

Exceededgoku said:
Cell isn't X86 (is it???) and isn't an in order CPU so performance would suck in day to day tasks like games and windows :S......

No, CELL is not an x86 CPU. It's an in-the-order CPU based on the PowerPC machine architecture. Linux supports PPC. (A distro supporting PPC should have the PPC-supportive version of the kernel).

yogurt_21 · Jun 16, 2008

RapidMind has reported a 55x speedup over CPU alone on binomial options pricing calculators. The comparison is versus Quantlib running on a single core of a Dual-Core AMD Opteron™ 2352 processor on Tyan S2915 w/ Win XP 32 (Palomar Workstation from Colfax)
Neurala comparison is against dual AMD Opteron 248 processor (using only a single processor for comparison) w/ 2GB SDRAM DDR 400 ECC dual channel and SUSE Linux 10 (custom kernel)
Mercury benchmark system details: Intel Core2 6820 @ 2.13 GHz w/ 3GB of RAM, FireStream 9250 stream processor

you know I wish the manufacturers would stop number stacking and provide actual resuslts. lol it's nice to see an evolution towards the future at half the price of the previous, but choosing peak numbers to showcase it isn't going to impress the people who run the research projects. they're not your average joe consumer.

btarunr · Jun 16, 2008

yogurt_21 said:
you know I wish the manufacturers would stop number stacking and provide actual resuslts. lol it's nice to see an evolution towards the future at half the price of the previous, but choosing peak numbers to showcase it isn't going to impress the people who run the research projects. they're not your average joe consumer.

The biggest evaluation of all this is the Folding@Home project. Since ages, F@H supported ATI's GPU's for GPU computing, and the numbers did translate to results.

WarEagleAU · Jun 16, 2008

so with the 9170 costing more still even after this announcement, does that mean this new 9250 isnt as powerful? Congrats AMD on the breakthrough, but that is a tad steep price for a co processor. I can see Pharmaceutical and DNA/RNA Synthesis companies using these.

lemonadesoda · Jun 16, 2008

For anyone following this thread, read http://www.rapidmind.net/pdfs/FinancialDataSheet.pdf
Basically, the 55x speedup quoted by AMD is:

1>> A single core Opteron running an opensource math library, COMPARED TO
2>> The FireStream running optimized math library SPECIFICALLY designed for financial math by RapidMind.

REAL COMPARISON
1./ Single core CPU, running inefficient C++ math library
2./ Replace math library with RapidMind, = 2x speedup
3./ Replace "single core" Opteron with "single core" Intel Core 2, = 2x speedup
4./ Replace single core with quad core = 4x speedup

So, actually, the REAL COMPARISON should be 55/16 = 3.5x speedup. At a price of $999.

OK, SO LETS USE A DUAL XEON SYSTEM ALTERNATIVE

5./ Upgrade to dual socket mainboard, one extra xeon, total $500, = 2 x speedup

That would give a net speedup of 1.75x to the FireStream but at a higher cost ($499), plus development time associated with using the SDK for FireStream and then having codethat could only run on the FireSteam. (THERE ARE GOOD SECURITY REASONS TO DO THIS... ESPECIALLY FOR PROPRIETARY FINANCE SOFTWARE).

IMO, 1.75x speed of a dual xeon workstation, is not all that impressive.

******

From looking closer at the hardware of FireStream, it seems to be essentially a GPU card with the "Video" bits removed. You could probably get a regular gaming card to do exactly the same. But I'm sure AMD will "lock" features within the BIOS, just like they do with the FireGL GPUs.

WarEagleAU said:
Congrats AMD on the breakthrough, but that is a tad steep price for a co processor. I can see Pharmaceutical and DNA/RNA Synthesis companies using these.

I agree, too expensive
But its not much of a breakthrough. Its a GPU in wolfs clothes, with an SDK not dissimilar to CUDA concept.
Smoke and mirrors by AMD.

imperialreign · Jun 16, 2008

Sure, it looks a little blown out of proportion as it is; but look at the target audience for this capability as well - they've been listening to the blown out of proportion claims of Intel and nVidia for how long now? AMD is coming along with something that does work better, they're just exaggerating it a bit - still, for it's market, it's highly competitive, and I think it's great to see AMD being able to bring the goods in at least one field right now.

I'm curious, though, has anyone else noticed that AMD seems to have drastically changed their marketing strategies over the last 3-5 months? It seems to me that they've become a lot more aggressive in their marketing and claims, compared to how they used to be.

They're finally adopting the ruthless attitude of all the other financially successful and stable companies.

lemonadesoda · Jun 17, 2008

a lot more aggressive in their marketing and claims

Can you translate that to English please? Choose one of the following:

1./ Bullshit
2./ Lies
3./ Misrepresentation

They're finally adopting the ruthless attitude of all the other financially successful and stable companies

And that one too, please:

A./ No integrity
B./ No ethics
C./ Short term profit before brand reputation and customer loyalty, ala, fool the customer with 1, 2, 3

imperialreign · Jun 17, 2008

lemonadesoda said:
Can you translate that to English please? Choose one of the following:

1./ Bullshit
2./ Lies
3./ Misrepresentation

And that one too, please:

A./ No integrity
B./ No ethics
C./ Short term profit before brand reputation and customer loyalty, ala, fool the customer with 1, 2, 3

IDK about all that - ATI still at least provide some kind of basis for their claims, some representation of what they've tested to help support their propaganda - more than I can say for Intel, nVidia, MS, Creative, or any other market leader in the industry.

Sure, recently they might be 'twisting' the truth and stretching it as far as they can, but we're still given some kind of base to look at as well; unlike other companies who spit out propaganda that looks like they waved their voodoo stick over a spread sheeting while swinging chickens.

eidairaman1 · Jun 17, 2008

Dontforget Intel and Nvidia were doing that shit for years until the Other Companies started to step on their feet.

lemonadesoda said:
Can you translate that to English please? Choose one of the following:

1./ Bullshit
2./ Lies
3./ Misrepresentation

And that one too, please:

A./ No integrity
B./ No ethics
C./ Short term profit before brand reputation and customer loyalty, ala, fool the customer with 1, 2, 3

tkpenalty · Jun 17, 2008

AMD might as well retool their GPUs for CPU usage-GPUs have massve FP calc speeds and they have an x86 liscense anyway.......

if AMD used their GPUs for CPUs... Intel would be screwed.

From_Nowhere · Jun 17, 2008

^ That would be interesting... my question is, "Can it be done?"

btarunr · Jun 17, 2008

From_Nowhere said:
^ That would be interesting... my question is, "Can it be done?"

Yes. AMD Fusion is a CPU with a GPU embedded. GPU means stream processors.

Even if a GPU the class of a HD2600 XT (120 SP's) was embedded, theoritically it means an added 50 GFLOPs at least.

lemonadesoda · Jun 17, 2008

Interesting discussion http://www.simbiosys.ca/blog/2008/05/03/the-fast-and-the-furious-compare-cellbe-gpu-and-fpga/

They (quietly) point out that the GPGPU are fantastic for massively parallel calculations. But for general purpose mixed math they are aweful. Why? Because the processing power and benchmarks we keep reading about are based on calculations that are scalable via parallelization, so that, e.g. ALL 320 stream processors are put to good use.

If you were using the GPGPU to "re-calculate an EXCEL table", then divide performance by 320, since you wont get parallelization there. In such situations a CPU's FPU will PWN the GPGPU.

The GPGPU comes into its own ONLY when using the math library and SDK designed for it... AND when doing things like vector or matrix math, of SIMPLE additions, subtractions and multiplications.

An FPU will PWN a GPGU at trig math, for example.

tkpenalty · Jun 17, 2008

lemonadesoda said:
Interesting discussion http://www.simbiosys.ca/blog/2008/05/03/the-fast-and-the-furious-compare-cellbe-gpu-and-fpga/

They (quietly) point out that the GPGPU are fantastic for massively parallel calculations. But for general purpose mixed math they are aweful. Why? Because the processing power and benchmarks we keep reading about are based on calculations that are scalable via parallelization, so that, e.g. ALL 320 stream processors are put to good use.

If you were using the GPGPU to "re-calculate an EXCEL table", then divide performance by 320, since you wont get parallelization there. In such situations a CPU's FPU will PWN the GPGPU.

The GPGPU comes into its own ONLY when using the math library and SDK designed for it... AND when doing things like vector or matrix math, of SIMPLE additions, subtractions and multiplications.

An FPU will PWN a GPGU at trig math, for example.

FPU is designed for maths anyway...
Lets hope Fusion will give phenom the well needed performance boost.

lemonadesoda · Jun 17, 2008

OK, new news. http://www.tgdaily.com/content/view/37970/135/

Clearspeed's new math co-processor delivers 100 DP math (compared to Firestram 200 DP math) but with only 12W (compared to Firestream 150W).

Clearspeed CSX700 is the winner. It also has a better math library (faster) due to the CSX700 being a much more capable FPU than GPGPU (which is limited to simpler natives of plus, minus, multiply etc.)

Downside, $3000

btarunr · Jun 17, 2008

lemonadesoda said:
Interesting discussion http://www.simbiosys.ca/blog/2008/05/03/the-fast-and-the-furious-compare-cellbe-gpu-and-fpga/

They (quietly) point out that the GPGPU are fantastic for massively parallel calculations. But for general purpose mixed math they are aweful. Why? Because the processing power and benchmarks we keep reading about are based on calculations that are scalable via parallelization, so that, e.g. ALL 320 stream processors are put to good use.

If you were using the GPGPU to "re-calculate an EXCEL table", then divide performance by 320, since you wont get parallelization there. In such situations a CPU's FPU will PWN the GPGPU.

The GPGPU comes into its own ONLY when using the math library and SDK designed for it... AND when doing things like vector or matrix math, of SIMPLE additions, subtractions and multiplications.

An FPU will PWN a GPGU at trig math, for example.

That's where the specialised SP's that handle both MADD/MUL come to play. 1 in every 5 SP's in the ATI Stream architecture are such. Of course, a GPU will never be able to perform out-of-the-order execution the way an x86 CPU does. A GPU requires you to send it instructions and data far more rapidly than you'd send a CPU (where the main memory and CPU staged caches pool data). We can put it this way, just as you have SIMD instruction sets (SSE and its successors), they might come up with an instruction set that lets apps exploit stream processors on a Fusion. Of course, other apps will have to rely on the CPU's FPU.

lemonadesoda · Jun 17, 2008

imperialreign said:
Sure, recently they might be 'twisting' the truth and stretching it as far as they can, but we're still given some kind of base to look at as well; unlike other companies who spit out propaganda that looks like they waved their voodoo stick over a spread sheeting while swinging chickens.

ROFLCOPTERS

MilkyWay · Jun 17, 2008

cell would be useless because you cant run windows or mac on it and then you have no compatible motherboard with pci ex slots for expansion even then things like memory controllers ect

i think that the cell would be useless because youd only be able to run linux and whats the point in having a powerfull cpu for linux if all you can run is doom 3 and quake 4

MilkyWay · Jun 17, 2008

using gpus for cpu is stupid i dont know if it could compute everthing quite like a cpu

either way gpus are different architecture from cpus youd have to totaly redesign the gpu to include cache and memory controllers ect

im not sure why youd want a math co processor
co processors are useless if you have multi threading on a cpu and the software is programed to use it fully

id like to see physics done on a core of a cpu or have a full single graphics card for physics but be able to add in a cheaper graphics card o take advantage

spud107 · Jun 17, 2008

this is interesting, from amd,
http://ati.amd.com/technology/streamcomputing/faq.html#5

Will the AMD FireStream SDK work on previous generation hardware?
To run the CAL/Brook+ SDK, you need a platform based on the AMD R600 GPU or later. R600 and newer GPUs are found with ATI Radeon™ HD2400, HD2600, HD2900 and HD3800 graphics board.

Which applications are best suited to Stream Computing?
Applications best suited to stream computing possess two fundamental characteristics:
A high degree of arithmetic computation per system memory fetch
Computational independence — arithmetic occurs on each processing unit without needing to be checked or verified by or with arithmetic occurring on any other processing unit.

Examples include:
Engineering — fluid dynamics
Mathematics — linear equations, matrix calculations
Simulations — Monte Carlo, molecular modeling, etc.
Financial — options pricing
Biological — protein structure calculations
Imaging — medical image processing

btarunr · Jun 17, 2008

MilkyWay said:
using gpus for cpu is stupid i dont know if it could compute everthing quite like a cpu

either way gpus are different architecture from cpus youd have to totaly redesign the gpu to include cache and memory controllers ect

im not sure why youd want a math co processor
co processors are useless if you have multi threading on a cpu and the software is programed to use it fully

id like to see physics done on a core of a cpu or have a full single graphics card for physics but be able to add in a cheaper graphics card o take advantage

Try to read the complete thread, learn something about it all. As for the CELL BE part. Stop regarding the CELL as "that which drives PS3". CELL was/is touted to have general-purpose applications. Driving a console is just a part of it. What do you think drives the Sony Bravia? CELL finds applications in several other devices such as display panels, etc., it's a PowerPC based processor. Had Apple not ditched PPC for x86 , you'd probably have the PowerMac (now Mac Pro) running a CELL BE.

System Name	PCGOD
Processor	AMD FX 8350@ 5.0GHz
Motherboard	Asus TUF 990FX Sabertooth R2 2901 Bios
Cooling	Scythe Ashura, 2×BitFenix 230mm Spectre Pro LED (Blue,Green), 2x BitFenix 140mm Spectre Pro LED
Memory	16 GB Gskill Ripjaws X 2133 (2400 OC, 10-10-12-20-20, 1T, 1.65V)
Video Card(s)	AMD Radeon 290 Sapphire Vapor-X
Storage	Samsung 840 Pro 256GB, WD Velociraptor 1TB
Display(s)	NEC Multisync LCD 1700V (Display Port Adapter)
Case	AeroCool Xpredator Evil Blue Edition
Audio Device(s)	Creative Labs Sound Blaster ZxR
Power Supply	Seasonic 1250 XM2 Series (XP3)
Mouse	Roccat Kone XTD
Keyboard	Roccat Ryos MK Pro
Software	Windows 7 Pro 64

System Name	RogueOne
Processor	Xeon W9-3495x
Motherboard	ASUS w790E Sage SE
Cooling	SilverStone XE360-4677
Memory	128gb Gskill Zeta R5 DDR5 RDIMMs
Video Card(s)	MSI SUPRIM Liquid 5090
Storage	1x 2TB WD SN850X \| 2x 8TB GAMMIX S70
Display(s)	49" Philips Evnia OLED (49M2C8900)
Case	Thermaltake Core P3 Pro Snow
Audio Device(s)	Moondrop S8's on Schitt Gunnr
Power Supply	Seasonic Prime TX-1600
Mouse	Razer Viper mini signature edition (mercury white)
Keyboard	Wooting 80 HE White, Gateron Jades
VR HMD	Quest 3
Software	Windows 11 Pro Workstation
Benchmark Scores	I dont have time for that.

System Name	RBMK-1000
Processor	AMD Ryzen 7 5700G
Motherboard	Gigabyte B550 AORUS Elite V2
Cooling	DeepCool Gammax L240 V2
Memory	2x 16GB DDR4-3200
Video Card(s)	Galax RTX 4070 Ti EX
Storage	Samsung 990 1TB
Display(s)	BenQ 1440p 60 Hz 27-inch
Case	Corsair Carbide 100R
Audio Device(s)	ASUS SupremeFX S1220A
Power Supply	Cooler Master MWE Gold 650W
Mouse	ASUS ROG Strix Impact
Keyboard	Gamdias Hermes E2
Software	Windows 11 Pro

System Name	The Money Sink
Processor	Intel i7-5960X at 4.60Ghz
Motherboard	MSI X99A Godlike
Cooling	Custom watercooling loop, single D5 -> CPU, dual D5 -> GPU's
Memory	64GB DDR4-3000
Video Card(s)	2 x 1080Ti @ Stock for the moment (40oC LOAD)
Storage	960GB Mushkin Scorpion Deluxe and 2 x 512GB M.2 SSD RAID0
Display(s)	Dual Curved LG 34" Display
Power Supply	EVGA 1600W G2
Software	Windows 10
Benchmark Scores	ALOT

System Name	RBMK-1000
Processor	AMD Ryzen 7 5700G
Motherboard	Gigabyte B550 AORUS Elite V2
Cooling	DeepCool Gammax L240 V2
Memory	2x 16GB DDR4-3200
Video Card(s)	Galax RTX 4070 Ti EX
Storage	Samsung 990 1TB
Display(s)	BenQ 1440p 60 Hz 27-inch
Case	Corsair Carbide 100R
Audio Device(s)	ASUS SupremeFX S1220A
Power Supply	Cooler Master MWE Gold 650W
Mouse	ASUS ROG Strix Impact
Keyboard	Gamdias Hermes E2
Software	Windows 11 Pro

AMD FireStream 9250 Breaks the 1 Teraflop Barrier

eidairaman1

The Exiled Airman

Solaris17

Super Dainty Moderator

btarunr

Editor & Senior Moderator

Exceededgoku

btarunr

Editor & Senior Moderator

yogurt_21

btarunr

Editor & Senior Moderator

WarEagleAU

Bird of Prey

lemonadesoda

imperialreign

New Member

lemonadesoda

imperialreign

New Member

eidairaman1

The Exiled Airman

tkpenalty

From_Nowhere

New Member

btarunr

Editor & Senior Moderator

lemonadesoda

tkpenalty

lemonadesoda

btarunr

Editor & Senior Moderator

lemonadesoda

MilkyWay

MilkyWay

spud107

btarunr

Editor & Senior Moderator

System Name	Thought I'd be done with this by now
Processor	i7 11700k 8/16
Motherboard	MSI Z590 Pro Wifi
Cooling	Be Quiet Dark Rock Pro 4, 9x aigo AR12
Memory	32GB GSkill TridentZ Neo DDR4-4000 CL18-22-22-42
Video Card(s)	MSI Ventus 2x Geforce RTX 3070
Storage	1TB MX300 M.2 OS + Games, + cloud mostly
Display(s)	Samsung 40" 4k (TV)
Case	Lian Li PC-011 Dynamic EVO Black
Audio Device(s)	onboard HD -> Yamaha 5.1
Power Supply	EVGA 850 GQ
Mouse	Logitech wireless
Keyboard	same
VR HMD	nah
Software	Windows 10
Benchmark Scores	no one cares anymore lols

System Name	Pandemic 2020
Processor	AMD Ryzen 5 "Gen 2" 2600X
Motherboard	AsRock X470 Killer Promontory
Cooling	CoolerMaster 240 RGB Master Cooler (Newegg Eggxpert)
Memory	32 GB Geil EVO Portenza DDR4 3200 MHz
Video Card(s)	ASUS Radeon RX 580 DirectX 12 DUAL-RX580-O8G 8GB 256-Bit GDDR5 HDCP Ready CrossFireX Support Video C
Storage	WD 250 M.2, Corsair P500 M.2, OCZ Trion 500, WD Black 1TB, Assorted others.
Display(s)	ASUS MG24UQ Gaming Monitor - 23.6" 4K UHD (3840x2160) , IPS, Adaptive Sync, DisplayWidget
Case	Fractal Define R6 C
Audio Device(s)	Realtek 5.1 Onboard
Power Supply	Corsair RMX 850 Platinum PSU (Newegg Eggxpert)
Mouse	Razer Death Adder
Keyboard	Corsair K95 Mechanical & Corsair K65 Wired, Wireless, Bluetooth)
Software	Windows 10 Pro x64

System Name	ICE-QUAD // ICE-CRUNCH
Processor	Q6600 // 2x Xeon 5472
Memory	2GB DDR // 8GB FB-DIMM
Video Card(s)	HD3850-AGP // FireGL 3400
Display(s)	2 x Samsung 204Ts = 3200x1200
Audio Device(s)	Audigy 2
Software	Windows Server 2003 R2 as a Workstation now migrated to W10 with regrets.

System Name	УльтраФиолет
Processor	Intel Kentsfield Q9650 @ 3.8GHz (4.2GHz highest achieved)
Motherboard	ASUS P5E3 Deluxe/WiFi; X38 NSB, ICH9R SSB
Cooling	Delta V3 block, XPSC res, 120x3 rad, ST 1/2" pump - 10 fans, SYSTRIN HDD cooler, Antec HDD cooler
Memory	Dual channel 8GB OCZ Platinum DDR3 @ 1800MHz @ 7-7-7-20 1T
Video Card(s)	Quadfire: (2) Sapphire HD5970
Storage	(2) WD VelociRaptor 300GB SATA-300; WD 320GB SATA-300; WD 200GB UATA + WD 160GB UATA
Display(s)	Samsung Syncmaster T240 24" (16:10)
Case	Cooler Master Stacker 830
Audio Device(s)	Creative X-Fi Titanium Fatal1ty Pro PCI-E x1
Power Supply	Kingwin Mach1 1200W modular
Software	Windows XP Home SP3; Vista Ultimate x64 SP2
Benchmark Scores	3m06: 20270 here: http://hwbot.org/user.do?userId=12313

System Name	spuds K8-X2
Processor	amd athlon X2 4200+ toledo s939 2794mhz 254x11 1.4 vcore
Motherboard	MSI K8N Neo4-F v1.0 (MS-7125) nforce4 sata2 mod, laptop cpu heatpipe copper nb cooler
Cooling	akasa evo "blue" + 90mm fan, 2x120mm front, 250mm side, 120mm rear, 120mm in psu, pci slot exhaust.
Memory	OCZ Platinum XTC DDR PC3200 4GB(4x1024) @254mhz 3-3-3-8 2T
Video Card(s)	sapphire HD3870 512mb GDDR4 vf900cu, several ramsinks on components / nvidia 7300gt 256mb secondary
Storage	hitachi 160gb (slightly fried) / hitachi 120gb ATA / Seagate 160gb / 2x ps3 seagate 60gb
Display(s)	CTX EX1300F 20" flat CRT, 1280x1024@100hz / 19" benq FP91G X / 19" hanns-g (all free)
Case	mesh server/gaming black case, 9x 5.25' drive bays, silvestone auto fan controller
Audio Device(s)	onboard realtek alc850 7.1/soundblaster LIVE! ct4780 + kxaudio - sony home theatre surround
Power Supply	winpower 650w, system draws around 470-500w under load(+all screens)
Software	win7 64bit
Benchmark Scores	~16m trips/sec using mty trip generator. triple monitor gaming using SoftTH. 3840x1024