Engineers Boost Computer Processor Performance By Over 20 Percent

btarunr · Feb 8, 2012

Researchers from North Carolina State University have developed a new technique that allows graphics processing units (GPUs) and central processing units (CPUs) on a single chip to collaborate - boosting processor performance by an average of more than 20 percent.

"Chip manufacturers are now creating processors that have a 'fused architecture,' meaning that they include CPUs and GPUs on a single chip," says Dr. Huiyang Zhou, an associate professor of electrical and computer engineering who co-authored a paper on the research. "This approach decreases manufacturing costs and makes computers more energy efficient. However, the CPU cores and GPU cores still work almost exclusively on separate functions. They rarely collaborate to execute any given program, so they aren't as efficient as they could be. That's the issue we're trying to resolve."

GPUs were initially designed to execute graphics programs, and they are capable of executing many individual functions very quickly. CPUs, or the "brains" of a computer, have less computational power - but are better able to perform more complex tasks.

"Our approach is to allow the GPU cores to execute computational functions, and have CPU cores pre-fetch the data the GPUs will need from off-chip main memory," Zhou says.

"This is more efficient because it allows CPUs and GPUs to do what they are good at. GPUs are good at performing computations. CPUs are good at making decisions and flexible data retrieval."

In other words, CPUs and GPUs fetch data from off-chip main memory at approximately the same speed, but GPUs can execute the functions that use that data more quickly. So, if a CPU determines what data a GPU will need in advance, and fetches it from off-chip main memory, that allows the GPU to focus on executing the functions themselves - and the overall process takes less time.

In preliminary testing, Zhou's team found that its new approach improved fused processor performance by an average of 21.4 percent.

This approach has not been possible in the past, Zhou adds, because CPUs and GPUs were located on separate chips.

The paper, "CPU-Assisted GPGPU on Fused CPU-GPU Architectures," will be presented Feb. 27 at the 18th International Symposium on High Performance Computer Architecture, in New Orleans. The paper was co-authored by NC State Ph.D. students Yi Yang and Ping Xiang, and by Mike Mantor of Advanced Micro Devices (AMD). The research was funded by the National Science Foundation and AMD.

The paper abstract follows.

"CPU-Assisted GPGPU on Fused CPU-GPU Architectures"

Authors: Yi Yang, Ping Xiang, Huiyang Zhou, North Carolina State University; Mike Mantor, Advanced Micro Devices

Presented: Feb. 27, 18th International Symposium on High Performance Computer Architecture, New Orleans

Abstract: This paper presents a novel approach to utilize the CPU resource to facilitate the execution of GPGPU programs on fused CPU-GPU architectures. In our model of fused architectures, the GPU and the CPU are integrated on the same die and share the on-chip L3 cache and off-chip memory, similar to the latest Intel Sandy Bridge and AMD accelerated processing unit (APU) platforms. In our proposed CPU-assisted GPGPU, after the CPU launches a GPU program, it executes a pre-execution program, which is generated automatically from the GPU kernel using our proposed compiler algorithms and contains memory access instructions of the GPU kernel for multiple threadblocks. The CPU pre-execution program runs ahead of GPU threads because (1) the CPU pre-execution thread only contains memory fetch instructions from GPU kernels and not floating-point computations, and (2) the CPU runs at higher frequencies and exploits higher degrees of instruction-level parallelism than GPU scalar cores. We also leverage the prefetcher at the L2-cache on the CPU side to increase the memory traffic from CPU. As a result, the memory accesses of GPU threads hit in the L3 cache and their latency can be drastically reduced. Since our pre-execution is directly controlled by user-level applications, it enjoys both high accuracy and flexibility. Our experiments on a set of benchmarks show that our proposed preexecution improves the performance by up to 113% and 21.4% on average.

View at TechPowerUp Main Site

btarunr · Feb 8, 2012

Many Thanks to tigger for the tip.

FreedomEclipse · Feb 8, 2012

for a moment, i thought there was going to be hope for BD

/troll

NC37 · Feb 8, 2012

Saw this coming/predicted it even before APUs came out. When NV showcased using GPUs for CPU tasks years ago...it was like just one massive hint of where future tech was going. But can AMD capitalize it? Curious to see. Intel can utilize the same idea but their GPU tech is so far behind that I could see them leveraging the CPU side even more to compensate. So then it is a matter of how far AMD can take it to offset their weakness on the x86.

Either way, forces both companies to innovate. Innovation is good!!

erixx · Feb 8, 2012

Ha! Wait a second! Didn't they say -and we believe...- that we are having this feature since we installed our first Physix enabled videocard? hohoho! HOHOHOHO!

TheoneandonlyMrK · Feb 8, 2012

NC37 said:
Saw this coming/predicted it even before APUs came out. When NV showcased using GPUs for CPU tasks years ago...it was like just one massive hint of where future tech was going. But can AMD capitalize it? Curious to see. Intel can utilize the same idea but their GPU tech is so far behind that I could see them leveraging the CPU side even more to compensate. So then it is a matter of how far AMD can take it to offset their weakness on the x86.

Either way, forces both companies to innovate. Innovation is good!!

+1

but arm and nvidia imho make this more then a two horse race from here on so i am likeing AMD's open standards policy regarding HSA as hopefully most Inovators will at least try and get them standards working across platforms ,but ive a fiver says nvidia make up some more stuff only they can use.

naoan · Feb 8, 2012

This stuff would probably remain as an abstract unless AMD took the aggressive stance.

TheoneandonlyMrK · Feb 8, 2012

naoan said:
This stuff would probably remain as an abstract unless AMD took the aggressive stance.

hopefully , I prefer open standards that everybody works to, that way the devs have to work harder to make their chip better than others rather then trying to differentiate with under used additional features that need to be tailored for specificaly

and just when you start to think your pc might last a while an all, tutt be next year im looking at mine with that hmmmm upgrade time eye

D4S4 · Feb 8, 2012

this looks very good for amd in the coming years.

xenocide · Feb 8, 2012

Keep in mind, they didn't actually physically accomplish anything yet. With the help of AMD Engineers they modeled how a supposed performance gain could potentially occur, but have yet to get it functioning. When APU's were first introduced I figured they would find a way to have the GPU and CPU simultaneously process when a Discrete GPU Solution was present, but apparently they didn't care much for developing that idea. This entire study should be taken with a truckload of grains of salt.

D4S4 · Feb 9, 2012

well they can't use it now since there is no software support for something like this. but in 5-10 years... intel has nothing like this and i bet heterogenous computing is going to gain some serious ground in the near future simply because amd made a chip that that makes it commercially viable.

Static~Charge · Feb 9, 2012

What happens if the GPU is busy doing video-related work and the CPU throws a calculation request at it? Does the display stutter or freeze? Or does the GPU perform the calculations more slowly? In that case, the CPU might be able to perform the calculations faster just because it isn't bogged down with other work. Definitely an issue to take into consideration.

System Name	RBMK-1000
Processor	AMD Ryzen 7 5700G
Motherboard	ASUS ROG Strix B450-E Gaming
Cooling	DeepCool Gammax L240 V2
Memory	2x 8GB G.Skill Sniper X
Video Card(s)	Palit GeForce RTX 2080 SUPER GameRock
Storage	Western Digital Black NVMe 512GB
Display(s)	BenQ 1440p 60 Hz 27-inch
Case	Corsair Carbide 100R
Audio Device(s)	ASUS SupremeFX S1220A
Power Supply	Cooler Master MWE Gold 650W
Mouse	ASUS ROG Strix Impact
Keyboard	Gamdias Hermes E2
Software	Windows 11 Pro

System Name	RBMK-1000
Processor	AMD Ryzen 7 5700G
Motherboard	ASUS ROG Strix B450-E Gaming
Cooling	DeepCool Gammax L240 V2
Memory	2x 8GB G.Skill Sniper X
Video Card(s)	Palit GeForce RTX 2080 SUPER GameRock
Storage	Western Digital Black NVMe 512GB
Display(s)	BenQ 1440p 60 Hz 27-inch
Case	Corsair Carbide 100R
Audio Device(s)	ASUS SupremeFX S1220A
Power Supply	Cooler Master MWE Gold 650W
Mouse	ASUS ROG Strix Impact
Keyboard	Gamdias Hermes E2
Software	Windows 11 Pro

System Name	Codename: Icarus Mk.VI
Processor	Intel 8600k@Stock -- pending tuning
Motherboard	Asus ROG Strixx Z370-F
Cooling	CPU: BeQuiet! Dark Rock Pro 4 {1xCorsair ML120 Pro\|5xML140 Pro}
Memory	32GB XPG Gammix D10 {2x16GB}
Video Card(s)	ASUS Dual Radeon™ RX 6700 XT OC Edition
Storage	Samsung 970 Evo 512GB SSD (Boot)\|WD SN770 (Gaming)\|2x 3TB Toshiba DT01ACA300\|2x 2TB Crucial BX500
Display(s)	LG GP850-B
Case	Corsair 760T (White)
Audio Device(s)	Yamaha RX-V573\|Speakers: JBL Control One\|Auna 300-CN\|Wharfedale Diamond SW150
Power Supply	Corsair AX760
Mouse	Logitech G900
Keyboard	Duckyshine Dead LED(s) III
Software	Windows 10 Pro
Benchmark Scores	(ノಠ益ಠ)ノ彡┻━┻

System Name	Lailalo
Processor	Ryzen 9 5900X Boosts to 4.95Ghz
Motherboard	Asus TUF Gaming X570-Plus (WIFI
Cooling	Noctua
Memory	32GB DDR4 3200 Corsair Vengeance
Video Card(s)	XFX 7900XT 20GB
Storage	Samsung 970 Pro Plus 1TB, Crucial 1TB MX500 SSD, Segate 3TB
Display(s)	LG Ultrawide 29in @ 2560x1080
Case	Coolermaster Storm Sniper
Power Supply	XPG 1000W
Mouse	G602
Keyboard	G510s
Software	Windows 10 Pro / Windows 10 Home

System Name	RyzenGtEvo/ Asus strix scar II
Processor	Amd R5 5900X/ Intel 8750H
Motherboard	Crosshair hero8 impact/Asus
Cooling	360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory	Corsair Vengeance Rgb pro 3600cas14 16Gb in four sticks./16Gb/16GB
Video Card(s)	Powercolour RX7900XT Reference/Rtx 2060
Storage	Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s)	Samsung UAE28"850R 4k freesync.dell shiter
Case	Lianli 011 dynamic/strix scar2
Audio Device(s)	Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply	corsair 1200Hxi/Asus stock
Mouse	Roccat Kova/ Logitech G wireless
Keyboard	Roccat Aimo 120
VR HMD	Oculus rift
Software	Win 10 Pro
Benchmark Scores	8726 vega 3dmark timespy/ laptop Timespy 6506

Engineers Boost Computer Processor Performance By Over 20 Percent

btarunr

Editor & Senior Moderator

btarunr

Editor & Senior Moderator

FreedomEclipse

~Technological Technocrat~

NC37

erixx

TheoneandonlyMrK

naoan

New Member

TheoneandonlyMrK

D4S4

xenocide

D4S4

Static~Charge

Processor	C2D E8400@3.9GHz (488x8, 1.4v :( )
Motherboard	Abit IP35-E
Cooling	Thermaltake Sonic Tower+120mm fan
Memory	2GB kingmax ddr1066@976MHz 5-5-5-15
Video Card(s)	Radeon X1800GTO @700/1400MHz with Accelero S1+Glacialtech fancard
Storage	2xSeagate Barracuda 7200.10 160GB
Display(s)	Samsung SyncMaster 793s... just you laugh...
Case	some Aplus case
Audio Device(s)	Realtek ALC888
Power Supply	Chieftec 450W
Software	Win7 x64

Processor	Intel i7-10700k
Motherboard	Gigabyte Aurorus Ultra z490
Cooling	Corsair H100i RGB
Memory	32GB (4x8GB) Corsair Vengeance DDR4-3200MHz
Video Card(s)	MSI Gaming Trio X 3070 LHR
Display(s)	ASUS MG278Q / AOC G2590FX
Case	Corsair X4000 iCue
Audio Device(s)	Onboard
Power Supply	Corsair RM650x 650W Fully Modular
Software	Windows 10

Processor	Intel Core i3-8100
Motherboard	ASRock H370 Pro4
Cooling	Cryorig M9i
Memory	16GB G.Skill Aegis DDR4-2400
Video Card(s)	Gigabyte GeForce GTX 1060 WindForce OC 3GB
Storage	Crucial MX500 512GB SSD
Display(s)	Dell S2316M LCD
Case	Fractal Design Define R4 Black Pearl
Audio Device(s)	Realtek ALC892
Power Supply	Corsair CX600M
Mouse	Logitech M500
Keyboard	Lenovo KB1021 USB
Software	Windows 10 Professional x64