• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

X-Silicon Startup Wants to Combine RISC-V CPU, GPU, and NPU in a Single Processor

Such a claim is disrespecting everyone who ever was or is in that business, as if those where plain stupid.
Investor scam warning!
Build it, show it running a 3D-game with at least 60 FPS, then trigger a press release again.
 
It is because of the internal "pipelining" of a single architecture
It's got nothing to do with pipelining, no matter how deep the pipeline is instructions still take the same amount of time to complete.
No one nowadays makes multiplications by recursively adding a single result in SW-like routines or microcode or whatever you are proposing, it would be too time consuming!
I don't know if that's how it's literally done, probably not but you cannot multiply two integers without carrying out multiple additions, that much is clear. Every integer op will block a processor's integer execution port until it's finished, whether it's a multiplication or addition and that's because they're all using the same addition circuitry, if there were was a separate block for multiplication like you're all convinced there is then addition and multiplication could be done in parallel on the same execution port but it doesn't work like that in any architecture.
 
Last edited:
It's got nothing to do with pipelining
I'm not talking about the processor pipeline to complete instructions, I'm referring to the practice of adding a register in the middle of a combinatorial circuit to break the critical path in two parts - it is named the same, so confusion could happen.

Every integer op will block a processor's integer execution port until it's finished, whether it's a multiplication or addition and that's because they're all using the same addition circuitry
Also this is not precise, the reason behind what you are referring is because of the fact internally either you can compute one operation at a time, so you have only one port, or if you are on a "Tomasulo-like" architecture, you have to dispatch, and you share queues/entries/ports - whatever you wanna call them - but still they are not going to the same unit. There is no point in that, as I said, it would be too slow.

I'll make a practical reference, just consider this which is one of the simplest open-source architectures available.
Here is where the adder unit is defined, while this point to this and it is where the multiplier is defined.
And this architecture is targeting very very small devices, still there are 2 units for these two operations.
Moreover, that multiplier takes more than 1 cycle - actually 3 cycles - to perform the computation. Not because you need 3 cycles to do that - for a 32 bit multiplication you would need 32 cycles of additions - but because that is the tradeoff in speed they choose.
 
I have a feeling they're gonna be snapped up for crypto since they can do it all, or ai.
 
Needs more Processing Units.................
 
Such a claim is disrespecting everyone who ever was or is in that business, as if those where plain stupid.
Investor scam warning!
Build it, show it running a 3D-game with at least 60 FPS, then trigger a press release again.
Big investors have no idea what a cpu even is, so I don't think they care. They just see big tech words, think of AI and put money.
 
Back
Top