Wednesday, March 20th 2024

Tiny Corp. Pauses Development of AMD Radeon GPU-based Tinybox AI Cluster

George Hotz and his Tiny Corporation colleagues were pinning their hopes on AMD delivering some good news earlier this month. The development of a "TinyBox" AI compute cluster project hit some major roadblocks a couple of weeks ago—at the time, Radeon RX 7900 XTX GPU firmware was not gelling with Tiny Corp.'s setup. Hotz expressed "70% confidence" in AMD approving open-sourcing certain bits of firmware. At the time of writing this has not transpired—this week the Tiny Corp. social media account has, once again, switched to an "all guns blazing" mode. Hotz and Co. have publicly disclosed that they were dabbling with Intel Arc graphics cards, as of a few weeks ago. NVIDIA hardware is another possible route, according to freshly posted open thoughts.

Yesterday, it was confirmed that the young startup organization had paused its utilization of XFX Speedster MERC310 RX 7900 XTX graphics cards: "the driver is still very unstable, and when it crashes or hangs we have no way of debugging it. We have no way of dumping the state of a GPU. Apparently it isn't just the MES causing these issues, it's also the Command Processor (CP). After seeing how open Tenstorrent is, it's hard to deal with this. With Tenstorrent, I feel confident that if there's an issue, I can debug and fix it. With AMD, I don't." The $15,000 TinyBox system relies on "cheaper" gaming-oriented GPUs, rather than traditional enterprise solutions—this oddball approach has attracted a number of customers, but the latest announcements likely signal another delay. Yesterday's tweet continued to state: "we are exploring Intel, working on adding Level Zero support to tinygrad. We also added a $400 bounty for XMX support. We are also (sadly) exploring a 6x GeForce RTX 4090 GPU box. At least we know the software is good there. We will revisit AMD once we have an open and reproducible build process for the driver and firmware. We are willing to dive really deep into hardware to make it amazing. But without access, we can't."
Another post provided a behind-the-scenes look at Hotz's diplomatic approach: "I have spoken with AMD on multiple occasions, we have gotten through to top people, and they have been quite nice to us. I believe they want to be more open, and obviously they don't want their driver to have bugs. Unfortunately, this access and responses prolonged this decision, part of me wishes they just said it's a consumer card, you get what you pay for and we could have switched earlier. We probably tried too hard to make it work. We have an amazing team at tinygrad. Someday, we are going to make our own chips, and I figure if we can make our own chips, we better be able to make the 7900XTX software great. But we can't if we don't have access. The firmware is complex, undocumented, closed source, and signed, all struggles we wouldn't have with our own hardware. If and when the firmware is open and installable, if we aren't too far along with a different chip, we are down to put resources into writing fuzzers and rewriting whatever needs to be rewritten. The 7900XTX hardware seems great, but we aren't going to put resources into fixing a black box."
Show 36 Comments