News Posts matching #AI

Return to Keyword Browsing

NVIDIA Issues Patches for ChatRTX AI Chatbot, Suspect to Improper Privilege Management

Just a month after releasing the 0.1 beta preview of Chat with RTX, now called ChatRTX, NVIDIA has swiftly addressed critical security vulnerabilities discovered in its cutting-edge AI chatbot. The chatbot was found to be susceptible to cross-site scripting attacks (CWE-79) and improper privilege management attacks (CWE-269) in version 0.2 and all prior releases. The identified vulnerabilities posed significant risks to users' personal data and system security. Cross-site scripting attacks could allow malicious actors to inject scripts into the chatbot's interface, potentially compromising sensitive information. The improper privilege management flaw could also enable attackers to escalate their privileges and gain administrative control over users' systems and files.

Upon becoming aware of these vulnerabilities, NVIDIA promptly released an updated version of ChatRTX 0.2, available for download from its official website. The latest iteration of the software addresses these security issues, providing users with a more secure experience. As ChatRTX utilizes retrieval augmented generation (RAG) and NVIDIA Tensor-RT LLM software to allow users to train the chatbot on their personal data, the presence of such vulnerabilities is particularly concerning. Users are strongly advised to update their ChatRTX software to the latest version to mitigate potential risks and protect their personal information. ChatRTX remains in beta version, with no official release candidate timeline announced. As NVIDIA continues to develop and refine this innovative AI chatbot, the company must prioritize security and promptly address any vulnerabilities that may arise, ensuring a safe and reliable user experience.

Microsoft Copilot to Run Locally on AI PCs with at Least 40 TOPS of NPU Performance

Microsoft, Intel, and AMD are attempting to jumpstart demand in the PC industry again, under the aegis of the AI PC—devices with native acceleration for AI workloads. Both Intel and AMD have mobile processors with on-silicon NPUs (neural processing units), which are designed to accelerate the first wave of AI-enhanced client experiences on Windows 11 23H2. Microsoft's bulwark with democratizing AI has been Copilot, as a licensee of Open AI GPT-4, GPT-4 Turbo, Dali, and other generative AI tools from the Open AI stable. Copilot is currently Microsoft's most heavily invested application, with its most capital and best minds mobilized to making it the most popular AI assistant. Microsoft even pushed for the AI PC designator to PC OEMs, which requires them to have a dedicated Copilot key akin to the Start key (we'll see how anti-competition regulators deal with that).

The problem with Microsoft's tango with Intel and AMD to push AI PCs, is that Copilot doesn't really use an NPU, not even at the edge—you input a query or a prompt, and Copilot hands it over to a cloud-based AI service. This is about to change, with Microsoft announcing that Copilot will be able to run locally on AI PCs. Microsoft identified several kinds of Copilot use-cases that an NPU can handle on-device, which should speed up response times to Copilot queries, but this requires the NPU to have at least 40 TOPS of performance. This is a problem for the current crop of processors with NPUs. Intel's Core Ultra "Meteor Lake" has an AI Boost NPU with 10 TOPS on tap, while the Ryzen 8040 "Hawk Point" is only slightly faster, with a 16 TOPS Ryzen AI NPU. AMD has already revealed that the XDNA 2-based 2nd Generation Ryzen AI NPU in its upcoming "Strix Point" processors will come with over 40 TOPS of performance, and it stands to reason that the NPUs in Intel's "Arrow Lake" or "Lunar Lake" processors are comparable in performance; which should enable on-device Copilot.

Report Suggests Naver Siding with Samsung in $752 Million "Mach-1" AI Chip Deal

Samsung debuted its Mach-1 generation of AI processors during a recent shareholder meeting—the South Korean megacorp anticipates an early 2025 launch window. Their application-specific integrated circuit (ASIC) design is expected to "excel in edge computing applications," with a focus on low power and efficiency-oriented operating environments. Naver Corporation was a key NVIDIA high-end AI customer in South Korea (and Japan), but the leading search platform firm and creator of HyperCLOVA X LLM (reportedly) deliberated on an adoption alternative hardware last October. The Korea Economic Daily believes that Naver's relationship with Samsung is set to grow, courtesy of a proposed $752 million investment: "the world's top memory chipmaker, will supply its next-generation Mach-1 artificial intelligence chips to Naver Corp. by the end of this year."

Reports from last December indicated that the two companies were deep into the process of co-designing power-efficient AI accelerators—Naver's main goal is to finalize a product that will offer eight times more energy efficiency than NVIDIA's H100 AI accelerator. Naver's alleged bulk order—of roughly 150,000 to 200,000 Samsung Mach-1 AI chips—appears to be a stopgap. Industry insiders reckon that Samsung's first-gen AI accelerator is much cheaper when compared to NVIDIA H100 GPU price points—a per-unit figure of $3756 is mentioned in the KED Global article. Samsung is speculated to be shopping its fledgling AI tech to Microsoft and Meta.

NVIDIA Hopper Leaps Ahead in Generative AI at MLPerf

It's official: NVIDIA delivered the world's fastest platform in industry-standard tests for inference on generative AI. In the latest MLPerf benchmarks, NVIDIA TensorRT-LLM—software that speeds and simplifies the complex job of inference on large language models—boosted the performance of NVIDIA Hopper architecture GPUs on the GPT-J LLM nearly 3x over their results just six months ago. The dramatic speedup demonstrates the power of NVIDIA's full-stack platform of chips, systems and software to handle the demanding requirements of running generative AI. Leading companies are using TensorRT-LLM to optimize their models. And NVIDIA NIM—a set of inference microservices that includes inferencing engines like TensorRT-LLM—makes it easier than ever for businesses to deploy NVIDIA's inference platform.

Raising the Bar in Generative AI
TensorRT-LLM running on NVIDIA H200 Tensor Core GPUs—the latest, memory-enhanced Hopper GPUs—delivered the fastest performance running inference in MLPerf's biggest test of generative AI to date. The new benchmark uses the largest version of Llama 2, a state-of-the-art large language model packing 70 billion parameters. The model is more than 10x larger than the GPT-J LLM first used in the September benchmarks. The memory-enhanced H200 GPUs, in their MLPerf debut, used TensorRT-LLM to produce up to 31,000 tokens/second, a record on MLPerf's Llama 2 benchmark. The H200 GPU results include up to 14% gains from a custom thermal solution. It's one example of innovations beyond standard air cooling that systems builders are applying to their NVIDIA MGX designs to take the performance of Hopper GPUs to new heights.

Intel Gaudi 2 Remains Only Benchmarked Alternative to NV H100 for Generative AI Performance

Today, MLCommons published results of the industry-standard MLPerf v4.0 benchmark for inference. Intel's results for Intel Gaudi 2 accelerators and 5th Gen Intel Xeon Scalable processors with Intel Advanced Matrix Extensions (Intel AMX) reinforce the company's commitment to bring "AI Everywhere" with a broad portfolio of competitive solutions. The Intel Gaudi 2 AI accelerator remains the only benchmarked alternative to Nvidia H100 for generative AI (GenAI) performance and provides strong performance-per-dollar. Further, Intel remains the only server CPU vendor to submit MLPerf results. Intel's 5th Gen Xeon results improved by an average of 1.42x compared with 4th Gen Intel Xeon processors' results in MLPerf Inference v3.1.

"We continue to improve AI performance on industry-standard benchmarks across our portfolio of accelerators and CPUs. Today's results demonstrate that we are delivering AI solutions that deliver to our customers' dynamic and wide-ranging AI requirements. Both Intel Gaudi and Xeon products provide our customers with options that are ready to deploy and offer strong price-to-performance advantages," said Zane Ball, Intel corporate vice president and general manager, DCAI Product Management.

SK Hynix Plans a $4 Billion Chip Packaging Facility in Indiana

SK Hynix is planning a large $4 billion chip-packaging and testing facility in Indiana, USA. The company is still in the planning stage of the decision to invest in the US. "[the company] is reviewing its advanced chip packaging investment in the US, but hasn't made a final decision yet," a company spokesperson told the Wall Street Journal. The primary product focus for this plant will be stacked HBM memory meant to be consumed by the AI GPU and self-driving automobile industries. The plant could also focus on other exotic memory types, such as high-density server memory; and perhaps even compute-in-memory. The plant is expected to start operations in 2028, and will create up to 1,000 skilled jobs. SK Hynix is counting for state- and federal tax incentives to propel this investment; under government initiatives such as the CHIPS Act. SK Hynix is a significant supplier of HBM to NVIDIA for its AI GPUs. Its HBM3E features in the latest NVIDIA "Blackwell" GPUs.

Intel Announces New Program for AI PC Software Developers and Hardware Vendors

Intel Corporation today announced the creation of two new artificial intelligence (AI) initiatives as part of the AI PC Acceleration Program: the AI PC Developer Program and the addition of independent hardware vendors to the program. These are critical milestones in Intel's pursuit of enabling the software and hardware ecosystem to optimize and maximize AI on more than 100 million Intel-based AI PCs through 2025.

"We have made great strides with our AI PC Acceleration Program by working with the ecosystem. Today, with the addition of the AI PC Developer Program, we are expanding our reach to go beyond large ISVs and engage with small- and medium-sized players and aspiring developers. Our goal is to drive a frictionless experience by offering a broad set of tools including the new AI-ready Developer Kit," said Carla Rodriguez, Intel vice president and general manager of Client Software Ecosystem Enabling.

Report: China's PC Market Set for Return to Growth of 3% in 2024

Canalys anticipates that China's PC (excluding tablets) market will rebound to 3% growth in 2024 and 10% growth in 2025, primarily fueled by refresh demand from the commercial sector. The tablet market is expected to grow by 4% in both 2024 and 2025, benefiting from increasing penetration as digitalization deepens.

"2024 is expected to bring modest relief to a struggling PC market in China, but a challenging environment will remain," said Canalys Analyst Emma Xu. "Ongoing economic structural adjustments are a key priority as the government seeks new avenues for economic growth, with a core focus on technology-driven innovation. AI emerged as a central theme during the latest 'Two Sessions' in China, with enthusiasm for AI spanning commercial entities and government initiatives aimed at establishing a domestic AI ecosystem across industries. Significant opportunities for the PC industry are set to arise from this commercial push, especially as it coincides with the upcoming device refresh and the emergence of AI-capable PCs."

NVIDIA Modulus & Omniverse Drive Physics-informed Models and Simulations

A manufacturing plant near Hsinchu, Taiwan's Silicon Valley, is among facilities worldwide boosting energy efficiency with AI-enabled digital twins. A virtual model can help streamline operations, maximizing throughput for its physical counterpart, say engineers at Wistron, a global designer and manufacturer of computers and electronics systems. In the first of several use cases, the company built a digital copy of a room where NVIDIA DGX systems undergo thermal stress tests (pictured above). Early results were impressive.

Making Smart Simulations
Using NVIDIA Modulus, a framework for building AI models that understand the laws of physics, Wistron created digital twins that let them accurately predict the airflow and temperature in test facilities that must remain between 27 and 32 degrees C. A simulation that would've taken nearly 15 hours with traditional methods on a CPU took just 3.3 seconds on an NVIDIA GPU running inference with an AI model developed using Modulus, a whopping 15,000x speedup. The results were fed into tools and applications built by Wistron developers with NVIDIA Omniverse, a platform for creating 3D workflows and applications based on OpenUSD.

UL Announces the Procyon AI Image Generation Benchmark Based on Stable Diffusion

We're excited to announce we're expanding our AI Inference benchmark offerings with the UL Procyon AI Image Generation Benchmark, coming Monday, 25th March. AI has the potential to be one of the most significant new technologies hitting the mainstream this decade, and many industry leaders are competing to deliver the best AI Inference performance through their hardware. Last year, we launched the first of our Procyon AI Inference Benchmarks for Windows, which measured AI Inference performance with a workload using Computer Vision.

The upcoming UL Procyon AI Image Generation Benchmark provides a consistent, accurate and understandable workload for measuring the AI performance of high-end hardware, built with input from members of the industry to ensure fair and comparable results across all supported hardware.

UGREEN Unveils Its First Network-Attached Storage Solutions

Ugreen, a leading innovator in consumer electronics, is excited to announce the launch of its inaugural Network Attached Storage (NAS) series. The launch is scheduled for March 26 on the popular crowdfunding platform, Kickstarter.com. This campaign is specifically aimed at users in the United States and Germany. As a special incentive, UGREEN is offering an early bird discount of 40%, with prices commencing at just USD $239.99. Visit Kickstarter.com to be among the first to experience UGREEN's innovative NASync series.

UGREEN NASync series is a versatile range designed to cater to a variety of use scenarios. The NASync DXP2800 and NASync DXP4800 and DXP4800 Plus are tailored for personal and home users. For power users and business solutions, UGREEN offers the NASync DXP6800 Pro and NASync DXP8800 Plus. Lastly, the NASync DXP480T Plus is specifically designed to meet the needs of creative and media professionals.

Know more about the Kickstarter campaign here.

Tiny Corp. Prepping Separate AMD & NVIDIA GPU-based AI Compute Systems

George Hotz and his startup operation (Tiny Corporation) appeared ready to completely abandon AMD Radeon GPUs last week, after experiencing a period of firmware-related headaches. The original plan involved the development of a pre-orderable $15,000 TinyBox AI compute cluster that housed six XFX Speedster MERC310 RX 7900 XTX graphics cards, but software/driver issues prompted experimentation via alternative hardware routes. A lot of media coverage has focused on the unusual adoption of consumer-grade GPUs—Tiny Corp.'s struggles with RDNA 3 (rather than CDNA 3) were maneuvered further into public view, after top AMD brass pitched in.

The startup's social media feed is very transparent about showcasing everyday tasks, problem-solving and important decision-making. Several Acer Predator BiFrost Arc A770 OC cards were purchased and promptly integrated into a colorfully-lit TinyBox prototype, but Hotz & Co. swiftly moved onto Team Green pastures. Tiny Corp. has begrudgingly adopted NVIDIA GeForce RTX 4090 GPUs. Earlier today, it was announced that work on the AMD-based system has resumed—although customers were forewarned about anticipated teething problems. The surprising message arrived in the early hours: "a hard to find 'umr' repo has turned around the feasibility of the AMD TinyBox. It will be a journey, but it gives us an ability to debug. We're going to sell both, red for $15,000 and green for $25,000. When you realize your pre-order you'll choose your color. Website has been updated. If you like to tinker and feel pain, buy red. The driver still crashes the GPU and hangs sometimes, but we can work together to improve it."

NVIDIA's Bryan Catanzaro Discusses Future of AI Personal Computing

Imagine a world where you can whisper your digital wishes into your device, and poof, it happens. That world may be coming sooner than you think. But if you're worried about AI doing your thinking for you, you might be waiting for a while. In a fireside chat Wednesday (March 20) at NVIDIA GTC, the global AI conference, Kanjun Qiu, CEO of Imbue, and Bryan Catanzaro, VP of applied deep learning research at NVIDIA, challenged many of the clichés that have long dominated conversations about AI. Launched in October 2022, Imbue made headlines with its Series B fundraiser last year, raising over $200 million at a $1 billion valuation.

The Future of Personal Computing
Qiu and Catanzaro discussed the role that virtual worlds will play in this, and how they could serve as interfaces for human-technology interaction. "I think it's pretty clear that AI is going to help build virtual worlds," said Catanzaro. "I think the maybe more controversial part is virtual worlds are going to be necessary for humans to interact with AI." People have an almost primal fear of being displaced, Catanzaro said, but what's much more likely is that our capabilities will be amplified as the technology fades into the background. Catanzaro compared it to the adoption of electricity. A century ago, people talked a lot about electricity. Now that it's ubiquitous, it's no longer the focus of broader conversations, even as it makes our day-to-day lives better.

Qualcomm Announces the Snapdragon 7+ Gen 3, Featuring Exceptional On-Device AI Capabilities

Qualcomm Technologies, Inc., unveiled today the Snapdragon 7+ Gen 3 Mobile Platform, bringing on-device generative AI into the Snapdragon 7 series. The Mobile Platform supports a wide range of AI models including large language models (LLMs) such as Baichuan-7B, Llama 2, and Gemini Nano. Fueling extraordinary entertainment capabilities, Snapdragon 7+ Gen 3 also brings new select Snapdragon Elite Gaming features to the 7-series including Game Post Processing Accelerator and Adreno Frame Motion Engine 2, enhancing game effects and upscaling gaming content for desktop-level visuals. Plus, this platform brings top-notch photography features with our industry-leading 18-bit cognitive ISP.

"Today, we embark on the latest expansion in the 7-series to create new levels of entertainment for consumers - integrating next-generation technologies for richer experiences," said Chris Patrick, senior vice president and general manager of mobile handsets, Qualcomm Technologies, Inc. "Snapdragon 7+ Gen 3 is packed with support for incredible on-device generative AI features and provides incredible performance and power efficiency, while bringing Wi-Fi 7 to the Snapdragon 7 Series for the first time."

MediaTek Licenses NVIDIA GPU IP for AI-Enhanced Vehicle Processors

NVIDIA has been offering its GPU IP for more than a decade now ever since the introduction of Kepler uArch, and its IP has had relatively low traction in other SoCs. However, that trend seems to be reaching an inflection point as NVIDIA has given MediaTek a license to use its GPU IP to produce the next generation of processors for the auto industry. The newest MediaTek Dimensity Auto Cockpit family consists of CX-1, CY-1, CM-1, and CV-1, where the CX-1 targets premium vehicles, CM targets medium range, and CV targets lower-end vehicles, probably divided by their compute capabilities. The Dimensity Auto Cockpit family is brimming with the latest technology, as the processor core of choice is an Armv9-based design paired with "next-generation" NVIDIA GPU IP, possibly referring to Blackwell, capable of doing ray tracing and DLSS 3, powered by RTX and DLA.

The SoC is supposed to integrate a lot of technology to lower BOM costs of auto manufacturing, and it includes silicon for controlling displays, cameras (advanced HDR ISP), audio streams (multiple audio DSPs), and connectivity (WiFi networking). Interestingly, the SKUs can play movies with AI-enhanced video and support AAA gaming. MediaTek touts the Dimensity Auto Cockpit family with fully local AI processing capabilities, without requiring assistance from outside servers via WiFi, and 3D spatial sensing with driver and occupant monitoring, gaze-aware UI, and natural controls. All of that fits into an SoC fabricated at TSMC's fab on a 3 nm process and runs on the industry-established NVIDIA DRIVE OS.

Alibaba Unveils Plans for Server-Grade RISC-V Processor and RISC-V Laptop

Chinese e-commerce and cloud giant Alibaba announced its plans to launch a server-grade RISC-V processor later this year, and it showcased a RISC-V-powered laptop running an open-source operating system. The announcements were made by Alibaba's research division, the Damo Academy, at the recent Xuantie RISC-V Ecological Conference in Shenzhen. The upcoming server-class processor called the Xuantie C930, is expected to be launched by the end of 2024. While specific details about the chip have not been disclosed, it is anticipated to cater to AI and server workloads. This development is part of Alibaba's ongoing efforts to expand its RISC-V portfolio and reduce reliance on foreign chip technologies amidst US export restrictions. To complement the C930, Alibaba is also preparing a Xuantie 907 matrix processing unit for AI, which could be an IP block inside an SoC like the C930 or an SoC of its own.

In addition to the C930, Alibaba showcased the RuyiBOOK, a laptop powered by the company's existing T-Head C910 processor. The C910, previously designed for edge servers, AI, and telecommunications applications, has been adapted for use in laptops. Strangely, the RuyiBOOK laptop runs on the openEuler operating system, an open-source version of Huawei's EulerOS, which is based on Red Hat Linux. The laptop also features Alibaba's collaboration suite, Ding Talk, and the open-source office software Libre Office, demonstrating its potential to cater to the needs of Chinese knowledge workers and consumers without relying on foreign software. Zhang Jianfeng, president of the Damo Academy, emphasized the increasing demand for new computing power and the potential for RISC-V to enter a period of "application explosion." Alibaba plans to continue investing in RISC-V research and development and fostering collaboration within the industry to promote innovation and growth in the RISC-V ecosystem, lessening reliance on US-sourced technology.

Samsung Roadmaps UFS 5.0 Storage Standard, Predicts Commercialization by 2027

Mobile tech tipster, Revegnus, has highlighted an interesting Samsung presentation slide—according to machine translation, the company's electronics division is already responding to an anticipated growth of "client-side large language model" service development. This market trend will demand improved Universal Flash Storage (UFS) interface speeds—Samsung engineers are currently engaged in: "developing a new product that uses UFS 4.0 technology, but increases the number of channels from the current 2 to 4." The upcoming "more advanced" UFS 4.0 storage chips could be beefy enough to be utilized alongside next-gen mobile processors in 2025. For example; ARM is gearing up "Blackhawk," the Cortex-X4's successor—industry watchdogs reckon that the semiconductor firm's new core is designed to deliver "great Large Language Model (LLM) performance" on future smartphones. Samsung's roadmap outlines another major R&D goal, but this prospect is far off from finalization—their chart reveals an anticipated 2027 rollout. The slide's body of text included a brief teaser: "at the same time, we are also actively participating in discussions on the UFS 5.0 standard."

Sony Semiconductor Solutions Selects Cutting-Edge AMD Adaptive Computing Tech

Yesterday, AMD announced that its cutting-edge adaptive computing technology was selected by Sony Semiconductor Solutions (SSS) for its newest automotive LiDAR reference design. SSS, a global leader in image sensor technology, and AMD joined forces to deliver a powerful and efficient LiDAR solution for use in autonomous vehicles. Using adaptive computing technology from AMD significantly extends the SSS LiDAR system capabilities, offering extraordinary accuracy, fast data processing, and high reliability for next-generation autonomous driving solutions.

In the rapidly evolving landscape of autonomous driving, the demand for precise and reliable sensor technology has never been greater. LiDAR (Light Detection and Ranging) technology plays a pivotal role in enabling depth perception and environmental mapping for various industries. LiDAR delivers image classification, segmentation, and object detection data that is essential for 3D vision perception enhanced by AI, which cannot be provided by cameras alone, especially in low-light or inclement weather. The dedicated LiDAR reference design addresses the complexities of autonomous vehicle development with a standardized platform to enhance safety in navigating diverse driving scenarios.

Altair SimSolid Transforms Simulation for Electronics Industry

Altair, a global leader in computational intelligence, announced the upcoming release of Altair SimSolid for electronics, bringing game-changing fast, easy, and precise multi-physics scenario exploration for electronics, from chips, PCBs, and ICs to full system design. "As the electronics industry pushes the boundaries of complexity and miniaturization, engineers have struggled with simulations that often compromise on detail for expediency. Altair SimSolid will empower engineers to capture the intricate complexities of PCBs and ICs without simplification," said James R. Scapa, founder and chief executive officer, Altair. "Traditional simulation methods often require approximations when analyzing PCB structures due to their complexity. Altair SimSolid eliminates these approximations to run more accurate simulations for complex problems with vast dimensional disparities."

Altair SimSolid has revolutionized conventional analysis in its ability to accurately predict complex structural problems with blazing-fast speed while eliminating the complexity of laborious hours of modeling. It eliminates geometry simplification and meshing, the two most time-consuming and expertise-intensive tasks done in traditional finite element analysis. As a result, it delivers results in seconds to minutes—up to 25x faster than traditional finite element solvers—and effortlessly handles complex assemblies. Having experienced fast adoption in the aerospace and automotive industries, two sectors that typically experience challenges associated with massive structures, Altair SimSolid is poised to play a significant role in the electronics market. The initial release, expected in Q2 2024, will support structural and thermal analysis for PCBs and ICs with full electromagnetics analysis coming in a future release.

Samsung Prepares Mach-1 Chip to Rival NVIDIA in AI Inference

During its 55th annual shareholders' meeting, Samsung Electronics announced its entry into the AI processor market with the upcoming launch of its Mach-1 AI accelerator chips in early 2025. The South Korean tech giant revealed its plans to compete with established players like NVIDIA in the rapidly growing AI hardware sector. The Mach-1 generation of chips is an application-specific integrated circuit (ASIC) design equipped with LPDDR memory that is envisioned to excel in edge computing applications. While Samsung does not aim to directly rival NVIDIA's ultra-high-end AI solutions like the H100, B100, or B200, the company's strategy focuses on carving out a niche in the market by offering unique features and performance enhancements at the edge, where low power and efficient computing is what matters the most.

According to SeDaily, the Mach-1 chips boast a groundbreaking feature that significantly reduces memory bandwidth requirements for inference to approximately 0.125x compared to existing designs, which is an 87.5% reduction. This innovation could give Samsung a competitive edge in terms of efficiency and cost-effectiveness. As the demand for AI-powered devices and services continues to soar, Samsung's foray into the AI chip market is expected to intensify competition and drive innovation in the industry. While NVIDIA currently holds a dominant position, Samsung's cutting-edge technology and access to advanced semiconductor manufacturing nodes could make it a formidable contender. The Mach-1 has been field-verified on an FPGA, while the final design is currently going through a physical design for SoC, which includes placement, routing, and other layout optimizations.

Tiny Corp. Pauses Development of AMD Radeon GPU-based Tinybox AI Cluster

George Hotz and his Tiny Corporation colleagues were pinning their hopes on AMD delivering some good news earlier this month. The development of a "TinyBox" AI compute cluster project hit some major roadblocks a couple of weeks ago—at the time, Radeon RX 7900 XTX GPU firmware was not gelling with Tiny Corp.'s setup. Hotz expressed "70% confidence" in AMD approving open-sourcing certain bits of firmware. At the time of writing this has not transpired—this week the Tiny Corp. social media account has, once again, switched to an "all guns blazing" mode. Hotz and Co. have publicly disclosed that they were dabbling with Intel Arc graphics cards, as of a few weeks ago. NVIDIA hardware is another possible route, according to freshly posted open thoughts.

Yesterday, it was confirmed that the young startup organization had paused its utilization of XFX Speedster MERC310 RX 7900 XTX graphics cards: "the driver is still very unstable, and when it crashes or hangs we have no way of debugging it. We have no way of dumping the state of a GPU. Apparently it isn't just the MES causing these issues, it's also the Command Processor (CP). After seeing how open Tenstorrent is, it's hard to deal with this. With Tenstorrent, I feel confident that if there's an issue, I can debug and fix it. With AMD, I don't." The $15,000 TinyBox system relies on "cheaper" gaming-oriented GPUs, rather than traditional enterprise solutions—this oddball approach has attracted a number of customers, but the latest announcements likely signal another delay. Yesterday's tweet continued to state: "we are exploring Intel, working on adding Level Zero support to tinygrad. We also added a $400 bounty for XMX support. We are also (sadly) exploring a 6x GeForce RTX 4090 GPU box. At least we know the software is good there. We will revisit AMD once we have an open and reproducible build process for the driver and firmware. We are willing to dive really deep into hardware to make it amazing. But without access, we can't."

NVIDIA CEO Jensen Huang: AGI Within Five Years, AI Hallucinations are Solvable

After giving a vivid GTC talk, NVIDIA's CEO Jensen Huang took on a Q&A session with many interesting ideas for debate. One of them is addressing the pressing concerns surrounding AI hallucinations and the future of Artificial General Intelligence (AGI). With a tone of confidence, Huang reassured the tech community that the phenomenon of AI hallucinations—where AI systems generate plausible yet unfounded answers—is a solvable issue. His solution emphasizes the importance of well-researched and accurate data feeding into AI systems to mitigate these occurrences. "The AI shouldn't just answer; it should do research first to determine which of the answers are the best," noted Mr. Huang as he added that for every single question, there should be a rule that makes AI research the answer. This also refers to Retrieval-Augmented Generation (RAG), where LLMs fetch data from external sources, like additional databases, for fact-checking.

Another interesting comment made by the CEO is that the pinnacle of AI evolution—Artificial General Intelligence—is just five years away. Many people working in AI are divided between the AGI timeline. While Mr. Huang predicted five years, some leading researchers like Meta's Yann LeCunn think we are far from the AGI singularity threshold and will be stuck with dog/cat-level AI systems first. AGI has long been a topic of both fascination and apprehension, with debates often revolving around its potential to exceed human intelligence and the ethical implications of such a development. Critics worry about the unpredictability and uncontrollability of AGI once it reaches a certain level of autonomy, raising questions about aligning its objectives with human values and priorities. Timeline-wise, no one knows, and everyone makes their prediction, so time will tell who was right.

Jensen Huang Discloses NVIDIA Blackwell GPU Pricing: $30,000 to $40,000

Jensen Huang has been talking to media outlets following the conclusion of his keynote presentation at NVIDIA's GTC 2024 conference—an NBC TV "exclusive" interview with the Team Green boss has caused a stir in tech circles. Jim Cramer's long-running "Squawk on the Street" trade segment hosted Huang for just under five minutes—NBC's presenter labelled the latest edition of GTC the "Woodstock of AI." NVIDIA's leader reckoned that around $1 trillion of industry was in attendance at this year's event—folks turned up to witness the unveiling of "Blackwell" B200 and GB200 AI GPUs. In the interview, Huang estimated that his company had invested around $10 billion into the research and development of its latest architecture: "we had to invent some new technology to make it possible."

Industry watchdogs have seized on a major revelation—as disclosed during the televised NBC report—Huang revealed that his next-gen AI GPUs "will cost between $30,000 and $40,000 per unit." NVIDIA (and its rivals) are not known to publicly announce price ranges for AI and HPC chips—leaks from hardware partners and individuals within industry supply chains are the "usual" sources. An investment banking company has already delved into alleged Blackwell production costs—as shared by Tae Kim/firstadopter: "Raymond James estimates it will cost NVIDIA more than $6000 to make a B200 and they will price the GPU at a 50-60% premium to H100...(the bank) estimates it costs NVIDIA $3320 to make the H100, which is then sold to customers for $25,000 to $30,000." Huang's disclosure should be treated as an approximation, since his company (normally) deals with the supply of basic building blocks.

Ultra Ethernet Consortium Experiences Exponential Growth in Support of Ethernet for High-Performance AI

Ultra Ethernet Consortium (UEC) is delighted to announce the addition of 45 new members to its thriving community since November, 2023. This remarkable influx of members underscores UEC's position as a unifying force, bringing together industry leaders to build a complete Ethernet-based communication stack architecture for high-performance networking. As a testament to UEC's commitment and the vibrant growth of its community, members shared their excitement about the recent developments. The community testimonials, accessible on our Testimonial page, reflect the positive impact UEC is having on its members. These testimonials highlight the collaborative spirit and the shared vision for the future of high-performance networking.

In the four months since November 2023, when UEC began accepting new members, the consortium has experienced an impressive growth of 450%. In October 2023, UEC boasted a distinguished membership comprising 10 steering members, marking the initial steps towards fostering collaboration in the high-performance networking sector. Now, the community is flourishing with the addition of 45 new member companies, reflecting an extraordinary expansion that demonstrates the industry's recognition of UEC's commitment. With a total of 715 industry experts actively engaged in the eight working groups, UEC is positioned at the forefront of industry collaboration, driving advancements in Ethernet-based communication technologies.

MIPS Expands Global Footprint with New Design Center and Talent for Systems Architects and AI Compute

MIPS, a leading developer of efficient and configurable compute cores, today announced the company's global expansion with the launch of a new R&D center in Austin, TX, making this the second office expansion in Texas after Dallas. MIPS plans to tap into the growing AI engineering talent in Texas and continue to build deeper roots in the community by partnering with local universities and schools. In addition to creating new job opportunities within the local community, each location will support MIPS' RISC-V research and development efforts, while furthering the company's strategic focus on giving customers the freedom to innovate compute in the AI-centric automotive, data center and embedded markets.

"MIPS' global expansion marks a strategic step forward in the company's growth, especially given our focus on AI and the wide and diverse talent available in the cities where we operate," said Sameer Wasson, CEO of MIPS. "The acceleration of AI-based processing and rapid adoption of RISC-V is on an upward trajectory as engineers continue to seek solutions that deliver the ability to innovate and design without constraints. We are rapidly growing our team and accelerating product roadmaps to enable AI-based systems with better scalability, low power efficiency, real-time multi-threading processing and enhanced configurability, while reducing customers' time to market."
Return to Keyword Browsing
May 16th, 2024 20:58 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts