News Posts matching #AI

NVIDIA Issues Patches for ChatRTX AI Chatbot, Suspect to Improper Privilege Management

Mar 28th, 2024 08:43 Discuss (6 Comments)

Just a month after releasing the 0.1 beta preview of Chat with RTX, now called ChatRTX, NVIDIA has swiftly addressed critical security vulnerabilities discovered in its cutting-edge AI chatbot. The chatbot was found to be susceptible to cross-site scripting attacks (CWE-79) and improper privilege management attacks (CWE-269) in version 0.2 and all prior releases. The identified vulnerabilities posed significant risks to users' personal data and system security. Cross-site scripting attacks could allow malicious actors to inject scripts into the chatbot's interface, potentially compromising sensitive information. The improper privilege management flaw could also enable attackers to escalate their privileges and gain administrative control over users' systems and files.

Upon becoming aware of these vulnerabilities, NVIDIA promptly released an updated version of ChatRTX 0.2, available for download from its official website. The latest iteration of the software addresses these security issues, providing users with a more secure experience. As ChatRTX utilizes retrieval augmented generation (RAG) and NVIDIA Tensor-RT LLM software to allow users to train the chatbot on their personal data, the presence of such vulnerabilities is particularly concerning. Users are strongly advised to update their ChatRTX software to the latest version to mitigate potential risks and protect their personal information. ChatRTX remains in beta version, with no official release candidate timeline announced. As NVIDIA continues to develop and refine this innovative AI chatbot, the company must prioritize security and promptly address any vulnerabilities that may arise, ensuring a safe and reliable user experience.

Microsoft Copilot to Run Locally on AI PCs with at Least 40 TOPS of NPU Performance

btarunr

Mar 27th, 2024 20:37 Discuss (25 Comments)

Microsoft, Intel, and AMD are attempting to jumpstart demand in the PC industry again, under the aegis of the AI PC—devices with native acceleration for AI workloads. Both Intel and AMD have mobile processors with on-silicon NPUs (neural processing units), which are designed to accelerate the first wave of AI-enhanced client experiences on Windows 11 23H2. Microsoft's bulwark with democratizing AI has been Copilot, as a licensee of Open AI GPT-4, GPT-4 Turbo, Dali, and other generative AI tools from the Open AI stable. Copilot is currently Microsoft's most heavily invested application, with its most capital and best minds mobilized to making it the most popular AI assistant. Microsoft even pushed for the AI PC designator to PC OEMs, which requires them to have a dedicated Copilot key akin to the Start key (we'll see how anti-competition regulators deal with that).

The problem with Microsoft's tango with Intel and AMD to push AI PCs, is that Copilot doesn't really use an NPU, not even at the edge—you input a query or a prompt, and Copilot hands it over to a cloud-based AI service. This is about to change, with Microsoft announcing that Copilot will be able to run locally on AI PCs. Microsoft identified several kinds of Copilot use-cases that an NPU can handle on-device, which should speed up response times to Copilot queries, but this requires the NPU to have at least 40 TOPS of performance. This is a problem for the current crop of processors with NPUs. Intel's Core Ultra "Meteor Lake" has an AI Boost NPU with 10 TOPS on tap, while the Ryzen 8040 "Hawk Point" is only slightly faster, with a 16 TOPS Ryzen AI NPU. AMD has already revealed that the XDNA 2-based 2nd Generation Ryzen AI NPU in its upcoming "Strix Point" processors will come with over 40 TOPS of performance, and it stands to reason that the NPUs in Intel's "Arrow Lake" or "Lunar Lake" processors are comparable in performance; which should enable on-device Copilot.

Report Suggests Naver Siding with Samsung in $752 Million "Mach-1" AI Chip Deal

T0@st

Mar 27th, 2024 14:44 Discuss (3 Comments)

Samsung debuted its Mach-1 generation of AI processors during a recent shareholder meeting—the South Korean megacorp anticipates an early 2025 launch window. Their application-specific integrated circuit (ASIC) design is expected to "excel in edge computing applications," with a focus on low power and efficiency-oriented operating environments. Naver Corporation was a key NVIDIA high-end AI customer in South Korea (and Japan), but the leading search platform firm and creator of HyperCLOVA X LLM (reportedly) deliberated on an adoption alternative hardware last October. The Korea Economic Daily believes that Naver's relationship with Samsung is set to grow, courtesy of a proposed $752 million investment: "the world's top memory chipmaker, will supply its next-generation Mach-1 artificial intelligence chips to Naver Corp. by the end of this year."

Reports from last December indicated that the two companies were deep into the process of co-designing power-efficient AI accelerators—Naver's main goal is to finalize a product that will offer eight times more energy efficiency than NVIDIA's H100 AI accelerator. Naver's alleged bulk order—of roughly 150,000 to 200,000 Samsung Mach-1 AI chips—appears to be a stopgap. Industry insiders reckon that Samsung's first-gen AI accelerator is much cheaper when compared to NVIDIA H100 GPU price points—a per-unit figure of $3756 is mentioned in the KED Global article. Samsung is speculated to be shopping its fledgling AI tech to Microsoft and Meta.

NVIDIA Hopper Leaps Ahead in Generative AI at MLPerf

Press Release by

T0@st

Mar 27th, 2024 12:47 Discuss (15 Comments)

It's official: NVIDIA delivered the world's fastest platform in industry-standard tests for inference on generative AI. In the latest MLPerf benchmarks, NVIDIA TensorRT-LLM—software that speeds and simplifies the complex job of inference on large language models—boosted the performance of NVIDIA Hopper architecture GPUs on the GPT-J LLM nearly 3x over their results just six months ago. The dramatic speedup demonstrates the power of NVIDIA's full-stack platform of chips, systems and software to handle the demanding requirements of running generative AI. Leading companies are using TensorRT-LLM to optimize their models. And NVIDIA NIM—a set of inference microservices that includes inferencing engines like TensorRT-LLM—makes it easier than ever for businesses to deploy NVIDIA's inference platform.

Raising the Bar in Generative AI
TensorRT-LLM running on NVIDIA H200 Tensor Core GPUs—the latest, memory-enhanced Hopper GPUs—delivered the fastest performance running inference in MLPerf's biggest test of generative AI to date. The new benchmark uses the largest version of Llama 2, a state-of-the-art large language model packing 70 billion parameters. The model is more than 10x larger than the GPT-J LLM first used in the September benchmarks. The memory-enhanced H200 GPUs, in their MLPerf debut, used TensorRT-LLM to produce up to 31,000 tokens/second, a record on MLPerf's Llama 2 benchmark. The H200 GPU results include up to 14% gains from a custom thermal solution. It's one example of innovations beyond standard air cooling that systems builders are applying to their NVIDIA MGX designs to take the performance of Hopper GPUs to new heights.

News Posts matching #AI

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts