News Posts matching #AI

Return to Keyword Browsing

Microsoft Investment in Mistral Attracts Possible Investigation by EU Regulators

Tech giant Microsoft and Paris-based startup Mistral AI, an innovator in open-source AI model development, have announced a new multi-year partnership to accelerate AI innovation and expand access to Mistral's state-of-the-art models. The collaboration will leverage Azure's cutting-edge AI infrastructure to propel Mistral's research and bring its innovations to more customers globally. The partnership focuses on three core areas. First, Microsoft will provide Mistral with Azure AI supercomputing infrastructure to power advanced AI training and inference for Mistral's flagship models like Mistral-Large. Second, the companies will collaborate on AI research and development to push AI model's boundaries. And third, Azure's enterprise capabilities will give Mistral additional opportunities to promote, sell, and distribute their models to Microsoft customers worldwide.

However, an investment in a European startup can not go smoothly without the constant eyesight of the European Union authorities and regulators to oversee the deal. According to Bloomberg, an EU spokesperson on Tuesday claimed that the EU regulators will perform an analysis of Microsoft's investment into Mistral after receiving a copy of the agreement between the two parties. While there is no formal investigation yet, if EU regulators continue to probe Microsoft's deal and intentions, they could launch a complete formal investigation that could lead to the termination of Microsoft's plans. Of course, the formal investigation is still on hold, but investing in EU startups might become unfeasible for American tech giants if the EU regulators continue to push the scrutiny of every investment made in companies based on EU soil.

Samsung Electronics Joins AI-RAN Alliance as a Founding Member

Samsung Electronics announced that it is participating in the AI-RAN Alliance as a founding member, with the goal of promoting 6G innovation by combining AI technology and wireless communication technology. Officially launched at Mobile World Congress (MWC) Barcelona 2024 today, the AI-RAN Alliance is an organization aimed at revitalizing the convergence of AI and wireless communication and leading technology innovation through cooperation with related companies. A total of eleven organizations—including Samsung, Arm, Ericsson, Microsoft, Nokia, NVIDIA, SoftBank and Northeastern University—are participating as founding members. This new alliance will collaborate on the development of innovative new technologies, as well as the application of these technologies to commercial products in preparation for the upcoming 6G era.

"Emerging services in the 6G era will revolutionize the way people interact with technology, and AI will be an integral part of this trend," said Charlie Zhang, Senior Vice President at Samsung Research America. "The AI-RAN Alliance will foster collaboration, drive innovation and usher in a new era of transformation around AI and 6G networks. We believe this coalition will create new value for end users and operators through AI-based use cases and innovations."

Lenovo Unveils Trailblazing Products and Solutions Designed to Power AI for All at MWC 2024

Today at MWC 2024, Lenovo unveiled its latest portfolio of purpose-built AI devices, software, and infrastructure solutions, as well as showcased two proof-of-concepts devices that challenge the traditional PC and smartphone form factors. The company also revealed the future of hybrid AI fueling multi-device, software, and service offerings for more personalization, collaboration and efficiency.

"Lenovo's suite of AI-enabled, AI-ready, and AI-optimized devices, infrastructure, solutions, and services at MWC provides a wider look at our vision for "AI for All"," said Lenovo Chairman and CEO, Yuanqing Yang. "Lenovo's AI technology benefits organizations of all sizes, driving intelligent transformation across all industries while reinforcing our commitment to sustainability."

Intel Announces New Edge Platform for Scaling AI Applications

At MWC 2024, Intel announced its new Edge Platform, a modular, open software platform enabling enterprises to develop, deploy, run, secure, and manage edge and AI applications at scale with cloud-like simplicity. Together, these capabilities will accelerate time-to-scale deployment for enterprises, contributing to improved total cost of ownership (TCO).

"The edge is the next frontier of digital transformation, being further fueled by AI. We are building on our strong customer base in the market and consolidating our years of software initiatives to the next level in delivering a complete edge-native platform, which is needed to enable infrastructure, applications and efficient AI deployments at scale. Our modular platform is exactly that, driving optimal edge infrastructure performance and streamlining application management for enterprises, giving them both improved competitiveness and improved total cost of ownership," said Pallavi Mahajan, Intel corporate vice president and general manager of Network and Edge Group Software.

LG Announces US Pricing and Availability of 2024 Gram Pro Notebooks

LG Electronics USA (LG) today announced pricing and availability of its 2024 premium lineup of laptops - the LG gram Pro and LG gram Pro 2-in-1. The 16- and 17-inch LG gram Pro models retail for $2399 (16Z90SP-A.ADB9U1) and $2499 (17Z90SP-E.ADB9U1) respectively while the CES 2024 Innovation Award-winning 16-inch gram Pro 2-in-1 (16T90SP-K.ADB9U1) retails for $2099. For a limited time, customers shopping on LG.com can pre-order the 16- and 17-inch LG gram Pro and 16-inch LG gram Pro 2-in-1.

Throughout the duration of the pre-order period from February 21, 2024, to March 10, 2024, customers will be able to purchase the 32 GB-RAM/2 TB-SSD LG gram Pro laptop for the price of a 16 GB-RAM/1 TB-SSD model of the same screen size. They'll also receive an LG gram +view IPS portable monitor (16MR70.ASDU) and expedited shipping at no additional cost. All standard terms of purchase apply.

Huawei Introduces HONOR MagicBook Pro 16

Global technology brand HONOR today unveiled the HONOR MagicBook Pro 16, a revolutionary AI-powered laptop which sets to redefine the traditional laptop landscape. Based on HONOR's platform-level AI capabilities and joint efforts with technology partners such as Microsoft, Intel and NVIDIA, HONOR is bringing PCs into the AI PC era, marking a significant milestone in computing, offering users an unparalleled AI experience and transforming their device interactions.

"At HONOR, our commitment lies in embracing open collaboration with industry partners to foster a flourishing ecosystem. We firmly believe in the transformative power of collaborative synergy, especially in the era of AI. By leveraging the collective expertise of top industry players, we are dedicated to crafting exceptional products and delivering unparalleled experiences to consumers worldwide," said George Zhao, CEO at HONOR.

SmartCow introduces Uranus Plus AI Fanless Embedded System Powered by NVIDIA Jetson Orin

SmartCow, an AI engineering company specializing in building complex hardware and software solutions for artificial intelligence at the edge, announces the launch of their latest product, Uranus Plus, an AI fanless embedded system powered by the latest NVIDIA Jetson Orin NX and Jetson Orin Nano system-on-modules. With its thermally efficient design and compact form factor, Uranus Plus is suitable for various smart applications. Uranus Plus comes with options for 5G, 4G, and Wi-Fi connectivity and includes a 256Gb NVMe SSD, enabling the simultaneous operation of multiple neural networks and processing of high-resolution images, enhancing a groundbreaking benchmark in AI-driven capabilities at the edge with support up to 100 TOPS of AI compute.

Uranus Plus supercharges vision AI application development at the edge with NVIDIA Metropolis Microservices for Jetson through app stack modernization. Uranus Plus developers now get access to the latest generative AI capabilities through simple API calls, along with a far faster path to development and cloud-native deployment of vision AI applications at the far edge.

Intel Brings AI Everywhere Across Network, Edge, Enterprise

At MWC 2024, Intel announced new platforms, solutions and services spanning network and edge AI, Intel Core Ultra processors and the AI PC, and more. In an era where technological advancements are integral to staying competitive, Intel is delivering products and solutions for its customers, partners and expansive ecosystem to capitalize on the emerging opportunities of artificial intelligence and built-in automation, to improve total cost of ownership (TCO) and operational efficiency, and to deliver new innovations and services.

Across today's announcements, Intel is focused on empowering the industry to further modernize and monetize 5G, edge and enterprise infrastructures and investments, and to take advantage of bringing AI Everywhere. For more than a decade, and alongside Intel's customers and partners, the company has been transforming today's network infrastructure from fixed-function to a software-defined platform and driving success at the edge with more than 90,000 real-world deployments.

Intel Optimizes PyTorch for Llama 2 on Arc A770, Higher Precision FP16

Intel just announced optimizations for PyTorch (IPEX) to take advantage of the AI acceleration features of its Arc "Alchemist" GPUs.PyTorch is a popular machine learning library that is often associated with NVIDIA GPUs, but it is actually platform-agnostic. It can be run on a variety of hardware, including CPUs and GPUs. However, performance may not be optimal without specific optimizations. Intel offers such optimizations through the Intel Extension for PyTorch (IPEX), which extends PyTorch with optimizations specifically designed for Intel's compute hardware.

Intel released a blog post detailing how to run Meta AI's Llama 2 large language model on its Arc "Alchemist" A770 graphics card. The model requires 14 GB of GPU RAM, so a 16 GB version of the A770 is recommended. This development could be seen as a direct response to NVIDIA's Chat with RTX tool, which allows GeForce users with >8 GB RTX 30-series "Ampere" and RTX 40-series "Ada" GPUs to run PyTorch-LLM models on their graphics cards. NVIDIA achieves lower VRAM usage by distributing INT4-quantized versions of the models, while Intel uses a higher-precision FP16 version. In theory, this should not have a significant impact on the results. This blog post by Intel provides instructions on how to set up Llama 2 inference with PyTorch (IPEX) on the A770.

NVIDIA Prepared to Offer Custom Chip Designs to AI Clients

NVIDIA is reported to be setting up an AI-focused semi-custom chip design business unit, according to inside sources known to Reuters—it is believed that Team Green leadership is adapting to demands leveraged by key data-center customers. Many companies are seeking cheaper alternatives, or have devised their own designs (budget/war chest permitting)—NVIDIA's current range of AI GPUs are simply off-the-shelf solutions. OpenAI has generated the most industry noise—their alleged early 2024 fund-raising pursuits have attracted plenty of speculative/kind-of-serious interest from notable semiconductor personalities.

Team Green is seemingly reacting to emerging market trends—Jensen Huang (CEO, president and co-founder) has hinted that NVIDIA custom chip designing services are on the cusp. Stephen Nellis—a Reuters reporter specializing in tech industry developments—has highlighted select NVIDIA boss quotes from an incoming interview piece: "We're always open to do that. Usually, the customization, after some discussion, could fall into system reconfigurations or recompositions of systems." The Team Green chief teased that his engineering team is prepared to take on the challenge meeting exact requests: "But if it's not possible to do that, we're more than happy to do a custom chip. And the benefit to the customer, as you can imagine, is really quite terrific. It allows them to extend our architecture with their know-how and their proprietary information." The rumored NVIDIA semi-custom chip design business unit could be introduced in an official capacity at next month's GTC 2024 Conference.

NVIDIA Expects Upcoming Blackwell GPU Generation to be Capacity-Constrained

NVIDIA is anticipating supply issues for its upcoming Blackwell GPUs, which are expected to significantly improve artificial intelligence compute performance. "We expect our next-generation products to be supply constrained as demand far exceeds supply," said Colette Kress, NVIDIA's chief financial officer, during a recent earnings call. This prediction of scarcity comes just days after an analyst noted much shorter lead times for NVIDIA's current flagship Hopper-based H100 GPUs tailored to AI and high-performance computing. The eagerly anticipated Blackwell architecture and B100 GPUs built on it promise major leaps in capability—likely spurring NVIDIA's existing customers to place pre-orders already. With skyrocketing demand in the red-hot AI compute market, NVIDIA appears poised to capitalize on the insatiable appetite for ever-greater processing power.

However, the scarcity of NVIDIA's products may present an excellent opportunity for significant rivals like AMD and Intel. If both companies can offer a product that could beat NVIDIA's current H100 and provide a suitable software stack, customers would be willing to jump to their offerings and not wait many months for the anticipated high lead times. Intel is preparing the next-generation Gaudi 3 and working on the Falcon Shores accelerator for AI and HPC. AMD is shipping its Instinct MI300 accelerator, a highly competitive product, while already working on the MI400 generation. It remains to be seen if AI companies will begin the adoption of non-NVIDIA hardware or if they will remain a loyal customer and agree to the higher lead times of the new Blackwell generation. However, capacity constrain should only be a problem at launch, where the availability should improve from quarter to quarter. As TSMC improves CoWoS packaging capacity and 3 nm production, NVIDIA's allocation of the 3 nm wafers will likely improve over time as the company moves its priority from H100 to B100.

SK Hynix VP Reveals HBM Production Volumes for 2024 are Sold Out

SK Hynix Vice President Kitae Kim presides over the company's HBM Sales & Marketing (S&M) department—an official leadership blog profile reveals that the executive played a key role in making the South Korean supplier's high bandwidth memory (HBM) product line "a superstar of the semiconductor memory industry in 2023." Growing demand for powerful AI processors has placed SK Hynix in a more comfortable position, following recessive spells—including a major sales downturn in 2022. NVIDIA is the market leader in AI processing chips, and many of its flagship enterprise designs are fitted with cutting-edge SK Hynix memory modules. Kim noted that his firm has many notable international clients: "HBM is a revolutionary product which has challenged the notion that semiconductor memory is only one part of an overall system...in particular, SK Hynix's HBM has outstanding competitiveness. Our advanced technology is highly sought after by global tech companies."

The VP outlined how artificial intelligence industries are fuelling innovations: "With the diversification and advancement of generative AI services, demand for HBM, an AI memory solution, has also exploded. HBM, with its high-performance and high-capacity characteristics, is a monumental product that shakes the conventional wisdom that memory semiconductors are only a part of the overall system. In particular, SK Hynix HBM's competitiveness is outstanding." Business is booming, so much so that nothing can be added to this year's HBM order books: "Proactively securing customer purchase volumes and negotiating more favorable conditions for our high-quality products are the basics of semiconductor sales operations. With excellent products in hand, it's a matter of speed. Our planned production volume of HBM this year has already sold out. Although 2024 has just begun, we've already started preparing for 2025 to stay ahead of the market."

US Commerce Chief: Nation Requires Additional Chip Funding

US Commerce Secretary, Gina Raimondo, was a notable guest speaker during yesterday's Intel Foundry Direct Connect Keynote—she was invited on (via a video link) to discuss the matter of strengthening the nation's semiconductor industry, and staying competitive with global rivals. During discussions, Pat Gelsinger (Intel CEO) cheekily asked whether a "CHIPS Act Part Two" was in the pipeline. Raimondo responded by stating that she is till busy with the original $52 billion tranche: "I'm out of breath running as fast as I can implementing CHIPS One." Earlier this week, her department revealed a $1.5 billion planned direct fund for GlobalFoundries: "this investment will enable GF to expand and create new manufacturing capacity and capabilities to securely produce more essential chips for automotive, IoT, aerospace, defense, and other vital markets."

Intel is set to receive a large grant courtesy of the US government's 2022-launched CHIPS and Science Act—exact figures have not been revealed to the public, but a Nikkei Asia report suggests that Team Blue will be benefiting significantly in the near future: "While the Commerce Department has not yet announced how much of the funding package's $52 billion it would grant Intel, the American chipmaker is expected to get a significant portion, according to analysts and officials close to the situation." Raimondo stated that: "Intel is an American champion company and has a very huge role to play in this revitalization." The US Commerce Chief also revealed that she had spoken with artificial intelligence industry leaders, including OpenAI's Sam Altman, about the ever-growing demand for AI-crunching processors/accelerators/GPUs. The country's semiconductor production efforts could be bolstered once more, in order to preserve a competitive edge—Raimondo addressed Gelsinger's jokey request for another batch of subsidies: "I suspect there will have to be—whether you call it Chips Two or something else—continued investment if we want to lead the world...We fell pretty far. We took our eye off the ball."

Cadence Digital and Custom/Analog Flows Certified for Latest Intel 18A Process Technology

Cadence's digital and custom/analog flows are certified on the Intel 18A process technology. Cadence design IP supports this node from Intel Foundry, and the corresponding process design kits (PDKs) are delivered to accelerate the development of a wide variety of low-power consumer, high-performance computing (HPC), AI and mobile computing designs. Customers can now begin using the production-ready Cadence design flows and design IP to achieve design goals and speed up time to market.

"Intel Foundry is very excited to expand our partnership with Cadence to enable key markets for the leading-edge Intel 18A process technology," said Rahul Goyal, Vice President and General Manager, Product and Design Ecosystem, Intel Foundry. "We will leverage Cadence's world-class portfolio of IP, AI design technologies, and advanced packaging solutions to enable high-volume, high-performance, and power-efficient SoCs in Intel Foundry's most advanced process technology. Cadence is an indispensable partner supporting our IDM2.0 strategy and the Intel Foundry ecosystem."

NVIDIA Announces Q4 and Fiscal 2024 Results, Clocks 126% YoY Revenue Growth, Gaming Just 1/6th of Data Center Revenues

NVIDIA (NASDAQ: NVDA) today reported revenue for the fourth quarter ended January 28, 2024, of $22.1 billion, up 22% from the previous quarter and up 265% from a year ago. For the quarter, GAAP earnings per diluted share was $4.93, up 33% from the previous quarter and up 765% from a year ago. Non-GAAP earnings per diluted share was $5.16, up 28% from the previous quarter and up 486% from a year ago.

For fiscal 2024, revenue was up 126% to $60.9 billion. GAAP earnings per diluted share was $11.93, up 586% from a year ago. Non-GAAP earnings per diluted share was $12.96, up 288% from a year ago. "Accelerated computing and generative AI have hit the tipping point. Demand is surging worldwide across companies, industries and nations," said Jensen Huang, founder and CEO of NVIDIA.

Arm Launches Next-Generation Neoverse CSS V3 and N3 Designs for Cloud, HPC, and AI Acceleration

Last year, Arm introduced its Neoverse Compute Subsystem (CSS) for the N2 and V2 series of data center processors, providing a reference platform for the development of efficient Arm-based chips. Major cloud service providers like AWS with Graviton 4 and Trainuium 2, Microsoft with Cobalt 100 and Maia 100, and even NVIDIA with Grace CPU and Bluefield DPUs are already utilizing custom Arm server CPU and accelerator designs based on the CSS foundation in their data centers. The CSS allows hyperscalers to optimize Arm processor designs specifically for their workloads, focusing on efficiency rather than outright performance. Today, Arm has unveiled the next generation CSS N3 and V3 for even greater efficiency and AI inferencing capabilities. The N3 design provides up to 32 high-efficiency cores per die with improved branch prediction and larger caches to boost AI performance by 196%, while the V3 design scales up to 64 cores and is 50% faster overall than previous generations.

Both the N3 and V3 leverage advanced features like DDR5, PCIe 5.0, CXL 3.0, and chiplet architecture, continuing Arm's push to make chiplets the standard for data center and cloud architectures. The chiplet approach enables customers to connect their own accelerators and other chiplets to the Arm cores via UCIe interfaces, reducing costs and time-to-market. Looking ahead, Arm has a clear roadmap for its Neoverse platform. The upcoming CSS V4 "Adonis" and N4 "Dionysus" designs will build on the improvements in the N3 and V3, advancing Arm's goal of greater efficiency and performance using optimized chiplet architectures. As more major data center operators introduce custom Arm-based designs, the Neoverse CSS aims to provide a flexible, efficient foundation to power the next generation of cloud computing.

Microsoft Auto-updating Eligible Windows 11 PCs to Version 23H2

Windows 11 version 23H2 started rolling out last October, but many users of Microsoft's flagship operating system opted out of an upgrade, thanks to a handy "optional" toggle. News outlets have latched onto a freshly published (February 20) Windows 11 "Release Health" notice—the official Microsoft dashboard alert states that Windows 11 2023 Update: "is now entering a new rollout phase." Fastidious users will not be happy to discover that "eligible Windows 11 devices" are now subject to an automatic bump up to version 23H2. Very passive-aggressive tactics have been utilized in the past—Microsoft is seemingly eager to get it audience upgraded onto its latest and greatest feature-rich experience.

According to NeoWin, an official announcement from last week alerted users to an "impending end of optional preview updates on Windows 11 22H2." Yesterday's "23H2" dashboard confessional provided a little bit more context—unsurprisingly involving artificial intelligence: "This automatic update targets Windows 11 devices that have reached or are approaching end of servicing, and it follows the machine learning-based (ML) training we have utilized so far. We will continue to train our intelligent ML model to safely roll out this new Windows version in phases to deliver a smooth update experience."

Google's Gemma Optimized to Run on NVIDIA GPUs, Gemma Coming to Chat with RTX

NVIDIA, in collaboration with Google, today launched optimizations across all NVIDIA AI platforms for Gemma—Google's state-of-the-art new lightweight 2 billion- and 7 billion-parameter open language models that can be run anywhere, reducing costs and speeding innovative work for domain-specific use cases.

Teams from the companies worked closely together to accelerate the performance of Gemma—built from the same research and technology used to create the Gemini models—with NVIDIA TensorRT-LLM, an open-source library for optimizing large language model inference, when running on NVIDIA GPUs in the data center, in the cloud and on PCs with NVIDIA RTX GPUs. This allows developers to target the installed base of over 100 million NVIDIA RTX GPUs available in high-performance AI PCs globally.

Sony Announces Launch of PlayStation Pulse Elite Wireless Headset

Launching today, our latest wireless headset, Pulse Elite, brings crisp, immersive audio to the gaming experience on the PS5 console; to PlayStation Link supported devices including PS5, PC, Mac, and PlayStation Portal remote player; and to Bluetooth compatible devices such as smartphones and tablets. Pulse Elite follows the launch of our first wireless earbuds, Pulse Explore, with both audio devices featuring planar magnetic drivers to further enhance the PS5 console's Tempest 3D AudioTech. When combined with PlayStation Link, the planar drivers precisely deliver the output of the 3D audio algorithms directly to the player's ear without loss, and nearly no distortion or delay. Here's our quick-start guide on setting up and using the Pulse Elite wireless headset, along with Pulse Explore wireless earbuds.

Set up and use sidetone and 3D audio features on PS5
A tour of the headset appears when you first connect the Pulse Elite wireless headset or Pulse Explore wireless earbuds to your PS5 console via the included PlayStation Link USB adapter. Before diving into a game, I recommend personalizing 3D audio settings and adjusting sidetone volume (changing this adjusts how loudly you hear your own voice in your ear when you talk). It's also possible to create a custom name for the headset, with standard letters, symbols, and even emoji. After the tour, you can change settings at any time while the headset is connected by navigating to the Settings menu and selecting Accessories, followed by Pulse Elite wireless headset.

Supermicro Unveils New Edge AI Systems

Supermicro, Inc., a Total IT Solution Manufacturer for AI, Cloud, Storage, and 5G/Edge, is expanding its portfolio of AI solutions, allowing customers to leverage the power and capability of AI in edge locations, such as public spaces, retail stores, or industrial infrastructure. Using Supermicro application-optimized servers with NVIDIA GPUs makes it easier to fine-tune pre-trained models and for AI inference solutions to be deployed at the edge where the data is generated, improving response times and decision-making.

"Supermicro has the broadest portfolio of Edge AI solutions, capable of supporting pre-trained models for our customers' edge environments," said Charles Liang, president and CEO of Supermicro. "The Supermicro Hyper-E server, based on the dual 5th Gen Intel Xeon processors, can support up to three NVIDIA H100 Tensor Core GPUs, delivering unparalleled performance for Edge AI. With up to 8 TB of memory in these servers, we are bringing data center AI processing power to edge locations. Supermicro continues to provide the industry with optimized solutions as enterprises build a competitive advantage by processing AI data at their edge locations."

Jensen Huang to Unveil Latest AI Breakthroughs at GTC 2024 Conference

NVIDIA today announced it will host its flagship GTC 2024 conference at the San Jose Convention Center from March 18-21. More than 300,000 people are expected to register to attend in person or virtually. NVIDIA founder and CEO Jensen Huang will deliver the keynote from the SAP Center on Monday, March 18, at 1 p.m. Pacific time. It will be livestreamed and available on demand. Registration is not required to view the keynote online. Since Huang first highlighted machine learning in his 2014 GTC keynote, NVIDIA has been at the forefront of the AI revolution. The company's platforms have played a crucial role in enabling AI across numerous domains including large language models, biology, cybersecurity, data center and cloud computing, conversational AI, networking, physics, robotics, and quantum, scientific and edge computing.

The event's 900 sessions and over 300 exhibitors will showcase how organizations are deploying NVIDIA platforms to achieve remarkable breakthroughs across industries, including aerospace, agriculture, automotive and transportation, cloud services, financial services, healthcare and life sciences, manufacturing, retail and telecommunications. "Generative AI has moved to center stage as governments, industries and organizations everywhere look to harness its transformative capabilities," Huang said. "GTC has become the world's most important AI conference because the entire ecosystem is there to share knowledge and advance the state of the art. Come join us."

AMD Ryzen 8040 NPU Monitoring Coming to Windows Task Manager

AMD's first generation XDNA-based Neural Processing Unit (NPU) arrived last year, as an onboard aspect of their "Phoenix" Ryzen 7040 mobile processor series, followed many months later by Intel's similarly NPU-laden Core Ultra "Meteor Lake" generation. It was recently revealed that a Windows 11 DirectML preview grants preliminary support for Core Ultra NPUs—Microsoft's software engineering department seems to be prioritizing Intel AI tech. Team Red has already released XDNA on desktop platforms—with its Ryzen 8000G APU family—and the "Hawk Point" 8040 series is nearing a retail launch, but these processors (plus 7040) remain unsupported by Microsoft's DirectML API. An interesting AMD community blog entry was posted two weeks—news outlets have been slow to pick up on its relevance.

Intel NPU activity can be monitored in Windows Task Manager (see screenshot below), and an upcoming update will add competing AMD parts to the mix. Joel Hruska's Team Red community blog post reveals that NPU monitoring for Ryzen 8040 series processors is due soon: " As AI PCs become more popular, there's a growing need for system monitoring tools that can track the performance of the new NPUs (Neural Processing Units) available on select Ryzen 8040 Series mobile processors. A neural processing unit - also sometimes referred to an integrated or on-die AI engine -- can improve battery life by offloading AI tasks that would otherwise be performed on the CPU or GPU. AMD has been working with Microsoft to enable MCDM (Microsoft Compute Driver Model) infrastructure on the AMD NPU (Neural Processing Unit)-enabled Ryzen 8040 Series of mobile processors. MCDM is a derivative of Windows Display Driver Model (WDDM) that is targeting non-GPU, compute devices, such as the NPU. MCDM enables NPUs to make use of the existing GPU device management infrastructure, including scheduling, power management, memory management, and performance debugging with tools such as the Task Manager. MCDM serves as a fundamental layer, ensuring the smooth execution of AI workloads on NPU devices."

ASUS New Vivobook S Series Also Comes With AI-Enabled AMD Ryzen 8040 Series CPUs

ASUS today announced brand-new ASUS Vivobook S series laptops for 2024, designed for a sleek and lightweight lifestyle. These laptops - all featuring ASUS Lumina OLED display options - are driven by up to the latest AI-enabled processors from AMD, and offer exceptional performance. The series includes the 14.0-inch ASUS Vivobook S 14 OLED M5406, the 15.6-inch ASUS Vivobook S 15 OLED M5506, and the 16.0-inch ASUS Vivobook S 16 OLED M5606. ASUS Vivobook S series laptops are not only powerful but also lightweight, making them perfect for individuals who need both productivity and entertainment while on the move. They come in contemporary color options and feature a minimalist, high-end design, striking a balance between mobility and performance.

The latest 2024 ASUS Vivobook S series laptops are equipped with up to AMD Ryzen 8040 Series Processors, boasting a TDP of up to 50 watts and built-in AMD Ryzen AI acceleration for efficient performance in modern AI applications. A dedicated Copilot key on the keyboard allows users to effortlessly dive into Windows 11's AI-powered tools with just one press.The laptops provide lifelike visuals through ASUS Lumina OLED displays, offering resolutions of up to 3.2K (M5606), a 120 Hz refresh rate, a 100% DCI-P3 color gamut, and VESA DisplayHDR True Black 600 certification. The ASUS ErgoSense keyboard, known for its style and comfort, now features customizable single-zone RGB backlighting, and there's an extra-large ErgoSense touchpad. Prioritizing user experience, these ASUS Vivobook S models include a lay-flat 180° hinge, an IR camera with a physical shutter, a full range of I/O ports, and immersive Dolby Atmos audio from the powerful Harman Kardon-certified stereo speakers.

Groq LPU AI Inference Chip is Rivaling Major Players like NVIDIA, AMD, and Intel

AI workloads are split into two different categories: training and inference. While training requires large computing and memory capacity, access speeds are not a significant contributor; inference is another story. With inference, the AI model must run extremely fast to serve the end-user with as many tokens (words) as possible, hence giving the user answers to their prompts faster. An AI chip startup, Groq, which was in stealth mode for a long time, has been making major moves in providing ultra-fast inference speeds using its Language Processing Unit (LPU) designed for large language models (LLMs) like GPT, Llama, and Mistral LLMs. The Groq LPU is a single-core unit based on the Tensor-Streaming Processor (TSP) architecture which achieves 750 TOPS at INT8 and 188 TeraFLOPS at FP16, with 320x320 fused dot product matrix multiplication, in addition to 5,120 Vector ALUs.

Having massive concurrency with 80 TB/s of bandwidth, the Groq LPU has 230 MB capacity of local SRAM. All of this is working together to provide Groq with a fantastic performance, making waves over the past few days on the internet. Serving the Mixtral 8x7B model at 480 tokens per second, the Groq LPU is providing one of the leading inference numbers in the industry. In models like Llama 2 70B with 4096 token context length, Groq can serve 300 tokens/s, while in smaller Llama 2 7B with 2048 tokens of context, Groq LPU can output 750 tokens/s. According to the LLMPerf Leaderboard, the Groq LPU is beating the GPU-based cloud providers at inferencing LLMs Llama in configurations of anywhere from 7 to 70 billion parameters. In token throughput (output) and time to first token (latency), Groq is leading the pack, achieving the highest throughput and second lowest latency.

GlobalFoundries and Biden-Harris Administration Announce CHIPS and Science Act Funding for Essential Chip Manufacturing

The U.S. Department of Commerce today announced $1.5 billion in planned direct funding for GlobalFoundries (Nasdaq: GFS) (GF) as part of the U.S. CHIPS and Science Act. This investment will enable GF to expand and create new manufacturing capacity and capabilities to securely produce more essential chips for automotive, IoT, aerospace, defense, and other vital markets.

New York-headquartered GF, celebrating its 15th year of operations, is the only U.S.-based pure play foundry with a global manufacturing footprint including facilities in the U.S., Europe, and Singapore. GF is the first semiconductor pure play foundry to receive a major award (over $1.5 billion) from the CHIPS and Science Act, designed to strengthen American semiconductor manufacturing, supply chains and national security. The proposed funding will support three GF projects:
Return to Keyword Browsing
May 20th, 2024 23:47 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts