News Posts matching #AI

Return to Keyword Browsing

Intel Sets 100 Million CPU Supply Goal for AI PCs by 2025

Intel has been hyping up their artificial intelligence-augmented processor products since late last year—their "AI Everywhere" marketing push started with the official launch of Intel Core Ultra mobile CPUs, AKA the much-delayed Meteor Lake processor family. CEO, Pat Gelsinger stated (mid-December 2023): "AI innovation is poised to raise the digital economy's impact up to as much as one-third of global gross domestic product...Intel is developing the technologies and solutions that empower customers to seamlessly integrate and effectively run AI in all their applications—in the cloud and, increasingly, locally at the PC and edge, where data is generated and used." Team Blue's presence at this week's MWC Barcelona 2024 event introduced "AI Everywhere Across Network, Edge, Enterprise."

Nikkei Asia sat down with Intel's David Feng—Vice President of Client Computing Group and General Manager of Client Segments. The impressively job-titled executive discussed the "future of AI PCs," and set some lofty sales goals for his firm. According to the Nikkei report, Intel leadership expects to "deliver 40 million AI PCs" this year and a further 60 million units next year—representing "more than 20% of the projected total global PC market in 2025." Feng and his colleagues predict that mainstream customers will prefer to use local "on-device" AI solutions (equipped with NPUs), rather than rely on remote cloud services. Significant Edge AI improvements are expected to arrive with next generation Lunar Lake and Arrow Lake processor families, the latter will be bringing Team Blue NPU technologies to desktop platforms—AMD's Ryzen 8000G series of AM5 APUs launched with XDNA engines last month.

Tiny Corp. Builds AI Platform with Six AMD Radeon RX 7900 XTX GPUs

Tiny Corp., a neural network framework specialist, has revealed intimate details about the ongoing development and building of its "tinybox" system: "I don't think there's much value in secrecy. We have the parts to build 12 boxes and a case that's pretty close to final. Beating back all the PCI-E AER errors was hard, as anyone knows who has tried to build a system like this. Our BOM cost is around $10k, and we are selling them for $15k. We've put a year of engineering into this, it's a lot harder than it first seemed. You are welcome to believe me or not, but unless you are building in huge quantity, you are getting a great deal for $15k." The startup has taken the unusual step of integrating Team Red's current flagship gaming GPU into its AI-crunching platform. Tiny Corp. founder—George Hotz—has documented his past rejections of NVIDIA AI hardware on social media, but TinyBox will not be running AMD's latest Instinct MI300X accelerators. RDNA 3.0 is seemingly favored over CDNA 3.0—perhaps due to growing industry demand for enterprise-grade GPUs.

The rack-mounted 12U TinyBox build houses an AMD EPYC 7532 processor with 128 GB of system memory. Five 1 TB SN850X SSDs take care of storage duties (4 in raid, 1 for boot), and an unoccupied 16x OCP 3.0 slot is designated for networking tasks Two 1600 W PSUs provide necessary electrical juice. The Tiny Corp. social media picture feed indicates that they have acquired a pile of XFX Speedster MERC310 RX 7900 XTX graphics cards—six units are hooked up inside of each TinyBox system. Hotz's young startup has ambitious plans: "The system image shipping with the box will be Ubuntu 22.04. It will only include tinygrad out of the box, but PyTorch and JAX support on AMD have come a long way, and your hardware is your hardware. We make money either way, you are welcome to buy it for any purpose. The goal of the tiny corp is to commoditize the petaflop, and we believe tinygrad is the best way to do it. Solving problems in software is cheaper than in hardware. tinygrad will elucidate the deep structure of what neural networks are. We have 583 preorders, and next week we'll place an order for 100 sets of parts. This is $1M in outlay. We will also ship five of the 12 boxes we have to a few early people who I've communicated with. For everyone else, they start shipping in April. The production line started running yesterday."

Global Server Shipments Expected to Increase by 2.05% in 2024, with AI Servers Accounting For Around 12.1%

TrendForce underscores that the primary momentum for server shipments this year remains with American CSPs. However, due to persistently high inflation and elevated corporate financing costs curtailing capital expenditures, overall demand has not yet returned to pre-pandemic growth levels. Global server shipments are estimated to reach approximately. 13.654 million units in 2024, an increase of about 2.05% YoY. Meanwhile, the market continues to focus on the deployment of AI servers, with their shipment share estimated at around 12.1%.

Foxconn is expected to see the highest growth rate, with an estimated annual increase of about 5-7%. This growth includes significant orders such as Dell's 16G platform, AWS Graviton 3 and 4, Google Genoa, and Microsoft Gen9. In terms of AI server orders, Foxconn has made notable inroads with Oracle and has also secured some AWS ASIC orders.

AAEON BOXER-8653AI & BOXER-8623AI Expand Vertical Market Potential in a More Compact Form

Leading provider of embedded PC solutions, AAEON, is delighted to announce the official launch of two new additions to its rich line of embedded AI systems, the BOXER-8653AI and BOXER-8623AI, which are powered by the NVIDIA Jetson Orin NX and Jetson Orin Nano, respectively. Measuring just 180 mm x 136 mm x 75 mm, both systems are compact and easily wall-mounted for discreet deployment, which AAEON indicate make them ideal for use in both indoor and outdoor settings such as factories and parking lots. Adding to this is the systems' environmental resilience, with the BOXER-8653AI sporting a wide -15°C to 60°C temperature tolerance and the BOXER-8623AI able to operate between -15°C and 65°C, with both supporting a 12 V ~ 24 V power input range via a 2-pin terminal block.

The BOXER-8653AI benefits from the NVIDIA Jetson Orin NX module, offering up to 70 TOPS of AI inference performance for applications that require extremely fast analysis of vast quantities of data. Meanwhile, the BOXER-8623AI utilizes the more efficient, yet still powerful NVIDIA Jetson Orin Nano module, capable of up to 40 TOPS. Both systems consequently make use of the 1024-core NVIDIA Ampere architecture GPU with 32 Tensor Cores.

ServiceNow, Hugging Face & NVIDIA Release StarCoder2 - a New Open-Access LLM Family

ServiceNow, Hugging Face, and NVIDIA today announced the release of StarCoder2, a family of open-access large language models for code generation that sets new standards for performance, transparency, and cost-effectiveness. StarCoder2 was developed in partnership with the BigCode Community, managed by ServiceNow, the leading digital workflow company making the world work better for everyone, and Hugging Face, the most-used open-source platform, where the machine learning community collaborates on models, datasets, and applications. Trained on 619 programming languages, StarCoder2 can be further trained and embedded in enterprise applications to perform specialized tasks such as application source code generation, workflow generation, text summarization, and more. Developers can use its code completion, advanced code summarization, code snippets retrieval, and other capabilities to accelerate innovation and improve productivity.

StarCoder2 offers three model sizes: a 3-billion-parameter model trained by ServiceNow; a 7-billion-parameter model trained by Hugging Face; and a 15-billion-parameter model built by NVIDIA with NVIDIA NeMo and trained on NVIDIA accelerated infrastructure. The smaller variants provide powerful performance while saving on compute costs, as fewer parameters require less computing during inference. In fact, the new 3-billion-parameter model matches the performance of the original StarCoder 15-billion-parameter model. "StarCoder2 stands as a testament to the combined power of open scientific collaboration and responsible AI practices with an ethical data supply chain," emphasized Harm de Vries, lead of ServiceNow's StarCoder2 development team and co-lead of BigCode. "The state-of-the-art open-access model improves on prior generative AI performance to increase developer productivity and provides developers equal access to the benefits of code generation AI, which in turn enables organizations of any size to more easily meet their full business potential."

MiTAC Unleashes Revolutionary Server Solutions, Powering Ahead with 5th Gen Intel Xeon Scalable Processors Accelerated by Intel Data Center GPUs

MiTAC Computing Technology, a subsidiary of MiTAC Holdings Corp., proudly reveals its groundbreaking suite of server solutions that deliver unsurpassed capabilities with the 5th Gen Intel Xeon Scalable Processors. MiTAC introduces its cutting-edge signature platforms that seamlessly integrate the Intel Data Center GPUs, both Intel Max Series and Intel Flex Series, an unparalleled leap in computing performance is unleashed targeting HPC and AI applications.

MiTAC Announce its Full Array of Platforms Supporting the latest 5th Gen Intel Xeon Scalable Processors
Last year, Intel transitioned the right to manufacture and sell products based on Intel Data Center Solution Group designs to MiTAC. MiTAC confidently announces a transformative upgrade to its product offerings, unveiling advanced platforms that epitomize the future of computing. Featured with up to 64 cores, expanded shared cache, increased UPI and DDR5 support, the latest 5th Gen Intel Xeon Scalable Processors deliver remarkable performance per watt gains across various workloads. MiTAC's Intel Server M50FCP Family and Intel Server D50DNP Family fully support the latest 5th Gen Intel Xeon Scalable Processors, made possible through a quick BIOS update and easy technical resource revisions which provide unsurpassed performance to diverse computing environments.

IBM Intros AI-enhanced Data Resilience Solution - a Cyberattack Countermeasure

Cyberattacks are an existential risk, with 89% of organizations ranking ransomware as one of the top five threats to their viability, according to a November 2023 report from TechTarget's Enterprise Strategy Group, a leading analyst firm. And this is just one of many risks to corporate data—insider threats, data exfiltration, hardware failures, and natural disasters also pose significant danger. Moreover, as the just-released 2024 IBM X-Force Threat Intelligence Index states, as the generative AI market becomes more established, it could trigger the maturity of AI as an attack surface, mobilizing even further investment in new tools from cybercriminals. The report notes that enterprises should also recognize that their existing underlying infrastructure is a gateway to their AI models that doesn't require novel tactics from attackers to target.

To help clients counter these threats with earlier and more accurate detection, we're announcing new AI-enhanced versions of the IBM FlashCore Module technology available inside new IBM Storage FlashSystem products and a new version of IBM Storage Defender software to help organizations improve their ability to detect and respond to ransomware and other cyberattacks that threaten their data. The newly available fourth generation of FlashCore Module (FCM) technology enables artificial intelligence capabilities within the IBM Storage FlashSystem family. FCM works with Storage Defender to provide end-to-end data resilience across primary and secondary workloads with AI-powered sensors designed for earlier notification of cyber threats to help enterprises recover faster.

Qualcomm AI Hub Introduced at MWC 2024

Qualcomm Technologies, Inc. unveiled its latest advancements in artificial intelligence (AI) at Mobile World Congress (MWC) Barcelona. From the new Qualcomm AI Hub, to cutting-edge research breakthroughs and a display of commercial AI-enabled devices, Qualcomm Technologies is empowering developers and revolutionizing user experiences across a wide range of devices powered by Snapdragon and Qualcomm platforms.

"With Snapdragon 8 Gen 3 for smartphones and Snapdragon X Elite for PCs, we sparked commercialization of on-device AI at scale. Now with the Qualcomm AI Hub, we will empower developers to fully harness the potential of these cutting-edge technologies and create captivating AI-enabled apps," said Durga Malladi, senior vice president and general manager, technology planning and edge solutions, Qualcomm Technologies, Inc. "The Qualcomm AI Hub provides developers with a comprehensive AI model library to quickly and easily integrate pre-optimized AI models into their applications, leading to faster, more reliable and private user experiences."

TSMC Customers Request Construction of Additional AI Chip Fabs

Morris Chang, TSMC's founder and semiconductor industry icon, was present at the opening ceremony of his company's new semiconductor fabrication plant in Kumamoto Prefecture, Japan. According to a Nikkei Asia article, Chang predicted that the nation will experience "a chip renaissance" during his February 24 commencement speech. The Japanese government also announced that it will supply an additional ¥732 billion ($4.86 billion) in subsidies for Taiwan Semiconductor Manufacturing Co. to expand semiconductor operations on the island of Kyūshū. Economy Minister Ken Saito stated: "TSMC is the most important partner for Japan in realizing digital transformation, and its Kumamoto factory is an important contributor for us to stably procure cutting-edge logic chips that is extremely essential for the future of industries in Japan."

Chang disclosed some interesting insights during last weekend's conference segment—according to Nikkei's report, he revealed that unnamed TSMC customers had made some outlandish requests: "They are not talking about tens of thousands of wafers. They are talking about fabs, (saying): 'We need so many fabs. We need three fabs, five fabs, 10 fabs.' Well, I can hardly believe that one." The Taiwanese chip manufacturing giant reportedly has the resources to create a new "Gigafab" within reasonable timeframes, but demands for (up to) ten new plants are extremely fanciful. Chang set expectations at a reasonable level—he predicted that demand for AI processors would lie somewhere in the middle ground: "between tens of thousands of wafers and tens of fabs." Past insider reports suggested that OpenAI has been discussing the formation of a proprietary fabrication network, with proposed investments of roughly $5 to $7 trillion. OpenAI CEO, Sam Altman, reportedly engaged in talks with notable contract chip manufacturers—The Wall Street Journal posited that TSMC would be an ideal partner.

JPR: Total PC GPU Shipments Increased by 6% From Last Quarter and 20% Year-to-Year

Jon Peddie Research reports the growth of the global PC-based graphics processor unit (GPU) market reached 76.2 million units in Q4'23 and PC CPU shipments increased an astonishing 24% year over year, the biggest year-to-year increase in two and a half decades. Overall, GPUs will have a compound annual growth rate of 3.6% during 2024-2026 and reach an installed base of almost 5 billion units at the end of the forecast period. Over the next five years, the penetration of discrete GPUs (dGPUs) in the PC will be 30%.

AMD's overall market share decreased by -1.4% from last quarter, Intel's market share increased 2.8, and Nvidia's market share decreased by -1.36%, as indicated in the following chart.

LG and Meta Forge Collaboration to Accelerate XR Business

LG Electronics (LG) is ramping up its strategic collaboration with the global tech powerhouse, Meta Platforms, Inc. (Meta), aiming to expedite its extended reality (XR) ventures. The aim is to combine the strengths of both companies across products, content, services and platforms to drive innovation in customer experiences within the burgeoning virtual space.

Forging an XR Collaboration With Meta
On February 28, LG's top management, including CEO William Cho and Park Hyoung-sei, president of the Home Entertainment Company, met with Meta Founder and CEO Mark Zuckerberg at LG Twin Towers in Yeouido, Seoul. This meeting coincided with Zuckerberg's tour of Asia. The two-hour session saw discussions on business strategies and considerations for next-gen XR device development. CEO Cho, while experiencing the Meta Quest 3 headset and Ray-Ban Meta smart glasses, expressed a keen interest in Meta's advanced technology demonstrations, notably focusing on Meta's large language models and its potential for on-device AI integration.

NVIDIA Accused of Acting as "GPU Cartel" and Controlling Supply

World's most important fuel of the AI frenzy, NVIDIA, is facing accusations of acting as a "GPU cartel" and controlling supply in the data center market, according to statements made by executives at rival chipmaker Groq and former AMD executive Scott Herkelman. In an interview with the Wall Street Journal, Groq CEO Jonathan Ross alleged that some of NVIDIA's data center customers are afraid to even meet with rival AI chipmakers out of fear that NVIDIA will retaliate by delaying shipments of already ordered GPUs. This is despite NVIDIA's claims that it is trying to allocate supply fairly during global shortages. "This happens more than you expect, NVIDIA does this with DC customers, OEMs, AIBs, press, and resellers. They learned from GPP to not put it into writing. They just don't ship after a customer has ordered. They are the GPU cartel, and they control all supply," said former Senior Vice President and General Manager at AMD Radeon, Scott Herkelman, in response to the accusations on X/Twitter.

NVIDIA AI GPU Customers Reportedly Selling Off Excess Hardware

The NVIDIA H100 Tensor Core GPU was last year's hot item for HPC and AI industry segments—the largest purchasers were reported to have acquired up to 150,000 units each. Demand grew so much that lead times of 36 to 52 weeks became the norm for H100-based server equipment. The latest rumblings indicate that things have stabilized—so much so that some organizations are "offloading chips" as the supply crunch cools off. Apparently it is more cost-effective to rent AI processing sessions through cloud service providers (CSPs)—the big three being Amazon Web Services, Google Cloud, and Microsoft Azure.

According to a mid-February Seeking Alpha report, wait times for the NVIDIA H100 80 GB GPU model have been reduced down to around three to four months. The Information believes that some companies have already reduced their order counts, while others have hardware sitting around, completely unused. Maintenance complexity and costs are reportedly cited as a main factors in "offloading" unneeded equipment, and turning to renting server time from CSPs. Despite improved supply conditions, AI GPU demand is still growing—driven mainly by organizations dealing with LLM models. A prime example being Open AI—as pointed out by The Information—insider murmurings have Sam Altman & Co. seeking out alternative solutions and production avenues.

Microsoft Investment in Mistral Attracts Possible Investigation by EU Regulators

Tech giant Microsoft and Paris-based startup Mistral AI, an innovator in open-source AI model development, have announced a new multi-year partnership to accelerate AI innovation and expand access to Mistral's state-of-the-art models. The collaboration will leverage Azure's cutting-edge AI infrastructure to propel Mistral's research and bring its innovations to more customers globally. The partnership focuses on three core areas. First, Microsoft will provide Mistral with Azure AI supercomputing infrastructure to power advanced AI training and inference for Mistral's flagship models like Mistral-Large. Second, the companies will collaborate on AI research and development to push AI model's boundaries. And third, Azure's enterprise capabilities will give Mistral additional opportunities to promote, sell, and distribute their models to Microsoft customers worldwide.

However, an investment in a European startup can not go smoothly without the constant eyesight of the European Union authorities and regulators to oversee the deal. According to Bloomberg, an EU spokesperson on Tuesday claimed that the EU regulators will perform an analysis of Microsoft's investment into Mistral after receiving a copy of the agreement between the two parties. While there is no formal investigation yet, if EU regulators continue to probe Microsoft's deal and intentions, they could launch a complete formal investigation that could lead to the termination of Microsoft's plans. Of course, the formal investigation is still on hold, but investing in EU startups might become unfeasible for American tech giants if the EU regulators continue to push the scrutiny of every investment made in companies based on EU soil.

Samsung Electronics Joins AI-RAN Alliance as a Founding Member

Samsung Electronics announced that it is participating in the AI-RAN Alliance as a founding member, with the goal of promoting 6G innovation by combining AI technology and wireless communication technology. Officially launched at Mobile World Congress (MWC) Barcelona 2024 today, the AI-RAN Alliance is an organization aimed at revitalizing the convergence of AI and wireless communication and leading technology innovation through cooperation with related companies. A total of eleven organizations—including Samsung, Arm, Ericsson, Microsoft, Nokia, NVIDIA, SoftBank and Northeastern University—are participating as founding members. This new alliance will collaborate on the development of innovative new technologies, as well as the application of these technologies to commercial products in preparation for the upcoming 6G era.

"Emerging services in the 6G era will revolutionize the way people interact with technology, and AI will be an integral part of this trend," said Charlie Zhang, Senior Vice President at Samsung Research America. "The AI-RAN Alliance will foster collaboration, drive innovation and usher in a new era of transformation around AI and 6G networks. We believe this coalition will create new value for end users and operators through AI-based use cases and innovations."

Lenovo Unveils Trailblazing Products and Solutions Designed to Power AI for All at MWC 2024

Today at MWC 2024, Lenovo unveiled its latest portfolio of purpose-built AI devices, software, and infrastructure solutions, as well as showcased two proof-of-concepts devices that challenge the traditional PC and smartphone form factors. The company also revealed the future of hybrid AI fueling multi-device, software, and service offerings for more personalization, collaboration and efficiency.

"Lenovo's suite of AI-enabled, AI-ready, and AI-optimized devices, infrastructure, solutions, and services at MWC provides a wider look at our vision for "AI for All"," said Lenovo Chairman and CEO, Yuanqing Yang. "Lenovo's AI technology benefits organizations of all sizes, driving intelligent transformation across all industries while reinforcing our commitment to sustainability."

Intel Announces New Edge Platform for Scaling AI Applications

At MWC 2024, Intel announced its new Edge Platform, a modular, open software platform enabling enterprises to develop, deploy, run, secure, and manage edge and AI applications at scale with cloud-like simplicity. Together, these capabilities will accelerate time-to-scale deployment for enterprises, contributing to improved total cost of ownership (TCO).

"The edge is the next frontier of digital transformation, being further fueled by AI. We are building on our strong customer base in the market and consolidating our years of software initiatives to the next level in delivering a complete edge-native platform, which is needed to enable infrastructure, applications and efficient AI deployments at scale. Our modular platform is exactly that, driving optimal edge infrastructure performance and streamlining application management for enterprises, giving them both improved competitiveness and improved total cost of ownership," said Pallavi Mahajan, Intel corporate vice president and general manager of Network and Edge Group Software.

LG Announces US Pricing and Availability of 2024 Gram Pro Notebooks

LG Electronics USA (LG) today announced pricing and availability of its 2024 premium lineup of laptops - the LG gram Pro and LG gram Pro 2-in-1. The 16- and 17-inch LG gram Pro models retail for $2399 (16Z90SP-A.ADB9U1) and $2499 (17Z90SP-E.ADB9U1) respectively while the CES 2024 Innovation Award-winning 16-inch gram Pro 2-in-1 (16T90SP-K.ADB9U1) retails for $2099. For a limited time, customers shopping on LG.com can pre-order the 16- and 17-inch LG gram Pro and 16-inch LG gram Pro 2-in-1.

Throughout the duration of the pre-order period from February 21, 2024, to March 10, 2024, customers will be able to purchase the 32 GB-RAM/2 TB-SSD LG gram Pro laptop for the price of a 16 GB-RAM/1 TB-SSD model of the same screen size. They'll also receive an LG gram +view IPS portable monitor (16MR70.ASDU) and expedited shipping at no additional cost. All standard terms of purchase apply.

Huawei Introduces HONOR MagicBook Pro 16

Global technology brand HONOR today unveiled the HONOR MagicBook Pro 16, a revolutionary AI-powered laptop which sets to redefine the traditional laptop landscape. Based on HONOR's platform-level AI capabilities and joint efforts with technology partners such as Microsoft, Intel and NVIDIA, HONOR is bringing PCs into the AI PC era, marking a significant milestone in computing, offering users an unparalleled AI experience and transforming their device interactions.

"At HONOR, our commitment lies in embracing open collaboration with industry partners to foster a flourishing ecosystem. We firmly believe in the transformative power of collaborative synergy, especially in the era of AI. By leveraging the collective expertise of top industry players, we are dedicated to crafting exceptional products and delivering unparalleled experiences to consumers worldwide," said George Zhao, CEO at HONOR.

SmartCow introduces Uranus Plus AI Fanless Embedded System Powered by NVIDIA Jetson Orin

SmartCow, an AI engineering company specializing in building complex hardware and software solutions for artificial intelligence at the edge, announces the launch of their latest product, Uranus Plus, an AI fanless embedded system powered by the latest NVIDIA Jetson Orin NX and Jetson Orin Nano system-on-modules. With its thermally efficient design and compact form factor, Uranus Plus is suitable for various smart applications. Uranus Plus comes with options for 5G, 4G, and Wi-Fi connectivity and includes a 256Gb NVMe SSD, enabling the simultaneous operation of multiple neural networks and processing of high-resolution images, enhancing a groundbreaking benchmark in AI-driven capabilities at the edge with support up to 100 TOPS of AI compute.

Uranus Plus supercharges vision AI application development at the edge with NVIDIA Metropolis Microservices for Jetson through app stack modernization. Uranus Plus developers now get access to the latest generative AI capabilities through simple API calls, along with a far faster path to development and cloud-native deployment of vision AI applications at the far edge.

Intel Brings AI Everywhere Across Network, Edge, Enterprise

At MWC 2024, Intel announced new platforms, solutions and services spanning network and edge AI, Intel Core Ultra processors and the AI PC, and more. In an era where technological advancements are integral to staying competitive, Intel is delivering products and solutions for its customers, partners and expansive ecosystem to capitalize on the emerging opportunities of artificial intelligence and built-in automation, to improve total cost of ownership (TCO) and operational efficiency, and to deliver new innovations and services.

Across today's announcements, Intel is focused on empowering the industry to further modernize and monetize 5G, edge and enterprise infrastructures and investments, and to take advantage of bringing AI Everywhere. For more than a decade, and alongside Intel's customers and partners, the company has been transforming today's network infrastructure from fixed-function to a software-defined platform and driving success at the edge with more than 90,000 real-world deployments.

Intel Optimizes PyTorch for Llama 2 on Arc A770, Higher Precision FP16

Intel just announced optimizations for PyTorch (IPEX) to take advantage of the AI acceleration features of its Arc "Alchemist" GPUs.PyTorch is a popular machine learning library that is often associated with NVIDIA GPUs, but it is actually platform-agnostic. It can be run on a variety of hardware, including CPUs and GPUs. However, performance may not be optimal without specific optimizations. Intel offers such optimizations through the Intel Extension for PyTorch (IPEX), which extends PyTorch with optimizations specifically designed for Intel's compute hardware.

Intel released a blog post detailing how to run Meta AI's Llama 2 large language model on its Arc "Alchemist" A770 graphics card. The model requires 14 GB of GPU RAM, so a 16 GB version of the A770 is recommended. This development could be seen as a direct response to NVIDIA's Chat with RTX tool, which allows GeForce users with >8 GB RTX 30-series "Ampere" and RTX 40-series "Ada" GPUs to run PyTorch-LLM models on their graphics cards. NVIDIA achieves lower VRAM usage by distributing INT4-quantized versions of the models, while Intel uses a higher-precision FP16 version. In theory, this should not have a significant impact on the results. This blog post by Intel provides instructions on how to set up Llama 2 inference with PyTorch (IPEX) on the A770.

NVIDIA Prepared to Offer Custom Chip Designs to AI Clients

NVIDIA is reported to be setting up an AI-focused semi-custom chip design business unit, according to inside sources known to Reuters—it is believed that Team Green leadership is adapting to demands leveraged by key data-center customers. Many companies are seeking cheaper alternatives, or have devised their own designs (budget/war chest permitting)—NVIDIA's current range of AI GPUs are simply off-the-shelf solutions. OpenAI has generated the most industry noise—their alleged early 2024 fund-raising pursuits have attracted plenty of speculative/kind-of-serious interest from notable semiconductor personalities.

Team Green is seemingly reacting to emerging market trends—Jensen Huang (CEO, president and co-founder) has hinted that NVIDIA custom chip designing services are on the cusp. Stephen Nellis—a Reuters reporter specializing in tech industry developments—has highlighted select NVIDIA boss quotes from an incoming interview piece: "We're always open to do that. Usually, the customization, after some discussion, could fall into system reconfigurations or recompositions of systems." The Team Green chief teased that his engineering team is prepared to take on the challenge meeting exact requests: "But if it's not possible to do that, we're more than happy to do a custom chip. And the benefit to the customer, as you can imagine, is really quite terrific. It allows them to extend our architecture with their know-how and their proprietary information." The rumored NVIDIA semi-custom chip design business unit could be introduced in an official capacity at next month's GTC 2024 Conference.

NVIDIA Expects Upcoming Blackwell GPU Generation to be Capacity-Constrained

NVIDIA is anticipating supply issues for its upcoming Blackwell GPUs, which are expected to significantly improve artificial intelligence compute performance. "We expect our next-generation products to be supply constrained as demand far exceeds supply," said Colette Kress, NVIDIA's chief financial officer, during a recent earnings call. This prediction of scarcity comes just days after an analyst noted much shorter lead times for NVIDIA's current flagship Hopper-based H100 GPUs tailored to AI and high-performance computing. The eagerly anticipated Blackwell architecture and B100 GPUs built on it promise major leaps in capability—likely spurring NVIDIA's existing customers to place pre-orders already. With skyrocketing demand in the red-hot AI compute market, NVIDIA appears poised to capitalize on the insatiable appetite for ever-greater processing power.

However, the scarcity of NVIDIA's products may present an excellent opportunity for significant rivals like AMD and Intel. If both companies can offer a product that could beat NVIDIA's current H100 and provide a suitable software stack, customers would be willing to jump to their offerings and not wait many months for the anticipated high lead times. Intel is preparing the next-generation Gaudi 3 and working on the Falcon Shores accelerator for AI and HPC. AMD is shipping its Instinct MI300 accelerator, a highly competitive product, while already working on the MI400 generation. It remains to be seen if AI companies will begin the adoption of non-NVIDIA hardware or if they will remain a loyal customer and agree to the higher lead times of the new Blackwell generation. However, capacity constrain should only be a problem at launch, where the availability should improve from quarter to quarter. As TSMC improves CoWoS packaging capacity and 3 nm production, NVIDIA's allocation of the 3 nm wafers will likely improve over time as the company moves its priority from H100 to B100.

SK Hynix VP Reveals HBM Production Volumes for 2024 are Sold Out

SK Hynix Vice President Kitae Kim presides over the company's HBM Sales & Marketing (S&M) department—an official leadership blog profile reveals that the executive played a key role in making the South Korean supplier's high bandwidth memory (HBM) product line "a superstar of the semiconductor memory industry in 2023." Growing demand for powerful AI processors has placed SK Hynix in a more comfortable position, following recessive spells—including a major sales downturn in 2022. NVIDIA is the market leader in AI processing chips, and many of its flagship enterprise designs are fitted with cutting-edge SK Hynix memory modules. Kim noted that his firm has many notable international clients: "HBM is a revolutionary product which has challenged the notion that semiconductor memory is only one part of an overall system...in particular, SK Hynix's HBM has outstanding competitiveness. Our advanced technology is highly sought after by global tech companies."

The VP outlined how artificial intelligence industries are fuelling innovations: "With the diversification and advancement of generative AI services, demand for HBM, an AI memory solution, has also exploded. HBM, with its high-performance and high-capacity characteristics, is a monumental product that shakes the conventional wisdom that memory semiconductors are only a part of the overall system. In particular, SK Hynix HBM's competitiveness is outstanding." Business is booming, so much so that nothing can be added to this year's HBM order books: "Proactively securing customer purchase volumes and negotiating more favorable conditions for our high-quality products are the basics of semiconductor sales operations. With excellent products in hand, it's a matter of speed. Our planned production volume of HBM this year has already sold out. Although 2024 has just begun, we've already started preparing for 2025 to stay ahead of the market."
Return to Keyword Browsing
May 16th, 2024 21:34 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts