News Posts matching #GPU

Return to Keyword Browsing

Interview with AMD's Senior Vice President and Chief Software Officer Andrej Zdravkovic: UDNA, ROCm for Radeon, AI Everywhere, and Much More!

A few days ago, we reported on AMD's newest expansion plans for Serbia. The company opened two new engineering design centers with offices in Belgrade and Nis. We were invited to join the opening ceremony and got an exclusive interview with one of AMD's top executives, Andrej Zdravkovic, who is the senior vice president and Chief Software Officer. Previously, we reported on AMD's transition to become a software company. The company has recently tripled its software engineering workforce and is moving some of its best people to support these teams. AMD's plan is spread over a three to five-year timeframe to improve its software ecosystem, accelerating hardware development to launch new products more frequently and to react to changes in software demand. AMD found that to help these expansion efforts, opening new design centers in Serbia would be very advantageous.

We sat down with Andrej Zdravkovic to discuss the purpose of AMD's establishment in Serbia and the future of some products. Zdravkovic is actually an engineer from Serbia, where he completed his Bachelor's and Master's degrees in electrical engineering from Belgrade University. In 1998, Zdravkovic joined ATI and quickly rose through the ranks, eventually becoming a senior director. During his decade-long tenure, Zdravkovic witnessed a significant industry shift as AMD acquired ATI in 2006. After a brief stint at another company, Zdravkovic returned to AMD in 2015, bringing with him a wealth of experience and a unique perspective on the evolution of the graphics and computing industry.
Here is the full interview:

NVIDIA GeForce 561.09 WHQL Game Ready Drivers Released

NVIDIA has released its latest GeForce WHQL Game Ready drivers, version 561.09, which brings optimizations for FINAL FANTASY XVI and God of War Ragnarök games, including support for DLSS 3 technology. It also adds launch support for EA SPORTS FC 25 and Frostpunk 2 games. The new GeForce 561.09 WHQL Game Ready drivers also add new GeForce Optimal Settings for Black Myth: Wukong, F1 Manager 2024, and Star Wars Outlaws games. Unfortunately, it does not fix any previously seen gaming bugs, but it does add several general bug fixes, including a fix for GeForce Experience where Performance Monitoring overlay could stop refreshing GPU information, a fix for NVIDIA App, where game filters may be missing when invoking the NVIDIA Overlay in-game, and a fix for performance issues in some workloads with Chaos V-Ray. Interestingly, there are no listed open issues with the latest driver either.

DOWNLOAD: NVIDIA GeForce 561.09 WHQL Game Ready

Micron Announces 12-high HBM3E Memory, Bringing 36 GB Capacity and 1.2 TB/s Bandwidth

As AI workloads continue to evolve and expand, memory bandwidth and capacity are increasingly critical for system performance. The latest GPUs in the industry need the highest performance high bandwidth memory (HBM), significant memory capacity, as well as improved power efficiency. Micron is at the forefront of memory innovation to meet these needs and is now shipping production-capable HBM3E 12-high to key industry partners for qualification across the AI ecosystem.

Micron's industry-leading HBM3E 12-high 36 GB delivers significantly lower power consumption than our competitors' 8-high 24 GB offerings, despite having 50% more DRAM capacity in the package
Micron HBM3E 12-high boasts an impressive 36 GB capacity, a 50% increase over current HBM3E 8-high offerings, allowing larger AI models like Llama 2 with 70 billion parameters to run on a single processor. This capacity increase allows faster time to insight by avoiding CPU offload and GPU-GPU communication delays. Micron HBM3E 12-high 36 GB delivers significantly lower power consumption than the competitors' HBM3E 8-high 24 GB solutions. Micron HBM3E 12-high 36 GB offers more than 1.2 terabytes per second (TB/s) of memory bandwidth at a pin speed greater than 9.2 gigabits per second (Gb/s). These combined advantages of Micron HBM3E offer maximum throughput with the lowest power consumption can ensure optimal outcomes for power-hungry data centers. Additionally, Micron HBM3E 12-high incorporates fully programmable MBIST that can run system representative traffic at full spec speed, providing improved test coverage for expedited validation and enabling faster time to market and enhancing system reliability.

Intel Core Ultra 300 Series "Panther Lake-H" Leaks: 18 CPU Cores, 12 Xe3 GPU Cores, and up to 45 Watt TDP

Details have emerged about Intel's upcoming "Panther Lake" processors, set to be the third generation of Core Ultra mobile chips. Called the Core Ultra 300 series, these CPUs are expected to succeed "Lunar Lake". According to recent leaks, Panther Lake-H will be manufactured using Intel's cutting-edge 18A process node. The chips are said to feature a combination of Cougar Cove P-Cores, Skymont E-Cores, and Xe3 (Celestial) integrated graphics. This architecture builds upon Intel's hybrid core design, refining it for even better performance on mobile devices. The leaked information suggests a range of configurations for Panther Lake-H, the high-perfomance variant of the lineup. These include models with varying core counts and power envelopes, from efficient 25 W parts to more interesting 45 W options. Notably, some SKUs reportedly feature up to 18 cores in total, combining P-cores, E-cores, and LP E-cores in a five-tile package. This is an increase from previously believed 16 cores.

NVIDIA DLSS Comes to Supermoves, Spectre Divide, and Gori Cuddly Carnage

More than 600 games and applications feature RTX technologies, and each week new games integrating NVIDIA DLSS, NVIDIA Reflex and advanced ray-traced effects are released or announced, delivering the definitive PC experience for GeForce RTX players. Read on to learn about this week's new integrations of DLSS.

Supermoves Out Now, Featuring DLSS 3
Makea Games' Supermoves is a competitive parkour showdown for you and your friends. Soar through skylines in both first and third person. Grind on your feet across rails, trapeze on high wires, wall run, scramble, and pull off epic backflips and frontflips on trampolines. Go head to head in Bomb Tag, reach the finish line before the Rising Tide, or race up to 40 players in Royale. Make your own games and levels with the included tools. Check out the single-player Career Mode. And never stop running. If you're rocking a GeForce RTX GPU in your PC or laptop, ensure you enable DLSS Super Resolution and Frame Generation in Supermoves to accelerate frame rates, making your parkour adventures even better.

LG gram Ready to Define the Next-Gen AI Laptop With New Intel Core Ultra Processors

LG Electronics (LG) is excited to announce that its newest LG gram laptop featuring the Intel Core Ultra processor (Series 2) will be showcased at the Intel Core Ultra Global Launch Event from September 3-8. Renowned for its powerful performance and ultra-lightweight design, the LG gram series now integrates advanced AI capabilities powered by the latest Intel Core Ultra processor. The LG gram 16 Pro, the first model to feature these new Intel processors, will be unveiled before its release at the end of 2024.

As the first on-device AI laptop from the LG gram series, it offers up to an impressive 48 neural processing unit (NPU) tera operations per second (TOPS), setting a new standard for AI PCs and providing the exceptional performance required for Copilot experiences. Powered by the latest Intel Core Ultra processor, the LG gram 16 Pro is now more efficient thanks to advanced AI functionalities such as productivity assistants, text and image creation and collaboration tools. What's more, its extended battery life helps users handle tasks without worry.

Chinese GPU Maker XCT Faces Financial Crisis and Legal Troubles

Xiangdixian Computing Technology (XCT), once hailed as China's answer to NVIDIA at its peak, is now grappling with severe financial difficulties and legal challenges. The company, which has developed its own line of GPUs based on the Tianjun chips, recently admitted that its progress in "development of national GPU has not yet fully met the company's expectations and is facing certain market adjustment pressures." Despite producing two desktop and one workstation GPU model, XCT has been forced to address rumors of its closure. The company has undergone significant layoffs, but it claims to have retained key research and development staff essential for GPU advancement. Adding to XCT's woes, investors have initiated legal proceedings against the company's founder, Tang Zhimin, claiming he failed to deliver on his commitment to raising 500 million Yuan in Series B funding.

Among the complainants is the state-owned Jiangsu Zhongde Services Trade Industry Investment Fund, which has filed a lawsuit against three companies under Zhimin's control. Further complicating matters, Capitalonline Data Service is reportedly suing XCT for unpaid debts totaling 18.8 million Yuan. There are also claims that the company's bank accounts have been frozen, potentially impeding its ability to continue operations. The situation is further complicated by allegations of corruption within China's semiconductor sector, with reports of executives misappropriating investment funds. With XCT fighting for survival through restructuring efforts, its fate hangs in the balance. Without securing additional funding soon, the company may be forced to close its doors, which will blow China's GPU aspirations.

Samsung Announces New Galaxy Book5 Pro 360

Samsung Electronics today announced the Galaxy Book5 Pro 360, a Copilot+ PC and the first in the all-new Galaxy Book5 series. Performance upgrades made possible by the Intel Core Ultra processors (Series 2) bring next-level computing power, with up to 47 total TOPs NPU - and more than 300 AI-accelerated features across 100+ creativity, productivity, gaming and entertainment apps. Microsoft Phone Link provides access to your Galaxy phone screen on a larger, more immersive PC display, enabling use of fan-favorite Galaxy AI features like Circle to Search with Google, Chat Assist, Live Translate and more. And with the Intel ARC GPU, graphics performance is improved by 17%. When paired with stunning features like the Dynamic AMOLED 2X display with Vision Booster and 10-point multi-touchscreen, Galaxy Book5 Pro 360 allows creation anytime, anywhere.

"The Galaxy Book5 series brings even more cutting-edge AI experiences to Galaxy users around the world who want to enhance and simplify their everyday tasks - a vision made possible by our continued collaboration with longtime industry partners," said Dr. Hark-Sang Kim, EVP & Head of New Computing R&D Team, Mobile eXperience Business at Samsung Electronics. "As one of our most powerful PCs, Galaxy Book5 Pro 360 brings together top-tier performance with Galaxy's expansive mobile AI ecosystem for the ultimate AI PC experience."

NVIDIA Resolves "Blackwell" Yield Issues with New Photomask

During its Q2 2024 earnings call, NVIDIA confirmed that its upcoming Blackwell-based products are facing low-yield challenges. However, the company announced that it has implemented design changes to improve the production yields of its B100 and B200 processors. Despite these setbacks, NVIDIA remains optimistic about its production timeline. The tech giant plans to commence the production ramp of Blackwell GPUs in Q4 2024, with expected shipments worth several billion dollars by the end of the year. In an official statement, NVIDIA explained, "We executed a change to the Blackwell GPU mask to improve production yield." The company also reaffirmed that it had successfully sampled Blackwell GPUs with customers in the second quarter.

However, NVIDIA acknowledged that meeting demand required producing "low-yielding Blackwell material," which impacted its gross margins. During an earnings call, NVIDIA's CEO Jensen Huang assured investors that the supply of B100 and B200 GPUs will be there. He expressed confidence in the company's ability to mass-produce these chips starting in the fourth quarter. The Blackwell B100 and B200 GPUs use TSMC's CoWoS-L packaging technology and a complex design, which prompted rumors about the company facing yield issues with its designs. Reports suggest that initial challenges arose from mismatched thermal expansion coefficients among various components, leading to warping and system failures. However, now the company claims that the fix that solved these problems was a new GPU photomask, which bumped yields back to normal levels.

ASUS Announces ESC N8-E11 AI Server with NVIDIA HGX H200

ASUS today announced the latest marvel in the groundbreaking lineup of ASUS AI servers - ESC N8-E11, featuring the intensely powerful NVIDIA HGX H200 platform. With this AI titan, ASUS has secured its first industry deal, showcasing the exceptional performance, reliability and desirability of ESC N8-E11 with HGX H200, as well as the ability of ASUS to move first and fast in creating strong, beneficial partnerships with forward-thinking organizations seeking the world's most powerful AI solutions.

Shipments of the ESC N8-E11 with NVIDIA HGX H200 are scheduled to begin in early Q4 2024, marking a new milestone in the ongoing ASUS commitment to excellence. ASUS has been actively supporting clients by assisting in the development of cooling solutions to optimize overall PUE, guaranteeing that every ESC N8-E11 unit delivers top-tier efficiency and performance - ready to power the new era of AI.

Intel Dives Deep into Lunar Lake, Xeon 6, and Gaudi 3 at Hot Chips 2024

Demonstrating the depth and breadth of its technologies at Hot Chips 2024, Intel showcased advancements across AI use cases - from the data center, cloud and network to the edge and PC - while covering the industry's most advanced and first-ever fully integrated optical compute interconnect (OCI) chiplet for high-speed AI data processing. The company also unveiled new details about the Intel Xeon 6 SoC (code-named Granite Rapids-D), scheduled to launch during the first half of 2025.

"Across consumer and enterprise AI usages, Intel continuously delivers the platforms, systems and technologies necessary to redefine what's possible. As AI workloads intensify, Intel's broad industry experience enables us to understand what our customers need to drive innovation, creativity and ideal business outcomes. While more performant silicon and increased platform bandwidth are essential, Intel also knows that every workload has unique challenges: A system designed for the data center can no longer simply be repurposed for the edge. With proven expertise in systems architecture across the compute continuum, Intel is well-positioned to power the next generation of AI innovation." -Pere Monclus, chief technology officer, Network and Edge Group at Intel.

AMD Radeon RX 8000 "RDNA 4" GPU Spotted on Geekbench

AMD's upcoming Radeon RX 8000 "RDNA 4" GPU has been spotted on Geekbench, revealing some of its core specifications. These early benchmark appearances indicate that AMD is now testing the new GPUs internally, preparing for a launch expected next year. The leaked GPU, identified as "GFX1201", is believed to be the Navi 48 SKU - the larger of two dies planned for the RDNA 4 family.

It features 28 Compute Units in the Geekbench listing, which in this case refers to Work Group Processors (WGPs). This likely translates to 56 Compute Units positioning it between the current RX 7700 XT (54 CU) and RX 7800 XT (60 CU) models. The clock speed is listed at 2.1 GHz, which seems low compared to current RDNA 3 GPUs that can boost to 2.5-2.6 GHz. However, this is likely due to the early nature of the samples, and we can expect higher frequencies closer to launch. Memory specifications show 16 GB of VRAM, matching current high-end models and suggesting a 256-bit bus interface. Some variants may feature 12 GB VRAM with a 192-bit bus. While not confirmed, previous reports indicate AMD will use GDDR6 memory.

Minisforum Announces New G7 Ti Mini-PC With Core i9-14900HX and RTX 4070

Minisforum is thrilled to announce the launch of the new G7 Ti Mini-PC, a marvel of engineering designed specifically for professionals in AI development, video rendering, 3D design, and AI-driven creative fields. This ultra-compact yet extraordinarily powerful system is set to revolutionize the market, offering top-tier performance in a sleek, space-saving design.

Unleashing Power with Intel Core i9-14900HX
At the heart of the G7 Ti Mini-PC lies the Intel Core i9-14900HX processor, a dynamic powerhouse that brings desktop-caliber performance to a mini-PC format. With its advanced architecture, the i9-14900HX is optimized for high-speed computing tasks and multitasking, making it an ideal choice for professionals who demand efficiency and speed in their workflow.

Arm to Dip its Fingers into Discrete GPU Game, Plans on Competing with Intel, AMD, and NVIDIA

According to a recent report from Globes, Arm, the chip design giant and maker of the Arm ISA, is reportedly developing a new discrete GPU at its Ra'anana development center in Israel. This development signals Arm's intention to compete directly with industry leaders like Intel, AMD, and NVIDIA in the massive discrete GPU market. Sources close to the matter reveal that Arm has assembled a team of approximately 100 skilled chip and software development engineers at its Israeli facility. The team is focused on creating GPUs primarily aimed at the video game market. However, industry insiders speculate that this technology could potentially be adapted for AI processing in the future, mirroring the trajectory of NVIDIA, which slowly integrated AI hardware accelerators into its lineup.

The Israeli development center is playing a crucial role in this initiative. The hardware teams are overseeing the development of key components for these GPUs, including the flagship Immortalis and Mali GPU. Meanwhile, the software teams are creating interfaces for external graphics engine developers, working with both established game developers and startups. Arm is already entering the PC market through its partners like Qualcomm with Snapdragon X chips. However, these chips run an integrated GPU, and Arm wants to provide discrete GPUs and compete there. While details are still scarce, Arm could make GPUs to accompany Arm-based Copilot+ PCs and some desktop builds. The final execution plan still needs to be discovered, and we are still waiting to see which stage Arm's discrete GPU project is in.

Geekbench AI Hits 1.0 Release: CPUs, GPUs, and NPUs Finally Get AI Benchmarking Solution

Primate Labs, the developer behind the popular Geekbench benchmarking suite, has launched Geekbench AI—a comprehensive benchmark tool designed to measure the artificial intelligence capabilities of various devices. Geekbench AI, previously known as Geekbench ML during its preview phase, has now reached version 1.0. The benchmark is available on multiple operating systems, including Windows, Linux, macOS, Android, and iOS, making it accessible to many users and developers. One of Geekbench AI's key features is its multifaceted approach to scoring. The benchmark utilizes three distinct precision levels: single-precision, half-precision, and quantized data. This evaluation aims to provide a more accurate representation of AI performance across different hardware designs.

In addition to speed, Geekbench AI places a strong emphasis on accuracy. The benchmark assesses how closely each test's output matches the expected results, offering insights into the trade-offs between performance and precision. The release of Geekbench AI 1.0 brings support for new frameworks, including OpenVINO, ONNX, and Qualcomm QNN, expanding its compatibility across various platforms. Primate Labs has also implemented measures to ensure fair comparisons, such as enforcing minimum runtime durations for each workload. The company noted that Samsung and NVIDIA are already utilizing the software to measure their chip performance in-house, showing that adoption is already strong. While the benchmark provides valuable insights, real-world AI applications are still limited, and reliance on a few benchmarks may paint a partial picture. Nevertheless, Geekbench AI represents a significant step forward in standardizing AI performance measurement, potentially influencing future consumer choices in the AI-driven tech market. Results from the benchmark runs can be seen here.

Huawei Reportedly Developing New Ascend 910C AI Chip to Rival NVIDIA's H100 GPU

Amidst escalating tensions in the U.S.-China semiconductor industry, Huawei is reportedly working on a new AI chip called the Ascend 910C. This development appears to be the Chinese tech giant's attempt to compete with NVIDIA's AI processors in the Chinese market. According to a Wall Street Journal report, Huawei has begun testing the Ascend 910C with various Chinese internet and telecom companies to evaluate its performance and capabilities. Notable firms such as ByteDance, Baidu, and China Mobile are said to have received samples of the chip.

Huawei has reportedly informed its clients that the Ascend 910C can match the performance of NVIDIA's H100 chip. The company has been conducting tests for several weeks, suggesting that the new processor is nearing completion. The Wall Street Journal indicates that Huawei could start shipping the chip as early as October 2024. The report also mentions that Huawei and potential customers have discussed orders for over 70,000 chips, potentially worth $2 billion.

ASUS Presents Comprehensive AI Server Lineup

ASUS today announced its ambitious All in AI initiative, marking a significant leap into the server market with a complete AI infrastructure solution, designed to meet the evolving demands of AI-driven applications from edge, inference and generative AI the new, unparalleled wave of AI supercomputing. ASUS has proven its expertise lies in striking the perfect balance between hardware and software, including infrastructure and cluster architecture design, server installation, testing, onboarding, remote management and cloud services - positioning the ASUS brand and AI server solutions to lead the way in driving innovation and enabling the widespread adoption of AI across industries.

Meeting diverse AI needs
In partnership with NVIDIA, Intel and AMD, ASUS offer comprehensive AI-infrastructure solutions with robust software platforms and services, from entry-level AI servers and machine-learning solutions to full racks and data centers for large-scale supercomputing. At the forefront is the ESC AI POD with NVIDIA GB200 NVL72, a cutting-edge rack designed to accelerate trillion-token LLM training and real-time inference operations. Complemented by the latest NVIDIA Blackwell GPUs, NVIDIA Grace CPUs and 5th Gen NVIDIA NVLink technology, ASUS servers ensure unparalleled computing power and efficiency.

SUNON Unveils a Two-Phase Liquid Cooling Solution for Advanced Workstations

In the age of AI, computing power has become a vital component for driving innovation. For most industries, using professional-grade workstations as the computing engine enables efficient computing and infinite creativity. A workstation is a multi-purpose computer that supports high-performance computing in a distributed network setting. It excels at graphics processing and task parallelism, making it suitable for a wide range of AI applications as well as common visual design tasks.

For example, the workstation can fully meet the requirements for multi-task processing, such as 3D modeling, large-scale industrial drawing, advertising rendering output, non-linear video editing, file rendering production and acceleration, and so on. The computer can also perform effectively in a wide range of model loadings and professional software, as well as remote system maintenance and monitoring in unsupervised settings, which has the potential to revolutionize the application domain.

Intel Ships 0x129 Microcode Update for 13th and 14th Generation Processors with Stability Issues

Intel has officially started shipping the "0x129" microcode update for its 13th and 14th generation "Raptor Lake" and "Raptor Lake Refresh" processors. This critical update is currently being pushed to all OEM/ODM partners to address the stability issues that Intel's processors have been facing. According to Intel, this microcode update fixes "incorrect voltage requests to the processor that are causing elevated operating voltage." Intel's analysis shows that the root cause of stability problems is caused by too high voltage during operation of the processor. These increases to voltage cause degradation that increases the minimum voltage required for stable operation. Intel calls this "Vmin"—it's a theoretical construct, not an actual voltage, think "speed for an airplane required to fly". The latest 0x129 microcode patch will limit the processor's voltage to no higher than 1.55 V, which should avoid further degradation. Overclocking is still supported, enthusiasts will have to disable the eTVB setting in their BIOS to push the processor beyond the 1.55 V threshold. The company's internal testing shows that the new default settings with limited voltages with standard run-to-run variations show minimal performance impact, with only a single game (Hitman 3: Dartmoor) showing degradation. For a full statement from Intel, see the quote below.

Intel Announces Arc A760A Automotive-grade GPU

In a strategic move to empower automakers with groundbreaking opportunities, Intel unveiled its first discrete graphics processing unit (dGPU), the Intel Arc Graphics for Automotive, at its AI Cockpit Innovation Experience event. To advance automotive AI, the product will be commercially deployed in vehicles as soon as 2025, accelerating automobile technology and unlocking a new era of AI-driven cockpit experiences and enhanced personalization for manufacturers and drivers alike.

Intel's entry into automotive discrete GPUs addresses growing demand for compute power in increasingly sophisticated vehicle cockpits. By adding the Intel Arc graphics for Automotive to its existing portfolio of AI-enhanced software-defined vehicle (SDV) system-on-chips (SoCs), Intel offers automakers an open, flexible and scalable platform solution that brings next-level, high-fidelity experiences to the vehicle.

NVIDIA's New B200A Targets OEM Customers; High-End GPU Shipments Expected to Grow 55% in 2025

Despite recent rumors speculating on NVIDIA's supposed cancellation of the B100 in favor of the B200A, TrendForce reports that NVIDIA is still on track to launch both the B100 and B200 in the 2H24 as it aims to target CSP customers. Additionally, a scaled-down B200A is planned for other enterprise clients, focusing on edge AI applications.

TrendForce reports that NVIDIA will prioritize the B100 and B200 for CSP customers with higher demand due to the tight production capacity of CoWoS-L. Shipments are expected to commence after 3Q24. In light of yield and mass production challenges with CoWoS-L, NVIDIA is also planning the B200A for other enterprise clients, utilizing CoWoS-S packaging technology.

Lossless Scaling Frame Generation Boosts Frame Rate by 4x in All PC Games, Update Arrives This Week

Lossless Scaling, an all-in-one paid gaming utility for scaling and frame generation, is set to introduce an outstanding 4x FPS mode to its Lossless Scaling Frame Generation (LSFG) technology. Officially announced in the Lossless Scaling Discord and showcased by the YouTube user Vyathaen, the upcoming 4x FPS mode is expected to arrive in the upscaler's frame generation option within this week. While YouTube videos may not fully capture the experience and benefit of this improvement, beta testers have reported that the latency remains consistent with the current 2x FPS option, ensuring that most games will remain perfectly playable given a sufficiently high base framerate. For those seeking a more comprehensive demonstration, the Lossless Scaling official Discord server features a Cyberpunk 2077 video that better illustrates the capabilities of the 4x FPS mode.

The journey of Lossless Scaling has been marked by continuous innovation since its initial release. Version 2.1, launched in June, introduced a 3x FPS mode, effectively tripling framerates. Additionally, it brought performance optimizations that enhanced the speed of the 2x FPS mode compared to previous iterations. The update also included refinements for scenarios where the final frame rate surpasses the monitor's refresh rate. The software is universally compatible with all GPUs and PC games, including emulated titles, requiring only windowed mode and Windows 10 1903 or newer. While the LSFG frame generation technology and LS1 upscaler are proprietary, for upscaling, users can choose one of the many underlying options depending on their GPU like AMD FidelityFX Super Resolution, NVIDIA Image Scaling, Integer Scaling, Nearest Neighbor, xBR, Anime4K, Sharp Bilinear, Bicubic CAS. Below, you can check out the YouTube video with 4x frame generation example.
Lossless Scaling Lossless Scaling

Apple Trained its Apple Intelligence Models on Google TPUs, Not NVIDIA GPUs

Apple has disclosed that its newly announced Apple Intelligence features were developed using Google's Tensor Processing Units (TPUs) rather than NVIDIA's widely adopted hardware accelerators like H100. This unexpected choice was detailed in an official Apple research paper, shedding light on the company's approach to AI development. The paper outlines how systems equipped with Google's TPUv4 and TPUv5 chips played a crucial role in creating Apple Foundation Models (AFMs). These models, including AFM-server and AFM-on-device, are designed to power both online and offline Apple Intelligence features introduced at WWDC 2024. For the training of the 6.4 billion parameter AFM-server, Apple's largest language model, the company utilized an impressive array of 8,192 TPUv4 chips, provisioned as 8×1024 chip slices. The training process involved a three-stage approach, processing a total of 7.4 trillion tokens. Meanwhile, the more compact 3 billion parameter AFM-on-device model, optimized for on-device processing, was trained using 2,048 TPUv5p chips.

Apple's training data came from various sources, including the Applebot web crawler and licensed high-quality datasets. The company also incorporated carefully selected code, math, and public datasets to enhance the models' capabilities. Benchmark results shared in the paper suggest that both AFM-server and AFM-on-device excel in areas such as Instruction Following, Tool Use, and Writing, positioning Apple as a strong contender in the AI race despite its relatively late entry. However, Apple's penetration tactic into the AI market is much more complex than any other AI competitor. Given Apple's massive user base and millions of devices compatible with Apple Intelligence, the AFM has the potential to change user interaction with devices for good, especially for everyday tasks. Hence, refining AI models for these tasks is critical before massive deployment. Another unexpected feature is transparency from Apple, a company typically known for its secrecy. The AI boom is changing some of Apple's ways, and revealing these inner workings is always interesting.

NVIDIA Plans RTX 3050 A with Ada Lovelace AD106 Silicon

NVIDIA may be working on a new RTX 3050 A laptop GPU using an AD106 (Ada Lovelace) die, moving away from the Ampere chips used in other RTX 30-series GPUs. While not officially announced, the GPU is included in NVIDIA's latest driver release and the PCI ID database as GeForce RTX 3050 A Laptop GPU. The AD106 die choice is notable, as it has more transistors and CUDA cores than the GA107 in current RTX 3050s and the AD107 in RTX 4050 laptops. The AD106, used in RTX 4060 Ti desktop and RTX 4070 laptop GPUs, boasts 22.9 billion transistors and 4,608 CUDA cores, compared to GA107's 8.7 billion transistors and 2,560 CUDA cores, and AD107's 18.9 billion transistors and 3,072 CUDA cores.

While this could potentially improve performance, it's likely that NVIDIA will use a cut-down version of the AD106 chip for the RTX 3050 A. The exact specifications and features, such as support for DLSS 3, remain unknown. The use of TSMC's 4N node in AD106, instead of Samsung's 8N node used in Ampere, could potentially improve power efficiency and battery life. The performance of the RTX 3050 A compared to existing RTX 3050 and RTX 4050 laptops remains to be seen, however, the RTX 3050 A will likely perform similarly to existing Ampere-based parts as NVIDIA tends to use similar names for comparable performance levels. It's unclear if NVIDIA will bring this GPU to market, but adding new SKUs late in a product's lifespan isn't unprecedented.

Samsung Electro-Mechanics Collaborates with AMD to Supply High-Performance Substrates for Hyperscale Data Center Computing

Samsung Electro-Mechanics (SEMCO) today announced a collaboration with AMD to supply high-performance substrates for hyperscale data center compute applications. These substrates are made in SEMCO's key the technology hub in Busan and the newly built state of the art factory in Vietnam. Market research firm Prismark predicts that the semiconductor substrate market will grow at an average annual rate of about 7%, increasing from 15.2 trillion KRW in 2024 to 20 trillion KRW in 2028. SEMCO's substantial investment of 1.9 trillion KRW in the FCBGA factory underscores its commitment to advancing substrate technology and manufacturing capabilities to meet the highest industry standards and the future technology needs.

SEMCO's collaboration with AMD focuses on meeting the unique challenges of integrating multiple semiconductor chips (Chiplets) on a single large substrate. These high-performance substrates, essential for CPU/GPU applications, offer significantly larger surface areas and higher layer counts, providing the dense interconnections required for today's advanced data centers. Compared to standard computer substrates, data center substrates are ten times larger and feature three times more layers, ensuring efficient power delivery and lossless signal integrity between chips. Addressing these challenges, SEMCO's innovative manufacturing processes mitigate issues like warpage to ensure high yields during chip mounting.
Return to Keyword Browsing
Sep 17th, 2024 17:17 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts