News Posts matching #Machine Learning

Return to Keyword Browsing

Ceremorphic Exits Stealth Mode; Unveils Technology Plans to Deliver a New Architecture Specifically Designed for Reliable Performance Computing

Armed with more than 100 patents and leveraging multi-decade expertise in creating industry-leading silicon systems, Ceremorphic Inc. today announced its plans to deliver a complete silicon system that provides the performance needed for next-generation applications such as AI model training, HPC, automotive processing, drug discovery, and metaverse processing. Designed in advanced silicon geometry (TSMC 5 nm node), this new architecture was built from the ground up to solve today's high-performance computing problems in reliability, security and energy consumption to serve all performance-demanding market segments.

Ceremorphic was founded in April 2020 by industry-veteran Dr. Venkat Mattela, the Founding CEO of Redpine Signals, which sold its wireless assets to Silicon Labs, Inc. in March 2020 for $308 million. Under his leadership, the team at Redpine Signals delivered breakthrough innovations and industry-first products that led to the development of an ultra-low-power wireless solution that outperformed products from industry giants in the wireless space by as much as 26 times on energy consumption. Ceremorphic leverages its own patented multi-thread processor technology ThreadArch combined with cutting-edge new technology developed by the silicon, algorithm and software engineers currently employed by Ceremorphic. This team is leveraging its deep expertise and patented technology to design an ultra-low-power training supercomputing chip.

AMD Files Patent for Chiplet Machine Learning Accelerator to be Paired With GPU, Cache Chiplets

AMD has filed a patent whereby they describe a MLA (Machine Learning Accelerator) chiplet design that can then be paired with a GPU unit (such as RDNA 3) and a cache unit (likely a GPU-excised version of AMD's Infinity Cache design debuted with RDNA 2) to create what AMD is calling an "APD" (Accelerated Processing Device). The design would thus enable AMD to create a chiplet-based machine learning accelerator whose sole function would be to accelerate machine learning - specifically, matrix multiplication. This would enable capabilities not unlike those available through NVIDIA's Tensor cores.

This could give AMD a modular way to add machine-learning capabilities to several of their designs through the inclusion of such a chiplet, and might be AMD's way of achieving hardware acceleration of a DLSS-like feature. This would avoid the shortcomings associated with implementing it in the GPU package itself - an increase in overall die area, with thus increased cost and reduced yields, while at the same time enabling AMD to deploy it in other products other than GPU packages. The patent describes the possibility of different manufacturing technologies being employed in the chiplet-based design - harkening back to the I/O modules in Ryzen CPUs, manufactured via a 12 nm process, and not the 7 nm one used for the core chiplets. The patent also describes acceleration of cache-requests from the GPU die to the cache chiplet, and on-the-fly usage of it as actual cache, or as directly-addressable memory.

GIGABYTE Joins MLCommons to Accelerate the Machine Learning Community

GIGABYTE Technology, an industry leader in high-performance servers and workstations, today announced GIGABYTE as one of the founding members of MLCommons, an open engineering consortium with the goal of accelerating machine learning with benchmarking, large-scale open data sets, and best practices that are community-driven.

In 2018, a group from Google, Baidu, Harvard, and Stanford created a benchmarking suite for machine learning called MLPerf. The purpose was to evaluate the new generation of accelerators to neural-networking jobs performance. By having benchmarking tools, companies and universities would be able to design hardware and software optimized for training and inferencing machine learning workloads.

CXL Consortium Releases Compute Express Link 2.0 Specification

The CXL Consortium, an industry standards body dedicated to advancing Compute Express Link (CXL) technology, today announced the release of the CXL 2.0 specification. CXL is an open industry-standard interconnect offering coherency and memory semantics using high-bandwidth, low-latency connectivity between host processor and devices such as accelerators, memory buffers, and smart I/O devices. The CXL 2.0 specification adds support for switching for fan-out to connect to more devices; memory pooling for increased memory utilization efficiency and providing memory capacity on demand; and support for persistent memory - all while preserving industry investments by supporting full backwards compatibility with CXL 1.1 and 1.0.

"Datacenter architectures continue to evolve rapidly to support the growing demands of emerging workloads for Artificial Intelligence and Machine Learning, with CXL technology keeping pace to meet the performance and latency demands," said Barry McAuliffe, president, CXL Consortium. "Designed with breakthrough performance and easy adoption as guiding principles, the CXL 2.0 specification is a significant achievement from our dedicated technical work group members."

NVIDIA Surpasses Intel in Market Cap Size

Yesterday after the stock market has closed, NVIDIA has officially reached a bigger market cap compared to Intel. After hours, the price of the NVIDIA (ticker: NVDA) stock is $411.20 with a market cap of 251.31B USD. It marks a historic day for NVIDIA as the company has historically been smaller than Intel (ticker: INTC), with some speculating that Intel could buy NVIDIA in the past while the company was much smaller. Intel's market cap now stands at 248.15B USD, which is a bit lower than NVIDIA's. However, the market cap is not an indication of everything. NVIDIA's stock is fueled by the hype generated around Machine Learning and AI, while Intel is not relying on any possible bubbles.

If we compare the revenues of both companies, Intel is having much better performance. It had a revenue of 71.9 billion USD in 2019, while NVIDIA has 11.72 billion USD of revenue. No doubt that NVIDIA has managed to do a good job and it managed to almost double revenue from 2017, where it went from $6.91 billion in 2017 to $11.72 billion in 2019. That is an amazing feat and market predictions are that it is not stopping to grow. With the recent acquisition of Mellanox, the company now has much bigger opportunities for expansion and growth.

Arm Announces new IP Portfolio with Cortex-A78 CPU

During this unprecedented global health crisis, we have experienced rapid societal changes in how we interact with and rely on technology to connect, aid, and support us. As a result of this we are increasingly living our lives on our smartphones, which have been essential in helping feed our families through application-based grocery or meal delivery services, as well as virtually seeing our colleagues and loved ones daily. Without question, our Arm-based smartphones are the computing hub of our lives.

However, even before this increased reliance on our smartphones, there was already growing interest among users to explore the limits of what is possible. The combination of these factors with the convergence of 5G and AI, are generating greater demand for more performance and efficiency in the palm of our hands.
Arm Cortex-A78

TSMC and Broadcom Enhance the CoWoS Platform with World's First 2X Reticle Size Interposer

TSMC today announced it has collaborated with Broadcom on enhancing the Chip-on-Wafer-on-Substrate (CoWoS ) platform to support the industry's first and largest 2X reticle size interposer. With an area of approximately 1,700mm2, this next generation CoWoS interposer technology significantly boosts computing power for advanced HPC systems by supporting more SoCs as well as being ready to support TSMC's next-generation five-nanometer (N5) process technology.

This new generation CoWoS technology can accommodate multiple logic system-on-chip (SoC) dies, and up to 6 cubes of high-bandwidth memory (HBM), offering as much as 96 GB of memory. It also provides bandwidth of up to 2.7 terabytes per second, 2.7 times faster than TSMC's previously offered CoWoS solution in 2016. With higher memory capacity and bandwidth, this CoWoS solution is well-suited for memory-intensive workloads such as deep learning, as well as workloads for 5G networking, power-efficient datacenters, and more. In addition to offering additional area to increase compute, I/O, and HBM integration, this enhanced CoWoS technology provides greater design flexibility and yield for complex ASIC designs in advanced process nodes.

Intel in Negotiations for Habana Labs Acquisition

Intel is currently performing negotiations to acquire Israeli AI chip startup, Habana Labs, according to a person who spoke to Calcalist anonymously. If the deal realizes, Intel will pay between one and two billion USD, making it Intel's second-largest acquisition of an Israeli company. When asked about the potential deal, the Intel spokesperson has stated that the company will not respond to rumors surrounding it, while Habana Labs has yet to respond to a request for comment made by Calcalist.

Founded in 2016 by Israeli entrepreneur Avigdor Willenz, who founded Galileo Technologies and Annapurna Labs, Habana Labs develops processors for training and inference of Machine Learning models. This acquisition would allow Intel to compete better in the AI processor market and get new customers which were previously exclusive to Habana Labs.

NVIDIA Leads the Edge AI Chipset Market but Competition is Intensifying: ABI Research

Diversity is the name of the game when it comes to the edge Artificial Intelligence (AI) chipset industry. In 2019, the AI industry is witnessing the continual migration of AI workloads, particularly AI inference, to edge devices, including on-premise servers, gateways, and end-devices and sensors. Based on the AI development in 17 vertical markets, ABI Research, a global tech market advisory firm, estimates that the edge AI chipset market will grow from US $2.6 billion in 2019 to US $7.6 billion by 2024, with no vendor commanding more than 40% of the market.

The frontrunner of this market is NVIDIA, with a 39% revenue share in the first half of 2019. The GPU vendor has a strong presence in key AI verticals that are currently leading in AI deployments, such as automotive, camera systems, robotics, and smart manufacturing. "In the face of different use cases, NVIDIA chooses to release GPU chipsets with different computational and power budgets. In combination with its large developer ecosystem and partnerships with academic and research institutions, the chipset vendor has developed a strong foothold in the edge AI industry," said Lian Jye Su, Principal Analyst at ABI Research.

NVIDIA is facing stiff competition from Intel with its comprehensive chipset portfolio, from Xeon CPU to Mobileye and Movidius Myriad. At the same time, FPGA vendors, such as Xilinx, QuickLogic, and Lattice Semiconductor, are creating compelling solutions for industrial AI applications. One missing vertical from NVIDIA's wide footprint is consumer electronics, specifically smartphones. In recent years, AI processing in smartphones has been driven by smartphone chipset manufacturers and smartphone vendors, such as Qualcomm, Huawei, and Apple. In smart home applications, MediaTek and Amlogic are making their presence known through the widespread adoption of voice control front ends and smart appliances.

Compute Express Link Consortium (CXL) Officially Incorporates

Today, Alibaba, Cisco, Dell EMC, Facebook, Google, Hewlett Packard Enterprise, Huawei, Intel Corporation and Microsoft announce the incorporation of the Compute Express Link (CXL) Consortium, and unveiled the names of its newly-elected members to its Board of Directors. The core group of key industry partners announced their intent to incorporate in March 2019, and remain dedicated to advancing the CXL standard, a new high-speed CPU-to-Device and CPU-to-Memory interconnect which accelerates next-generation data center performance.

The five new CXL board members are as follows: Steve Fields, Fellow and Chief Engineer of Power Systems, IBM; Gaurav Singh, Corporate Vice President, Xilinx; Dong Wei, Standards Architect and Fellow at ARM Holdings; Nathan Kalyanasundharam, Senior Fellow at AMD Semiconductor; and Larrie Carr, Fellow, Technical Strategy and Architecture, Data Center Solutions, Microchip Technology Inc.

India First Country to Deploy AI Machine Learning to Fight Income Tax Evasion

India is building a large AI machine learning data-center that can crunch through trillions of financial transactions per hour to process income tax returns of India's billion-strong income tax assessee base. India's Income Tax Department has relied on human tax assessment officers that are randomly selected by a computer to assess tax returned filed by individuals, in an increasingly inefficient system that's prone to both evasion and corruption. India has already been using machine learning since 2017 to fish out cases of tax-evasion for further human scrutiny. The AI now replaces human assessment officers, relegating them up an escalation matrix.

The AI/ML assessment system is a logical next step to two big policy decisions the Indian government has taken in recent years: one of 100% data-localization by foreign entities conducting commerce in India; and getting India's vast population to use electronic payment instruments, away from paper-money, by de-monetizing high-value currency, and replacing it with a scarce supply of newer bank-notes that effectively force people to use electronic instruments. Contributing to these efforts are some of the lowest 4G mobile data prices in the world (as low as $1.50 for 40 GB of 4G LTE data), and low-cost smartphone handsets. It's also free to open a basic bank account with no minimum balance requirements.

Logic Supply Unveils Karbon 300 Compact Rugged PC, Built For IoT

Global industrial and IoT hardware manufacturer Logic Supply has combined the latest vision processing, security protocols, wireless communication technologies, and proven cloud architectures to create the Karbon 300 rugged fanless computer. The system has been engineered to help innovators overcome the limitations of deploying reliable computer hardware in challenging environments.

"Computing at the edge is increasingly at the core of today's Industry 4.0 and Industrial IoT solutions," says Logic Supply VP of Products Murat Erdogan. "These devices are being deployed in environments that would quickly destroy traditional computer hardware. The builders and creators we work with require a careful combination of connectivity, processing and environmental protections. With Karbon 300, we're providing the ideal mix of capabilities to help make the next generation of industry-shaping innovation a reality, and enable innovators to truly challenge what's possible."

QNAP Officially Launches the TS-2888X AI-Ready NAS for Machine Learning

QNAP Systems, Inc. today officially launched the TS-2888X, an AI-Ready NAS specifically optimized for AI model training. Built using powerful Intel Xeon W processors with up to 18 cores and employing a flash-optimized hybrid storage architecture for IOPS-intensive workloads, the TS-2888X also supports installing up to 4 high-end graphics cards and runs QNAP's AI developer package " QuAI". The TS-2888X packs everything required for machine learning AI, greatly reducing latency, accelerating data transfer, and eliminating performance bottlenecks caused by network connectivity to expedite AI implementation.

Google Cloud Introduces NVIDIA Tesla P4 GPUs, for $430 per Month

Today, we are excited to announce a new addition to the Google Cloud Platform (GCP) GPU family that's optimized for graphics-intensive applications and machine learning inference: the NVIDIA Tesla P4 GPU.

We've come a long way since we introduced our first-generation compute accelerator, the K80 GPU, adding along the way P100 and V100 GPUs that are optimized for machine learning and HPC workloads. The new P4 accelerators, now in beta, provide a good balance of price/performance for remote display applications and real-time machine learning inference.

Khronos Group Releases NNEF 1.0 Standard for Neural Network Exchange

The Khronos Group, an open consortium of leading hardware and software companies creating advanced acceleration standards, announces the release of the Neural Network Exchange Format (NNEF) 1.0 Provisional Specification for universal exchange of trained neural networks between training frameworks and inference engines. NNEF reduces machine learning deployment fragmentation by enabling a rich mix of neural network training tools and inference engines to be used by applications across a diverse range of devices and platforms. The release of NNEF 1.0 as a provisional specification enables feedback from the industry to be incorporated before the specification is finalized - comments and feedback are welcome on the NNEF GitHub repository.

AMD also Announces Radeon Instinct MI8 and MI6 Machine Learning Accelerators

AMD also announced the Radeon Instinct MI8 and MI6 Machine Learning GPUs based on Fiji and Polaris cores, respectively. These parts comprise the more "budget" part of the still most certainly non-consumer oriented high-end machine learning lineup. Still, with all parts using fairly modern cores, they aim to make an impact in their respective segments.

Starting with the Radeon Instinct MI8, we have a Fiji based core with the familiar 4 GBs of HBM1 memory and 512 GB/s total memory bandwidth. It has 8.2 TFLOPS of either Single Precision of Half Precision floating point performance (so performance there does not double when going half precision like its bigger Vega based brother, the MI25). It features 64 Compute Units.

The Radeon Instinct MI6 is a Polaris based card and slightly slower in performance than the MI8, despite having four times the amount of memory at 16 GBs of GDDR5. The likely reason for this is a slower bandwidth speed, at only 224 GB/s. It also has less compute units at 36 total, with a total of 2304 stream processors. This all equates out to a still respectable 5.7 TFLOPs of overall half or single precision floating point performance (which again, does not double at half precision rate like Vega).

AMD Announces the Radeon Instinct MI25 Deep Learning Accelerator

AMD's EPYC Launch presentation focused mainly on its line of datacenter processors, but fans of AMD's new Vega GPU lineup may be interested in another high-end product that was announced during the presentation. The Radeon Instinct MI25 is a Deep Learning accelerator, and as such is hardly intended for consumers, but it is Vega based and potentially very potent in the company's portfolio all the same. Claiming a massive 24.6 TFLOPS of Half Precision Floating Point performance (12.3 Single Precision) from its 64 "next-gen" compute units, this machine is very suited to Deep Learning and Machine AI oriented applications. It comes with no less than 16 GBs of HBM2 memory, and has 484 GB/s of memory bandwidth to play with.

ARM Reveals Its Plan for World Domination: Announces DynamIQ Technology

ARM processors have been making forays into hitherto shallow markets, with it's technology and processor architectures winning an ever increasing amount of design wins. Most recently, Microsoft itself announced a platform meant to use ARM processors in a server environment. Now, ARM has put forward its plans towards achieving a grand total of 100 billion chips shipped in the 2017-2021 time frame.

To put that goal in perspective, ARM is looking to ship as many ARM-powered processors in this 2017-2021 time frame as it did between 1991 and 2017. This is no easy task - at least if ARM were to stay in its known markets, where it has already achieved almost total saturation. The plan: to widen the appeal of its processor design, with big bets in the AI, Automotive, XR (which encompasses the Virtual Reality, Augmented Reality, and Mixed Reality markets), leveraged by what ARM does best: hyper-efficient processors.

"Zoom and Enhance" to Become a Reality Thanks to Machine Learning

The one phrase from television that makes IT people and creative professionals cringe the most is "zoom and enhance" - the notion that you zoom into a digital image and, at the push of a button, it converts a pixellated image into something with details - which lets CSI catch the bad guys. Up until now, this has been laughably impossible. Images are made up of dots called pixels, and the more pixels you have, the more details you can have in your image (resolution). Zooming into images eventually shows you a colorful checkerboard that's proud of its identity.

Google is tapping into machine-learning, in an attempt to change this. The company has reportedly come up with a machine-learning technique that attempts to reconstruct details in low-resolution images. Google is calling this RAISR (rapid and accurate image super-resolution). The technology works with the software learning "edges" of a picture (portions of the image with drastic changes in color and brightness gradients), and attempts to reconstruct them. What makes this different from conventional super-sampling methods is its machine-learning component. A low-resolution image is studied by the machine to invent an upscaling method most effective for the image, in-situ. While its application in law-enforcement is tricky, and will only become a reality when a reasonably high court of law sets a spectacular precedent; this technology could have commercial applications in up-scaling low-resolution movies to new formats such as 4K Ultra HD, and perhaps even 8K.
Return to Keyword Browsing
May 16th, 2024 10:38 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts