Thursday, April 11th 2024

Meta Announces New MTIA AI Accelerator with Improved Performance to Ease NVIDIA's Grip

Apr 11th, 2024 05:12 Discuss (19 Comments)

Meta has announced the next generation of its Meta Training and Inference Accelerator (MTIA) chip, which is designed to train and infer AI models at scale. The newest MTIA chip is a second-generation design of Meta's custom silicon for AI, and it is being built on TSMC's 5 nm technology. Running at the frequency of 1.35 GHz, the new chip is getting a boost to 90 Watts of TDP per package compared to just 25 Watts for the first-generation design. Basic Linear Algebra Subprograms (BLAS) processing is where the chip shines, and it includes matrix multiplication and vector/SIMD processing. At GEMM matrix processing, each chip can process 708 TeraFLOPS at INT8 (presumably meant FP8 in the spec) with sparsity, 354 TeraFLOPS without, 354 TeraFLOPS at FP16/BF16 with sparsity, and 177 TeraFLOPS without.

Classical vector and processing is a bit slower at 11.06 TeraFLOPS at INT8 (FP8), 5.53 TeraFLOPS at FP16/BF16, and 2.76 TFLOPS single-precision FP32. The MTIA chip is specifically designed to run AI training and inference on Meta's PyTorch AI framework, with an open-source Triton backend that produces compiler code for optimal performance. Meta uses this for all its Llama models, and with Llama3 just around the corner, it could be trained on these chips. To package it into a system, Meta puts two of these chips onto a board and pairs them with 128 GB of LPDDR5 memory. The board is connected via PCIe Gen 5 to a system where 12 boards are stacked densely. This process is repeated six times in a single rack for 72 boards and 144 chips in a single rack for a total of 101.95 PetaFLOPS, assuming linear scaling at INT8 (FP8) precision. Of course, linear scaling is not quite possible in scale-out systems, which could bring it down to under 100 PetaFLOPS per rack.

Below, you can see images of the chip floorplan, specifications compared to the prior version, as well as the system.

Source: Meta

Add your own comment

19 Comments on Meta Announces New MTIA AI Accelerator with Improved Performance to Ease NVIDIA's Grip

Wirko

Thanks for correcting Meta's "TFLOPS/s" from their AI-generated specifications list to TFLOPS.

AleksandarK

News Editor

WirkoThanks for correcting Meta's "TFLOPS/s" from their AI-generated specifications list to TFLOPS.

Yeah I noticed that as well. The S literally stands for Second so not sure why do it again. ¯\_(ツ)_/¯

Daven

AleksandarKYeah I noticed that as well. The S literally stands for Second so not sure why do it again. ¯\_(ツ)_/¯

Marketing team ‘gone wild’ I guess.

not_my_real_name

AleksandarKYeah I noticed that as well. The S literally stands for Second so not sure why do it again. ¯\_(ツ)_/¯

AI acceleration, you know...

Daven

not_my_real_nameAI acceleration, you know...

Lol, I get it, nice joke. Meters per second per second and all that.

Owen1982

Anyone notice from spec sheet: nearly 4x power for 'only' 2x performance of 1st Gen?

Edit: Was looking at the Instances flops, maybe that is a bad comparison ¯\_(ツ)_/¯

Wirko

DavenLol, I get it, nice joke. Meters per second per second and all that.

But if they are actually right... then we better run... and run fast!

konga

Owen1982Anyone notice from spec sheet: nearly 4x power for 'only' 2x performance of 1st Gen?

Edit: Was looking at the Instances flops, maybe that is a bad comparison ¯\_(ツ)_/¯

I don't know much about AI compute specs, but it's definitely over 3x faster in several metrics in their spec sheet. This is a card designed primarily for their own use, so it only really needs to be faster in the ways that matter to them, anyway.

Steevo

I too RPMs at that.

#10

ThrashZone

Hi,
More worrying is Meta's grip a known bad actor hehe

#11

the54thvoid

Intoxicated Moderator

ThrashZoneHi,
More worrying is Meta's grip a known bad actor hehe

They're a bad actor for all sides - that makes them chaotic evil, I believe.

And with a bit of PR, they think they can swipe away Nvidia's marketshare? Don't think it works that way.

#12

Wirko

kongaThis is a card designed primarily for their own use, so it only really needs to be faster in the ways that matter to them, anyway.

They have published detailed specs, it looks like they are going to sell the card to others too.

#13

Onasi

the54thvoidAnd with a bit of PR, they think they can swipe away Nvidia's marketshare? Don't think it works that way.

It’s Meta, the company who apparently thought that saying “metaverse” enough times and even rebranding themselves as it would inevitably lead to said stupid idea becoming a reality and making them bank. They are Chaotic Evil alright, also in the sense that any rational thought had left the building a while ago.

#14

Wirko

ThrashZoneHi,
More worrying is Meta's grip a known bad actor hehe

At least they've released some interesting stuff as open source - maybe they'll release the software ecosystem for this chip.

#15

persondb

WirkoThanks for correcting Meta's "TFLOPS/s" from their AI-generated specifications list to TFLOPS.

It's even funnier because they put it as

Classical vector and processing is a bit slower at 11.06 TeraFLOPS at INT8 (...)

INT8 isn't 'FLOPS' as FLOPS' are 'FLoating point OPerations per Second'
There is no floating point in int8...

#16

AleksandarK

News Editor

persondbIt's even funnier because they put it as

INT8 isn't 'FLOPS' as FLOPS' are 'FLoating point OPerations per Second'
There is no floating point in int8...

Which is true. I assume they meant FP8, which is the hot new low-precision format everyone is pushing.

#17

ToTTenTranz

Unless Meta is going to be selling these as AIB products, it's not really going to ease Nvidia's grip.

Nvidia has plenty of competition in cloud services regardless of the hardware. What the market is lacking is competition in hardware that can be bought to run in the clients' installations.

#18

Solaris17

Super Dainty Moderator

ToTTenTranzUnless Meta is going to be selling these as AIB products, it's not really going to ease Nvidia's grip.

It’s eases the grip Nvidia has on them.

#19

Minus Infinity

Honestly if it came to a choice, I would choose Nvidia over Meta any day of the week. Fcukerberg is one of the three biggest scum on the planet. Huang an amateur compared to this clown. I put Meta and Google in the same category. Nvidia is next tier down.

Add your own comment

Meta Announces New MTIA AI Accelerator with Improved Performance to Ease NVIDIA's Grip

19 Comments on Meta Announces New MTIA AI Accelerator with Improved Performance to Ease NVIDIA's Grip

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts

Meta Announces New MTIA AI Accelerator with Improved Performance to Ease NVIDIA's Grip

Related News

19 Comments on Meta Announces New MTIA AI Accelerator with Improved Performance to Ease NVIDIA's Grip

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts