Apple to use it’s own server chips

Cmaier

Site Master
Staff Member
Site Donor
Posts
5,561
Reaction score
9,081
I wonder if the servers are boxes of blades.
i would think so. I think these are M2 ultras, or something very close to those, and they should be able to achieve fairly high density in a rack, assuming custom cooling, given the thermals.
 

theorist9

Site Champ
Posts
645
Reaction score
604
If this application could also benefit from Extreme-class chips, that would reduce or eliminate one of the main barriers to seeing these in the Mac Pro: Insufficient volume to justify the development costs.
 
Last edited:

Jimmyjames

Site Champ
Posts
823
Reaction score
930
I’m struggling to make sense of some of the rumours surrounding this initiative. There was a report today by The Information:
https://9to5mac.com/2024/05/29/apple-ai-confidential-computing-ios-18/ (The Information link unavailable at the time of writing)

It states that Apple has a way to perform more demanding AI tasks in the cloud while maintaining privacy. That’s all great but what I am perplexed by is

A). Can they really provide enough compute for these really demanding tasks by using M2 Ultras? These are nearly beaten by an M3 Max now and certainly big iron from Nvidia will destroy these. Can really support their user base of ~1 billion users?

B) If not just M2 Ultras, then what? The article states that later in the year Apple will move to M4 chips. I assume the Ultra. Gurman has repeatedly said no M4 Ultra until mid-2025. He could certainly be wrong, but he has doubled down on it recently. The M4 Ultra has to come sooner if they are to be even remotely competitive with their own chips surely?

C) Perhaps the rumours of their own servers for AI are just wrong. That is, they are making them, but they handle only a small component, and that detail is lost in the rumour.
 

Yoused

up
Posts
5,797
Reaction score
9,328
Location
knee deep in the road apples of the 4 horsemen
If not just M2 Ultras, then what? The article states that later in the year Apple will move to M4 chips. I assume the Ultra. Gurman has repeatedly said no M4 Ultra until mid-2025.

The obvious answer (well, obvious to me) is that each blade has a pair of M2 Ultras with a memory bridge between them (each can access the other's package RAM over the bridge), and the SoCs are clamped into sockets that will be pin-compatible with M4 Ultras (the clamp maintains solid pin contact better than the old style seating sockets and also affords a good waste heat path).
 

Jimmyjames

Site Champ
Posts
823
Reaction score
930
The obvious answer (well, obvious to me) is that each blade has a pair of M2 Ultras with a memory bridge between them (each can access the other's package RAM over the bridge), and the SoCs are clamped into sockets that will be pin-compatible with M4 Ultras (the clamp maintains solid pin contact better than the old style seating sockets and also affords a good waste heat path).
Do you think this would provide enough compute for a user base the size of Apple’s?
 

Yoused

up
Posts
5,797
Reaction score
9,328
Location
knee deep in the road apples of the 4 horsemen
Do you think this would provide enough compute for a user base the size of Apple’s?
Depends on the size of the installation. The largest, PFlop/EFlop SCs have millions of cores (and also draw many megawatts). If one U4 can hold 16 blades comfortably and 1∞L has a room with 30 racks full of units, that would be a shitload of cores. They might have to contract with Cray for the high-performance interconnect, though.

(Remember when Steve told us the G4 was classified as a SC?)
 
Last edited:

Altaic

Power User
Posts
166
Reaction score
208
Pretty sure Apple’s ML infrastructure will be devoted to R&D, not inference. They may collect non-opt-out-on-device-data with differential privacy, but that’s different. This guy gets it:

IMG_3203.jpeg
 
Last edited:

dada_dave

Elite Member
Posts
2,363
Reaction score
2,381
Pretty sure Apple’s ML infrastructure will be devoted to R&D, not inference. They may collect non-opt-out-on-device-data with differential privacy, but that’s different. This guy gets it:

View attachment 29678
I’m not sure I understand what API vs local is in this context. Or rather I get local, but am unsure what API is here. Is there another meaning beyond Application Programming Interface or is there some way that applies here that I’m missing?
 

casperes1996

Power User
Posts
232
Reaction score
257
I’m not sure I understand what API vs local is in this context. Or rather I get local, but am unsure what API is here. Is there another meaning beyond Application Programming Interface or is there some way that applies here that I’m missing?
Took me a while to get that too. They mean using cloud provider apis like ChatGPT. I think the ai crowd use it as a shorthand because the model running programs either let you run local or insert an api access token so they conclude api is online models.
 

throAU

Site Champ
Posts
285
Reaction score
310
Location
Perth, Western Australia
I’m struggling to make sense of some of the rumours surrounding this initiative. There was a report today by The Information:
https://9to5mac.com/2024/05/29/apple-ai-confidential-computing-ios-18/ (The Information link unavailable at the time of writing)

It states that Apple has a way to perform more demanding AI tasks in the cloud while maintaining privacy. That’s all great but what I am perplexed by is

A). Can they really provide enough compute for these really demanding tasks by using M2 Ultras? These are nearly beaten by an M3 Max now and certainly big iron from Nvidia will destroy these. Can really support their user base of ~1 billion users?

B) If not just M2 Ultras, then what? The article states that later in the year Apple will move to M4 chips. I assume the Ultra. Gurman has repeatedly said no M4 Ultra until mid-2025. He could certainly be wrong, but he has doubled down on it recently. The M4 Ultra has to come sooner if they are to be even remotely competitive with their own chips surely?

C) Perhaps the rumours of their own servers for AI are just wrong. That is, they are making them, but they handle only a small component, and that detail is lost in the rumour.

Server hardware is usually last generation or late in the generation because that’s when teh bugs have been sorted out.

Whilst the M2 ultra may not be the latest and greatest, it is tested and like anyone apple will be scaling out.

Its not like one server is going to use one of these chips - think probably 4 or 8 per blade, and say 40 blades per rack, and hundreds of racks in a datacentre….



whilst your modern iDevice might compete with a single m2 ultra, it certainly won’t be as fast as a cluster of even 300 of them, and apple will have hundreds of thousands or millions of them in their cloud no doubt.
 

dada_dave

Elite Member
Posts
2,363
Reaction score
2,381
How much of a GPU core's logic is graphic-specific? If you shave out shading and RT to get a PEU (Parallel Embarrassing Unit), is it notably smaller and more efficient at the kind of work a server would be doing?
GPU cores are really designed for FP heavy loads and have no branch prediction and threads in a warp have to all take the same branches to be fully parallel. So they’re not like little CPU cores that would be useful for server workloads.

If the primary purpose is for ML training, maybe large scale inference though, obviously the GPUs are good for that. Though it would be better if they had dedicated matmul accelerators for that purpose.
 

The Hardcard

New member
Posts
4
Reaction score
7
I’m struggling to make sense of some of the rumours surrounding this initiative. There was a report today by The Information:
https://9to5mac.com/2024/05/29/apple-ai-confidential-computing-ios-18/ (The Information link unavailable at the time of writing)

It states that Apple has a way to perform more demanding AI tasks in the cloud while maintaining privacy. That’s all great but what I am perplexed by is

A). Can they really provide enough compute for these really demanding tasks by using M2 Ultras? These are nearly beaten by an M3 Max now and certainly big iron from Nvidia will destroy these. Can really support their user base of ~1 billion users?

B) If not just M2 Ultras, then what? The article states that later in the year Apple will move to M4 chips. I assume the Ultra. Gurman has repeatedly said no M4 Ultra until mid-2025. He could certainly be wrong, but he has doubled down on it recently. The M4 Ultra has to come sooner if they are to be even remotely competitive with their own chips surely?

C) Perhaps the rumours of their own servers for AI are just wrong. That is, they are making them, but they handle only a small component, and that detail is lost in the rumour.
I’m just another tech nerd speculator here, no special or professional insight:

1 The bottleneck for large language model token generation Is memory bandwidth. for this part of inference both Ultras are about the same speed and nearly double the speed of each Max. The M1 generation at least, appears to not be using it theoretical maximum bandwidth, later generations are closer.

The other important factor for running the largest most useful models is memory capacity, where M2 Ultras also crush all Max variations.

I would hope the datacenter team would arrange to have the M2 Ultras fully populated with faster RAM than the consumer Ultras.

2. if these rumors are true, it would make sense for the M4 Max and Ultra to be available for consumers later. serving their user base with datacenter AI is going to require many hundreds of thousands of Ultras. They might need two to three quarters of M4 Ultra production before they can spare to sell them.

According to Semianalysis:

”The other indication that Cupertino is serious about their AI hardware and infrastructure strategy is they made a number of major hires a few months ago. This includes Sumit Gupta who joined to lead cloud infrastructure at Apple in March. He’s an impressive hire. He was at Nvidia from 2007 to 2015, and involved in the beginning of Nvidia's foray into accelerated computing. After working on AI at IBM, he then joined Google’s AI infrastructure team in 2021 and eventually was the product manager for all Google infrastructure including the Google TPU and Arm based datacenter CPUs.

He’s been heavily involved in AI hardware at Nvidia and Google who are both the best in the business and are the only companies that are deploying AI infrastructure at scale today. This is the perfect hire.”

Hopefully he is discussing with the chip team about optimizing Apple Silicon capabilities in this arena.
 
Top Bottom
1 2