A16 Bionic

Andropov

Site Champ
Posts
620
Reaction score
780
Location
Spain
A dedicated thread to talk about Apple's latest chip :)

Seems like this leaked Geekbench result (https://browser.geekbench.com/v5/cpu/17141095) is more in line with expectations: 1887 in single core, 5455 in multicore (+9.2%, +13.4%). Interestingly, the biggest improvement in multicore is in AES-XTS, which has jumped quite a bit (+36%), despite not improving as much in single core (+8.7%). Could this be that the new E cores are significantly better now at this particular subtest? Or maybe a result of the increased memory bandwidth?

No Geekbench compute results yet, sadly.
 

exoticspice1

Site Champ
Posts
298
Reaction score
101
I expect the GPU be a 4-5% improvement due to LPDRR5 and also because the GPU is largely untouched from A15 other Apple would have said in their marketing.
 

Andropov

Site Champ
Posts
620
Reaction score
780
Location
Spain
I expect the GPU be a 4-5% improvement due to LPDRR5 and also because the GPU is largely untouched from A15 other Apple would have said in their marketing.
Hard to know, it'll depend on the workload. I took a frame capture of Apple's Modern Rendering With Metal sample code, which is the closest thing I have at hand that resembles the 'typical' game engine, to see how much of a limitation memory bandwidth (which has improved by 50%) is during a typical frame. Xcode GPU profiling tools in frame capture show main memory limiter at ~29% on average during a frame on an iPhone 13 Pro. Hence I'd expect that the same frame on a A16 Bionic with +50% memory bandwidth to be around (gross estimate) ~9% faster due to memory bandwidth improvements alone.

And while Modern Rendering With Metal sample code may look like a traditional game engine, it's highly optimized towards TBDR which comes with lower memory bandwidth usage compared to less optimized renderers, so 3rd party games may saturate the memory bandwidth more often and therefore benefit even more from the increased bandwidth.

Even then, that's assuming that Apple didn't change the GPU cores *at all* this year, which still seems unlikely to me. Unprecedented, even. But I'll give that it's odd that Apple didn't mention anything about the GPU at the keynote.
 

Cmaier

Site Master
Staff Member
Site Donor
Posts
5,350
Reaction score
8,558
A dedicated thread to talk about Apple's latest chip :)

Seems like this leaked Geekbench result (https://browser.geekbench.com/v5/cpu/17141095) is more in line with expectations: 1887 in single core, 5455 in multicore (+9.2%, +13.4%). Interestingly, the biggest improvement in multicore is in AES-XTS, which has jumped quite a bit (+36%), despite not improving as much in single core (+8.7%). Could this be that the new E cores are significantly better now at this particular subtest? Or maybe a result of the increased memory bandwidth?

No Geekbench compute results yet, sadly.

Hard for me to figure out why AES-XTS would have such a jump in MP vs. SP, given the description of the test here: https://www.geekbench.com/doc/geekbench5-cpu-workloads.pdf. Shouldn’t be a lot of dependency on memory bandwidth. I think your first guess may be right - the E cores got better at it.
 

Andropov

Site Champ
Posts
620
Reaction score
780
Location
Spain
I think your first guess may be right - the E cores got better at it.
Hmm. If that's the case, E cores must be an order of magnitude faster at it now. I was trying to infer how much of the A15 Bionic multicore AEX-XTS score was due to the E cores by substracting 2 * the single score result from the total multicore score. But it seems like going from 1P to the full 2P + 4E cores doesn't even double performance on the A15: it goes from 4.82GB/s to 8.28GB/s, ~1.71x the single core result. Not a great scaling. So we can't tell how much of that (if any) is due to the E cores.

On the other hand, the A16 Bionic goes from 5.24GB/s on 1P core to 14.2GB/s on 2P + 4E cores, ~2.71x the single core result. So the E cores must now account for at least 3.72GB/s of the multicore result, potentially more (since we know the scaling wasn't 2x on the A15), up to ~5GB/s. Quite impressive if true, given how little A15's E cores contributed to the AES-XTS multicore score.
 

Andropov

Site Champ
Posts
620
Reaction score
780
Location
Spain
Whoa, +28% on GPU. New GPU core design confirmed? Seems too high to be just from increased memory bandwidth and better process node.

+17% multicore CPU score would already be great on its own, but Apple claims to have '20% lower power' on top of that. Did they mean 20% less power at same performance level or something like that? +17% performance at 80% the power sounds too good to be true.
 

Cmaier

Site Master
Staff Member
Site Donor
Posts
5,350
Reaction score
8,558
Whoa, +28% on GPU. New GPU core design confirmed? Seems too high to be just from increased memory bandwidth and better process node.

+17% multicore CPU score would already be great on its own, but Apple claims to have '20% lower power' on top of that. Did they mean 20% less power at same performance level or something like that? +17% performance at 80% the power sounds too good to be true.

Lots of good questions. It does seem that Apple has stopped trying to talk up its performance improvements and is letting customers get surprised (in a good way) by what they find over the last couple of iterations on the A-series side.

As for the 20%, I think they did NOT mean ”at the same performance,” because in the same sentence they talked about something else as being ”x% less power at the same performance” - I can’t remember what they were talking about.
 

Joelist

Power User
Posts
177
Reaction score
168
Is Geekbench the one that needs to be patched? Remember one of the big benchmarks was WAY off on Apple Silicon performance.
 

Colstan

Site Champ
Posts
822
Reaction score
1,124
Whoa, +28% on GPU. New GPU core design confirmed?
Obviously, we're many steps from an A16 architectural foundation to becoming an M-series product release, but I'm hoping this is true, because GPU performance seems to be the one area where Apple Silicon Macs still fall short compared to PCs. I'm sure we all remember this infamous graph:

dubious.png


I think it's still unclear exactly what Apple was attempting to communicate. I had assumed that they were trying to say that the M1 Ultra performs about the same as an RXT 3090 if both are using the same wattage, but they communicated it in the most ham-fisted way possible. It gave off the impression that the Ultra had nearly identical performance to the 3090, which was immediately proven untrue once the Mac Studio was independently benchmarked.

It's a shame, because the M1 Ultra is an otherwise extremely impressive SoC, yet Apple needlessly gave it a self-inflicted black eye. According to Geekbench, the top-end M1 Ultra with a 20-core CPU, 64-core GPU, and 128GB of unified memory blows away the competition in regards to single-core, multi-core, and tasks that take advantage of specialized co-processors on the SoC.

Here are the CPU scores for the 28-core Intel Xeon W-3275M inside the top-end 2019 Mac Pro:

28core.jpg


Here are the results for the M1 Ultra inside the high-end Mac Studio:

m1ultra.jpg


That's remarkable, and keep in mind that the 28-core Mac Pro starts at $13,000 while a similar Mac Studio is about $6,200.

However, this same configuration Mac Studio, specifically upgraded to the 64-core GPU, falls short on Geekbench's GPU tasks compared to similar PCs, which detracts from otherwise stellar performance from high-end Apple Silicon:

ultracompared.jpg


The M1 Ultra is no slouch, that's impressive considering that this is Apple's first desktop GPU, but they shouldn't be claiming any sort of parity with Nvidia's top-performing part. The M1 Ultra destroyed the Mac Pro in CPU, but it barely bested the 6600 XT, which just so happens to be the MPX module I recently ordered for my Mac Pro.

I'm hoping that @Andropov is right and that Apple has refocused considerably on Apple Silicon's GPU performance, the area that seems to need the most work, and that @Cmaier is also correct that Apple is going to let the chips do the talking, rather than vague, occasionally bizarre charts spawn from Apple's marketing department.

Regardless, if GPU improvements prove correct with the A16, this portends well for all Apple Silicon products moving forward.
 

Cmaier

Site Master
Staff Member
Site Donor
Posts
5,350
Reaction score
8,558
Obviously, we're many steps from an A16 architectural foundation to becoming an M-series product release, but I'm hoping this is true, because GPU performance seems to be the one area where Apple Silicon Macs still fall short compared to PCs. I'm sure we all remember this infamous graph:

View attachment 17630

I think it's still unclear exactly what Apple was attempting to communicate. I had assumed that they were trying to say that the M1 Ultra performs about the same as an RXT 3090 if both are using the same wattage, but they communicated it in the most ham-fisted way possible. It gave off the impression that the Ultra had nearly identical performance to the 3090, which was immediately proven untrue once the Mac Studio was independently benchmarked.

It's a shame, because the M1 Ultra is an otherwise extremely impressive SoC, yet Apple needlessly gave it a self-inflicted black eye. According to Geekbench, the top-end M1 Ultra with a 20-core CPU, 64-core GPU, and 128GB of unified memory blows away the competition in regards to single-core, multi-core, and tasks that take advantage of specialized co-processors on the SoC.

Here are the CPU scores for the 28-core Intel Xeon W-3275M inside the top-end 2019 Mac Pro:

View attachment 17636

Here are the results for the M1 Ultra inside the high-end Mac Studio:

View attachment 17637

That's remarkable, and keep in mind that the 28-core Mac Pro starts at $13,000 while a similar Mac Studio is about $6,200.

However, this same configuration Mac Studio, specifically upgraded to the 64-core GPU, falls short on Geekbench's GPU tasks compared to similar PCs, which detracts from otherwise stellar performance from high-end Apple Silicon:

View attachment 17635

The M1 Ultra is no slouch, that's impressive considering that this is Apple's first desktop GPU, but they shouldn't be claiming any sort of parity with Nvidia's top-performing part. The M1 Ultra destroyed the Mac Pro in CPU, but it barely bested the 6600 XT, which just so happens to be the MPX module I recently ordered for my Mac Pro.

I'm hoping that @Andropov is right and that Apple has refocused considerably on Apple Silicon's GPU performance, the area that seems to need the most work, and that @Cmaier is also correct that Apple is going to let the chips do the talking, rather than vague, occasionally bizarre charts spawn from Apple's marketing department.

Regardless, if GPU improvements prove correct with the A16, this portends well for all Apple Silicon products moving forward.

I think Apple is going to have to have a decent answer for at least GPU compute for the “extreme” chip for Mac Pro. They also need astounding performance/watt for what they hope to do with glasses. They have to be working very hard on the GPU side.
 

Yoused

up
Posts
5,639
Reaction score
8,976
Location
knee deep in the road apples of the 4 horsemen
ere are the CPU scores for the 28-core Intel Xeon W-3275M inside the top-end 2019 Mac Pro:

28core.jpg


Here are the results for the M1 Ultra inside the high-end Mac Studio:

m1ultra.jpg


That's remarkable, and keep in mind that the 28-core Mac Pro starts at $13,000 while a similar Mac Studio is about $6,200.
But remember, the 28-core Xeon in Mac Pro had a base clock of 2.5GHz (if you are running more than about a third of the cores, count on base clock) while the Ultra runs at 3.23. 2.5 x 28 (or 1.1 x 56, being generous) should still be better than 3.23 x 16 (though, add in 4 x 2.0, that may make the difference).
 

Cmaier

Site Master
Staff Member
Site Donor
Posts
5,350
Reaction score
8,558

Colstan

Site Champ
Posts
822
Reaction score
1,124
But remember, the 28-core Xeon in Mac Pro had a base clock of 2.5GHz (if you are running more than about a third of the cores, count on base clock)
I can't speak to the M1 Ultra, but I've been running some stress tests on my Mac Pro with a Xeon W-3245. While this is all preliminary, I started out with the Intel Power Gadget utility. First off, as far as I can tell, Turbo Boost 3.0 isn't enabled on the Mac Pro, because I was never able to push it to 4.6Ghz, just the normal max boost of 4.4Ghz. The only way I was able to sustain 4.4Ghz was with a single thread at 100%, bump that up to two threads and it drops to 4.2Ghz. Increase it to 4 threads and it goes down again to 4.1Ghz.

Here's where it gets weird. This Xeon is 16C/32T, when maxing out 100% with 32 threads, all cores run at 3.9Ghz. It isn't until I enable AVX that it starts taking a major toll on the CPU. With AVX-256, it drops down to 3.6Ghz. Finally, when running AVX-512, it sits at the rated 3.2Ghz.

As I said, this is just my first test, and my initial assumption is that something seems off. However, coming from a Core i3, I'm hardly in a position to be certain of that, and I've been using Intel's own stress test. Temps never get above 70C, while the Tcase is 77C, so there's a bit of thermal headroom. My intitial conclusion is that Intel's boost clocks are fairy tales composed of unicorn tears, smoke powder, and liquid joy. However, at least with the Mac Pro, which is very much a quality thermal solution, Xeons are able to stay well above their rated speeds for sustained workloads, as long as you keep them away from AVX-512, which is used by very few users or applications. Unless you are emulating a Playstation 3, which apparently gets a 30% boost from AVX-512, not that I plan on doing that.

Even with 32-threads at 100%, the fans stay reasonably silent, only ramping somewhat, but not annoyingly so. Putting my hand behind the case, I feel a substantial increase in heat output, so Intel Power Gadget is definitely working it hard. It does make me wonder how much effort Apple is going to put into cooling the Apple Silicon Mac Pro. I think it's more likely that the SoC itself is going to be the limiting factor, rather than thermal constraints, unless they decide to follow Intel and AMD into power consumption crazy town. Even though I'm certainly not in the market for one, I'm very much looking forward to what form the next Mac Pro takes, because it's perhaps going to be the most exotic Apple Silicon Mac, and therefore the most interesting.
 

Andropov

Site Champ
Posts
620
Reaction score
780
Location
Spain
Obviously, we're many steps from an A16 architectural foundation to becoming an M-series product release, but I'm hoping this is true, because GPU performance seems to be the one area where Apple Silicon Macs still fall short compared to PCs. I'm sure we all remember this infamous graph:

I think it's still unclear exactly what Apple was attempting to communicate. I had assumed that they were trying to say that the M1 Ultra performs about the same as an RXT 3090 if both are using the same wattage, but they communicated it in the most ham-fisted way possible. It gave off the impression that the Ultra had nearly identical performance to the 3090, which was immediately proven untrue once the Mac Studio was independently benchmarked.
At the same wattage, the M1 Ultra is much faster than the 3090. The lowest power consumption of the 3090 is higher than the highest power consumption of the M1 Ultra. I believe the graph must be reflecting some kind of internal test where the M1 Ultra is rasterising the same scene as the 3090. TBDR GPUs excel at rasterization if the graphics pipeline is designed with TBDR in mind. Maybe the problem is that no one is doing that kind of optimizations (yet?). I doubt they made the graph out of thin air.

In any case, the best thing for Apple to do is what they're already doing: helping out in open source applications (like Blender) so they can have a properly designed backend that takes advantage of their GPUs.

I'm hoping that @Andropov is right and that Apple has refocused considerably on Apple Silicon's GPU performance, the area that seems to need the most work, and that @Cmaier is also correct that Apple is going to let the chips do the talking, rather than vague, occasionally bizarre charts spawn from Apple's marketing department.
They probably didn't want to hype the A16 Bionic performance too much on this year's keynote, since the iPhone 14 is not getting it. I'm sure they'll go back to roasting the competition when they release the M2 Pro/Max or the M3.
 

Jimmyjames

Site Champ
Posts
680
Reaction score
768
The M1 Ultra gets 260fps in gfxbench 4k Aztec Ruins.

The Nvidia 3090 gets 235fps in the same test.

It does seem like in pure raster performance the Ultra is as fast or faster than the 3090, not sure why the reviews say different (although having read some lately they mostly lack competence). The Ultra does fall behind in pure compute perf though. Apple should have been clearer, but they weren't misleading.
 

Joelist

Power User
Posts
177
Reaction score
168
I found it - it is Geekbench and specifically that GB Compute is too short bursty - it doesn't let AS ramp up all the way before quitting. It's why other benches and real world scenarios show better performance. It is referenced in Anandtech ( Andrei Frumusanu specifically).
 

Cmaier

Site Master
Staff Member
Site Donor
Posts
5,350
Reaction score
8,558
I found it - it is Geekbench and specifically that GB Compute is too short bursty - it doesn't let AS ramp up all the way before quitting. It's why other benches and real world scenarios show better performance. It is referenced in Anandtech ( Andrei Frumusanu specifically).
Ah, thought you were referring to the CPU tests in GB, not the GPU.
 
Top Bottom
1 2