M2 Low power mode and the future.

jbailey · Jul 27, 2022

casperes1996 said:
I think "something bad" is a safe bet regardless of what you remember

A modern architecture would probably throw some sort of exception but I don't think the 8051/8052 did anything sensible. Probably the SP just rolled over to 0 and kept writing on memory until it crashed.

Cmaier · Jul 27, 2022

jbailey said:
A modern architecture would probably throw some sort of exception but I don't think the 8051/8052 did anything sensible. Probably the SP just rolled over to 0 and kept writing on memory until it crashed.

When you are limited in how many transistors you can fit on there, you make dumb architectural decisions, thinking “we’ll ditch this architecture and move onto something better later” and then you are stuck with it for a few decades.

jbailey · Jul 27, 2022

Cmaier said:
When you are limited in how many transistors you can fit on there, you make dumb architectural decisions, thinking “we’ll ditch this architecture and move onto something better later” and then you are stuck with it for a few decades.

The C compiler we tried to use didn't make matters any better if I remember correctly (it was a long time ago). It didn't make any attempt at compile time to do a sanity check on how much data was getting pushed on the stack. Obviously it was early in the project when we discovered that C really wasn't going to cut it so we just used the C compiler to produce Assembly that we then hand optimized with hardcoded external memory parameter blocks. It was ugly. Reading and writing to external memory was very slow and painful. All of our comments was the C code that we started with before we changed it to ASM51. Lots of ugly macros.

OK, now I need go back to something less painful. Javascript.

casperes1996 · Jul 27, 2022

jbailey said:
A modern architecture would probably throw some sort of exception but I don't think the 8051/8052 did anything sensible. Probably the SP just rolled over to 0 and kept writing on memory until it crashed.

And there's probably some tacky code out there intentionally abusing rollover behaviour. *Starts thinking of the A20 line of x86 IBM Compatibles*

jbailey said:
The C compiler we tried to use didn't make matters any better if I remember correctly (it was a long time ago). It didn't make any attempt at compile time to do a sanity check on how much data was getting pushed on the stack. Obviously it was early in the project when we discovered that C really wasn't going to cut it so we just used the C compiler to produce Assembly that we then hand optimized with hardcoded external memory parameter blocks. It was ugly. Reading and writing to external memory was very slow and painful. All of our comments was the C code that we started with before we changed it to ASM51. Lots of ugly macros.

OK, now I need go back to something less painful. Javascript.

Only marginally less painful

Cmaier · Jul 27, 2022

casperes1996 said:
And there's probably some tacky code out there intentionally abusing rollover behaviour. *Starts thinking of the A20 line of x86 IBM Compatibles*

Only marginally less painful

Ah, the infamous A20 gate. God that was a pain.

Yoused · Jul 27, 2022

Someone started a thread over elsewhere on how M2 has lower performance/W than M1, which rather early on generated

Xiao_Xi said:
…
I wonder where cmaier is when you need him most.

Sydde said:
…
He got banned, AAUI, for, reasons.

Romain_H said:
What did he do this time?

Sydde said:
His side of the story is something like "I looked under the bridge and became angry that it had been allowed to reach such a state". If you were to ask, say, weaselbot, you would probably get a completely different narrative, about snowflakes and hurt fee-fees and how I should be reprimanded for even mentioning it.

It is not a very active thread – it looks like the fud-baiters are losing traction.

Cmaier · Jul 27, 2022

FWIW, a higher voltage IS required to hit a higher frequency, for a given design with a given circuit, etc. The rate of the voltage ramp on the output of a CMOS logic gate - that is, the slope - is a function of the voltage difference Vdd and Vss. So what happens is that if you speed up the clocks but don’t speed up those ramps, then signals can’t switch and propagate through the path quickly enough. You also get cross-coupling between signals which can cause up to a 2x delay if you have too big a difference in the slopes between neighboring wires. So for a bunch of reasons, when you are “turbo boosting” or whatever you want to call it, you typically modulate the voltage along with the clock. This all stems from the fact that the source—drain current is a function of the voltage difference between the gate and the source. Current is just how much charge you move per second. So the more charge you move per second, the faster you can put charge on the output of the gate. And the more charge you have on the output of the gate, the higher its voltage.

Cmaier · Jul 28, 2022

This is fun.

https://www.twitter.com/i/web/status/1552797680723628032/

theorist9 · Jul 28, 2022

casperes1996 said:
But yeah it's wild with how the old chips stick around. One of the very recent space rovers (probably Mars) was powered by a PowerPC G3 IIRC. Lots of monitors with old 80186, 80286 like chips too.

They use those old chips because their large feature size makes them more reistant to the relatively high levels of radiation they encounter in space. Plus NASA has lots of experience with those chips, so they know how well they'll hold up in that environment. The actual chip used in Perseverance is a RAD750*, which is a radiation-hardened version of the PowerPC 750. And it's quite expensive—about $200k – $300k (nearly as much as Apple will be charging for the 7k Pro Display XDR). Though NewEgg does occasionally run sales. It's been used in about 150 missions, including the Webb Space Telescope.

*https://en.wikipedia.org/wiki/RAD750

casperes1996 · Jul 28, 2022

theorist9 said:
They use those old chips because their large feature size makes them more reistant to the relatively high levels of radiation they encounter in space. Plus NASA has lots of experience with those chips, so they know how well they'll hold up in that environment. The actual chip used in Perseverance is a RAD750*, which is a radiation-hardened version of the PowerPC 750. And it's quite expensive—about $200k – $300k (nearly as much as Apple will be charging for the 7k Pro Display XDR). Though NewEgg does occasionally run sales. It's been used in about 150 missions, including the Webb Space Telescope.

*https://en.wikipedia.org/wiki/RAD750

That's it yeah. I understand the reasons and it all makes sense. It's just another interesting case of old chip designs still being used, albeit in an altered form, here the radiation resistance and whatnot to make it survive the harsh environments

Jimmyjames · Jul 29, 2022

Cmaier said:
This is fun.

https://www.twitter.com/i/web/status/1552797680723628032/

Indeed. It's from the video I posted.

Yoused · Jul 29, 2022

The Voyagers have been transitting the most hostile regions of local space, where cosmic ray flux is really, really high. Their computers were designed over 48 years ago, and one of them finally seems to be showing signs of psychosis.

leman · Aug 5, 2022

Cmaier said:
I think it more likely that someday apple might go to three levels of cores, with performance cores, intermediate cores, and efficiency cores, and that the current efficiency cores would make decent intermediate cores. That said, while i find that MORE likely than them scaling up the efficiency cores into p-cores, i DON’T find it VERY likely.

Wouldn't it make more sense to use the current crop of P-cores as intermediate cores and then have a "real" P-core with higher power consumption (think desktop use)?

Cmaier · Aug 5, 2022

leman said:
Wouldn't it make more sense to use the current crop of P-cores as intermediate cores and then have a "real" P-core with higher power consumption (think desktop use)?

Maybe. I don’t think 3 levels makes a lot of sense either way, though.

jbailey · Aug 5, 2022

Cmaier said:
Maybe. I don’t think 3 levels makes a lot of sense either way, though.

ARM Inc seems to disagree. Is it just that they don't have enough performance in their Performance cores do you think? The Arm trend seems to be three or more levels of performance (more than three on embedded).

Cmaier · Aug 5, 2022

jbailey said:
ARM Inc seems to disagree. Is it just that they don't have enough performance in their Performance cores do you think? The Arm trend seems to be three or more levels of performance (more than three on embedded).

It’s a design trade off, I guess. But if you can make performance cores as energy-efficient as Apple does, I don’t see why you would need another level below that (performance-wise). And it’s not clear to me that if Apple wanted a higher level of performance for a third level that it would make sense (given the performance already in Apple’s p-cores) - apple hasn’t really left much on the table where they can pull out a bunch more performance and throw energy to the wind.

Yoused · Aug 5, 2022

Cmaier said:
if you can make performance cores as energy-efficient as Apple does, I don’t see why you would need another level below that

I still contend that they are working toward a hybrid, where the E/H cores will function quite well at very low power but have a wart on the side that will allow them to ramp up to P level. Already Apple's E cores have improved significantly, by maybe as much as 3~4x compared to M1. Apple will probably reach practical core convergence by M4 or 5.

Cmaier · Aug 5, 2022

Yoused said:
I still contend that they are working toward a hybrid, where the E/H cores will function quite well at very low power but have a wart on the side that will allow them to ramp up to P level. Already Apple's E cores have improved significantly, by maybe as much as 3~4x compared to M1. Apple will probably reach practical core convergence by M4 or 5.

Could be. I don’t know enough about real-life Apple workloads to know what makes the most sense. In general, I would always assume I could benefit from a “low power core” which is missing some stuff that isn’t needed to eke out the highest possible performance; simply disabling that stuff when not needed will not be as power-efficient, because there’s a coarseness to the power grid. Disabling clocks is easy and good, but disabling the power rail is better, and it’s hard to do that in a very fine-grained way.

M2 Low power mode and the future.

jbailey

Power User

Cmaier

Site Master

jbailey

Power User

casperes1996

Power User

Cmaier

Site Master

Yoused

up

Cmaier

Site Master

Cmaier

Site Master

theorist9

Site Champ

casperes1996

Power User

Jimmyjames

Site Champ

Yoused

up

leman

Site Champ

Cmaier

Site Master

jbailey

Power User

Cmaier

Site Master

Yoused

up

Cmaier

Site Master

Similar threads