Google Tensor CPU

Cmaier

Elite Member
Staff Member
Vaccinated
Site Donor
Posts
3,037
Reaction score
4,214
Finally a non-Apple post by me :)

So it looks like Tensor has 3 levels of core, with the most performative being a pair of Cortex-X1’s? That puts them behind A14 (Firestorm), let alone A15 (Avalanche). 8-wide MOP dispatch instead of 7 would seem to be the only advantage, but X1 only has 4 ALUs (plus 4, instead of 3, FP), so unlikely it will dispatch 8 very often. Not to mention that the fetch is only 5-wide anyway (or 8 MOPS), so it’s likely tough to keep the register renamer/scheduler busy unless the instruction stream is very “complex.“

Major cache differences, too, but I would tend to think apple’s use of a large shared cache is more efficient than X1’s smaller dedicated caches (since some cores may be running very memory intensive threads and others may not. Unlikely that they are all pegging memory at the same time).

Also curious whether Google did their own physical design, or just used hard IP from Arm.
 
Last edited:

thekev

Elite Member
Posts
1,088
Reaction score
1,635
What's the cpu for? Typically when I read tensor cores, I think of machine learning applications, and google is highly invested in that area. A lot of machine learning applications tend to fetch data in power of 2 sized chunks, which are prone to cache conflicts between cores over a shared cache. Multiple caches should help avoid having to worry as much about eviction policies on data where reads greatly outnumber writes.
 

Cmaier

Elite Member
Staff Member
Vaccinated
Site Donor
Posts
3,037
Reaction score
4,214
What's the cpu for? Typically when I read tensor cores, I think of machine learning applications, and google is highly invested in that area. A lot of machine learning applications tend to fetch data in power of 2 sized chunks, which are prone to cache conflicts between cores over a shared cache. Multiple caches should help avoid having to worry as much about eviction policies on data where reads greatly outnumber writes.

Why would the CPU core caches matter to the ML cores? Surely the ML cores have their own bus access to the SLC and memory?
 

thekev

Elite Member
Posts
1,088
Reaction score
1,635
Why would the CPU core caches matter to the ML cores? Surely the ML cores have their own bus access to the SLC and memory?

Given that you referred to a "tensor cpu", I thought you were referring to something machine learning focused, akin to the co-processor fpga alternatives that have shown up in recent years.
 

Cmaier

Elite Member
Staff Member
Vaccinated
Site Donor
Posts
3,037
Reaction score
4,214
Given that you referred to a "tensor cpu", I thought you were referring to something machine learning focused, akin to the co-processor fpga alternatives that have shown up in recent years.

No, sorry. Google named its CPU “tensor.” That’s the name of the whole SoC.
 

Cmaier

Elite Member
Staff Member
Vaccinated
Site Donor
Posts
3,037
Reaction score
4,214
Looking more and more like tensor is a rebranded Exynos variant.
 
Top Bottom