Exciting unique chips

Cmaier · Aug 31, 2022

Yoused said:
I would not be at all surprised to learn that the register file itself is actually just a rack of 64 nine-bit pointers tat identify the current location of the selected register in the 512 entry rename array. In other words, are there, in fact, any actual fixed registers (perhaps r30 and r31) or is the register file just an abstraction? In support of the abstraction idea, I point to SVE/SVE2, which can handle vectors of indeterminate width: this would be greatly simplified by implementing a register rename boulliabaise and assigning GPRs in ways that facilitate wide-vector instances for SVE.

In every design i worked on there was an actual register file that was the “source of truth.” The renaming just determines which entry in the register file corresponds to which architectural register.

Of course there are lots of other bypass registers floating around, so that the output of a computational unit can feed its results back to the input when appropriate.

Yoused · Aug 31, 2022

Cmaier said:
In every design i worked on there was an actual register file that was the “source of truth.” The renaming just determines which entry in the register file corresponds to which architectural register.

Except, of course, the truth-state of a core is highly non-static. If there are six hundred plus instructions in flight at any given time (or half that for an E-core), the need for a "source of truth" becomes somewhat less obvious. You have 3 or 4 r31s and a LR in r30 (only as a BL target – a return could use any GPR, other than r31). You do need a clean state for exception handling, but exceptions are handled on-core, so the renamer needs only keep sight of where an entry resides relative to the exception boundary, which is not hugely different to determining where an entry resides relative to the code stream. I am just not seeing an advantage to a fixed register file that outweighs the disadvantages, in terms of modern computing.

Cmaier · Sep 1, 2022

Yoused said:
Except, of course, the truth-state of a core is highly non-static. If there are six hundred plus instructions in flight at any given time (or half that for an E-core), the need for a "source of truth" becomes somewhat less obvious. You have 3 or 4 r31s and a LR in r30 (only as a BL target – a return could use any GPR, other than r31). You do need a clean state for exception handling, but exceptions are handled on-core, so the renamer needs only keep sight of where an entry resides relative to the exception boundary, which is not hugely different to determining where an entry resides relative to the code stream. I am just not seeing an advantage to a fixed register file that outweighs the disadvantages, in terms of modern computing.

You have a big physical register file and when you want, say, the contents of R3 you index into the RF’s address CAM to find which register in the register file is the current true R3. True, in this case, means the R3 that is usable as an input to currently issued instructions. There’s a translation on the addressing of the RF that takes place between architectural and physical registers.

The register renamer’s job is to figure out, for the target registers of incoming instructions, what physical memory row to map (typically the row corresponding to the least recently used architectural register if the architectural register isn’t already mapped, otherwise potentially an existing entry for that architectural register, assuming everything that needs that register is retired or has its own copy already, etc etc). The renamer doesn’t need to think about source registers since they should be already mapped. (Yes, there can be an exception for code that reads a register it never wrote, and that is handled of course).

So the renamer builds up this mapping which is stored in a content-addressable-memory (cam) which is built into the addressing of the register file, in a separate block alongside the RF.

That mapping may also be needed by the scheduler, depending on implementation details, so we don’t really think of it as the register file itself Just like we don’t think of the TLB as RAM. (As a random example, some people put retirement or status flags in the CAM, which the scheduler may need to see).

I think maybe we just have a nomenclature misunderstanding. You want to call the cam the register file, if I am understanding what you are saying. The way you describe the operation seems more or less consistent with my designs, but we use different terminology.

Exciting unique chips

Cmaier

Site Master

Yoused

up

Cmaier

Site Master

Similar threads