Discussion:
[gem5-dev] Review Request: X86: Runtime read, write conditions for CCFlagBits register
(too old to reply)
Nilay Vaish
2012-04-21 20:20:54 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/1160/
-----------------------------------------------------------

Review request for Default.


Description
-------

Changeset 8962:1c063dca6858
---------------------------
X86: Runtime read, write conditions for CCFlagBits register
This patch introduces runtime read, write conditions for CCFlagBits register.
The main idea is that each microop, while being used in the microcode,
should specify the condition code flags it would read and write. On the
basis of this info, at runtime, it would be decided which condition code
registers should be read and written. This would help in reducing the extent
of RAW dependence while using the o3 cpu with x86. Currently, the flags that
would read and written are specified together. Next couple of patches would
try to look in to splitting the flags in to read and write sets and split the
ccFlagBits register.


Diffs
-----

src/arch/x86/insts/micromediaop.hh 0bba1c59b4d1
src/arch/x86/insts/microop.hh 0bba1c59b4d1
src/arch/x86/insts/microregop.hh 0bba1c59b4d1
src/arch/x86/isa/microops/debug.isa 0bba1c59b4d1
src/arch/x86/isa/microops/mediaop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/regop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/seqop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/specop.isa 0bba1c59b4d1
src/arch/x86/isa/operands.isa 0bba1c59b4d1

Diff: http://reviews.gem5.org/r/1160/diff/


Testing
-------


Thanks,

Nilay Vaish
Gabe Black
2012-04-22 03:13:34 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/1160/#review2571
-----------------------------------------------------------


I don't understand what this patch is doing. It looks like you're adding members to keep track of flag bits and that sort of thing. If we didn't keep track of that before, we probably don't need to keep track of it now. Could you please explain what you're implementing here?


src/arch/x86/insts/microop.hh
<http://reviews.gem5.org/r/1160/#comment2969>

What are these functions for?


- Gabe Black
Post by Nilay Vaish
-----------------------------------------------------------
http://reviews.gem5.org/r/1160/
-----------------------------------------------------------
(Updated April 21, 2012, 1:20 p.m.)
Review request for Default.
Description
-------
Changeset 8962:1c063dca6858
---------------------------
X86: Runtime read, write conditions for CCFlagBits register
This patch introduces runtime read, write conditions for CCFlagBits register.
The main idea is that each microop, while being used in the microcode,
should specify the condition code flags it would read and write. On the
basis of this info, at runtime, it would be decided which condition code
registers should be read and written. This would help in reducing the extent
of RAW dependence while using the o3 cpu with x86. Currently, the flags that
would read and written are specified together. Next couple of patches would
try to look in to splitting the flags in to read and write sets and split the
ccFlagBits register.
Diffs
-----
src/arch/x86/insts/micromediaop.hh 0bba1c59b4d1
src/arch/x86/insts/microop.hh 0bba1c59b4d1
src/arch/x86/insts/microregop.hh 0bba1c59b4d1
src/arch/x86/isa/microops/debug.isa 0bba1c59b4d1
src/arch/x86/isa/microops/mediaop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/regop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/seqop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/specop.isa 0bba1c59b4d1
src/arch/x86/isa/operands.isa 0bba1c59b4d1
Diff: http://reviews.gem5.org/r/1160/diff/
Testing
-------
Thanks,
Nilay Vaish
Nilay Vaish
2012-04-22 04:43:47 UTC
Permalink
Post by Nilay Vaish
Post by Gabe Black
I don't understand what this patch is doing. It looks like you're adding members to keep track of flag bits and that sort of thing. If we didn't keep track of that before, we probably don't need to keep track of it now. Could you please explain what you're implementing here?
We did keep track of the bits before. But we need to keep track of the bits that
need to read and that need to be written separately. To me it seems that unless
we do that, we would never be able to execute two instructions in parallel. Consider
the case when two add instructions are executed back to back. If both of them are
writing to the CC register, then as of now they will read the register first. This
would mean that the second add instruction is dependent on the first. To break this
dependence, we need to figure out which bits written by the first instruction are
read by the second. If there are none, then the two can go in parallel. This requires
that the instructions specify their read and write bits separately. Based on these bits
a runtime check that decide whether or not the CC register needs to be read / written.
This patch introduces that run time check for the x86 ISA.
Post by Nilay Vaish
Post by Gabe Black
src/arch/x86/insts/microop.hh, line 125
<http://reviews.gem5.org/r/1160/diff/1/?file=26105#file26105line125>
What are these functions for?
This functions are the run time checks that decide whether or not
the cc register needs to be read/written.


- Nilay


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/1160/#review2571
-----------------------------------------------------------
Post by Nilay Vaish
-----------------------------------------------------------
http://reviews.gem5.org/r/1160/
-----------------------------------------------------------
(Updated April 21, 2012, 1:20 p.m.)
Review request for Default.
Description
-------
Changeset 8962:1c063dca6858
---------------------------
X86: Runtime read, write conditions for CCFlagBits register
This patch introduces runtime read, write conditions for CCFlagBits register.
The main idea is that each microop, while being used in the microcode,
should specify the condition code flags it would read and write. On the
basis of this info, at runtime, it would be decided which condition code
registers should be read and written. This would help in reducing the extent
of RAW dependence while using the o3 cpu with x86. Currently, the flags that
would read and written are specified together. Next couple of patches would
try to look in to splitting the flags in to read and write sets and split the
ccFlagBits register.
Diffs
-----
src/arch/x86/insts/micromediaop.hh 0bba1c59b4d1
src/arch/x86/insts/microop.hh 0bba1c59b4d1
src/arch/x86/insts/microregop.hh 0bba1c59b4d1
src/arch/x86/isa/microops/debug.isa 0bba1c59b4d1
src/arch/x86/isa/microops/mediaop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/regop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/seqop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/specop.isa 0bba1c59b4d1
src/arch/x86/isa/operands.isa 0bba1c59b4d1
Diff: http://reviews.gem5.org/r/1160/diff/
Testing
-------
Thanks,
Nilay Vaish
Gabe Black
2012-04-22 11:20:08 UTC
Permalink
Post by Nilay Vaish
Post by Gabe Black
I don't understand what this patch is doing. It looks like you're adding members to keep track of flag bits and that sort of thing. If we didn't keep track of that before, we probably don't need to keep track of it now. Could you please explain what you're implementing here?
We did keep track of the bits before. But we need to keep track of the bits that
need to read and that need to be written separately. To me it seems that unless
we do that, we would never be able to execute two instructions in parallel. Consider
the case when two add instructions are executed back to back. If both of them are
writing to the CC register, then as of now they will read the register first. This
would mean that the second add instruction is dependent on the first. To break this
dependence, we need to figure out which bits written by the first instruction are
read by the second. If there are none, then the two can go in parallel. This requires
that the instructions specify their read and write bits separately. Based on these bits
a runtime check that decide whether or not the CC register needs to be read / written.
This patch introduces that run time check for the x86 ISA.
I understand why we want to get rid of the unnecessary read, I was asking about what this patch is doing to help do that.

Neither "cc" or "ext" are specific to reading or writing, and they shouldn't be given special magical meaning like that. I suspect it's not actually necessary to store them separately either.
Post by Nilay Vaish
Post by Gabe Black
src/arch/x86/insts/microop.hh, line 125
<http://reviews.gem5.org/r/1160/diff/1/?file=26105#file26105line125>
What are these functions for?
This functions are the run time checks that decide whether or not
the cc register needs to be read/written.
I thought there were going to be a bunch of cc registers.


- Gabe


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/1160/#review2571
-----------------------------------------------------------
Post by Nilay Vaish
-----------------------------------------------------------
http://reviews.gem5.org/r/1160/
-----------------------------------------------------------
(Updated April 21, 2012, 1:20 p.m.)
Review request for Default.
Description
-------
Changeset 8962:1c063dca6858
---------------------------
X86: Runtime read, write conditions for CCFlagBits register
This patch introduces runtime read, write conditions for CCFlagBits register.
The main idea is that each microop, while being used in the microcode,
should specify the condition code flags it would read and write. On the
basis of this info, at runtime, it would be decided which condition code
registers should be read and written. This would help in reducing the extent
of RAW dependence while using the o3 cpu with x86. Currently, the flags that
would read and written are specified together. Next couple of patches would
try to look in to splitting the flags in to read and write sets and split the
ccFlagBits register.
Diffs
-----
src/arch/x86/insts/micromediaop.hh 0bba1c59b4d1
src/arch/x86/insts/microop.hh 0bba1c59b4d1
src/arch/x86/insts/microregop.hh 0bba1c59b4d1
src/arch/x86/isa/microops/debug.isa 0bba1c59b4d1
src/arch/x86/isa/microops/mediaop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/regop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/seqop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/specop.isa 0bba1c59b4d1
src/arch/x86/isa/operands.isa 0bba1c59b4d1
Diff: http://reviews.gem5.org/r/1160/diff/
Testing
-------
Thanks,
Nilay Vaish
Nilay Vaish
2012-04-22 14:02:26 UTC
Permalink
Post by Nilay Vaish
Post by Gabe Black
I don't understand what this patch is doing. It looks like you're adding members to keep track of flag bits and that sort of thing. If we didn't keep track of that before, we probably don't need to keep track of it now. Could you please explain what you're implementing here?
We did keep track of the bits before. But we need to keep track of the bits that
need to read and that need to be written separately. To me it seems that unless
we do that, we would never be able to execute two instructions in parallel. Consider
the case when two add instructions are executed back to back. If both of them are
writing to the CC register, then as of now they will read the register first. This
would mean that the second add instruction is dependent on the first. To break this
dependence, we need to figure out which bits written by the first instruction are
read by the second. If there are none, then the two can go in parallel. This requires
that the instructions specify their read and write bits separately. Based on these bits
a runtime check that decide whether or not the CC register needs to be read / written.
This patch introduces that run time check for the x86 ISA.
I understand why we want to get rid of the unnecessary read, I was asking about what this patch is doing to help do that.
Neither "cc" or "ext" are specific to reading or writing, and they shouldn't be given special magical meaning like that. I suspect it's not actually necessary to store them separately either.
I have given a thought about this, and I felt that we need the read and write
sets to be separate. Otherwise it does not seems possible that we can at runtime
decide whether or not a register needs to be read. As far as naming is concerned,
I can change that something more descriptive.

In one of the previous patches, I had allowed for introducing functions that
decide (at runtime) whether or not a certain register can be read/written. This
patch uses that functionality for the x86 ISA. It modifies isa description so
that each instruction that would need to read/write the CC register specifies
that read and the write flags. These two sets will be used for deciding whether
or not the CC register should be read.
Post by Nilay Vaish
Post by Gabe Black
src/arch/x86/insts/microop.hh, line 125
<http://reviews.gem5.org/r/1160/diff/1/?file=26105#file26105line125>
What are these functions for?
This functions are the run time checks that decide whether or not
the cc register needs to be read/written.
I thought there were going to be a bunch of cc registers.
Ultimately, there will be. But splitting up the CC register will not solve
the problem completely. Take the case of add instructions I outlined above.
Even after splitting the CC register, these instructions cannot execute in
parallel because they are still going to read those split registers, thus
creating a Read-After-Write dependence. Therefore we need some functionality
that prevents the registers from being read.


- Nilay


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/1160/#review2571
-----------------------------------------------------------
Post by Nilay Vaish
-----------------------------------------------------------
http://reviews.gem5.org/r/1160/
-----------------------------------------------------------
(Updated April 21, 2012, 1:20 p.m.)
Review request for Default.
Description
-------
Changeset 8962:1c063dca6858
---------------------------
X86: Runtime read, write conditions for CCFlagBits register
This patch introduces runtime read, write conditions for CCFlagBits register.
The main idea is that each microop, while being used in the microcode,
should specify the condition code flags it would read and write. On the
basis of this info, at runtime, it would be decided which condition code
registers should be read and written. This would help in reducing the extent
of RAW dependence while using the o3 cpu with x86. Currently, the flags that
would read and written are specified together. Next couple of patches would
try to look in to splitting the flags in to read and write sets and split the
ccFlagBits register.
Diffs
-----
src/arch/x86/insts/micromediaop.hh 0bba1c59b4d1
src/arch/x86/insts/microop.hh 0bba1c59b4d1
src/arch/x86/insts/microregop.hh 0bba1c59b4d1
src/arch/x86/isa/microops/debug.isa 0bba1c59b4d1
src/arch/x86/isa/microops/mediaop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/regop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/seqop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/specop.isa 0bba1c59b4d1
src/arch/x86/isa/operands.isa 0bba1c59b4d1
Diff: http://reviews.gem5.org/r/1160/diff/
Testing
-------
Thanks,
Nilay Vaish
Gabe Black
2012-04-22 20:28:34 UTC
Permalink
Post by Nilay Vaish
Post by Gabe Black
I don't understand what this patch is doing. It looks like you're adding members to keep track of flag bits and that sort of thing. If we didn't keep track of that before, we probably don't need to keep track of it now. Could you please explain what you're implementing here?
We did keep track of the bits before. But we need to keep track of the bits that
need to read and that need to be written separately. To me it seems that unless
we do that, we would never be able to execute two instructions in parallel. Consider
the case when two add instructions are executed back to back. If both of them are
writing to the CC register, then as of now they will read the register first. This
would mean that the second add instruction is dependent on the first. To break this
dependence, we need to figure out which bits written by the first instruction are
read by the second. If there are none, then the two can go in parallel. This requires
that the instructions specify their read and write bits separately. Based on these bits
a runtime check that decide whether or not the CC register needs to be read / written.
This patch introduces that run time check for the x86 ISA.
I understand why we want to get rid of the unnecessary read, I was asking about what this patch is doing to help do that.
Neither "cc" or "ext" are specific to reading or writing, and they shouldn't be given special magical meaning like that. I suspect it's not actually necessary to store them separately either.
I have given a thought about this, and I felt that we need the read and write
sets to be separate. Otherwise it does not seems possible that we can at runtime
decide whether or not a register needs to be read. As far as naming is concerned,
I can change that something more descriptive.
In one of the previous patches, I had allowed for introducing functions that
decide (at runtime) whether or not a certain register can be read/written. This
patch uses that functionality for the x86 ISA. It modifies isa description so
that each instruction that would need to read/write the CC register specifies
that read and the write flags. These two sets will be used for deciding whether
or not the CC register should be read.
If it's a microop that reads, then they're interpreted as reads. If it's a microop that writes, they're interpreted as writes with the necessary reads thrown in automatically. Any required reads would be added in because we know ahead of time they need to be without having to store them in the instruction object explicitly.


- Gabe


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/1160/#review2571
-----------------------------------------------------------
Post by Nilay Vaish
-----------------------------------------------------------
http://reviews.gem5.org/r/1160/
-----------------------------------------------------------
(Updated April 21, 2012, 1:20 p.m.)
Review request for Default.
Description
-------
Changeset 8962:1c063dca6858
---------------------------
X86: Runtime read, write conditions for CCFlagBits register
This patch introduces runtime read, write conditions for CCFlagBits register.
The main idea is that each microop, while being used in the microcode,
should specify the condition code flags it would read and write. On the
basis of this info, at runtime, it would be decided which condition code
registers should be read and written. This would help in reducing the extent
of RAW dependence while using the o3 cpu with x86. Currently, the flags that
would read and written are specified together. Next couple of patches would
try to look in to splitting the flags in to read and write sets and split the
ccFlagBits register.
Diffs
-----
src/arch/x86/insts/micromediaop.hh 0bba1c59b4d1
src/arch/x86/insts/microop.hh 0bba1c59b4d1
src/arch/x86/insts/microregop.hh 0bba1c59b4d1
src/arch/x86/isa/microops/debug.isa 0bba1c59b4d1
src/arch/x86/isa/microops/mediaop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/regop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/seqop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/specop.isa 0bba1c59b4d1
src/arch/x86/isa/operands.isa 0bba1c59b4d1
Diff: http://reviews.gem5.org/r/1160/diff/
Testing
-------
Thanks,
Nilay Vaish
Gabe Black
2012-04-22 20:39:15 UTC
Permalink
Post by Nilay Vaish
Post by Gabe Black
src/arch/x86/insts/microop.hh, line 125
<http://reviews.gem5.org/r/1160/diff/1/?file=26105#file26105line125>
What are these functions for?
This functions are the run time checks that decide whether or not
the cc register needs to be read/written.
I thought there were going to be a bunch of cc registers.
Ultimately, there will be. But splitting up the CC register will not solve
the problem completely. Take the case of add instructions I outlined above.
Even after splitting the CC register, these instructions cannot execute in
parallel because they are still going to read those split registers, thus
creating a Read-After-Write dependence. Therefore we need some functionality
that prevents the registers from being read.
I'm not following you. An add instruction will write to the condition code bits unless it's been told to partially write to the one bunch of condition code bits that are updated together. If that happens and the other add instruction is doing the same thing, the normal dependency tracking mechanisms will take care of it. This function is called needToRead, takes two sets of flags (which should be the same type for consistency btw, just noticed that) and returns a bool, hardcoded to true at the moment. needToRead what? How do the sets of flags help you figure that out? Why does it always return true?


- Gabe


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/1160/#review2571
-----------------------------------------------------------
Post by Nilay Vaish
-----------------------------------------------------------
http://reviews.gem5.org/r/1160/
-----------------------------------------------------------
(Updated April 21, 2012, 1:20 p.m.)
Review request for Default.
Description
-------
Changeset 8962:1c063dca6858
---------------------------
X86: Runtime read, write conditions for CCFlagBits register
This patch introduces runtime read, write conditions for CCFlagBits register.
The main idea is that each microop, while being used in the microcode,
should specify the condition code flags it would read and write. On the
basis of this info, at runtime, it would be decided which condition code
registers should be read and written. This would help in reducing the extent
of RAW dependence while using the o3 cpu with x86. Currently, the flags that
would read and written are specified together. Next couple of patches would
try to look in to splitting the flags in to read and write sets and split the
ccFlagBits register.
Diffs
-----
src/arch/x86/insts/micromediaop.hh 0bba1c59b4d1
src/arch/x86/insts/microop.hh 0bba1c59b4d1
src/arch/x86/insts/microregop.hh 0bba1c59b4d1
src/arch/x86/isa/microops/debug.isa 0bba1c59b4d1
src/arch/x86/isa/microops/mediaop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/regop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/seqop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/specop.isa 0bba1c59b4d1
src/arch/x86/isa/operands.isa 0bba1c59b4d1
Diff: http://reviews.gem5.org/r/1160/diff/
Testing
-------
Thanks,
Nilay Vaish
Nilay Vaish
2012-04-23 02:20:09 UTC
Permalink
Post by Nilay Vaish
Post by Gabe Black
src/arch/x86/insts/microop.hh, line 125
<http://reviews.gem5.org/r/1160/diff/1/?file=26105#file26105line125>
What are these functions for?
This functions are the run time checks that decide whether or not
the cc register needs to be read/written.
I thought there were going to be a bunch of cc registers.
Ultimately, there will be. But splitting up the CC register will not solve
the problem completely. Take the case of add instructions I outlined above.
Even after splitting the CC register, these instructions cannot execute in
parallel because they are still going to read those split registers, thus
creating a Read-After-Write dependence. Therefore we need some functionality
that prevents the registers from being read.
I'm not following you. An add instruction will write to the condition code bits unless it's been told to partially write to the one bunch of condition code bits that are updated together. If that happens and the other add instruction is doing the same thing, the normal dependency tracking mechanisms will take care of it. This function is called needToRead, takes two sets of flags (which should be the same type for consistency btw, just noticed that) and returns a bool, hardcoded to true at the moment. needToRead what? How do the sets of flags help you figure that out? Why does it always return true?
The way we currently recognise the CC register needs to be read, is that
it appears on the RHS of some assignment statement. It seems to me that
this would remain true even after the register is split. The expression
will just change to something like
zaps = genZaps(zaps, <something>);

The isa parser would then mark zaps as both source and destination. And the
two add instructions would still not execute in parallel, as the second one
would be dependent on the first for the value of the zaps register.

We can either --
a. drop the default assumption that we need to make partial updates, and handle
partial update as a special case.
b. keep the default assumption that we need to make partial updates, and handle
full update as a special case.

Current patches are along the lines of the second option.

As far as needToRead() is concerned, the function is for deciding whether or not
the CC register should be read. It takes in the read and write sets of CC
bits that this microop is going read and write respectively. The types are different
because the function check_condition() has a different encoding for CC bits compared
to say genFlags(). It currently returns true because it has not been decided yet what
the final condition would be, in part because of the difference in encodings.

The condition would be, I think, some thing like this --
if (read_set is not empty || write set is not full) return true;
return false;


- Nilay


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/1160/#review2571
-----------------------------------------------------------
Post by Nilay Vaish
-----------------------------------------------------------
http://reviews.gem5.org/r/1160/
-----------------------------------------------------------
(Updated April 21, 2012, 1:20 p.m.)
Review request for Default.
Description
-------
Changeset 8962:1c063dca6858
---------------------------
X86: Runtime read, write conditions for CCFlagBits register
This patch introduces runtime read, write conditions for CCFlagBits register.
The main idea is that each microop, while being used in the microcode,
should specify the condition code flags it would read and write. On the
basis of this info, at runtime, it would be decided which condition code
registers should be read and written. This would help in reducing the extent
of RAW dependence while using the o3 cpu with x86. Currently, the flags that
would read and written are specified together. Next couple of patches would
try to look in to splitting the flags in to read and write sets and split the
ccFlagBits register.
Diffs
-----
src/arch/x86/insts/micromediaop.hh 0bba1c59b4d1
src/arch/x86/insts/microop.hh 0bba1c59b4d1
src/arch/x86/insts/microregop.hh 0bba1c59b4d1
src/arch/x86/isa/microops/debug.isa 0bba1c59b4d1
src/arch/x86/isa/microops/mediaop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/regop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/seqop.isa 0bba1c59b4d1
src/arch/x86/isa/microops/specop.isa 0bba1c59b4d1
src/arch/x86/isa/operands.isa 0bba1c59b4d1
Diff: http://reviews.gem5.org/r/1160/diff/
Testing
-------
Thanks,
Nilay Vaish
Steve Reinhardt
2012-04-23 02:54:31 UTC
Permalink
Post by Nilay Vaish
The way we currently recognise the CC register needs to be read, is that
it appears on the RHS of some assignment statement. It seems to me that
this would remain true even after the register is split. The expression
will just change to something like
zaps = genZaps(zaps, <something>);
The isa parser would then mark zaps as both source and destination. And the
two add instructions would still not execute in parallel, as the second one
would be dependent on the first for the value of the zaps register.
We can either --
a. drop the default assumption that we need to make partial updates, and handle
partial update as a special case.
b. keep the default assumption that we need to make partial updates, and handle
full update as a special case.
Current patches are along the lines of the second option.
I'm jumping in partially informed, but can we just have two functions, like:

zaps = setZaps(<something>);

and

zaps = modifyZaps(zaps, <something>);

and then let the isa parser do its stuff naturally?

Steve
Nilay Vaish
2012-04-23 03:15:51 UTC
Permalink
Post by Steve Reinhardt
Post by Nilay Vaish
The way we currently recognise the CC register needs to be read, is that
it appears on the RHS of some assignment statement. It seems to me that
this would remain true even after the register is split. The expression
will just change to something like
zaps = genZaps(zaps, <something>);
The isa parser would then mark zaps as both source and destination. And the
two add instructions would still not execute in parallel, as the second one
would be dependent on the first for the value of the zaps register.
We can either --
a. drop the default assumption that we need to make partial updates, and handle
partial update as a special case.
b. keep the default assumption that we need to make partial updates, and handle
full update as a special case.
Current patches are along the lines of the second option.
zaps = setZaps(<something>);
and
zaps = modifyZaps(zaps, <something>);
and then let the isa parser do its stuff naturally?
Steve
I don't think that is possible. This code will appear in the .isa file. In
the .isa file, we cannot decide which version to use as the CC bits to be
written vary with the context in which the microop is used. So, we need a
run time condition that figures out tries to evaluate if the register
needs to be read.

--
Nilay
Steve Reinhardt
2012-04-23 03:42:51 UTC
Permalink
Post by Nilay Vaish
Post by Steve Reinhardt
Post by Nilay Vaish
The way we currently recognise the CC register needs to be read, is that
it appears on the RHS of some assignment statement. It seems to me that
this would remain true even after the register is split. The expression
will just change to something like
zaps = genZaps(zaps, <something>);
The isa parser would then mark zaps as both source and destination. And the
two add instructions would still not execute in parallel, as the second one
would be dependent on the first for the value of the zaps register.
We can either --
a. drop the default assumption that we need to make partial updates, and handle
partial update as a special case.
b. keep the default assumption that we need to make partial updates, and handle
full update as a special case.
Current patches are along the lines of the second option.
zaps = setZaps(<something>);
and
zaps = modifyZaps(zaps, <something>);
and then let the isa parser do its stuff naturally?
Steve
I don't think that is possible. This code will appear in the .isa file. In
the .isa file, we cannot decide which version to use as the CC bits to be
written vary with the context in which the microop is used. So, we need a
run time condition that figures out tries to evaluate if the register needs
to be read.
Sorry for being way behind on this, but I'm curious just how many microops
there are that have different impacts on the flags depending on their
context, and how many different contexts there are. I got the impression
before that there would be this huge explosion of microops if we actually
had a different ADD micro-op (for example) for each set of bits that it
could possibly write. However, looking at Appendix E of Vol 3 of the AMD
ISA manual, it looks like the set of bits written by each macro-instruction
(at least) is pretty well defined. I can believe that it's also valuable
to have an ADD micro-op that doesn't affect flags for use in microcode
sequences. But is there a 3rd version of ADD we need that modifies some
but not all the flags that the ADD macroinstruction does?

Basically if (hypothetically exaggerating) 80% of the macro-ops can modify
five different combinations of flags depending on context, then this
complex mechanism makes sense. On the other hand, if there are a small
number of microops that need two versions (one that modifies a certain set
of flags and one that doesn't), and maybe an even smaller number that
legitimately decide which flags to look at at runtime, then this is
starting to feel like overkill. I'm sure the truth is somewhere in the
middle, but I just don't understand the code well enough to know which
extreme it's closer to... and if it's close to the former, I'd like to
understand why.

Steve
Gabe Black
2012-04-23 05:32:20 UTC
Permalink
On Sun, Apr 22, 2012 at 7:20 PM, Nilay Vaish
The way we currently recognise the CC register needs to be
read, is that
it appears on the RHS of some assignment statement. It
seems to me that
this would remain true even after the register is split.
The expression
will just change to something like
zaps = genZaps(zaps, <something>);
The isa parser would then mark zaps as both source and
destination. And the
two add instructions would still not execute in parallel,
as the second one
would be dependent on the first for the value of the zaps register.
We can either --
a. drop the default assumption that we need to make
partial updates, and
handle
partial update as a special case.
b. keep the default assumption that we need to make
partial updates, and
handle
full update as a special case.
Current patches are along the lines of the second option.
I'm jumping in partially informed, but can we just have two
zaps = setZaps(<something>);
and
zaps = modifyZaps(zaps, <something>);
and then let the isa parser do its stuff naturally?
Steve
I don't think that is possible. This code will appear in the .isa
file. In the .isa file, we cannot decide which version to use as
the CC bits to be written vary with the context in which the
microop is used. So, we need a run time condition that figures out
tries to evaluate if the register needs to be read.
Sorry for being way behind on this, but I'm curious just how many
microops there are that have different impacts on the flags depending
on their context, and how many different contexts there are. I got
the impression before that there would be this huge explosion of
microops if we actually had a different ADD micro-op (for example)
for each set of bits that it could possibly write. However, looking
at Appendix E of Vol 3 of the AMD ISA manual, it looks like the set of
bits written by each macro-instruction (at least) is pretty well
defined. I can believe that it's also valuable to have an ADD
micro-op that doesn't affect flags for use in microcode sequences.
But is there a 3rd version of ADD we need that modifies some but not
all the flags that the ADD macroinstruction does?
Basically if (hypothetically exaggerating) 80% of the macro-ops can
modify five different combinations of flags depending on context, then
this complex mechanism makes sense. On the other hand, if there are a
small number of microops that need two versions (one that modifies a
certain set of flags and one that doesn't), and maybe an even smaller
number that legitimately decide which flags to look at at runtime,
then this is starting to feel like overkill. I'm sure the truth is
somewhere in the middle, but I just don't understand the code well
enough to know which extreme it's closer to... and if it's close to
the former, I'd like to understand why.
Steve
Being well defined and being consistent are not the same things. I
looked through the instruction reference one instruction at a time a
year or two ago tabulating what flags they set, and it was not apparent
that there were some small number of combinations. You are welcome to
repeat the process, but I'll pass.

Also, we're not setting flags at the macroop level, we're setting them
at the microop level. Besides the fact that I don't think such a small
set exists, I don't want to have to live with that small set for
forever, or redo all the existing macroops so they use the right version
of the microops.

Gabe
Steve Reinhardt
2012-04-23 14:50:26 UTC
Permalink
Post by Steve Reinhardt
I'm jumping in partially informed, but can we just have two functions,
Post by Nilay Vaish
Post by Steve Reinhardt
zaps = setZaps(<something>);
and
zaps = modifyZaps(zaps, <something>);
and then let the isa parser do its stuff naturally?
Steve
I don't think that is possible. This code will appear in the .isa file.
In the .isa file, we cannot decide which version to use as the CC bits to
be written vary with the context in which the microop is used. So, we need
a run time condition that figures out tries to evaluate if the register
needs to be read.
Sorry for being way behind on this, but I'm curious just how many
microops there are that have different impacts on the flags depending on
their context, and how many different contexts there are. I got the
impression before that there would be this huge explosion of microops if we
actually had a different ADD micro-op (for example) for each set of bits
that it could possibly write. However, looking at Appendix E of Vol 3 of
the AMD ISA manual, it looks like the set of bits written by each
macro-instruction (at least) is pretty well defined. I can believe that
it's also valuable to have an ADD micro-op that doesn't affect flags for
use in microcode sequences. But is there a 3rd version of ADD we need that
modifies some but not all the flags that the ADD macroinstruction does?
Basically if (hypothetically exaggerating) 80% of the macro-ops can
modify five different combinations of flags depending on context, then this
complex mechanism makes sense. On the other hand, if there are a small
number of microops that need two versions (one that modifies a certain set
of flags and one that doesn't), and maybe an even smaller number that
legitimately decide which flags to look at at runtime, then this is
starting to feel like overkill. I'm sure the truth is somewhere in the
middle, but I just don't understand the code well enough to know which
extreme it's closer to... and if it's close to the former, I'd like to
understand why.
Steve
Being well defined and being consistent are not the same things. I looked
through the instruction reference one instruction at a time a year or two
ago tabulating what flags they set, and it was not apparent that there were
some small number of combinations. You are welcome to repeat the process,
but I'll pass.
Are you saying that if I looked closely, I would get a different result
than the table that's already provided in Appendix E of Vol. 3?
Post by Steve Reinhardt
Also, we're not setting flags at the macroop level, we're setting them at
the microop level.
Yea, I understand that, and already mentioned that above. To quote: "I can
believe that it's also valuable to have an ADD micro-op that doesn't affect
flags for use in microcode sequences. But is there a 3rd version of ADD we
need that modifies some but not all the flags that the ADD macroinstruction
does?".
Post by Steve Reinhardt
Besides the fact that I don't think such a small set exists, I don't want
to have to live with that small set for forever, or redo all the existing
macroops so they use the right version of the microops.
I'm not arguing for or against anything right now, I'm just trying to
understand the situation a little better. So far all the design
discussions have implied that there are just crazy scads of microops that
could decide to read or write any arbitrary subset of flags at any time,
and that seems a little suspicious to me. I can believe that there is
enough diversity to make all this mechanism worthwhile, I'd just like to
get more specific examples and a better handle on the scope of the problem.

Steve
Gabe Black
2012-04-23 18:54:33 UTC
Permalink
Post by Gabe Black
Post by Steve Reinhardt
I'm jumping in partially informed, but can we just have
zaps = setZaps(<something>);
and
zaps = modifyZaps(zaps, <something>);
and then let the isa parser do its stuff naturally?
Steve
I don't think that is possible. This code will appear in the
.isa file. In the .isa file, we cannot decide which version
to use as the CC bits to be written vary with the context in
which the microop is used. So, we need a run time condition
that figures out tries to evaluate if the register needs to
be read.
Sorry for being way behind on this, but I'm curious just how many
microops there are that have different impacts on the flags
depending on their context, and how many different contexts there
are. I got the impression before that there would be this huge
explosion of microops if we actually had a different ADD
micro-op (for example) for each set of bits that it could
possibly write. However, looking at Appendix E of Vol 3 of the
AMD ISA manual, it looks like the set of bits written by each
macro-instruction (at least) is pretty well defined. I can
believe that it's also valuable to have an ADD micro-op that
doesn't affect flags for use in microcode sequences. But is
there a 3rd version of ADD we need that modifies some but not all
the flags that the ADD macroinstruction does?
Basically if (hypothetically exaggerating) 80% of the macro-ops
can modify five different combinations of flags depending on
context, then this complex mechanism makes sense. On the other
hand, if there are a small number of microops that need two
versions (one that modifies a certain set of flags and one that
doesn't), and maybe an even smaller number that legitimately
decide which flags to look at at runtime, then this is starting
to feel like overkill. I'm sure the truth is somewhere in the
middle, but I just don't understand the code well enough to know
which extreme it's closer to... and if it's close to the former,
I'd like to understand why.
Steve
Being well defined and being consistent are not the same things. I
looked through the instruction reference one instruction at a time
a year or two ago tabulating what flags they set, and it was not
apparent that there were some small number of combinations. You
are welcome to repeat the process, but I'll pass.
Are you saying that if I looked closely, I would get a different
result than the table that's already provided in Appendix E of Vol. 3?
Well that would have made life easier back then. That's basically the
information I was gathering. There are trends, but I don't think it's
consistent. That table also looks pretty short. I'm not sure it has all
the instructions in it, although I don't have time to look through it
very carefully right now.
Post by Gabe Black
Also, we're not setting flags at the macroop level, we're setting
them at the microop level.
"I can believe that it's also valuable to have an ADD micro-op that
doesn't affect flags for use in microcode sequences. But is there a
3rd version of ADD we need that modifies some but not all the flags
that the ADD macroinstruction does?".
I'd be pretty surprised if the answer wasn't yes, although I don't have
time right now to dig through the microcode to find an example. You
should be able to grep and find all the adds fairly easily. You might
want to just peruse the microcode anyway since there are probably other
microops which end up being used differently than add but which would
have to follow the same scheme.
Post by Gabe Black
Besides the fact that I don't think such a small set exists, I
don't want to have to live with that small set for forever, or
redo all the existing macroops so they use the right version of
the microops.
I'm not arguing for or against anything right now, I'm just trying to
understand the situation a little better. So far all the design
discussions have implied that there are just crazy scads of microops
that could decide to read or write any arbitrary subset of flags at
any time, and that seems a little suspicious to me. I can believe
that there is enough diversity to make all this mechanism worthwhile,
I'd just like to get more specific examples and a better handle on the
scope of the problem.
Right. Also keep in mind that we haven't implemented everything with x86
yet, so what we've used now isn't necessarily representative of
everything we'll need in the future.

Gabe
Gabe Black
2012-04-23 18:58:31 UTC
Permalink
Post by Gabe Black
Post by Gabe Black
Also, we're not setting flags at the macroop level, we're setting
them at the microop level.
"I can believe that it's also valuable to have an ADD micro-op that
doesn't affect flags for use in microcode sequences. But is there a
3rd version of ADD we need that modifies some but not all the flags
that the ADD macroinstruction does?".
I'd be pretty surprised if the answer wasn't yes, although I don't have
time right now to dig through the microcode to find an example. You
should be able to grep and find all the adds fairly easily. You might
want to just peruse the microcode anyway since there are probably other
microops which end up being used differently than add but which would
have to follow the same scheme.
Oh, also, there would also at least need to be a version that did write
the flags but wrote the invisible microcode versions for internal use in
the microcode.

Gabe
Nilay Vaish
2012-04-23 21:45:07 UTC
Permalink
Post by Gabe Black
Post by Gabe Black
Post by Steve Reinhardt
I'm jumping in partially informed, but can we just have
zaps = setZaps(<something>);
and
zaps = modifyZaps(zaps, <something>);
and then let the isa parser do its stuff naturally?
Steve
I don't think that is possible. This code will appear in the
.isa file. In the .isa file, we cannot decide which version
to use as the CC bits to be written vary with the context in
which the microop is used. So, we need a run time condition
that figures out tries to evaluate if the register needs to
be read.
Sorry for being way behind on this, but I'm curious just how many
microops there are that have different impacts on the flags
depending on their context, and how many different contexts there
are. I got the impression before that there would be this huge
explosion of microops if we actually had a different ADD
micro-op (for example) for each set of bits that it could
possibly write. However, looking at Appendix E of Vol 3 of the
AMD ISA manual, it looks like the set of bits written by each
macro-instruction (at least) is pretty well defined. I can
believe that it's also valuable to have an ADD micro-op that
doesn't affect flags for use in microcode sequences. But is
there a 3rd version of ADD we need that modifies some but not all
the flags that the ADD macroinstruction does?
Basically if (hypothetically exaggerating) 80% of the macro-ops
can modify five different combinations of flags depending on
context, then this complex mechanism makes sense. On the other
hand, if there are a small number of microops that need two
versions (one that modifies a certain set of flags and one that
doesn't), and maybe an even smaller number that legitimately
decide which flags to look at at runtime, then this is starting
to feel like overkill. I'm sure the truth is somewhere in the
middle, but I just don't understand the code well enough to know
which extreme it's closer to... and if it's close to the former,
I'd like to understand why.
Steve
Being well defined and being consistent are not the same things. I
looked through the instruction reference one instruction at a time
a year or two ago tabulating what flags they set, and it was not
apparent that there were some small number of combinations. You
are welcome to repeat the process, but I'll pass.
Are you saying that if I looked closely, I would get a different
result than the table that's already provided in Appendix E of Vol. 3?
Well that would have made life easier back then. That's basically the
information I was gathering. There are trends, but I don't think it's
consistent. That table also looks pretty short. I'm not sure it has all
the instructions in it, although I don't have time to look through it
very carefully right now.
Post by Gabe Black
Also, we're not setting flags at the macroop level, we're setting
them at the microop level.
"I can believe that it's also valuable to have an ADD micro-op that
doesn't affect flags for use in microcode sequences. But is there a
3rd version of ADD we need that modifies some but not all the flags
that the ADD macroinstruction does?".
I'd be pretty surprised if the answer wasn't yes, although I don't have
time right now to dig through the microcode to find an example. You
should be able to grep and find all the adds fairly easily. You might
want to just peruse the microcode anyway since there are probably other
microops which end up being used differently than add but which would
have to follow the same scheme.
Post by Gabe Black
Besides the fact that I don't think such a small set exists, I
don't want to have to live with that small set for forever, or
redo all the existing macroops so they use the right version of
the microops.
I'm not arguing for or against anything right now, I'm just trying to
understand the situation a little better. So far all the design
discussions have implied that there are just crazy scads of microops
that could decide to read or write any arbitrary subset of flags at
any time, and that seems a little suspicious to me. I can believe
that there is enough diversity to make all this mechanism worthwhile,
I'd just like to get more specific examples and a better handle on the
scope of the problem.
Right. Also keep in mind that we haven't implemented everything with x86
yet, so what we've used now isn't necessarily representative of
everything we'll need in the future.
Gabe
It seems like that this getting back to the original solution that I
proposed. The idea in that solution was that we will have three different
types of microops for the same microop -

(a) one that does not write CC at all ==> no need to read CC

(b) one that partially updates CC ==> need to read CC to perform the
merge

(c) one that completely updates CC ==> no need to read CC if no bits are
being read explicitly.

We already have different microop classes for case (a) and (b), and (b)
also fulfills the role when case (c) occurs. We can differentiate in the
.isa file that the microop in the current context writes to all the CC
bits and reads none, therefore should not read the CC register. These
microops will then get mapped to case (c). There may be several issues
involved here -

1. Is it possible to generate different microop classes for (b) and (c)?
It seems we would have to template flag_code, so that for (c), we can
replace the first read of the CC register with 0.

2. Suppose isa_parser gets as input the following statements --
CC = 0;
CC = CC | Carry;
Will the isa_parser recognize that there is no need to read CC register?

3. What happens when we split the CC register? It seems there will be five
parts of the register ([ZAPS], [O], [C], [rest], [ECF,EZF]). If we have
three cases for each of these five registers, that means we will have 243
different combinations in all. Can we some how figure out the microop
types that need to be generated?

--
Nilay
Gabe Black
2012-04-24 08:20:31 UTC
Permalink
Post by Nilay Vaish
Post by Gabe Black
Post by Gabe Black
Post by Steve Reinhardt
I'm jumping in partially informed, but can we just have
zaps = setZaps(<something>);
and
zaps = modifyZaps(zaps, <something>);
and then let the isa parser do its stuff naturally?
Steve
I don't think that is possible. This code will appear in the
.isa file. In the .isa file, we cannot decide which version
to use as the CC bits to be written vary with the context in
which the microop is used. So, we need a run time condition
that figures out tries to evaluate if the register needs to
be read.
Sorry for being way behind on this, but I'm curious just how many
microops there are that have different impacts on the flags
depending on their context, and how many different contexts there
are. I got the impression before that there would be this huge
explosion of microops if we actually had a different ADD
micro-op (for example) for each set of bits that it could
possibly write. However, looking at Appendix E of Vol 3 of the
AMD ISA manual, it looks like the set of bits written by each
macro-instruction (at least) is pretty well defined. I can
believe that it's also valuable to have an ADD micro-op that
doesn't affect flags for use in microcode sequences. But is
there a 3rd version of ADD we need that modifies some but not all
the flags that the ADD macroinstruction does?
Basically if (hypothetically exaggerating) 80% of the macro-ops
can modify five different combinations of flags depending on
context, then this complex mechanism makes sense. On the other
hand, if there are a small number of microops that need two
versions (one that modifies a certain set of flags and one that
doesn't), and maybe an even smaller number that legitimately
decide which flags to look at at runtime, then this is starting
to feel like overkill. I'm sure the truth is somewhere in the
middle, but I just don't understand the code well enough to know
which extreme it's closer to... and if it's close to the former,
I'd like to understand why.
Steve
Being well defined and being consistent are not the same things. I
looked through the instruction reference one instruction at a time
a year or two ago tabulating what flags they set, and it was not
apparent that there were some small number of combinations. You
are welcome to repeat the process, but I'll pass.
Are you saying that if I looked closely, I would get a different
result than the table that's already provided in Appendix E of Vol. 3?
Well that would have made life easier back then. That's basically the
information I was gathering. There are trends, but I don't think it's
consistent. That table also looks pretty short. I'm not sure it has all
the instructions in it, although I don't have time to look through it
very carefully right now.
Post by Gabe Black
Also, we're not setting flags at the macroop level, we're setting
them at the microop level.
"I can believe that it's also valuable to have an ADD micro-op that
doesn't affect flags for use in microcode sequences. But is there a
3rd version of ADD we need that modifies some but not all the flags
that the ADD macroinstruction does?".
I'd be pretty surprised if the answer wasn't yes, although I don't have
time right now to dig through the microcode to find an example. You
should be able to grep and find all the adds fairly easily. You might
want to just peruse the microcode anyway since there are probably other
microops which end up being used differently than add but which would
have to follow the same scheme.
Post by Gabe Black
Besides the fact that I don't think such a small set exists, I
don't want to have to live with that small set for forever, or
redo all the existing macroops so they use the right version of
the microops.
I'm not arguing for or against anything right now, I'm just trying to
understand the situation a little better. So far all the design
discussions have implied that there are just crazy scads of microops
that could decide to read or write any arbitrary subset of flags at
any time, and that seems a little suspicious to me. I can believe
that there is enough diversity to make all this mechanism worthwhile,
I'd just like to get more specific examples and a better handle on the
scope of the problem.
Right. Also keep in mind that we haven't implemented everything with x86
yet, so what we've used now isn't necessarily representative of
everything we'll need in the future.
Gabe
It seems like that this getting back to the original solution that I
proposed. The idea in that solution was that we will have three different
types of microops for the same microop -
(a) one that does not write CC at all ==> no need to read CC
(b) one that partially updates CC ==> need to read CC to perform the
merge
(c) one that completely updates CC ==> no need to read CC if no bits
are being read explicitly.
We already have different microop classes for case (a) and (b), and
(b) also fulfills the role when case (c) occurs. We can differentiate
in the .isa file that the microop in the current context writes to all
the CC bits and reads none, therefore should not read the CC register.
These microops will then get mapped to case (c). There may be several
issues involved here -
1. Is it possible to generate different microop classes for (b) and
(c)? It seems we would have to template flag_code, so that for (c), we
can replace the first read of the CC register with 0.
2. Suppose isa_parser gets as input the following statements --
CC = 0;
CC = CC | Carry;
Will the isa_parser recognize that there is no need to read CC register?
3. What happens when we split the CC register? It seems there will be
five parts of the register ([ZAPS], [O], [C], [rest], [ECF,EZF]). If
we have three cases for each of these five registers, that means we
will have 243 different combinations in all. Can we some how figure
out the microop types that need to be generated?
--
Nilay
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
The point of my email is that we're *not* ending up back there, and that
we *do* still need to do things differently. I'm confident the number of
microops we'll end up will be prohibitive, and going the other way and
not letting it get that way will make things restrictive writing new
microcode. We get pinched between those two options and don't have a
good place to be in the middle. I may be wrong, but that's definitely
what I expect and I did write almost all of the microcode (granted a
while ago) so I'm decently familiar with it. Also, [ECF] and [EZF]
should be separate. I'm 90% sure I remember places where those are set
by different microops and need to exist independently.

Gabe

Gabe Black
2012-04-23 05:46:13 UTC
Permalink
Post by Nilay Vaish
Post by Steve Reinhardt
Post by Nilay Vaish
The way we currently recognise the CC register needs to be read, is that
it appears on the RHS of some assignment statement. It seems to me that
this would remain true even after the register is split. The expression
will just change to something like
zaps = genZaps(zaps, <something>);
The isa parser would then mark zaps as both source and destination. And the
two add instructions would still not execute in parallel, as the second one
would be dependent on the first for the value of the zaps register.
We can either --
a. drop the default assumption that we need to make partial updates,
and
handle
partial update as a special case.
b. keep the default assumption that we need to make partial updates,
and
handle
full update as a special case.
Current patches are along the lines of the second option.
I'm jumping in partially informed, but can we just have two
zaps = setZaps(<something>);
and
zaps = modifyZaps(zaps, <something>);
and then let the isa parser do its stuff naturally?
Steve
I don't think that is possible. This code will appear in the .isa
file. In the .isa file, we cannot decide which version to use as the
CC bits to be written vary with the context in which the microop is
used. So, we need a run time condition that figures out tries to
evaluate if the register needs to be read.
--
Nilay
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
One of the main challenges of this effort is going to be figuring out a
better way for this mechanism to work. I don't have a concrete plan, but
we should figure out how to make the parser smarter/give it better tools
rather than try to hack around its limitations. I was thinking there
would somehow still be a single condition codes register, and it, the
python operand class representing it, or something, would figure out
what flags to read and stuff into it and what flags to pick out to write
back. When I said before that I took some steps towards composite
operands, this is one of the things I had in mind.

Gabe
Continue reading on narkive:
Loading...