Discussion:
[gem5-dev] Review Request 3192: ruby: split CPU and GPU latency stats
(too old to reply)
Tony Gutierrez
2015-10-30 21:51:33 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3192/
-----------------------------------------------------------

Review request for Default.


Repository: gem5


Description
-------

Changeset 11198:28396a1046f4
---------------------------
ruby: split CPU and GPU latency stats


Diffs
-----

src/mem/ruby/profiler/Profiler.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/profiler/Profiler.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6

Diff: http://reviews.gem5.org/r/3192/diff/


Testing
-------


Thanks,

Tony Gutierrez
Jason Lowe-Power
2015-11-02 17:59:09 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3192/#review7434
-----------------------------------------------------------

Ship it!


Ship It!

- Jason Lowe-Power
Post by Tony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/3192/
-----------------------------------------------------------
(Updated Oct. 30, 2015, 9:51 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 11198:28396a1046f4
---------------------------
ruby: split CPU and GPU latency stats
Diffs
-----
src/mem/ruby/profiler/Profiler.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/profiler/Profiler.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
Diff: http://reviews.gem5.org/r/3192/diff/
Testing
-------
Thanks,
Tony Gutierrez
Joel Hestness
2015-11-03 22:07:43 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3192/#review7463
-----------------------------------------------------------



src/mem/ruby/profiler/Profiler.hh (line 121)
<http://reviews.gem5.org/r/3192/#comment6315>

I don't see where this guy is used. Can you please verify that this is used in another patch, so we don't add dead code?


- Joel Hestness
Post by Tony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/3192/
-----------------------------------------------------------
(Updated Oct. 30, 2015, 9:51 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 11198:28396a1046f4
---------------------------
ruby: split CPU and GPU latency stats
Diffs
-----
src/mem/ruby/profiler/Profiler.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/profiler/Profiler.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
Diff: http://reviews.gem5.org/r/3192/diff/
Testing
-------
Thanks,
Tony Gutierrez
John Alsop
2015-11-09 22:51:32 UTC
Permalink
Post by Tony Gutierrez
src/mem/ruby/profiler/Profiler.hh, line 121
<http://reviews.gem5.org/r/3192/diff/1/?file=51173#file51173line121>
I don't see where this guy is used. Can you please verify that this is used in another patch, so we don't add dead code?
The histograms added will be populated when using GPUCoalescer objects added in patch 3189. Is that what you're asking?


- John


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3192/#review7463
-----------------------------------------------------------
Post by Tony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/3192/
-----------------------------------------------------------
(Updated Nov. 9, 2015, 1:16 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 11199:5d5bd1b0d332
---------------------------
ruby: split CPU and GPU latency stats
Diffs
-----
src/mem/ruby/profiler/Profiler.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/profiler/Profiler.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/slicc/symbols/StateMachine.py 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
Diff: http://reviews.gem5.org/r/3192/diff/
Testing
-------
Thanks,
Tony Gutierrez
Joel Hestness
2015-11-10 22:39:18 UTC
Permalink
Post by John Alsop
src/mem/ruby/profiler/Profiler.hh, line 121
<http://reviews.gem5.org/r/3192/diff/1/?file=51173#file51173line121>
I don't see where this guy is used. Can you please verify that this is used in another patch, so we don't add dead code?
The histograms added will be populated when using GPUCoalescer objects added in patch 3189. Is that what you're asking?
Apologies for the confusion. In the first diff, it appeared that either the old histogram or one of the new ones was unnecessary. That is clarified in the new diff.


- Joel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3192/#review7463
-----------------------------------------------------------
Post by John Alsop
-----------------------------------------------------------
http://reviews.gem5.org/r/3192/
-----------------------------------------------------------
(Updated Nov. 9, 2015, 9:16 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 11199:5d5bd1b0d332
---------------------------
ruby: split CPU and GPU latency stats
Diffs
-----
src/mem/ruby/profiler/Profiler.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/profiler/Profiler.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/slicc/symbols/StateMachine.py 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
Diff: http://reviews.gem5.org/r/3192/diff/
Testing
-------
Thanks,
Tony Gutierrez
John Alsop
2015-11-12 19:55:12 UTC
Permalink
Post by John Alsop
src/mem/ruby/profiler/Profiler.hh, line 121
<http://reviews.gem5.org/r/3192/diff/1/?file=51173#file51173line121>
I don't see where this guy is used. Can you please verify that this is used in another patch, so we don't add dead code?
The histograms added will be populated when using GPUCoalescer objects added in patch 3189. Is that what you're asking?
Apologies for the confusion. In the first diff, it appeared that either the old histogram or one of the new ones was unnecessary. That is clarified in the new diff.
No worries- yeah, the new changes were needed to make the stats work with GPUCoalescer. Did you mean to give this a ship-it or are you still reviewing the newer diff?


- John


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3192/#review7463
-----------------------------------------------------------
Post by John Alsop
-----------------------------------------------------------
http://reviews.gem5.org/r/3192/
-----------------------------------------------------------
(Updated Nov. 9, 2015, 1:16 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 11199:5d5bd1b0d332
---------------------------
ruby: split CPU and GPU latency stats
Diffs
-----
src/mem/ruby/profiler/Profiler.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/profiler/Profiler.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/slicc/symbols/StateMachine.py 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
Diff: http://reviews.gem5.org/r/3192/diff/
Testing
-------
Thanks,
Tony Gutierrez
Tony Gutierrez
2015-11-09 21:16:59 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3192/
-----------------------------------------------------------

(Updated Nov. 9, 2015, 1:16 p.m.)


Review request for Default.


Repository: gem5


Description (updated)
-------

Changeset 11199:5d5bd1b0d332
---------------------------
ruby: split CPU and GPU latency stats


Diffs (updated)
-----

src/mem/ruby/profiler/Profiler.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/profiler/Profiler.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/slicc/symbols/StateMachine.py 4daf60db14d794e2344a6c86a93bdd8273bc5bb6

Diff: http://reviews.gem5.org/r/3192/diff/


Testing
-------


Thanks,

Tony Gutierrez
Joel Hestness
2015-11-12 20:08:37 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3192/#review7540
-----------------------------------------------------------


I can give a "Ship it" here if we decide a reasonable way to tackle the issue below.


src/mem/slicc/symbols/StateMachine.py (line 313)
<http://reviews.gem5.org/r/3192/#comment6472>

Not sure if this should be fixed before commit, but I feel it should be fixed at some point:

It seems like we should be using better inheritance here. For the interface with a cache state machine, is there a difference between a Sequencer and a Coalescer? If not, these getCPUSequencer and getGPUCoalescer functions should be merged to return a type from which both Sequencer and GPUCoalescer descend (e.g. in review request 3189, GPUCoalescer copies much of the Sequencer functionality, so it seems like GPUCoalescer should descend from Sequencer and just overload functionality). Elsewhere in this patch where you need to decide whether to accumulate sequencer or coalescer stats, we could use an accessor function in the common ancestor type that indicates whether the attached RubyPort/Sequencer is a "sequencer" or "coalescer". The interfaces for these types appear nearly identical, and this would keep the modifications to SLICC to a minimum (e.g. these changes would not be necessary).


- Joel Hestness
Post by Tony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/3192/
-----------------------------------------------------------
(Updated Nov. 9, 2015, 9:16 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 11199:5d5bd1b0d332
---------------------------
ruby: split CPU and GPU latency stats
Diffs
-----
src/mem/ruby/profiler/Profiler.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/profiler/Profiler.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/slicc/symbols/StateMachine.py 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
Diff: http://reviews.gem5.org/r/3192/diff/
Testing
-------
Thanks,
Tony Gutierrez
Brad Beckmann
2015-11-18 19:43:42 UTC
Permalink
Post by Tony Gutierrez
src/mem/slicc/symbols/StateMachine.py, line 313
<http://reviews.gem5.org/r/3192/diff/2/?file=51487#file51487line313>
It seems like we should be using better inheritance here. For the interface with a cache state machine, is there a difference between a Sequencer and a Coalescer? If not, these getCPUSequencer and getGPUCoalescer functions should be merged to return a type from which both Sequencer and GPUCoalescer descend (e.g. in review request 3189, GPUCoalescer copies much of the Sequencer functionality, so it seems like GPUCoalescer should descend from Sequencer and just overload functionality). Elsewhere in this patch where you need to decide whether to accumulate sequencer or coalescer stats, we could use an accessor function in the common ancestor type that indicates whether the attached RubyPort/Sequencer is a "sequencer" or "coalescer". The interfaces for these types appear nearly identical, and this would keep the modifications to SLICC to a minimum (e.g. these changes would not be necessary).
There is a difference between the GPUCoalescer and Sequencer APIs. Though they have similar named functions, they require different arguments and behave very differently. Also note that it GPUCoalescer and/or Sequencer pointers are passed into the machines, not RubyPort pointers. I think it would be very hard to change that.

The GPUCoalescer behaves very differently than the Sequencer. If you actually diff the cc files, you will notice that well over 90% of the lines are different and that no two functions are the same. There is some Alpha LLSC cruft in the GPUCoalescer that doesn't make much sense to be there. Perhaps that is why you think there is a lot of copying between the two files? Would you feel better about this patch if we removed the LLSC cruft from the GPUCoalescer in 3189?


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3192/#review7540
-----------------------------------------------------------
Post by Tony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/3192/
-----------------------------------------------------------
(Updated Nov. 9, 2015, 9:16 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 11199:5d5bd1b0d332
---------------------------
ruby: split CPU and GPU latency stats
Diffs
-----
src/mem/ruby/profiler/Profiler.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/profiler/Profiler.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/slicc/symbols/StateMachine.py 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
Diff: http://reviews.gem5.org/r/3192/diff/
Testing
-------
Thanks,
Tony Gutierrez
Joel Hestness
2015-11-20 20:10:27 UTC
Permalink
Post by Brad Beckmann
src/mem/slicc/symbols/StateMachine.py, line 313
<http://reviews.gem5.org/r/3192/diff/2/?file=51487#file51487line313>
It seems like we should be using better inheritance here. For the interface with a cache state machine, is there a difference between a Sequencer and a Coalescer? If not, these getCPUSequencer and getGPUCoalescer functions should be merged to return a type from which both Sequencer and GPUCoalescer descend (e.g. in review request 3189, GPUCoalescer copies much of the Sequencer functionality, so it seems like GPUCoalescer should descend from Sequencer and just overload functionality). Elsewhere in this patch where you need to decide whether to accumulate sequencer or coalescer stats, we could use an accessor function in the common ancestor type that indicates whether the attached RubyPort/Sequencer is a "sequencer" or "coalescer". The interfaces for these types appear nearly identical, and this would keep the modifications to SLICC to a minimum (e.g. these changes would not be necessary).
There is a difference between the GPUCoalescer and Sequencer APIs. Though they have similar named functions, they require different arguments and behave very differently. Also note that it GPUCoalescer and/or Sequencer pointers are passed into the machines, not RubyPort pointers. I think it would be very hard to change that.
The GPUCoalescer behaves very differently than the Sequencer. If you actually diff the cc files, you will notice that well over 90% of the lines are different and that no two functions are the same. There is some Alpha LLSC cruft in the GPUCoalescer that doesn't make much sense to be there. Perhaps that is why you think there is a lot of copying between the two files? Would you feel better about this patch if we removed the LLSC cruft from the GPUCoalescer in 3189?
Apologies for the push here, but I don't feel you're representing these differences accurately. It looks trivial to align and unify these C++ classes. Let's break it down:

1) Their current common APIs: Both Sequencers and GPUCoalescers are RubyPorts, so they implement that common API. Nothing to do here.

2) Beyond that, Sequencers implement read and write callbacks from SLICC machines, and GPUCoalescers implement atomic callbacks. Sure, the parameters for GPUCoaleser callbacks are different from Sequencer callbacks, but it's because no effort has been made to unify them; Specifically, all of the read AND write callbacks in GPUCoalescer just call the callback with the most generic parameter list. The same effect can be achieved using default parameters, unifying these callbacks with the Sequencer callbacks. A Sequencer may implement the atomicCallback in the future (e.g. we could use it with the current organization of gem5-gpu), so it can be added to the common API.

3) Finally, for now, the recordCP*CallBacks can be left as members of GPUCoaleser, though they are pretty hacky: They record stats in coalescers when data is returned through sequencers. However, if there was better Sequencer/GPUCoalescer abstraction, you would only need to connect a sequencer OR a coalescer to those state machines (rather than both), AND you could eliminate the strange use_seq_not_coal variable.

Overall, I feel strongly that Sequencers and GPUCoalescers need to be interchangeable at the interface to cache controllers, because it is likely that users will want flexibility to use either (your GPU protocols already suggest that you're hacking around unifying their interfaces - e.g. MachineType:TCP). It would be nice if some more thought could go into how and when to make this happen.


- Joel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3192/#review7540
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/3192/
-----------------------------------------------------------
(Updated Nov. 9, 2015, 9:16 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 11199:5d5bd1b0d332
---------------------------
ruby: split CPU and GPU latency stats
Diffs
-----
src/mem/ruby/profiler/Profiler.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/profiler/Profiler.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/slicc/symbols/StateMachine.py 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
Diff: http://reviews.gem5.org/r/3192/diff/
Testing
-------
Thanks,
Tony Gutierrez
Brad Beckmann
2015-12-04 00:54:06 UTC
Permalink
Post by Brad Beckmann
src/mem/slicc/symbols/StateMachine.py, line 313
<http://reviews.gem5.org/r/3192/diff/2/?file=51487#file51487line313>
It seems like we should be using better inheritance here. For the interface with a cache state machine, is there a difference between a Sequencer and a Coalescer? If not, these getCPUSequencer and getGPUCoalescer functions should be merged to return a type from which both Sequencer and GPUCoalescer descend (e.g. in review request 3189, GPUCoalescer copies much of the Sequencer functionality, so it seems like GPUCoalescer should descend from Sequencer and just overload functionality). Elsewhere in this patch where you need to decide whether to accumulate sequencer or coalescer stats, we could use an accessor function in the common ancestor type that indicates whether the attached RubyPort/Sequencer is a "sequencer" or "coalescer". The interfaces for these types appear nearly identical, and this would keep the modifications to SLICC to a minimum (e.g. these changes would not be necessary).
There is a difference between the GPUCoalescer and Sequencer APIs. Though they have similar named functions, they require different arguments and behave very differently. Also note that it GPUCoalescer and/or Sequencer pointers are passed into the machines, not RubyPort pointers. I think it would be very hard to change that.
The GPUCoalescer behaves very differently than the Sequencer. If you actually diff the cc files, you will notice that well over 90% of the lines are different and that no two functions are the same. There is some Alpha LLSC cruft in the GPUCoalescer that doesn't make much sense to be there. Perhaps that is why you think there is a lot of copying between the two files? Would you feel better about this patch if we removed the LLSC cruft from the GPUCoalescer in 3189?
1) Their current common APIs: Both Sequencers and GPUCoalescers are RubyPorts, so they implement that common API. Nothing to do here.
2) Beyond that, Sequencers implement read and write callbacks from SLICC machines, and GPUCoalescers implement atomic callbacks. Sure, the parameters for GPUCoaleser callbacks are different from Sequencer callbacks, but it's because no effort has been made to unify them; Specifically, all of the read AND write callbacks in GPUCoalescer just call the callback with the most generic parameter list. The same effect can be achieved using default parameters, unifying these callbacks with the Sequencer callbacks. A Sequencer may implement the atomicCallback in the future (e.g. we could use it with the current organization of gem5-gpu), so it can be added to the common API.
3) Finally, for now, the recordCP*CallBacks can be left as members of GPUCoaleser, though they are pretty hacky: They record stats in coalescers when data is returned through sequencers. However, if there was better Sequencer/GPUCoalescer abstraction, you would only need to connect a sequencer OR a coalescer to those state machines (rather than both), AND you could eliminate the strange use_seq_not_coal variable.
Overall, I feel strongly that Sequencers and GPUCoalescers need to be interchangeable at the interface to cache controllers, because it is likely that users will want flexibility to use either (your GPU protocols already suggest that you're hacking around unifying their interfaces - e.g. MachineType:TCP). It would be nice if some more thought could go into how and when to make this happen.
1) Yes, they implement the same APIs, but their implementations are very different
2) Unifying the parameters is a lot of work. Even if we unified the parameters, the callback functionality is different. For instance Seq->hitcallback only process one pkt, where Coal->hitcallback processes several packet responses using a two phase approach (hitcallback calls completeHitCallback).
3) I'm not against what you suggest, but don't let that hang up this patch. There is a lot of subtle issues trying to accomplish interchangeablity. As I recall, we tried to do that but we determined it would not be easy and it wasn't worth the effort.

Can you please just let us check in this patch?


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3192/#review7540
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/3192/
-----------------------------------------------------------
(Updated Nov. 9, 2015, 9:16 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 11199:5d5bd1b0d332
---------------------------
ruby: split CPU and GPU latency stats
Diffs
-----
src/mem/ruby/profiler/Profiler.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/profiler/Profiler.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/slicc/symbols/StateMachine.py 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
Diff: http://reviews.gem5.org/r/3192/diff/
Testing
-------
Thanks,
Tony Gutierrez
Joel Hestness
2015-12-04 16:43:19 UTC
Permalink
Post by Brad Beckmann
src/mem/slicc/symbols/StateMachine.py, line 313
<http://reviews.gem5.org/r/3192/diff/2/?file=51487#file51487line313>
It seems like we should be using better inheritance here. For the interface with a cache state machine, is there a difference between a Sequencer and a Coalescer? If not, these getCPUSequencer and getGPUCoalescer functions should be merged to return a type from which both Sequencer and GPUCoalescer descend (e.g. in review request 3189, GPUCoalescer copies much of the Sequencer functionality, so it seems like GPUCoalescer should descend from Sequencer and just overload functionality). Elsewhere in this patch where you need to decide whether to accumulate sequencer or coalescer stats, we could use an accessor function in the common ancestor type that indicates whether the attached RubyPort/Sequencer is a "sequencer" or "coalescer". The interfaces for these types appear nearly identical, and this would keep the modifications to SLICC to a minimum (e.g. these changes would not be necessary).
There is a difference between the GPUCoalescer and Sequencer APIs. Though they have similar named functions, they require different arguments and behave very differently. Also note that it GPUCoalescer and/or Sequencer pointers are passed into the machines, not RubyPort pointers. I think it would be very hard to change that.
The GPUCoalescer behaves very differently than the Sequencer. If you actually diff the cc files, you will notice that well over 90% of the lines are different and that no two functions are the same. There is some Alpha LLSC cruft in the GPUCoalescer that doesn't make much sense to be there. Perhaps that is why you think there is a lot of copying between the two files? Would you feel better about this patch if we removed the LLSC cruft from the GPUCoalescer in 3189?
1) Their current common APIs: Both Sequencers and GPUCoalescers are RubyPorts, so they implement that common API. Nothing to do here.
2) Beyond that, Sequencers implement read and write callbacks from SLICC machines, and GPUCoalescers implement atomic callbacks. Sure, the parameters for GPUCoaleser callbacks are different from Sequencer callbacks, but it's because no effort has been made to unify them; Specifically, all of the read AND write callbacks in GPUCoalescer just call the callback with the most generic parameter list. The same effect can be achieved using default parameters, unifying these callbacks with the Sequencer callbacks. A Sequencer may implement the atomicCallback in the future (e.g. we could use it with the current organization of gem5-gpu), so it can be added to the common API.
3) Finally, for now, the recordCP*CallBacks can be left as members of GPUCoaleser, though they are pretty hacky: They record stats in coalescers when data is returned through sequencers. However, if there was better Sequencer/GPUCoalescer abstraction, you would only need to connect a sequencer OR a coalescer to those state machines (rather than both), AND you could eliminate the strange use_seq_not_coal variable.
Overall, I feel strongly that Sequencers and GPUCoalescers need to be interchangeable at the interface to cache controllers, because it is likely that users will want flexibility to use either (your GPU protocols already suggest that you're hacking around unifying their interfaces - e.g. MachineType:TCP). It would be nice if some more thought could go into how and when to make this happen.
1) Yes, they implement the same APIs, but their implementations are very different
2) Unifying the parameters is a lot of work. Even if we unified the parameters, the callback functionality is different. For instance Seq->hitcallback only process one pkt, where Coal->hitcallback processes several packet responses using a two phase approach (hitcallback calls completeHitCallback).
3) I'm not against what you suggest, but don't let that hang up this patch. There is a lot of subtle issues trying to accomplish interchangeablity. As I recall, we tried to do that but we determined it would not be easy and it wasn't worth the effort.
Can you please just let us check in this patch?
Sure. I was merely suggesting that we should have a plan for how this gets addressed. It would be my preference that AMD plan to merge the Sequencer and GPUCoalescer at a later date. Is that agreeable?


- Joel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3192/#review7540
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/3192/
-----------------------------------------------------------
(Updated Nov. 9, 2015, 9:16 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 11199:5d5bd1b0d332
---------------------------
ruby: split CPU and GPU latency stats
Diffs
-----
src/mem/ruby/profiler/Profiler.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/profiler/Profiler.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/slicc/symbols/StateMachine.py 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
Diff: http://reviews.gem5.org/r/3192/diff/
Testing
-------
Thanks,
Tony Gutierrez
Brad Beckmann
2015-12-04 18:39:58 UTC
Permalink
Post by Brad Beckmann
src/mem/slicc/symbols/StateMachine.py, line 313
<http://reviews.gem5.org/r/3192/diff/2/?file=51487#file51487line313>
It seems like we should be using better inheritance here. For the interface with a cache state machine, is there a difference between a Sequencer and a Coalescer? If not, these getCPUSequencer and getGPUCoalescer functions should be merged to return a type from which both Sequencer and GPUCoalescer descend (e.g. in review request 3189, GPUCoalescer copies much of the Sequencer functionality, so it seems like GPUCoalescer should descend from Sequencer and just overload functionality). Elsewhere in this patch where you need to decide whether to accumulate sequencer or coalescer stats, we could use an accessor function in the common ancestor type that indicates whether the attached RubyPort/Sequencer is a "sequencer" or "coalescer". The interfaces for these types appear nearly identical, and this would keep the modifications to SLICC to a minimum (e.g. these changes would not be necessary).
There is a difference between the GPUCoalescer and Sequencer APIs. Though they have similar named functions, they require different arguments and behave very differently. Also note that it GPUCoalescer and/or Sequencer pointers are passed into the machines, not RubyPort pointers. I think it would be very hard to change that.
The GPUCoalescer behaves very differently than the Sequencer. If you actually diff the cc files, you will notice that well over 90% of the lines are different and that no two functions are the same. There is some Alpha LLSC cruft in the GPUCoalescer that doesn't make much sense to be there. Perhaps that is why you think there is a lot of copying between the two files? Would you feel better about this patch if we removed the LLSC cruft from the GPUCoalescer in 3189?
1) Their current common APIs: Both Sequencers and GPUCoalescers are RubyPorts, so they implement that common API. Nothing to do here.
2) Beyond that, Sequencers implement read and write callbacks from SLICC machines, and GPUCoalescers implement atomic callbacks. Sure, the parameters for GPUCoaleser callbacks are different from Sequencer callbacks, but it's because no effort has been made to unify them; Specifically, all of the read AND write callbacks in GPUCoalescer just call the callback with the most generic parameter list. The same effect can be achieved using default parameters, unifying these callbacks with the Sequencer callbacks. A Sequencer may implement the atomicCallback in the future (e.g. we could use it with the current organization of gem5-gpu), so it can be added to the common API.
3) Finally, for now, the recordCP*CallBacks can be left as members of GPUCoaleser, though they are pretty hacky: They record stats in coalescers when data is returned through sequencers. However, if there was better Sequencer/GPUCoalescer abstraction, you would only need to connect a sequencer OR a coalescer to those state machines (rather than both), AND you could eliminate the strange use_seq_not_coal variable.
Overall, I feel strongly that Sequencers and GPUCoalescers need to be interchangeable at the interface to cache controllers, because it is likely that users will want flexibility to use either (your GPU protocols already suggest that you're hacking around unifying their interfaces - e.g. MachineType:TCP). It would be nice if some more thought could go into how and when to make this happen.
1) Yes, they implement the same APIs, but their implementations are very different
2) Unifying the parameters is a lot of work. Even if we unified the parameters, the callback functionality is different. For instance Seq->hitcallback only process one pkt, where Coal->hitcallback processes several packet responses using a two phase approach (hitcallback calls completeHitCallback).
3) I'm not against what you suggest, but don't let that hang up this patch. There is a lot of subtle issues trying to accomplish interchangeablity. As I recall, we tried to do that but we determined it would not be easy and it wasn't worth the effort.
Can you please just let us check in this patch?
Sure. I was merely suggesting that we should have a plan for how this gets addressed. It would be my preference that AMD plan to merge the Sequencer and GPUCoalescer at a later date. Is that agreeable?
I agree at a later date we will revamp the entire Sequencer/Coalescer design. In particular, as I described to you a couple months ago, I want to decouple the storage between pending requests and issued requests. I think that will resolve a lot of confusion.


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3192/#review7540
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/3192/
-----------------------------------------------------------
(Updated Nov. 9, 2015, 9:16 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 11199:5d5bd1b0d332
---------------------------
ruby: split CPU and GPU latency stats
Diffs
-----
src/mem/ruby/profiler/Profiler.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/profiler/Profiler.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/slicc/symbols/StateMachine.py 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
Diff: http://reviews.gem5.org/r/3192/diff/
Testing
-------
Thanks,
Tony Gutierrez
Joel Hestness
2015-12-04 18:44:25 UTC
Permalink
Post by Brad Beckmann
src/mem/slicc/symbols/StateMachine.py, line 313
<http://reviews.gem5.org/r/3192/diff/2/?file=51487#file51487line313>
It seems like we should be using better inheritance here. For the interface with a cache state machine, is there a difference between a Sequencer and a Coalescer? If not, these getCPUSequencer and getGPUCoalescer functions should be merged to return a type from which both Sequencer and GPUCoalescer descend (e.g. in review request 3189, GPUCoalescer copies much of the Sequencer functionality, so it seems like GPUCoalescer should descend from Sequencer and just overload functionality). Elsewhere in this patch where you need to decide whether to accumulate sequencer or coalescer stats, we could use an accessor function in the common ancestor type that indicates whether the attached RubyPort/Sequencer is a "sequencer" or "coalescer". The interfaces for these types appear nearly identical, and this would keep the modifications to SLICC to a minimum (e.g. these changes would not be necessary).
There is a difference between the GPUCoalescer and Sequencer APIs. Though they have similar named functions, they require different arguments and behave very differently. Also note that it GPUCoalescer and/or Sequencer pointers are passed into the machines, not RubyPort pointers. I think it would be very hard to change that.
The GPUCoalescer behaves very differently than the Sequencer. If you actually diff the cc files, you will notice that well over 90% of the lines are different and that no two functions are the same. There is some Alpha LLSC cruft in the GPUCoalescer that doesn't make much sense to be there. Perhaps that is why you think there is a lot of copying between the two files? Would you feel better about this patch if we removed the LLSC cruft from the GPUCoalescer in 3189?
1) Their current common APIs: Both Sequencers and GPUCoalescers are RubyPorts, so they implement that common API. Nothing to do here.
2) Beyond that, Sequencers implement read and write callbacks from SLICC machines, and GPUCoalescers implement atomic callbacks. Sure, the parameters for GPUCoaleser callbacks are different from Sequencer callbacks, but it's because no effort has been made to unify them; Specifically, all of the read AND write callbacks in GPUCoalescer just call the callback with the most generic parameter list. The same effect can be achieved using default parameters, unifying these callbacks with the Sequencer callbacks. A Sequencer may implement the atomicCallback in the future (e.g. we could use it with the current organization of gem5-gpu), so it can be added to the common API.
3) Finally, for now, the recordCP*CallBacks can be left as members of GPUCoaleser, though they are pretty hacky: They record stats in coalescers when data is returned through sequencers. However, if there was better Sequencer/GPUCoalescer abstraction, you would only need to connect a sequencer OR a coalescer to those state machines (rather than both), AND you could eliminate the strange use_seq_not_coal variable.
Overall, I feel strongly that Sequencers and GPUCoalescers need to be interchangeable at the interface to cache controllers, because it is likely that users will want flexibility to use either (your GPU protocols already suggest that you're hacking around unifying their interfaces - e.g. MachineType:TCP). It would be nice if some more thought could go into how and when to make this happen.
1) Yes, they implement the same APIs, but their implementations are very different
2) Unifying the parameters is a lot of work. Even if we unified the parameters, the callback functionality is different. For instance Seq->hitcallback only process one pkt, where Coal->hitcallback processes several packet responses using a two phase approach (hitcallback calls completeHitCallback).
3) I'm not against what you suggest, but don't let that hang up this patch. There is a lot of subtle issues trying to accomplish interchangeablity. As I recall, we tried to do that but we determined it would not be easy and it wasn't worth the effort.
Can you please just let us check in this patch?
Sure. I was merely suggesting that we should have a plan for how this gets addressed. It would be my preference that AMD plan to merge the Sequencer and GPUCoalescer at a later date. Is that agreeable?
I agree at a later date we will revamp the entire Sequencer/Coalescer design. In particular, as I described to you a couple months ago, I want to decouple the storage between pending requests and issued requests. I think that will resolve a lot of confusion.
Ok. Thank you!


- Joel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3192/#review7540
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/3192/
-----------------------------------------------------------
(Updated Nov. 9, 2015, 9:16 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 11199:5d5bd1b0d332
---------------------------
ruby: split CPU and GPU latency stats
Diffs
-----
src/mem/ruby/profiler/Profiler.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/profiler/Profiler.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/slicc/symbols/StateMachine.py 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
Diff: http://reviews.gem5.org/r/3192/diff/
Testing
-------
Thanks,
Tony Gutierrez
Joel Hestness
2015-12-04 18:44:31 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3192/#review7693
-----------------------------------------------------------

Ship it!


Ship It!

- Joel Hestness
Post by Tony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/3192/
-----------------------------------------------------------
(Updated Nov. 9, 2015, 9:16 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 11199:5d5bd1b0d332
---------------------------
ruby: split CPU and GPU latency stats
Diffs
-----
src/mem/ruby/profiler/Profiler.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/profiler/Profiler.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.hh 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/ruby/slicc_interface/AbstractController.cc 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
src/mem/slicc/symbols/StateMachine.py 4daf60db14d794e2344a6c86a93bdd8273bc5bb6
Diff: http://reviews.gem5.org/r/3192/diff/
Testing
-------
Thanks,
Tony Gutierrez
Loading...