Discussion:
[gem5-dev] Review Request 2776: ruby: cleaner ruby tester support
(too old to reply)
Anthony Gutierrez
2015-05-11 22:17:52 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------

Review request for Default.


Repository: gem5


Description
-------

Changeset 10833:10eaaf461483
---------------------------
ruby: cleaner ruby tester support

This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.


Diffs
-----


Diff: http://reviews.gem5.org/r/2776/diff/


Testing
-------


Thanks,

Anthony Gutierrez
Anthony Gutierrez
2015-05-11 22:28:30 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------

(Updated May 11, 2015, 3:28 p.m.)


Review request for Default.


Repository: gem5


Description (updated)
-------

Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support

This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.


Diffs (updated)
-----

configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec

Diff: http://reviews.gem5.org/r/2776/diff/


Testing
-------


Thanks,

Anthony Gutierrez
Andreas Hansson
2015-05-12 05:45:48 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------



src/mem/packet_queue.cc (line 117)
<http://reviews.gem5.org/r/2776/#comment5343>

I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control


- Andreas Hansson
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Brad Beckmann
2015-05-12 19:19:06 UTC
Permalink
Post by Anthony Gutierrez
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.

We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.

The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Andreas Hansson
2015-05-13 09:30:35 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.

I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.


- Andreas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Brad Beckmann
2015-05-13 16:16:29 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.

Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Nilay Vaish
2015-05-13 16:46:53 UTC
Permalink
Post by Brad Beckmann
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts.
Your suggestion goes far beyond this patch. Reimplementing RubyPort's
m5 ports to inherit from a different base port is a very signficant
change with many, many implications beyond the Ruby Tester. That is a
pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use
QueuedPorts. That decision was made back in 2012 with patches from your
group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Brad, I think you are forgetting one thing here. If the packets / memory
requests are in a packet queue of the queuedport, then they have not been
seen by the ruby memory system. So, if your memory system cannot process
requests at the rate at which the tester is issuing them, you are any way
not going to see the desired behavior. In which case buffering them in
the tester should also be OK.

--
Nilay
Beckmann, Brad
2015-05-13 17:34:08 UTC
Permalink
The reason why the requests cannot be processed is not a resource issue. It is usually an address contention issue. When the address contention issue is resolved, a large racey burst of requests is issued at once. That is exactly the type of scenario that finds protocol bugs. The request buffering you are referring to is not in the queued port, rather the buffering occurs in the Sequencer (or other similar objects). The queued port buffering happens on the response path back to the RubyTester. The RubyTester itself can deal with a very large amount of responses at once. The queued port limit is reached simply because a large number of responses come back at once.

Brad


-----Original Message-----
From: gem5-dev [mailto:gem5-dev-***@gem5.org] On Behalf Of Nilay Vaish
Sent: Wednesday, May 13, 2015 9:47 AM
To: gem5 Developer List
Subject: Re: [gem5-dev] Review Request 2776: ruby: cleaner ruby tester support
Post by Brad Beckmann
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not
exist, and even 100 packets is a stretch. Any module hitting this
needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts.
Your suggestion goes far beyond this patch. Reimplementing RubyPort's
m5 ports to inherit from a different base port is a very signficant
change with many, many implications beyond the Ruby Tester. That is a
pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use
QueuedPorts. That decision was made back in 2012 with patches from
your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Brad, I think you are forgetting one thing here. If the packets / memory requests are in a packet queue of the queuedport, then they have not been seen by the ruby memory system. So, if your memory system cannot process requests at the rate at which the tester is issuing them, you are any way not going to see the desired behavior. In which case buffering them in the tester should also be OK.
--
Nilay
_______________________________________________
gem5-dev mailing list
gem5-***@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev
Jason Power
2015-05-13 21:14:56 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.

IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.

What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.

(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)


- Jason


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Andreas Hansson
2015-05-13 21:24:38 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.
IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.
What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.
(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)
I think:

1) The 100 packet sanity check should definitely not be changed as part of this patch. If we do want to change it, it should be a patch on its own.

2) If there are somehow 10000 responses queued up, then something is truly _fundamentally_ broken, and having a panic telling the user it is broken is helpful imho. Do you not agree?

The QueuedPort is an evolution of the SimpleTimingPort that was used earlier, and I believe the RubyPort has always had this issue (it borrowed a lot of not-so-great design decisions from the Bus for example). The QueuedPort should really only be used as a "lazy" way of implementing non-performance critical devices. I've been meaning to remove it from the classic Caches as well, just not gotten around to it.

In the case of the RubyPort, I do not see the big issue of moving away from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. The only implication is that we need to deal with the port flow control (retry), and pass that downstream/upstream. Since the RubyPort really does not correspond to any real physical object, with any buffers, I think the case is clear: It should not do any buffering behind the user's back. Do you not think this is reasonable?


- Andreas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Brad Beckmann
2015-05-15 22:20:11 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.
IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.
What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.
(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)
1) The 100 packet sanity check should definitely not be changed as part of this patch. If we do want to change it, it should be a patch on its own.
2) If there are somehow 10000 responses queued up, then something is truly _fundamentally_ broken, and having a panic telling the user it is broken is helpful imho. Do you not agree?
The QueuedPort is an evolution of the SimpleTimingPort that was used earlier, and I believe the RubyPort has always had this issue (it borrowed a lot of not-so-great design decisions from the Bus for example). The QueuedPort should really only be used as a "lazy" way of implementing non-performance critical devices. I've been meaning to remove it from the classic Caches as well, just not gotten around to it.
In the case of the RubyPort, I do not see the big issue of moving away from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. The only implication is that we need to deal with the port flow control (retry), and pass that downstream/upstream. Since the RubyPort really does not correspond to any real physical object, with any buffers, I think the case is clear: It should not do any buffering behind the user's back. Do you not think this is reasonable?
Yes, if there is 10000 responses queued up, then there might be something broken. However, 100 responses can be queued up rather easily when running with the Ruby Tester or with a GPU. That is why this patch changes the sanity check from 100 to 10000.

RubyPort is used by a lot of users, with many different scenarios/protocols. Moving it away from QueuedPorts will fundamentally change its behavior. That should be done carefully and I don't want to delay this bug fix with a project that large.

Would you be happy if we introduced a patch that modifies the QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the RubyPort can set the transmitList threshold to 10000 and the default can still be set to 100?


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Andreas Hansson
2015-05-16 09:57:58 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.
IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.
What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.
(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)
1) The 100 packet sanity check should definitely not be changed as part of this patch. If we do want to change it, it should be a patch on its own.
2) If there are somehow 10000 responses queued up, then something is truly _fundamentally_ broken, and having a panic telling the user it is broken is helpful imho. Do you not agree?
The QueuedPort is an evolution of the SimpleTimingPort that was used earlier, and I believe the RubyPort has always had this issue (it borrowed a lot of not-so-great design decisions from the Bus for example). The QueuedPort should really only be used as a "lazy" way of implementing non-performance critical devices. I've been meaning to remove it from the classic Caches as well, just not gotten around to it.
In the case of the RubyPort, I do not see the big issue of moving away from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. The only implication is that we need to deal with the port flow control (retry), and pass that downstream/upstream. Since the RubyPort really does not correspond to any real physical object, with any buffers, I think the case is clear: It should not do any buffering behind the user's back. Do you not think this is reasonable?
Yes, if there is 10000 responses queued up, then there might be something broken. However, 100 responses can be queued up rather easily when running with the Ruby Tester or with a GPU. That is why this patch changes the sanity check from 100 to 10000.
RubyPort is used by a lot of users, with many different scenarios/protocols. Moving it away from QueuedPorts will fundamentally change its behavior. That should be done carefully and I don't want to delay this bug fix with a project that large.
Would you be happy if we introduced a patch that modifies the QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the RubyPort can set the transmitList threshold to 10000 and the default can still be set to 100?
Can we simply leave this change out of this patch? I think the right solution here is to fix the dodgy assumptions on implicit buffering, and I'm happy to have a go myself (given that I've done something dimilar to a range of other modules).


- Andreas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Brad Beckmann
2015-05-18 05:02:09 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.
IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.
What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.
(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)
1) The 100 packet sanity check should definitely not be changed as part of this patch. If we do want to change it, it should be a patch on its own.
2) If there are somehow 10000 responses queued up, then something is truly _fundamentally_ broken, and having a panic telling the user it is broken is helpful imho. Do you not agree?
The QueuedPort is an evolution of the SimpleTimingPort that was used earlier, and I believe the RubyPort has always had this issue (it borrowed a lot of not-so-great design decisions from the Bus for example). The QueuedPort should really only be used as a "lazy" way of implementing non-performance critical devices. I've been meaning to remove it from the classic Caches as well, just not gotten around to it.
In the case of the RubyPort, I do not see the big issue of moving away from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. The only implication is that we need to deal with the port flow control (retry), and pass that downstream/upstream. Since the RubyPort really does not correspond to any real physical object, with any buffers, I think the case is clear: It should not do any buffering behind the user's back. Do you not think this is reasonable?
Yes, if there is 10000 responses queued up, then there might be something broken. However, 100 responses can be queued up rather easily when running with the Ruby Tester or with a GPU. That is why this patch changes the sanity check from 100 to 10000.
RubyPort is used by a lot of users, with many different scenarios/protocols. Moving it away from QueuedPorts will fundamentally change its behavior. That should be done carefully and I don't want to delay this bug fix with a project that large.
Would you be happy if we introduced a patch that modifies the QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the RubyPort can set the transmitList threshold to 10000 and the default can still be set to 100?
Can we simply leave this change out of this patch? I think the right solution here is to fix the dodgy assumptions on implicit buffering, and I'm happy to have a go myself (given that I've done something dimilar to a range of other modules).
The RubyTester and our soon-to-be-posed HSA-compatible GPU will not work without this change. I believe many Ruby users, especially Ruby users evaluating GPUs, have had this type of change in their personal patch queues for a very long time. We need to get this problem out of our personal patch queues and into the main line.

I appreciate your offer to redesign the RubyPort yourself, but even if that were to happen, I would prefer to get this change happen first. A redesign of the RubyPort will require a lot of downstream work to verify things still work as expected.

Why are you so against changing this somewhat arbritrary value, or at least allowing someone to change it? As Jason pointed out, the queue port is an infinite buffer and doesn't represent a real hardware structure. Why so much resistance? This small change really benefits Ruby users and if we make it configurable, it will have no impact on Classic users.


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Andreas Hansson
2015-05-18 07:00:26 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.
IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.
What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.
(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)
1) The 100 packet sanity check should definitely not be changed as part of this patch. If we do want to change it, it should be a patch on its own.
2) If there are somehow 10000 responses queued up, then something is truly _fundamentally_ broken, and having a panic telling the user it is broken is helpful imho. Do you not agree?
The QueuedPort is an evolution of the SimpleTimingPort that was used earlier, and I believe the RubyPort has always had this issue (it borrowed a lot of not-so-great design decisions from the Bus for example). The QueuedPort should really only be used as a "lazy" way of implementing non-performance critical devices. I've been meaning to remove it from the classic Caches as well, just not gotten around to it.
In the case of the RubyPort, I do not see the big issue of moving away from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. The only implication is that we need to deal with the port flow control (retry), and pass that downstream/upstream. Since the RubyPort really does not correspond to any real physical object, with any buffers, I think the case is clear: It should not do any buffering behind the user's back. Do you not think this is reasonable?
Yes, if there is 10000 responses queued up, then there might be something broken. However, 100 responses can be queued up rather easily when running with the Ruby Tester or with a GPU. That is why this patch changes the sanity check from 100 to 10000.
RubyPort is used by a lot of users, with many different scenarios/protocols. Moving it away from QueuedPorts will fundamentally change its behavior. That should be done carefully and I don't want to delay this bug fix with a project that large.
Would you be happy if we introduced a patch that modifies the QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the RubyPort can set the transmitList threshold to 10000 and the default can still be set to 100?
Can we simply leave this change out of this patch? I think the right solution here is to fix the dodgy assumptions on implicit buffering, and I'm happy to have a go myself (given that I've done something dimilar to a range of other modules).
The RubyTester and our soon-to-be-posed HSA-compatible GPU will not work without this change. I believe many Ruby users, especially Ruby users evaluating GPUs, have had this type of change in their personal patch queues for a very long time. We need to get this problem out of our personal patch queues and into the main line.
I appreciate your offer to redesign the RubyPort yourself, but even if that were to happen, I would prefer to get this change happen first. A redesign of the RubyPort will require a lot of downstream work to verify things still work as expected.
Why are you so against changing this somewhat arbritrary value, or at least allowing someone to change it? As Jason pointed out, the queue port is an infinite buffer and doesn't represent a real hardware structure. Why so much resistance? This small change really benefits Ruby users and if we make it configurable, it will have no impact on Classic users.
I fundamentally object to the change since it effectively adds an implicit infinite buffer (in quite some places), thus changing the very behaviour of the system and potentially masking problems.

The RubyPort should really just be a glue-logic adaptor. Agreed? In a system architecture, there is no such thing as a RubyPort, and there should not be kilobytes of "magic" buffering imposed on the user. If this buffer is needed, then we should instantiate a buffer, cache, etc. I do not understand the statement that this is needed to make things work. If you need an infinite buffer to make things work, then the solution is broken. Am I missing something?

I would propose to simply remove the QueuedPort, make it inherit from the Master/SlavePort directly, remove the implicit buffer altogether, and simply relay the retry to/from Ruby. Surely Ruby understands flow control...does it not?


- Andreas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Steve Reinhardt
2015-05-18 16:40:38 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.
IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.
What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.
(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)
1) The 100 packet sanity check should definitely not be changed as part of this patch. If we do want to change it, it should be a patch on its own.
2) If there are somehow 10000 responses queued up, then something is truly _fundamentally_ broken, and having a panic telling the user it is broken is helpful imho. Do you not agree?
The QueuedPort is an evolution of the SimpleTimingPort that was used earlier, and I believe the RubyPort has always had this issue (it borrowed a lot of not-so-great design decisions from the Bus for example). The QueuedPort should really only be used as a "lazy" way of implementing non-performance critical devices. I've been meaning to remove it from the classic Caches as well, just not gotten around to it.
In the case of the RubyPort, I do not see the big issue of moving away from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. The only implication is that we need to deal with the port flow control (retry), and pass that downstream/upstream. Since the RubyPort really does not correspond to any real physical object, with any buffers, I think the case is clear: It should not do any buffering behind the user's back. Do you not think this is reasonable?
Yes, if there is 10000 responses queued up, then there might be something broken. However, 100 responses can be queued up rather easily when running with the Ruby Tester or with a GPU. That is why this patch changes the sanity check from 100 to 10000.
RubyPort is used by a lot of users, with many different scenarios/protocols. Moving it away from QueuedPorts will fundamentally change its behavior. That should be done carefully and I don't want to delay this bug fix with a project that large.
Would you be happy if we introduced a patch that modifies the QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the RubyPort can set the transmitList threshold to 10000 and the default can still be set to 100?
Can we simply leave this change out of this patch? I think the right solution here is to fix the dodgy assumptions on implicit buffering, and I'm happy to have a go myself (given that I've done something dimilar to a range of other modules).
The RubyTester and our soon-to-be-posed HSA-compatible GPU will not work without this change. I believe many Ruby users, especially Ruby users evaluating GPUs, have had this type of change in their personal patch queues for a very long time. We need to get this problem out of our personal patch queues and into the main line.
I appreciate your offer to redesign the RubyPort yourself, but even if that were to happen, I would prefer to get this change happen first. A redesign of the RubyPort will require a lot of downstream work to verify things still work as expected.
Why are you so against changing this somewhat arbritrary value, or at least allowing someone to change it? As Jason pointed out, the queue port is an infinite buffer and doesn't represent a real hardware structure. Why so much resistance? This small change really benefits Ruby users and if we make it configurable, it will have no impact on Classic users.
I fundamentally object to the change since it effectively adds an implicit infinite buffer (in quite some places), thus changing the very behaviour of the system and potentially masking problems.
The RubyPort should really just be a glue-logic adaptor. Agreed? In a system architecture, there is no such thing as a RubyPort, and there should not be kilobytes of "magic" buffering imposed on the user. If this buffer is needed, then we should instantiate a buffer, cache, etc. I do not understand the statement that this is needed to make things work. If you need an infinite buffer to make things work, then the solution is broken. Am I missing something?
I would propose to simply remove the QueuedPort, make it inherit from the Master/SlavePort directly, remove the implicit buffer altogether, and simply relay the retry to/from Ruby. Surely Ruby understands flow control...does it not?
Getting rid of implicit infinite buffering is a noble and appropriate long-term goal, I agree. However, in the near term, we already have implicit infinite buffering, and the only thing we're debating here is at what arbitrary point do we warn the user that the use of that infinite buffer is potentially getting abusive. To me it seems obvious that the fundamental near-term problem is that we have this totally arbitrary constant embedded in the code and no way to adjust it based on different circumstances in which different values would be appropriate. I think Brad's suggestion that we make this threshold configurable and then find some way to propagate the value downward seems like the right way to go for now; we can leave the threshold at 100 for CPU-based systems, and crank it up for GPU-based systems, with maybe a third even higher thershold specifically for when we run RubyTester.


- Steve


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 3:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Andreas Hansson
2015-05-18 16:52:40 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.
IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.
What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.
(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)
1) The 100 packet sanity check should definitely not be changed as part of this patch. If we do want to change it, it should be a patch on its own.
2) If there are somehow 10000 responses queued up, then something is truly _fundamentally_ broken, and having a panic telling the user it is broken is helpful imho. Do you not agree?
The QueuedPort is an evolution of the SimpleTimingPort that was used earlier, and I believe the RubyPort has always had this issue (it borrowed a lot of not-so-great design decisions from the Bus for example). The QueuedPort should really only be used as a "lazy" way of implementing non-performance critical devices. I've been meaning to remove it from the classic Caches as well, just not gotten around to it.
In the case of the RubyPort, I do not see the big issue of moving away from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. The only implication is that we need to deal with the port flow control (retry), and pass that downstream/upstream. Since the RubyPort really does not correspond to any real physical object, with any buffers, I think the case is clear: It should not do any buffering behind the user's back. Do you not think this is reasonable?
Yes, if there is 10000 responses queued up, then there might be something broken. However, 100 responses can be queued up rather easily when running with the Ruby Tester or with a GPU. That is why this patch changes the sanity check from 100 to 10000.
RubyPort is used by a lot of users, with many different scenarios/protocols. Moving it away from QueuedPorts will fundamentally change its behavior. That should be done carefully and I don't want to delay this bug fix with a project that large.
Would you be happy if we introduced a patch that modifies the QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the RubyPort can set the transmitList threshold to 10000 and the default can still be set to 100?
Can we simply leave this change out of this patch? I think the right solution here is to fix the dodgy assumptions on implicit buffering, and I'm happy to have a go myself (given that I've done something dimilar to a range of other modules).
The RubyTester and our soon-to-be-posed HSA-compatible GPU will not work without this change. I believe many Ruby users, especially Ruby users evaluating GPUs, have had this type of change in their personal patch queues for a very long time. We need to get this problem out of our personal patch queues and into the main line.
I appreciate your offer to redesign the RubyPort yourself, but even if that were to happen, I would prefer to get this change happen first. A redesign of the RubyPort will require a lot of downstream work to verify things still work as expected.
Why are you so against changing this somewhat arbritrary value, or at least allowing someone to change it? As Jason pointed out, the queue port is an infinite buffer and doesn't represent a real hardware structure. Why so much resistance? This small change really benefits Ruby users and if we make it configurable, it will have no impact on Classic users.
I fundamentally object to the change since it effectively adds an implicit infinite buffer (in quite some places), thus changing the very behaviour of the system and potentially masking problems.
The RubyPort should really just be a glue-logic adaptor. Agreed? In a system architecture, there is no such thing as a RubyPort, and there should not be kilobytes of "magic" buffering imposed on the user. If this buffer is needed, then we should instantiate a buffer, cache, etc. I do not understand the statement that this is needed to make things work. If you need an infinite buffer to make things work, then the solution is broken. Am I missing something?
I would propose to simply remove the QueuedPort, make it inherit from the Master/SlavePort directly, remove the implicit buffer altogether, and simply relay the retry to/from Ruby. Surely Ruby understands flow control...does it not?
Getting rid of implicit infinite buffering is a noble and appropriate long-term goal, I agree. However, in the near term, we already have implicit infinite buffering, and the only thing we're debating here is at what arbitrary point do we warn the user that the use of that infinite buffer is potentially getting abusive. To me it seems obvious that the fundamental near-term problem is that we have this totally arbitrary constant embedded in the code and no way to adjust it based on different circumstances in which different values would be appropriate. I think Brad's suggestion that we make this threshold configurable and then find some way to propagate the value downward seems like the right way to go for now; we can leave the threshold at 100 for CPU-based systems, and crank it up for GPU-based systems, with maybe a third even higher thershold specifically for when we run RubyTester.
I'd rather have one arbitrary constant than three :-), and the lower the better.

Can we just proceed with this patch without this change? At least then I can perhaps get an appreciation for the issue (if you can point me to an example that fails). Right now I really do not understand how this could ever be a problem in any sensible setup.


- Andreas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Joel Hestness
2015-05-18 17:00:04 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.
IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.
What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.
(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)
1) The 100 packet sanity check should definitely not be changed as part of this patch. If we do want to change it, it should be a patch on its own.
2) If there are somehow 10000 responses queued up, then something is truly _fundamentally_ broken, and having a panic telling the user it is broken is helpful imho. Do you not agree?
The QueuedPort is an evolution of the SimpleTimingPort that was used earlier, and I believe the RubyPort has always had this issue (it borrowed a lot of not-so-great design decisions from the Bus for example). The QueuedPort should really only be used as a "lazy" way of implementing non-performance critical devices. I've been meaning to remove it from the classic Caches as well, just not gotten around to it.
In the case of the RubyPort, I do not see the big issue of moving away from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. The only implication is that we need to deal with the port flow control (retry), and pass that downstream/upstream. Since the RubyPort really does not correspond to any real physical object, with any buffers, I think the case is clear: It should not do any buffering behind the user's back. Do you not think this is reasonable?
Yes, if there is 10000 responses queued up, then there might be something broken. However, 100 responses can be queued up rather easily when running with the Ruby Tester or with a GPU. That is why this patch changes the sanity check from 100 to 10000.
RubyPort is used by a lot of users, with many different scenarios/protocols. Moving it away from QueuedPorts will fundamentally change its behavior. That should be done carefully and I don't want to delay this bug fix with a project that large.
Would you be happy if we introduced a patch that modifies the QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the RubyPort can set the transmitList threshold to 10000 and the default can still be set to 100?
Can we simply leave this change out of this patch? I think the right solution here is to fix the dodgy assumptions on implicit buffering, and I'm happy to have a go myself (given that I've done something dimilar to a range of other modules).
The RubyTester and our soon-to-be-posed HSA-compatible GPU will not work without this change. I believe many Ruby users, especially Ruby users evaluating GPUs, have had this type of change in their personal patch queues for a very long time. We need to get this problem out of our personal patch queues and into the main line.
I appreciate your offer to redesign the RubyPort yourself, but even if that were to happen, I would prefer to get this change happen first. A redesign of the RubyPort will require a lot of downstream work to verify things still work as expected.
Why are you so against changing this somewhat arbritrary value, or at least allowing someone to change it? As Jason pointed out, the queue port is an infinite buffer and doesn't represent a real hardware structure. Why so much resistance? This small change really benefits Ruby users and if we make it configurable, it will have no impact on Classic users.
I fundamentally object to the change since it effectively adds an implicit infinite buffer (in quite some places), thus changing the very behaviour of the system and potentially masking problems.
The RubyPort should really just be a glue-logic adaptor. Agreed? In a system architecture, there is no such thing as a RubyPort, and there should not be kilobytes of "magic" buffering imposed on the user. If this buffer is needed, then we should instantiate a buffer, cache, etc. I do not understand the statement that this is needed to make things work. If you need an infinite buffer to make things work, then the solution is broken. Am I missing something?
I would propose to simply remove the QueuedPort, make it inherit from the Master/SlavePort directly, remove the implicit buffer altogether, and simply relay the retry to/from Ruby. Surely Ruby understands flow control...does it not?
Getting rid of implicit infinite buffering is a noble and appropriate long-term goal, I agree. However, in the near term, we already have implicit infinite buffering, and the only thing we're debating here is at what arbitrary point do we warn the user that the use of that infinite buffer is potentially getting abusive. To me it seems obvious that the fundamental near-term problem is that we have this totally arbitrary constant embedded in the code and no way to adjust it based on different circumstances in which different values would be appropriate. I think Brad's suggestion that we make this threshold configurable and then find some way to propagate the value downward seems like the right way to go for now; we can leave the threshold at 100 for CPU-based systems, and crank it up for GPU-based systems, with maybe a third even higher thershold specifically for when we run RubyTester.
I'd rather have one arbitrary constant than three :-), and the lower the better.
Can we just proceed with this patch without this change? At least then I can perhaps get an appreciation for the issue (if you can point me to an example that fails). Right now I really do not understand how this could ever be a problem in any sensible setup.
I agree with Andreas on this one. I don't feel that an "arbitrary" constant should be changed to another "arbitrary" constant in a patch that is trying to fix up the RubyTester. There must be configurations of the RubyTester that will work without increasing the threshold to 10000, so the RubyTester changes will still work for some configurations. These are largely orthogonal issues.

Split the threshold change into a separate patch where we can focus on debating the merits of changes between so-called "arbitrary" buffering limits (e.g. gem5-gpu doesn't require any changes to the RubyPort or QueuedPort buffering thresholds).


- Joel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Steve Reinhardt
2015-05-18 17:27:46 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.
IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.
What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.
(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)
1) The 100 packet sanity check should definitely not be changed as part of this patch. If we do want to change it, it should be a patch on its own.
2) If there are somehow 10000 responses queued up, then something is truly _fundamentally_ broken, and having a panic telling the user it is broken is helpful imho. Do you not agree?
The QueuedPort is an evolution of the SimpleTimingPort that was used earlier, and I believe the RubyPort has always had this issue (it borrowed a lot of not-so-great design decisions from the Bus for example). The QueuedPort should really only be used as a "lazy" way of implementing non-performance critical devices. I've been meaning to remove it from the classic Caches as well, just not gotten around to it.
In the case of the RubyPort, I do not see the big issue of moving away from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. The only implication is that we need to deal with the port flow control (retry), and pass that downstream/upstream. Since the RubyPort really does not correspond to any real physical object, with any buffers, I think the case is clear: It should not do any buffering behind the user's back. Do you not think this is reasonable?
Yes, if there is 10000 responses queued up, then there might be something broken. However, 100 responses can be queued up rather easily when running with the Ruby Tester or with a GPU. That is why this patch changes the sanity check from 100 to 10000.
RubyPort is used by a lot of users, with many different scenarios/protocols. Moving it away from QueuedPorts will fundamentally change its behavior. That should be done carefully and I don't want to delay this bug fix with a project that large.
Would you be happy if we introduced a patch that modifies the QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the RubyPort can set the transmitList threshold to 10000 and the default can still be set to 100?
Can we simply leave this change out of this patch? I think the right solution here is to fix the dodgy assumptions on implicit buffering, and I'm happy to have a go myself (given that I've done something dimilar to a range of other modules).
The RubyTester and our soon-to-be-posed HSA-compatible GPU will not work without this change. I believe many Ruby users, especially Ruby users evaluating GPUs, have had this type of change in their personal patch queues for a very long time. We need to get this problem out of our personal patch queues and into the main line.
I appreciate your offer to redesign the RubyPort yourself, but even if that were to happen, I would prefer to get this change happen first. A redesign of the RubyPort will require a lot of downstream work to verify things still work as expected.
Why are you so against changing this somewhat arbritrary value, or at least allowing someone to change it? As Jason pointed out, the queue port is an infinite buffer and doesn't represent a real hardware structure. Why so much resistance? This small change really benefits Ruby users and if we make it configurable, it will have no impact on Classic users.
I fundamentally object to the change since it effectively adds an implicit infinite buffer (in quite some places), thus changing the very behaviour of the system and potentially masking problems.
The RubyPort should really just be a glue-logic adaptor. Agreed? In a system architecture, there is no such thing as a RubyPort, and there should not be kilobytes of "magic" buffering imposed on the user. If this buffer is needed, then we should instantiate a buffer, cache, etc. I do not understand the statement that this is needed to make things work. If you need an infinite buffer to make things work, then the solution is broken. Am I missing something?
I would propose to simply remove the QueuedPort, make it inherit from the Master/SlavePort directly, remove the implicit buffer altogether, and simply relay the retry to/from Ruby. Surely Ruby understands flow control...does it not?
Getting rid of implicit infinite buffering is a noble and appropriate long-term goal, I agree. However, in the near term, we already have implicit infinite buffering, and the only thing we're debating here is at what arbitrary point do we warn the user that the use of that infinite buffer is potentially getting abusive. To me it seems obvious that the fundamental near-term problem is that we have this totally arbitrary constant embedded in the code and no way to adjust it based on different circumstances in which different values would be appropriate. I think Brad's suggestion that we make this threshold configurable and then find some way to propagate the value downward seems like the right way to go for now; we can leave the threshold at 100 for CPU-based systems, and crank it up for GPU-based systems, with maybe a third even higher thershold specifically for when we run RubyTester.
I'd rather have one arbitrary constant than three :-), and the lower the better.
Can we just proceed with this patch without this change? At least then I can perhaps get an appreciation for the issue (if you can point me to an example that fails). Right now I really do not understand how this could ever be a problem in any sensible setup.
I agree with Andreas on this one. I don't feel that an "arbitrary" constant should be changed to another "arbitrary" constant in a patch that is trying to fix up the RubyTester. There must be configurations of the RubyTester that will work without increasing the threshold to 10000, so the RubyTester changes will still work for some configurations. These are largely orthogonal issues.
Split the threshold change into a separate patch where we can focus on debating the merits of changes between so-called "arbitrary" buffering limits (e.g. gem5-gpu doesn't require any changes to the RubyPort or QueuedPort buffering thresholds).
We can certainly split this change into a separate patch. It's true that 10,000 is just another arbitrary value; it may make more sense to just add a parameter that lets us turn off this threshold check altogether rather than create another arbitrary constant. Note that the RubyTester is not trying to be a "sensible setup", it's a stress tester that is stressing the protocol implementation beyond the limits of what a "sensible" configuration might be able to do.


- Steve


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 3:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Brad Beckmann
2015-05-18 23:04:13 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.
IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.
What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.
(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)
1) The 100 packet sanity check should definitely not be changed as part of this patch. If we do want to change it, it should be a patch on its own.
2) If there are somehow 10000 responses queued up, then something is truly _fundamentally_ broken, and having a panic telling the user it is broken is helpful imho. Do you not agree?
The QueuedPort is an evolution of the SimpleTimingPort that was used earlier, and I believe the RubyPort has always had this issue (it borrowed a lot of not-so-great design decisions from the Bus for example). The QueuedPort should really only be used as a "lazy" way of implementing non-performance critical devices. I've been meaning to remove it from the classic Caches as well, just not gotten around to it.
In the case of the RubyPort, I do not see the big issue of moving away from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. The only implication is that we need to deal with the port flow control (retry), and pass that downstream/upstream. Since the RubyPort really does not correspond to any real physical object, with any buffers, I think the case is clear: It should not do any buffering behind the user's back. Do you not think this is reasonable?
Yes, if there is 10000 responses queued up, then there might be something broken. However, 100 responses can be queued up rather easily when running with the Ruby Tester or with a GPU. That is why this patch changes the sanity check from 100 to 10000.
RubyPort is used by a lot of users, with many different scenarios/protocols. Moving it away from QueuedPorts will fundamentally change its behavior. That should be done carefully and I don't want to delay this bug fix with a project that large.
Would you be happy if we introduced a patch that modifies the QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the RubyPort can set the transmitList threshold to 10000 and the default can still be set to 100?
Can we simply leave this change out of this patch? I think the right solution here is to fix the dodgy assumptions on implicit buffering, and I'm happy to have a go myself (given that I've done something dimilar to a range of other modules).
The RubyTester and our soon-to-be-posed HSA-compatible GPU will not work without this change. I believe many Ruby users, especially Ruby users evaluating GPUs, have had this type of change in their personal patch queues for a very long time. We need to get this problem out of our personal patch queues and into the main line.
I appreciate your offer to redesign the RubyPort yourself, but even if that were to happen, I would prefer to get this change happen first. A redesign of the RubyPort will require a lot of downstream work to verify things still work as expected.
Why are you so against changing this somewhat arbritrary value, or at least allowing someone to change it? As Jason pointed out, the queue port is an infinite buffer and doesn't represent a real hardware structure. Why so much resistance? This small change really benefits Ruby users and if we make it configurable, it will have no impact on Classic users.
I fundamentally object to the change since it effectively adds an implicit infinite buffer (in quite some places), thus changing the very behaviour of the system and potentially masking problems.
The RubyPort should really just be a glue-logic adaptor. Agreed? In a system architecture, there is no such thing as a RubyPort, and there should not be kilobytes of "magic" buffering imposed on the user. If this buffer is needed, then we should instantiate a buffer, cache, etc. I do not understand the statement that this is needed to make things work. If you need an infinite buffer to make things work, then the solution is broken. Am I missing something?
I would propose to simply remove the QueuedPort, make it inherit from the Master/SlavePort directly, remove the implicit buffer altogether, and simply relay the retry to/from Ruby. Surely Ruby understands flow control...does it not?
Getting rid of implicit infinite buffering is a noble and appropriate long-term goal, I agree. However, in the near term, we already have implicit infinite buffering, and the only thing we're debating here is at what arbitrary point do we warn the user that the use of that infinite buffer is potentially getting abusive. To me it seems obvious that the fundamental near-term problem is that we have this totally arbitrary constant embedded in the code and no way to adjust it based on different circumstances in which different values would be appropriate. I think Brad's suggestion that we make this threshold configurable and then find some way to propagate the value downward seems like the right way to go for now; we can leave the threshold at 100 for CPU-based systems, and crank it up for GPU-based systems, with maybe a third even higher thershold specifically for when we run RubyTester.
I'd rather have one arbitrary constant than three :-), and the lower the better.
Can we just proceed with this patch without this change? At least then I can perhaps get an appreciation for the issue (if you can point me to an example that fails). Right now I really do not understand how this could ever be a problem in any sensible setup.
I agree with Andreas on this one. I don't feel that an "arbitrary" constant should be changed to another "arbitrary" constant in a patch that is trying to fix up the RubyTester. There must be configurations of the RubyTester that will work without increasing the threshold to 10000, so the RubyTester changes will still work for some configurations. These are largely orthogonal issues.
Split the threshold change into a separate patch where we can focus on debating the merits of changes between so-called "arbitrary" buffering limits (e.g. gem5-gpu doesn't require any changes to the RubyPort or QueuedPort buffering thresholds).
We can certainly split this change into a separate patch. It's true that 10,000 is just another arbitrary value; it may make more sense to just add a parameter that lets us turn off this threshold check altogether rather than create another arbitrary constant. Note that the RubyTester is not trying to be a "sensible setup", it's a stress tester that is stressing the protocol implementation beyond the limits of what a "sensible" configuration might be able to do.
I have no problem moving this change to a separate patch, but we still need to provide flexibility on either setting this arbitrary value or turning it off. Leaving it as is will not work for us. We have important configurations of the RubyTester and other internal testers that reach the 100 packet threshold.

I don't want to create a separate patch only to spawn off another week of high-level resistance. Can you confirm that if we create a patch that allows an object to configure the threshold or turn it off, that you will approve it? Do you have a preference between those two options?


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Joel Hestness
2015-05-19 00:29:40 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.
IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.
What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.
(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)
1) The 100 packet sanity check should definitely not be changed as part of this patch. If we do want to change it, it should be a patch on its own.
2) If there are somehow 10000 responses queued up, then something is truly _fundamentally_ broken, and having a panic telling the user it is broken is helpful imho. Do you not agree?
The QueuedPort is an evolution of the SimpleTimingPort that was used earlier, and I believe the RubyPort has always had this issue (it borrowed a lot of not-so-great design decisions from the Bus for example). The QueuedPort should really only be used as a "lazy" way of implementing non-performance critical devices. I've been meaning to remove it from the classic Caches as well, just not gotten around to it.
In the case of the RubyPort, I do not see the big issue of moving away from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. The only implication is that we need to deal with the port flow control (retry), and pass that downstream/upstream. Since the RubyPort really does not correspond to any real physical object, with any buffers, I think the case is clear: It should not do any buffering behind the user's back. Do you not think this is reasonable?
Yes, if there is 10000 responses queued up, then there might be something broken. However, 100 responses can be queued up rather easily when running with the Ruby Tester or with a GPU. That is why this patch changes the sanity check from 100 to 10000.
RubyPort is used by a lot of users, with many different scenarios/protocols. Moving it away from QueuedPorts will fundamentally change its behavior. That should be done carefully and I don't want to delay this bug fix with a project that large.
Would you be happy if we introduced a patch that modifies the QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the RubyPort can set the transmitList threshold to 10000 and the default can still be set to 100?
Can we simply leave this change out of this patch? I think the right solution here is to fix the dodgy assumptions on implicit buffering, and I'm happy to have a go myself (given that I've done something dimilar to a range of other modules).
The RubyTester and our soon-to-be-posed HSA-compatible GPU will not work without this change. I believe many Ruby users, especially Ruby users evaluating GPUs, have had this type of change in their personal patch queues for a very long time. We need to get this problem out of our personal patch queues and into the main line.
I appreciate your offer to redesign the RubyPort yourself, but even if that were to happen, I would prefer to get this change happen first. A redesign of the RubyPort will require a lot of downstream work to verify things still work as expected.
Why are you so against changing this somewhat arbritrary value, or at least allowing someone to change it? As Jason pointed out, the queue port is an infinite buffer and doesn't represent a real hardware structure. Why so much resistance? This small change really benefits Ruby users and if we make it configurable, it will have no impact on Classic users.
I fundamentally object to the change since it effectively adds an implicit infinite buffer (in quite some places), thus changing the very behaviour of the system and potentially masking problems.
The RubyPort should really just be a glue-logic adaptor. Agreed? In a system architecture, there is no such thing as a RubyPort, and there should not be kilobytes of "magic" buffering imposed on the user. If this buffer is needed, then we should instantiate a buffer, cache, etc. I do not understand the statement that this is needed to make things work. If you need an infinite buffer to make things work, then the solution is broken. Am I missing something?
I would propose to simply remove the QueuedPort, make it inherit from the Master/SlavePort directly, remove the implicit buffer altogether, and simply relay the retry to/from Ruby. Surely Ruby understands flow control...does it not?
Getting rid of implicit infinite buffering is a noble and appropriate long-term goal, I agree. However, in the near term, we already have implicit infinite buffering, and the only thing we're debating here is at what arbitrary point do we warn the user that the use of that infinite buffer is potentially getting abusive. To me it seems obvious that the fundamental near-term problem is that we have this totally arbitrary constant embedded in the code and no way to adjust it based on different circumstances in which different values would be appropriate. I think Brad's suggestion that we make this threshold configurable and then find some way to propagate the value downward seems like the right way to go for now; we can leave the threshold at 100 for CPU-based systems, and crank it up for GPU-based systems, with maybe a third even higher thershold specifically for when we run RubyTester.
I'd rather have one arbitrary constant than three :-), and the lower the better.
Can we just proceed with this patch without this change? At least then I can perhaps get an appreciation for the issue (if you can point me to an example that fails). Right now I really do not understand how this could ever be a problem in any sensible setup.
I agree with Andreas on this one. I don't feel that an "arbitrary" constant should be changed to another "arbitrary" constant in a patch that is trying to fix up the RubyTester. There must be configurations of the RubyTester that will work without increasing the threshold to 10000, so the RubyTester changes will still work for some configurations. These are largely orthogonal issues.
Split the threshold change into a separate patch where we can focus on debating the merits of changes between so-called "arbitrary" buffering limits (e.g. gem5-gpu doesn't require any changes to the RubyPort or QueuedPort buffering thresholds).
We can certainly split this change into a separate patch. It's true that 10,000 is just another arbitrary value; it may make more sense to just add a parameter that lets us turn off this threshold check altogether rather than create another arbitrary constant. Note that the RubyTester is not trying to be a "sensible setup", it's a stress tester that is stressing the protocol implementation beyond the limits of what a "sensible" configuration might be able to do.
I have no problem moving this change to a separate patch, but we still need to provide flexibility on either setting this arbitrary value or turning it off. Leaving it as is will not work for us. We have important configurations of the RubyTester and other internal testers that reach the 100 packet threshold.
I don't want to create a separate patch only to spawn off another week of high-level resistance. Can you confirm that if we create a patch that allows an object to configure the threshold or turn it off, that you will approve it? Do you have a preference between those two options?
Sure.

The RubyTester (or other testers) is certainly an exceptional case for coherence and consistency incantations aimed at finding bugs. From what I've read about the incoming GPU memory hierarchies in these reviews, however, I'm unsure whether I can be convinced that a GPU core's sequencer should be allowed to queue more than order(100) packets. Hitting this limit with real system components seems like a major red flag that your simulated system is very unbalanced and needs to be reconfigured. We definitely need to see the GPU code to understand that more deeply.

As a separate review request, I would prefer that we add an option to allow very large but finite buffering in all PacketQueue instances of a single simulation. Allowing infinite buffering is likely a bad idea for debugging (imagine a Ruby livelock that just fills a RubyPort response buffer that never unblocks). I'd also prefer comments on that option that strongly suggest it should only be used for exceptional cases, such as testers. Such an option should *not* be exposed on the command line due to obvious potential misuse by a naive user, but instead should be set by the RubyTester simulation config file. I would likely oppose a per-instance PacketQueue buffering limit, because that would just encourage strange, improper use of fake buffering. Until we can actually see your GPU code, I feel that the option should only be allowed to be used with the RubyTester in mainline gem5 code.


- Joel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Brad Beckmann
2015-05-19 04:26:17 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.
IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.
What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.
(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)
1) The 100 packet sanity check should definitely not be changed as part of this patch. If we do want to change it, it should be a patch on its own.
2) If there are somehow 10000 responses queued up, then something is truly _fundamentally_ broken, and having a panic telling the user it is broken is helpful imho. Do you not agree?
The QueuedPort is an evolution of the SimpleTimingPort that was used earlier, and I believe the RubyPort has always had this issue (it borrowed a lot of not-so-great design decisions from the Bus for example). The QueuedPort should really only be used as a "lazy" way of implementing non-performance critical devices. I've been meaning to remove it from the classic Caches as well, just not gotten around to it.
In the case of the RubyPort, I do not see the big issue of moving away from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. The only implication is that we need to deal with the port flow control (retry), and pass that downstream/upstream. Since the RubyPort really does not correspond to any real physical object, with any buffers, I think the case is clear: It should not do any buffering behind the user's back. Do you not think this is reasonable?
Yes, if there is 10000 responses queued up, then there might be something broken. However, 100 responses can be queued up rather easily when running with the Ruby Tester or with a GPU. That is why this patch changes the sanity check from 100 to 10000.
RubyPort is used by a lot of users, with many different scenarios/protocols. Moving it away from QueuedPorts will fundamentally change its behavior. That should be done carefully and I don't want to delay this bug fix with a project that large.
Would you be happy if we introduced a patch that modifies the QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the RubyPort can set the transmitList threshold to 10000 and the default can still be set to 100?
Can we simply leave this change out of this patch? I think the right solution here is to fix the dodgy assumptions on implicit buffering, and I'm happy to have a go myself (given that I've done something dimilar to a range of other modules).
The RubyTester and our soon-to-be-posed HSA-compatible GPU will not work without this change. I believe many Ruby users, especially Ruby users evaluating GPUs, have had this type of change in their personal patch queues for a very long time. We need to get this problem out of our personal patch queues and into the main line.
I appreciate your offer to redesign the RubyPort yourself, but even if that were to happen, I would prefer to get this change happen first. A redesign of the RubyPort will require a lot of downstream work to verify things still work as expected.
Why are you so against changing this somewhat arbritrary value, or at least allowing someone to change it? As Jason pointed out, the queue port is an infinite buffer and doesn't represent a real hardware structure. Why so much resistance? This small change really benefits Ruby users and if we make it configurable, it will have no impact on Classic users.
I fundamentally object to the change since it effectively adds an implicit infinite buffer (in quite some places), thus changing the very behaviour of the system and potentially masking problems.
The RubyPort should really just be a glue-logic adaptor. Agreed? In a system architecture, there is no such thing as a RubyPort, and there should not be kilobytes of "magic" buffering imposed on the user. If this buffer is needed, then we should instantiate a buffer, cache, etc. I do not understand the statement that this is needed to make things work. If you need an infinite buffer to make things work, then the solution is broken. Am I missing something?
I would propose to simply remove the QueuedPort, make it inherit from the Master/SlavePort directly, remove the implicit buffer altogether, and simply relay the retry to/from Ruby. Surely Ruby understands flow control...does it not?
Getting rid of implicit infinite buffering is a noble and appropriate long-term goal, I agree. However, in the near term, we already have implicit infinite buffering, and the only thing we're debating here is at what arbitrary point do we warn the user that the use of that infinite buffer is potentially getting abusive. To me it seems obvious that the fundamental near-term problem is that we have this totally arbitrary constant embedded in the code and no way to adjust it based on different circumstances in which different values would be appropriate. I think Brad's suggestion that we make this threshold configurable and then find some way to propagate the value downward seems like the right way to go for now; we can leave the threshold at 100 for CPU-based systems, and crank it up for GPU-based systems, with maybe a third even higher thershold specifically for when we run RubyTester.
I'd rather have one arbitrary constant than three :-), and the lower the better.
Can we just proceed with this patch without this change? At least then I can perhaps get an appreciation for the issue (if you can point me to an example that fails). Right now I really do not understand how this could ever be a problem in any sensible setup.
I agree with Andreas on this one. I don't feel that an "arbitrary" constant should be changed to another "arbitrary" constant in a patch that is trying to fix up the RubyTester. There must be configurations of the RubyTester that will work without increasing the threshold to 10000, so the RubyTester changes will still work for some configurations. These are largely orthogonal issues.
Split the threshold change into a separate patch where we can focus on debating the merits of changes between so-called "arbitrary" buffering limits (e.g. gem5-gpu doesn't require any changes to the RubyPort or QueuedPort buffering thresholds).
We can certainly split this change into a separate patch. It's true that 10,000 is just another arbitrary value; it may make more sense to just add a parameter that lets us turn off this threshold check altogether rather than create another arbitrary constant. Note that the RubyTester is not trying to be a "sensible setup", it's a stress tester that is stressing the protocol implementation beyond the limits of what a "sensible" configuration might be able to do.
I have no problem moving this change to a separate patch, but we still need to provide flexibility on either setting this arbitrary value or turning it off. Leaving it as is will not work for us. We have important configurations of the RubyTester and other internal testers that reach the 100 packet threshold.
I don't want to create a separate patch only to spawn off another week of high-level resistance. Can you confirm that if we create a patch that allows an object to configure the threshold or turn it off, that you will approve it? Do you have a preference between those two options?
Sure.
The RubyTester (or other testers) is certainly an exceptional case for coherence and consistency incantations aimed at finding bugs. From what I've read about the incoming GPU memory hierarchies in these reviews, however, I'm unsure whether I can be convinced that a GPU core's sequencer should be allowed to queue more than order(100) packets. Hitting this limit with real system components seems like a major red flag that your simulated system is very unbalanced and needs to be reconfigured. We definitely need to see the GPU code to understand that more deeply.
As a separate review request, I would prefer that we add an option to allow very large but finite buffering in all PacketQueue instances of a single simulation. Allowing infinite buffering is likely a bad idea for debugging (imagine a Ruby livelock that just fills a RubyPort response buffer that never unblocks). I'd also prefer comments on that option that strongly suggest it should only be used for exceptional cases, such as testers. Such an option should *not* be exposed on the command line due to obvious potential misuse by a naive user, but instead should be set by the RubyTester simulation config file. I would likely oppose a per-instance PacketQueue buffering limit, because that would just encourage strange, improper use of fake buffering. Until we can actually see your GPU code, I feel that the option should only be allowed to be used with the RubyTester in mainline gem5 code.
Thanks Joel for voicing your preference for the finite buffering case. Once Andreas approves as well, I'll implement and post that patch.

As far as your concern with an unbalanced GPU system, please keep in mind this buffer is for responses. Also in our implementation each work-item issues a separate packet request that is coalesced in the RubyPort. All it takes is 2 wavefronts (2x64 work-items) to be waiting on the same few cache lines. When those cache lines arrive at the GPU L1 cache, 128 packets will be sent back to the GPU cache thus the threshold will be exceeded. I do not think this is a result of an unbalanced system.


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Andreas Hansson
2015-05-19 07:28:09 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.
IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.
What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.
(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)
1) The 100 packet sanity check should definitely not be changed as part of this patch. If we do want to change it, it should be a patch on its own.
2) If there are somehow 10000 responses queued up, then something is truly _fundamentally_ broken, and having a panic telling the user it is broken is helpful imho. Do you not agree?
The QueuedPort is an evolution of the SimpleTimingPort that was used earlier, and I believe the RubyPort has always had this issue (it borrowed a lot of not-so-great design decisions from the Bus for example). The QueuedPort should really only be used as a "lazy" way of implementing non-performance critical devices. I've been meaning to remove it from the classic Caches as well, just not gotten around to it.
In the case of the RubyPort, I do not see the big issue of moving away from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. The only implication is that we need to deal with the port flow control (retry), and pass that downstream/upstream. Since the RubyPort really does not correspond to any real physical object, with any buffers, I think the case is clear: It should not do any buffering behind the user's back. Do you not think this is reasonable?
Yes, if there is 10000 responses queued up, then there might be something broken. However, 100 responses can be queued up rather easily when running with the Ruby Tester or with a GPU. That is why this patch changes the sanity check from 100 to 10000.
RubyPort is used by a lot of users, with many different scenarios/protocols. Moving it away from QueuedPorts will fundamentally change its behavior. That should be done carefully and I don't want to delay this bug fix with a project that large.
Would you be happy if we introduced a patch that modifies the QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the RubyPort can set the transmitList threshold to 10000 and the default can still be set to 100?
Can we simply leave this change out of this patch? I think the right solution here is to fix the dodgy assumptions on implicit buffering, and I'm happy to have a go myself (given that I've done something dimilar to a range of other modules).
The RubyTester and our soon-to-be-posed HSA-compatible GPU will not work without this change. I believe many Ruby users, especially Ruby users evaluating GPUs, have had this type of change in their personal patch queues for a very long time. We need to get this problem out of our personal patch queues and into the main line.
I appreciate your offer to redesign the RubyPort yourself, but even if that were to happen, I would prefer to get this change happen first. A redesign of the RubyPort will require a lot of downstream work to verify things still work as expected.
Why are you so against changing this somewhat arbritrary value, or at least allowing someone to change it? As Jason pointed out, the queue port is an infinite buffer and doesn't represent a real hardware structure. Why so much resistance? This small change really benefits Ruby users and if we make it configurable, it will have no impact on Classic users.
I fundamentally object to the change since it effectively adds an implicit infinite buffer (in quite some places), thus changing the very behaviour of the system and potentially masking problems.
The RubyPort should really just be a glue-logic adaptor. Agreed? In a system architecture, there is no such thing as a RubyPort, and there should not be kilobytes of "magic" buffering imposed on the user. If this buffer is needed, then we should instantiate a buffer, cache, etc. I do not understand the statement that this is needed to make things work. If you need an infinite buffer to make things work, then the solution is broken. Am I missing something?
I would propose to simply remove the QueuedPort, make it inherit from the Master/SlavePort directly, remove the implicit buffer altogether, and simply relay the retry to/from Ruby. Surely Ruby understands flow control...does it not?
Getting rid of implicit infinite buffering is a noble and appropriate long-term goal, I agree. However, in the near term, we already have implicit infinite buffering, and the only thing we're debating here is at what arbitrary point do we warn the user that the use of that infinite buffer is potentially getting abusive. To me it seems obvious that the fundamental near-term problem is that we have this totally arbitrary constant embedded in the code and no way to adjust it based on different circumstances in which different values would be appropriate. I think Brad's suggestion that we make this threshold configurable and then find some way to propagate the value downward seems like the right way to go for now; we can leave the threshold at 100 for CPU-based systems, and crank it up for GPU-based systems, with maybe a third even higher thershold specifically for when we run RubyTester.
I'd rather have one arbitrary constant than three :-), and the lower the better.
Can we just proceed with this patch without this change? At least then I can perhaps get an appreciation for the issue (if you can point me to an example that fails). Right now I really do not understand how this could ever be a problem in any sensible setup.
I agree with Andreas on this one. I don't feel that an "arbitrary" constant should be changed to another "arbitrary" constant in a patch that is trying to fix up the RubyTester. There must be configurations of the RubyTester that will work without increasing the threshold to 10000, so the RubyTester changes will still work for some configurations. These are largely orthogonal issues.
Split the threshold change into a separate patch where we can focus on debating the merits of changes between so-called "arbitrary" buffering limits (e.g. gem5-gpu doesn't require any changes to the RubyPort or QueuedPort buffering thresholds).
We can certainly split this change into a separate patch. It's true that 10,000 is just another arbitrary value; it may make more sense to just add a parameter that lets us turn off this threshold check altogether rather than create another arbitrary constant. Note that the RubyTester is not trying to be a "sensible setup", it's a stress tester that is stressing the protocol implementation beyond the limits of what a "sensible" configuration might be able to do.
I have no problem moving this change to a separate patch, but we still need to provide flexibility on either setting this arbitrary value or turning it off. Leaving it as is will not work for us. We have important configurations of the RubyTester and other internal testers that reach the 100 packet threshold.
I don't want to create a separate patch only to spawn off another week of high-level resistance. Can you confirm that if we create a patch that allows an object to configure the threshold or turn it off, that you will approve it? Do you have a preference between those two options?
Sure.
The RubyTester (or other testers) is certainly an exceptional case for coherence and consistency incantations aimed at finding bugs. From what I've read about the incoming GPU memory hierarchies in these reviews, however, I'm unsure whether I can be convinced that a GPU core's sequencer should be allowed to queue more than order(100) packets. Hitting this limit with real system components seems like a major red flag that your simulated system is very unbalanced and needs to be reconfigured. We definitely need to see the GPU code to understand that more deeply.
As a separate review request, I would prefer that we add an option to allow very large but finite buffering in all PacketQueue instances of a single simulation. Allowing infinite buffering is likely a bad idea for debugging (imagine a Ruby livelock that just fills a RubyPort response buffer that never unblocks). I'd also prefer comments on that option that strongly suggest it should only be used for exceptional cases, such as testers. Such an option should *not* be exposed on the command line due to obvious potential misuse by a naive user, but instead should be set by the RubyTester simulation config file. I would likely oppose a per-instance PacketQueue buffering limit, because that would just encourage strange, improper use of fake buffering. Until we can actually see your GPU code, I feel that the option should only be allowed to be used with the RubyTester in mainline gem5 code.
Thanks Joel for voicing your preference for the finite buffering case. Once Andreas approves as well, I'll implement and post that patch.
As far as your concern with an unbalanced GPU system, please keep in mind this buffer is for responses. Also in our implementation each work-item issues a separate packet request that is coalesced in the RubyPort. All it takes is 2 wavefronts (2x64 work-items) to be waiting on the same few cache lines. When those cache lines arrive at the GPU L1 cache, 128 packets will be sent back to the GPU cache thus the threshold will be exceeded. I do not think this is a result of an unbalanced system.
I would like to understand the architecture a bit better before simply increasing the limit. Again, I propose we get this patch pushed without this change, since I envision that the discussion has not quite converged.

From the brief description it sounds like the issue arises due to a rather unfortunate use of packets. Perhaps I am missing something, but why not simply add masking or do the coalescing before the "real" port? Are the GPU L1 caches not taking care of this already (or are the L1's in Ruby?).

I had a look at the RubyPort, and it seems all cases of QueuedPort can easily be removed and replaced simply by forwarding retries from one side to the other. The one exception I cannot figure out is the "hit_callback". How can we tell Ruby that a response needs to wait?


- Andreas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Brad Beckmann
2015-05-19 15:40:11 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.
IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.
What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.
(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)
1) The 100 packet sanity check should definitely not be changed as part of this patch. If we do want to change it, it should be a patch on its own.
2) If there are somehow 10000 responses queued up, then something is truly _fundamentally_ broken, and having a panic telling the user it is broken is helpful imho. Do you not agree?
The QueuedPort is an evolution of the SimpleTimingPort that was used earlier, and I believe the RubyPort has always had this issue (it borrowed a lot of not-so-great design decisions from the Bus for example). The QueuedPort should really only be used as a "lazy" way of implementing non-performance critical devices. I've been meaning to remove it from the classic Caches as well, just not gotten around to it.
In the case of the RubyPort, I do not see the big issue of moving away from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. The only implication is that we need to deal with the port flow control (retry), and pass that downstream/upstream. Since the RubyPort really does not correspond to any real physical object, with any buffers, I think the case is clear: It should not do any buffering behind the user's back. Do you not think this is reasonable?
Yes, if there is 10000 responses queued up, then there might be something broken. However, 100 responses can be queued up rather easily when running with the Ruby Tester or with a GPU. That is why this patch changes the sanity check from 100 to 10000.
RubyPort is used by a lot of users, with many different scenarios/protocols. Moving it away from QueuedPorts will fundamentally change its behavior. That should be done carefully and I don't want to delay this bug fix with a project that large.
Would you be happy if we introduced a patch that modifies the QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the RubyPort can set the transmitList threshold to 10000 and the default can still be set to 100?
Can we simply leave this change out of this patch? I think the right solution here is to fix the dodgy assumptions on implicit buffering, and I'm happy to have a go myself (given that I've done something dimilar to a range of other modules).
The RubyTester and our soon-to-be-posed HSA-compatible GPU will not work without this change. I believe many Ruby users, especially Ruby users evaluating GPUs, have had this type of change in their personal patch queues for a very long time. We need to get this problem out of our personal patch queues and into the main line.
I appreciate your offer to redesign the RubyPort yourself, but even if that were to happen, I would prefer to get this change happen first. A redesign of the RubyPort will require a lot of downstream work to verify things still work as expected.
Why are you so against changing this somewhat arbritrary value, or at least allowing someone to change it? As Jason pointed out, the queue port is an infinite buffer and doesn't represent a real hardware structure. Why so much resistance? This small change really benefits Ruby users and if we make it configurable, it will have no impact on Classic users.
I fundamentally object to the change since it effectively adds an implicit infinite buffer (in quite some places), thus changing the very behaviour of the system and potentially masking problems.
The RubyPort should really just be a glue-logic adaptor. Agreed? In a system architecture, there is no such thing as a RubyPort, and there should not be kilobytes of "magic" buffering imposed on the user. If this buffer is needed, then we should instantiate a buffer, cache, etc. I do not understand the statement that this is needed to make things work. If you need an infinite buffer to make things work, then the solution is broken. Am I missing something?
I would propose to simply remove the QueuedPort, make it inherit from the Master/SlavePort directly, remove the implicit buffer altogether, and simply relay the retry to/from Ruby. Surely Ruby understands flow control...does it not?
Getting rid of implicit infinite buffering is a noble and appropriate long-term goal, I agree. However, in the near term, we already have implicit infinite buffering, and the only thing we're debating here is at what arbitrary point do we warn the user that the use of that infinite buffer is potentially getting abusive. To me it seems obvious that the fundamental near-term problem is that we have this totally arbitrary constant embedded in the code and no way to adjust it based on different circumstances in which different values would be appropriate. I think Brad's suggestion that we make this threshold configurable and then find some way to propagate the value downward seems like the right way to go for now; we can leave the threshold at 100 for CPU-based systems, and crank it up for GPU-based systems, with maybe a third even higher thershold specifically for when we run RubyTester.
I'd rather have one arbitrary constant than three :-), and the lower the better.
Can we just proceed with this patch without this change? At least then I can perhaps get an appreciation for the issue (if you can point me to an example that fails). Right now I really do not understand how this could ever be a problem in any sensible setup.
I agree with Andreas on this one. I don't feel that an "arbitrary" constant should be changed to another "arbitrary" constant in a patch that is trying to fix up the RubyTester. There must be configurations of the RubyTester that will work without increasing the threshold to 10000, so the RubyTester changes will still work for some configurations. These are largely orthogonal issues.
Split the threshold change into a separate patch where we can focus on debating the merits of changes between so-called "arbitrary" buffering limits (e.g. gem5-gpu doesn't require any changes to the RubyPort or QueuedPort buffering thresholds).
We can certainly split this change into a separate patch. It's true that 10,000 is just another arbitrary value; it may make more sense to just add a parameter that lets us turn off this threshold check altogether rather than create another arbitrary constant. Note that the RubyTester is not trying to be a "sensible setup", it's a stress tester that is stressing the protocol implementation beyond the limits of what a "sensible" configuration might be able to do.
I have no problem moving this change to a separate patch, but we still need to provide flexibility on either setting this arbitrary value or turning it off. Leaving it as is will not work for us. We have important configurations of the RubyTester and other internal testers that reach the 100 packet threshold.
I don't want to create a separate patch only to spawn off another week of high-level resistance. Can you confirm that if we create a patch that allows an object to configure the threshold or turn it off, that you will approve it? Do you have a preference between those two options?
Sure.
The RubyTester (or other testers) is certainly an exceptional case for coherence and consistency incantations aimed at finding bugs. From what I've read about the incoming GPU memory hierarchies in these reviews, however, I'm unsure whether I can be convinced that a GPU core's sequencer should be allowed to queue more than order(100) packets. Hitting this limit with real system components seems like a major red flag that your simulated system is very unbalanced and needs to be reconfigured. We definitely need to see the GPU code to understand that more deeply.
As a separate review request, I would prefer that we add an option to allow very large but finite buffering in all PacketQueue instances of a single simulation. Allowing infinite buffering is likely a bad idea for debugging (imagine a Ruby livelock that just fills a RubyPort response buffer that never unblocks). I'd also prefer comments on that option that strongly suggest it should only be used for exceptional cases, such as testers. Such an option should *not* be exposed on the command line due to obvious potential misuse by a naive user, but instead should be set by the RubyTester simulation config file. I would likely oppose a per-instance PacketQueue buffering limit, because that would just encourage strange, improper use of fake buffering. Until we can actually see your GPU code, I feel that the option should only be allowed to be used with the RubyTester in mainline gem5 code.
Thanks Joel for voicing your preference for the finite buffering case. Once Andreas approves as well, I'll implement and post that patch.
As far as your concern with an unbalanced GPU system, please keep in mind this buffer is for responses. Also in our implementation each work-item issues a separate packet request that is coalesced in the RubyPort. All it takes is 2 wavefronts (2x64 work-items) to be waiting on the same few cache lines. When those cache lines arrive at the GPU L1 cache, 128 packets will be sent back to the GPU cache thus the threshold will be exceeded. I do not think this is a result of an unbalanced system.
I would like to understand the architecture a bit better before simply increasing the limit. Again, I propose we get this patch pushed without this change, since I envision that the discussion has not quite converged.
From the brief description it sounds like the issue arises due to a rather unfortunate use of packets. Perhaps I am missing something, but why not simply add masking or do the coalescing before the "real" port? Are the GPU L1 caches not taking care of this already (or are the L1's in Ruby?).
I had a look at the RubyPort, and it seems all cases of QueuedPort can easily be removed and replaced simply by forwarding retries from one side to the other. The one exception I cannot figure out is the "hit_callback". How can we tell Ruby that a response needs to wait?
Yes this discussion has not quite converge, but I think that is mostly due to your concerns. You say "rather unfortunate use of packets". Perhaps we used packets differently than what you would have expected, but do not believe that deems the use improper or unfortunate. Please be open to uses of gem5 beyond the ones you have previously considered.

The GPU L1 caches are modeled in Ruby. Memory requests are sequenced and coalesced in the RubyPort. This is no different than the other Ruby protocols.

The "hit_callback" is pretty fundamental to Ruby. It has been designed to always "sink" responses because the protocols are set up to reserve the resources ahead of time. I believe that is fairly representative of real hardware and please note resources like register ports are outside of the realm of Ruby. Thus responses will stall inside the RubyPort.

Please let's not stall this necessary change on a fundamental redesign of Ruby.


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Andreas Hansson
2015-05-19 15:55:09 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.
IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.
What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.
(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)
1) The 100 packet sanity check should definitely not be changed as part of this patch. If we do want to change it, it should be a patch on its own.
2) If there are somehow 10000 responses queued up, then something is truly _fundamentally_ broken, and having a panic telling the user it is broken is helpful imho. Do you not agree?
The QueuedPort is an evolution of the SimpleTimingPort that was used earlier, and I believe the RubyPort has always had this issue (it borrowed a lot of not-so-great design decisions from the Bus for example). The QueuedPort should really only be used as a "lazy" way of implementing non-performance critical devices. I've been meaning to remove it from the classic Caches as well, just not gotten around to it.
In the case of the RubyPort, I do not see the big issue of moving away from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. The only implication is that we need to deal with the port flow control (retry), and pass that downstream/upstream. Since the RubyPort really does not correspond to any real physical object, with any buffers, I think the case is clear: It should not do any buffering behind the user's back. Do you not think this is reasonable?
Yes, if there is 10000 responses queued up, then there might be something broken. However, 100 responses can be queued up rather easily when running with the Ruby Tester or with a GPU. That is why this patch changes the sanity check from 100 to 10000.
RubyPort is used by a lot of users, with many different scenarios/protocols. Moving it away from QueuedPorts will fundamentally change its behavior. That should be done carefully and I don't want to delay this bug fix with a project that large.
Would you be happy if we introduced a patch that modifies the QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the RubyPort can set the transmitList threshold to 10000 and the default can still be set to 100?
Can we simply leave this change out of this patch? I think the right solution here is to fix the dodgy assumptions on implicit buffering, and I'm happy to have a go myself (given that I've done something dimilar to a range of other modules).
The RubyTester and our soon-to-be-posed HSA-compatible GPU will not work without this change. I believe many Ruby users, especially Ruby users evaluating GPUs, have had this type of change in their personal patch queues for a very long time. We need to get this problem out of our personal patch queues and into the main line.
I appreciate your offer to redesign the RubyPort yourself, but even if that were to happen, I would prefer to get this change happen first. A redesign of the RubyPort will require a lot of downstream work to verify things still work as expected.
Why are you so against changing this somewhat arbritrary value, or at least allowing someone to change it? As Jason pointed out, the queue port is an infinite buffer and doesn't represent a real hardware structure. Why so much resistance? This small change really benefits Ruby users and if we make it configurable, it will have no impact on Classic users.
I fundamentally object to the change since it effectively adds an implicit infinite buffer (in quite some places), thus changing the very behaviour of the system and potentially masking problems.
The RubyPort should really just be a glue-logic adaptor. Agreed? In a system architecture, there is no such thing as a RubyPort, and there should not be kilobytes of "magic" buffering imposed on the user. If this buffer is needed, then we should instantiate a buffer, cache, etc. I do not understand the statement that this is needed to make things work. If you need an infinite buffer to make things work, then the solution is broken. Am I missing something?
I would propose to simply remove the QueuedPort, make it inherit from the Master/SlavePort directly, remove the implicit buffer altogether, and simply relay the retry to/from Ruby. Surely Ruby understands flow control...does it not?
Getting rid of implicit infinite buffering is a noble and appropriate long-term goal, I agree. However, in the near term, we already have implicit infinite buffering, and the only thing we're debating here is at what arbitrary point do we warn the user that the use of that infinite buffer is potentially getting abusive. To me it seems obvious that the fundamental near-term problem is that we have this totally arbitrary constant embedded in the code and no way to adjust it based on different circumstances in which different values would be appropriate. I think Brad's suggestion that we make this threshold configurable and then find some way to propagate the value downward seems like the right way to go for now; we can leave the threshold at 100 for CPU-based systems, and crank it up for GPU-based systems, with maybe a third even higher thershold specifically for when we run RubyTester.
I'd rather have one arbitrary constant than three :-), and the lower the better.
Can we just proceed with this patch without this change? At least then I can perhaps get an appreciation for the issue (if you can point me to an example that fails). Right now I really do not understand how this could ever be a problem in any sensible setup.
I agree with Andreas on this one. I don't feel that an "arbitrary" constant should be changed to another "arbitrary" constant in a patch that is trying to fix up the RubyTester. There must be configurations of the RubyTester that will work without increasing the threshold to 10000, so the RubyTester changes will still work for some configurations. These are largely orthogonal issues.
Split the threshold change into a separate patch where we can focus on debating the merits of changes between so-called "arbitrary" buffering limits (e.g. gem5-gpu doesn't require any changes to the RubyPort or QueuedPort buffering thresholds).
We can certainly split this change into a separate patch. It's true that 10,000 is just another arbitrary value; it may make more sense to just add a parameter that lets us turn off this threshold check altogether rather than create another arbitrary constant. Note that the RubyTester is not trying to be a "sensible setup", it's a stress tester that is stressing the protocol implementation beyond the limits of what a "sensible" configuration might be able to do.
I have no problem moving this change to a separate patch, but we still need to provide flexibility on either setting this arbitrary value or turning it off. Leaving it as is will not work for us. We have important configurations of the RubyTester and other internal testers that reach the 100 packet threshold.
I don't want to create a separate patch only to spawn off another week of high-level resistance. Can you confirm that if we create a patch that allows an object to configure the threshold or turn it off, that you will approve it? Do you have a preference between those two options?
Sure.
The RubyTester (or other testers) is certainly an exceptional case for coherence and consistency incantations aimed at finding bugs. From what I've read about the incoming GPU memory hierarchies in these reviews, however, I'm unsure whether I can be convinced that a GPU core's sequencer should be allowed to queue more than order(100) packets. Hitting this limit with real system components seems like a major red flag that your simulated system is very unbalanced and needs to be reconfigured. We definitely need to see the GPU code to understand that more deeply.
As a separate review request, I would prefer that we add an option to allow very large but finite buffering in all PacketQueue instances of a single simulation. Allowing infinite buffering is likely a bad idea for debugging (imagine a Ruby livelock that just fills a RubyPort response buffer that never unblocks). I'd also prefer comments on that option that strongly suggest it should only be used for exceptional cases, such as testers. Such an option should *not* be exposed on the command line due to obvious potential misuse by a naive user, but instead should be set by the RubyTester simulation config file. I would likely oppose a per-instance PacketQueue buffering limit, because that would just encourage strange, improper use of fake buffering. Until we can actually see your GPU code, I feel that the option should only be allowed to be used with the RubyTester in mainline gem5 code.
Thanks Joel for voicing your preference for the finite buffering case. Once Andreas approves as well, I'll implement and post that patch.
As far as your concern with an unbalanced GPU system, please keep in mind this buffer is for responses. Also in our implementation each work-item issues a separate packet request that is coalesced in the RubyPort. All it takes is 2 wavefronts (2x64 work-items) to be waiting on the same few cache lines. When those cache lines arrive at the GPU L1 cache, 128 packets will be sent back to the GPU cache thus the threshold will be exceeded. I do not think this is a result of an unbalanced system.
I would like to understand the architecture a bit better before simply increasing the limit. Again, I propose we get this patch pushed without this change, since I envision that the discussion has not quite converged.
From the brief description it sounds like the issue arises due to a rather unfortunate use of packets. Perhaps I am missing something, but why not simply add masking or do the coalescing before the "real" port? Are the GPU L1 caches not taking care of this already (or are the L1's in Ruby?).
I had a look at the RubyPort, and it seems all cases of QueuedPort can easily be removed and replaced simply by forwarding retries from one side to the other. The one exception I cannot figure out is the "hit_callback". How can we tell Ruby that a response needs to wait?
Yes this discussion has not quite converge, but I think that is mostly due to your concerns. You say "rather unfortunate use of packets". Perhaps we used packets differently than what you would have expected, but do not believe that deems the use improper or unfortunate. Please be open to uses of gem5 beyond the ones you have previously considered.
The GPU L1 caches are modeled in Ruby. Memory requests are sequenced and coalesced in the RubyPort. This is no different than the other Ruby protocols.
The "hit_callback" is pretty fundamental to Ruby. It has been designed to always "sink" responses because the protocols are set up to reserve the resources ahead of time. I believe that is fairly representative of real hardware and please note resources like register ports are outside of the realm of Ruby. Thus responses will stall inside the RubyPort.
Please let's not stall this necessary change on a fundamental redesign of Ruby.
I think you misunderstood my comment. I think the patch should go ahead...just without the change to the queue limit. No further changes needed at this point.

I used the word "unfortunate" since the packet is a pretty heavy-weight data structure for performing byte strobes. It seems to me we should be able to come up with a much better way to solve this issue (if I have understood what it is you are trying to do). Again, I only see part of the overall picture of your use-case, but we have had related cases internally, and I think we can avoid the overheads of having a packet per byte. Makes sense?

It is fine to assume that reponses sink (eventually) and for deadlock reasons it makes sense to reserve space in advance. However, it is still important to model throughput in the ports, caches and interconnect. For this reason I do not find it too unreasonable to be able to rate regulate responses (which we do in quite a lot of places). If Ruby is unable to do this then I agree that we need an infinite buffer somewhere in the interaction with the "classic" MemObjects.

Again, please push the patch without the queue limit change, and we can revisit this point when we have a better picture of what the options are. Is that ok?


- Andreas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Joel Hestness
2015-05-19 15:55:55 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.
IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.
What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.
(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)
1) The 100 packet sanity check should definitely not be changed as part of this patch. If we do want to change it, it should be a patch on its own.
2) If there are somehow 10000 responses queued up, then something is truly _fundamentally_ broken, and having a panic telling the user it is broken is helpful imho. Do you not agree?
The QueuedPort is an evolution of the SimpleTimingPort that was used earlier, and I believe the RubyPort has always had this issue (it borrowed a lot of not-so-great design decisions from the Bus for example). The QueuedPort should really only be used as a "lazy" way of implementing non-performance critical devices. I've been meaning to remove it from the classic Caches as well, just not gotten around to it.
In the case of the RubyPort, I do not see the big issue of moving away from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. The only implication is that we need to deal with the port flow control (retry), and pass that downstream/upstream. Since the RubyPort really does not correspond to any real physical object, with any buffers, I think the case is clear: It should not do any buffering behind the user's back. Do you not think this is reasonable?
Yes, if there is 10000 responses queued up, then there might be something broken. However, 100 responses can be queued up rather easily when running with the Ruby Tester or with a GPU. That is why this patch changes the sanity check from 100 to 10000.
RubyPort is used by a lot of users, with many different scenarios/protocols. Moving it away from QueuedPorts will fundamentally change its behavior. That should be done carefully and I don't want to delay this bug fix with a project that large.
Would you be happy if we introduced a patch that modifies the QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the RubyPort can set the transmitList threshold to 10000 and the default can still be set to 100?
Can we simply leave this change out of this patch? I think the right solution here is to fix the dodgy assumptions on implicit buffering, and I'm happy to have a go myself (given that I've done something dimilar to a range of other modules).
The RubyTester and our soon-to-be-posed HSA-compatible GPU will not work without this change. I believe many Ruby users, especially Ruby users evaluating GPUs, have had this type of change in their personal patch queues for a very long time. We need to get this problem out of our personal patch queues and into the main line.
I appreciate your offer to redesign the RubyPort yourself, but even if that were to happen, I would prefer to get this change happen first. A redesign of the RubyPort will require a lot of downstream work to verify things still work as expected.
Why are you so against changing this somewhat arbritrary value, or at least allowing someone to change it? As Jason pointed out, the queue port is an infinite buffer and doesn't represent a real hardware structure. Why so much resistance? This small change really benefits Ruby users and if we make it configurable, it will have no impact on Classic users.
I fundamentally object to the change since it effectively adds an implicit infinite buffer (in quite some places), thus changing the very behaviour of the system and potentially masking problems.
The RubyPort should really just be a glue-logic adaptor. Agreed? In a system architecture, there is no such thing as a RubyPort, and there should not be kilobytes of "magic" buffering imposed on the user. If this buffer is needed, then we should instantiate a buffer, cache, etc. I do not understand the statement that this is needed to make things work. If you need an infinite buffer to make things work, then the solution is broken. Am I missing something?
I would propose to simply remove the QueuedPort, make it inherit from the Master/SlavePort directly, remove the implicit buffer altogether, and simply relay the retry to/from Ruby. Surely Ruby understands flow control...does it not?
Getting rid of implicit infinite buffering is a noble and appropriate long-term goal, I agree. However, in the near term, we already have implicit infinite buffering, and the only thing we're debating here is at what arbitrary point do we warn the user that the use of that infinite buffer is potentially getting abusive. To me it seems obvious that the fundamental near-term problem is that we have this totally arbitrary constant embedded in the code and no way to adjust it based on different circumstances in which different values would be appropriate. I think Brad's suggestion that we make this threshold configurable and then find some way to propagate the value downward seems like the right way to go for now; we can leave the threshold at 100 for CPU-based systems, and crank it up for GPU-based systems, with maybe a third even higher thershold specifically for when we run RubyTester.
I'd rather have one arbitrary constant than three :-), and the lower the better.
Can we just proceed with this patch without this change? At least then I can perhaps get an appreciation for the issue (if you can point me to an example that fails). Right now I really do not understand how this could ever be a problem in any sensible setup.
I agree with Andreas on this one. I don't feel that an "arbitrary" constant should be changed to another "arbitrary" constant in a patch that is trying to fix up the RubyTester. There must be configurations of the RubyTester that will work without increasing the threshold to 10000, so the RubyTester changes will still work for some configurations. These are largely orthogonal issues.
Split the threshold change into a separate patch where we can focus on debating the merits of changes between so-called "arbitrary" buffering limits (e.g. gem5-gpu doesn't require any changes to the RubyPort or QueuedPort buffering thresholds).
We can certainly split this change into a separate patch. It's true that 10,000 is just another arbitrary value; it may make more sense to just add a parameter that lets us turn off this threshold check altogether rather than create another arbitrary constant. Note that the RubyTester is not trying to be a "sensible setup", it's a stress tester that is stressing the protocol implementation beyond the limits of what a "sensible" configuration might be able to do.
I have no problem moving this change to a separate patch, but we still need to provide flexibility on either setting this arbitrary value or turning it off. Leaving it as is will not work for us. We have important configurations of the RubyTester and other internal testers that reach the 100 packet threshold.
I don't want to create a separate patch only to spawn off another week of high-level resistance. Can you confirm that if we create a patch that allows an object to configure the threshold or turn it off, that you will approve it? Do you have a preference between those two options?
Sure.
The RubyTester (or other testers) is certainly an exceptional case for coherence and consistency incantations aimed at finding bugs. From what I've read about the incoming GPU memory hierarchies in these reviews, however, I'm unsure whether I can be convinced that a GPU core's sequencer should be allowed to queue more than order(100) packets. Hitting this limit with real system components seems like a major red flag that your simulated system is very unbalanced and needs to be reconfigured. We definitely need to see the GPU code to understand that more deeply.
As a separate review request, I would prefer that we add an option to allow very large but finite buffering in all PacketQueue instances of a single simulation. Allowing infinite buffering is likely a bad idea for debugging (imagine a Ruby livelock that just fills a RubyPort response buffer that never unblocks). I'd also prefer comments on that option that strongly suggest it should only be used for exceptional cases, such as testers. Such an option should *not* be exposed on the command line due to obvious potential misuse by a naive user, but instead should be set by the RubyTester simulation config file. I would likely oppose a per-instance PacketQueue buffering limit, because that would just encourage strange, improper use of fake buffering. Until we can actually see your GPU code, I feel that the option should only be allowed to be used with the RubyTester in mainline gem5 code.
Thanks Joel for voicing your preference for the finite buffering case. Once Andreas approves as well, I'll implement and post that patch.
As far as your concern with an unbalanced GPU system, please keep in mind this buffer is for responses. Also in our implementation each work-item issues a separate packet request that is coalesced in the RubyPort. All it takes is 2 wavefronts (2x64 work-items) to be waiting on the same few cache lines. When those cache lines arrive at the GPU L1 cache, 128 packets will be sent back to the GPU cache thus the threshold will be exceeded. I do not think this is a result of an unbalanced system.
I would like to understand the architecture a bit better before simply increasing the limit. Again, I propose we get this patch pushed without this change, since I envision that the discussion has not quite converged.
From the brief description it sounds like the issue arises due to a rather unfortunate use of packets. Perhaps I am missing something, but why not simply add masking or do the coalescing before the "real" port? Are the GPU L1 caches not taking care of this already (or are the L1's in Ruby?).
I had a look at the RubyPort, and it seems all cases of QueuedPort can easily be removed and replaced simply by forwarding retries from one side to the other. The one exception I cannot figure out is the "hit_callback". How can we tell Ruby that a response needs to wait?
Yes this discussion has not quite converge, but I think that is mostly due to your concerns. You say "rather unfortunate use of packets". Perhaps we used packets differently than what you would have expected, but do not believe that deems the use improper or unfortunate. Please be open to uses of gem5 beyond the ones you have previously considered.
The GPU L1 caches are modeled in Ruby. Memory requests are sequenced and coalesced in the RubyPort. This is no different than the other Ruby protocols.
The "hit_callback" is pretty fundamental to Ruby. It has been designed to always "sink" responses because the protocols are set up to reserve the resources ahead of time. I believe that is fairly representative of real hardware and please note resources like register ports are outside of the realm of Ruby. Thus responses will stall inside the RubyPort.
Please let's not stall this necessary change on a fundamental redesign of Ruby.
I think you misunderstood my comment. I think the patch should go ahead...just without the change to the queue limit. No further changes needed at this point.
I used the word "unfortunate" since the packet is a pretty heavy-weight data structure for performing byte strobes. It seems to me we should be able to come up with a much better way to solve this issue (if I have understood what it is you are trying to do). Again, I only see part of the overall picture of your use-case, but we have had related cases internally, and I think we can avoid the overheads of having a packet per byte. Makes sense?
It is fine to assume that reponses sink (eventually) and for deadlock reasons it makes sense to reserve space in advance. However, it is still important to model throughput in the ports, caches and interconnect. For this reason I do not find it too unreasonable to be able to rate regulate responses (which we do in quite a lot of places). If Ruby is unable to do this then I agree that we need an infinite buffer somewhere in the interaction with the "classic" MemObjects.
Again, please push the patch without the queue limit change, and we can revisit this point when we have a better picture of what the options are. Is that ok?
My first instinct is that it's a very bad idea to do GPU coalescing in the RubyPort. The RubyPort is a thin shim that already does too many things (and poorly in a couple cases). However, without seeing the GPU code, I expect it would be hard for you to communicate the constraints on where to do coalescing (e.g. checkpointing, address translation, etc.).

Another aspect to this buffering problem is that you're changing the L1/L0 and Sequencer clock domains back to Ruby's clock. By default, Ruby's clock is 2GHz and most GPUs have lower frequency than this. If the GPU's eject width is, say, 1 packet per lane per GPU cycle, of course you're going to pile up packets within the RubyPort on the response path. This sounds unbalanced to me, especially on a response path.

@Andreas: The hit_callback is problematic: All current coherence protocols call into their RubyPort/Sequencer from their top-level cache controller (L0 or L1) when a memory access is complete. This is a direct call into RubyPort rather than handing the accesses through a port-like interface. Because it is done this way, it is assumed that the RubyPort will not stall these calls. IMO, there should be a response buffer in each L0/L1 cache controller that assesses the L0/L1 latency, and the RubyPort should consume from it like all other Ruby components. However, this would be a lot of work to change.


- Joel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Nilay Vaish
2015-05-19 16:08:11 UTC
Permalink
Post by Joel Hestness
My first instinct is that it's a very bad idea to do GPU coalescing in
the RubyPort. The RubyPort is a thin shim that already does too many
things (and poorly in a couple cases). However, without seeing the GPU
code, I expect it would be hard for you to communicate the constraints
on where to do coalescing (e.g. checkpointing, address translation,
etc.).
Another aspect to this buffering problem is that you're changing the
L1/L0 and Sequencer clock domains back to Ruby's clock. By default,
Ruby's clock is 2GHz and most GPUs have lower frequency than this. If
the GPU's eject width is, say, 1 packet per lane per GPU cycle, of
course you're going to pile up packets within the RubyPort on the
response path. This sounds unbalanced to me, especially on a response
path.
In my opinion those changes to clock domains are incorrect. As I see it
AMD is trying to fix a problem with ruby tester by changing unrelated
code.

--
Nilay
Brad Beckmann
2015-05-19 16:36:21 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.
IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.
What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.
(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)
1) The 100 packet sanity check should definitely not be changed as part of this patch. If we do want to change it, it should be a patch on its own.
2) If there are somehow 10000 responses queued up, then something is truly _fundamentally_ broken, and having a panic telling the user it is broken is helpful imho. Do you not agree?
The QueuedPort is an evolution of the SimpleTimingPort that was used earlier, and I believe the RubyPort has always had this issue (it borrowed a lot of not-so-great design decisions from the Bus for example). The QueuedPort should really only be used as a "lazy" way of implementing non-performance critical devices. I've been meaning to remove it from the classic Caches as well, just not gotten around to it.
In the case of the RubyPort, I do not see the big issue of moving away from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. The only implication is that we need to deal with the port flow control (retry), and pass that downstream/upstream. Since the RubyPort really does not correspond to any real physical object, with any buffers, I think the case is clear: It should not do any buffering behind the user's back. Do you not think this is reasonable?
Yes, if there is 10000 responses queued up, then there might be something broken. However, 100 responses can be queued up rather easily when running with the Ruby Tester or with a GPU. That is why this patch changes the sanity check from 100 to 10000.
RubyPort is used by a lot of users, with many different scenarios/protocols. Moving it away from QueuedPorts will fundamentally change its behavior. That should be done carefully and I don't want to delay this bug fix with a project that large.
Would you be happy if we introduced a patch that modifies the QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the RubyPort can set the transmitList threshold to 10000 and the default can still be set to 100?
Can we simply leave this change out of this patch? I think the right solution here is to fix the dodgy assumptions on implicit buffering, and I'm happy to have a go myself (given that I've done something dimilar to a range of other modules).
The RubyTester and our soon-to-be-posed HSA-compatible GPU will not work without this change. I believe many Ruby users, especially Ruby users evaluating GPUs, have had this type of change in their personal patch queues for a very long time. We need to get this problem out of our personal patch queues and into the main line.
I appreciate your offer to redesign the RubyPort yourself, but even if that were to happen, I would prefer to get this change happen first. A redesign of the RubyPort will require a lot of downstream work to verify things still work as expected.
Why are you so against changing this somewhat arbritrary value, or at least allowing someone to change it? As Jason pointed out, the queue port is an infinite buffer and doesn't represent a real hardware structure. Why so much resistance? This small change really benefits Ruby users and if we make it configurable, it will have no impact on Classic users.
I fundamentally object to the change since it effectively adds an implicit infinite buffer (in quite some places), thus changing the very behaviour of the system and potentially masking problems.
The RubyPort should really just be a glue-logic adaptor. Agreed? In a system architecture, there is no such thing as a RubyPort, and there should not be kilobytes of "magic" buffering imposed on the user. If this buffer is needed, then we should instantiate a buffer, cache, etc. I do not understand the statement that this is needed to make things work. If you need an infinite buffer to make things work, then the solution is broken. Am I missing something?
I would propose to simply remove the QueuedPort, make it inherit from the Master/SlavePort directly, remove the implicit buffer altogether, and simply relay the retry to/from Ruby. Surely Ruby understands flow control...does it not?
Getting rid of implicit infinite buffering is a noble and appropriate long-term goal, I agree. However, in the near term, we already have implicit infinite buffering, and the only thing we're debating here is at what arbitrary point do we warn the user that the use of that infinite buffer is potentially getting abusive. To me it seems obvious that the fundamental near-term problem is that we have this totally arbitrary constant embedded in the code and no way to adjust it based on different circumstances in which different values would be appropriate. I think Brad's suggestion that we make this threshold configurable and then find some way to propagate the value downward seems like the right way to go for now; we can leave the threshold at 100 for CPU-based systems, and crank it up for GPU-based systems, with maybe a third even higher thershold specifically for when we run RubyTester.
I'd rather have one arbitrary constant than three :-), and the lower the better.
Can we just proceed with this patch without this change? At least then I can perhaps get an appreciation for the issue (if you can point me to an example that fails). Right now I really do not understand how this could ever be a problem in any sensible setup.
I agree with Andreas on this one. I don't feel that an "arbitrary" constant should be changed to another "arbitrary" constant in a patch that is trying to fix up the RubyTester. There must be configurations of the RubyTester that will work without increasing the threshold to 10000, so the RubyTester changes will still work for some configurations. These are largely orthogonal issues.
Split the threshold change into a separate patch where we can focus on debating the merits of changes between so-called "arbitrary" buffering limits (e.g. gem5-gpu doesn't require any changes to the RubyPort or QueuedPort buffering thresholds).
We can certainly split this change into a separate patch. It's true that 10,000 is just another arbitrary value; it may make more sense to just add a parameter that lets us turn off this threshold check altogether rather than create another arbitrary constant. Note that the RubyTester is not trying to be a "sensible setup", it's a stress tester that is stressing the protocol implementation beyond the limits of what a "sensible" configuration might be able to do.
I have no problem moving this change to a separate patch, but we still need to provide flexibility on either setting this arbitrary value or turning it off. Leaving it as is will not work for us. We have important configurations of the RubyTester and other internal testers that reach the 100 packet threshold.
I don't want to create a separate patch only to spawn off another week of high-level resistance. Can you confirm that if we create a patch that allows an object to configure the threshold or turn it off, that you will approve it? Do you have a preference between those two options?
Sure.
The RubyTester (or other testers) is certainly an exceptional case for coherence and consistency incantations aimed at finding bugs. From what I've read about the incoming GPU memory hierarchies in these reviews, however, I'm unsure whether I can be convinced that a GPU core's sequencer should be allowed to queue more than order(100) packets. Hitting this limit with real system components seems like a major red flag that your simulated system is very unbalanced and needs to be reconfigured. We definitely need to see the GPU code to understand that more deeply.
As a separate review request, I would prefer that we add an option to allow very large but finite buffering in all PacketQueue instances of a single simulation. Allowing infinite buffering is likely a bad idea for debugging (imagine a Ruby livelock that just fills a RubyPort response buffer that never unblocks). I'd also prefer comments on that option that strongly suggest it should only be used for exceptional cases, such as testers. Such an option should *not* be exposed on the command line due to obvious potential misuse by a naive user, but instead should be set by the RubyTester simulation config file. I would likely oppose a per-instance PacketQueue buffering limit, because that would just encourage strange, improper use of fake buffering. Until we can actually see your GPU code, I feel that the option should only be allowed to be used with the RubyTester in mainline gem5 code.
Thanks Joel for voicing your preference for the finite buffering case. Once Andreas approves as well, I'll implement and post that patch.
As far as your concern with an unbalanced GPU system, please keep in mind this buffer is for responses. Also in our implementation each work-item issues a separate packet request that is coalesced in the RubyPort. All it takes is 2 wavefronts (2x64 work-items) to be waiting on the same few cache lines. When those cache lines arrive at the GPU L1 cache, 128 packets will be sent back to the GPU cache thus the threshold will be exceeded. I do not think this is a result of an unbalanced system.
I would like to understand the architecture a bit better before simply increasing the limit. Again, I propose we get this patch pushed without this change, since I envision that the discussion has not quite converged.
From the brief description it sounds like the issue arises due to a rather unfortunate use of packets. Perhaps I am missing something, but why not simply add masking or do the coalescing before the "real" port? Are the GPU L1 caches not taking care of this already (or are the L1's in Ruby?).
I had a look at the RubyPort, and it seems all cases of QueuedPort can easily be removed and replaced simply by forwarding retries from one side to the other. The one exception I cannot figure out is the "hit_callback". How can we tell Ruby that a response needs to wait?
Yes this discussion has not quite converge, but I think that is mostly due to your concerns. You say "rather unfortunate use of packets". Perhaps we used packets differently than what you would have expected, but do not believe that deems the use improper or unfortunate. Please be open to uses of gem5 beyond the ones you have previously considered.
The GPU L1 caches are modeled in Ruby. Memory requests are sequenced and coalesced in the RubyPort. This is no different than the other Ruby protocols.
The "hit_callback" is pretty fundamental to Ruby. It has been designed to always "sink" responses because the protocols are set up to reserve the resources ahead of time. I believe that is fairly representative of real hardware and please note resources like register ports are outside of the realm of Ruby. Thus responses will stall inside the RubyPort.
Please let's not stall this necessary change on a fundamental redesign of Ruby.
I think you misunderstood my comment. I think the patch should go ahead...just without the change to the queue limit. No further changes needed at this point.
I used the word "unfortunate" since the packet is a pretty heavy-weight data structure for performing byte strobes. It seems to me we should be able to come up with a much better way to solve this issue (if I have understood what it is you are trying to do). Again, I only see part of the overall picture of your use-case, but we have had related cases internally, and I think we can avoid the overheads of having a packet per byte. Makes sense?
It is fine to assume that reponses sink (eventually) and for deadlock reasons it makes sense to reserve space in advance. However, it is still important to model throughput in the ports, caches and interconnect. For this reason I do not find it too unreasonable to be able to rate regulate responses (which we do in quite a lot of places). If Ruby is unable to do this then I agree that we need an infinite buffer somewhere in the interaction with the "classic" MemObjects.
Again, please push the patch without the queue limit change, and we can revisit this point when we have a better picture of what the options are. Is that ok?
My first instinct is that it's a very bad idea to do GPU coalescing in the RubyPort. The RubyPort is a thin shim that already does too many things (and poorly in a couple cases). However, without seeing the GPU code, I expect it would be hard for you to communicate the constraints on where to do coalescing (e.g. checkpointing, address translation, etc.).
Another aspect to this buffering problem is that you're changing the L1/L0 and Sequencer clock domains back to Ruby's clock. By default, Ruby's clock is 2GHz and most GPUs have lower frequency than this. If the GPU's eject width is, say, 1 packet per lane per GPU cycle, of course you're going to pile up packets within the RubyPort on the response path. This sounds unbalanced to me, especially on a response path.
@Andreas: The hit_callback is problematic: All current coherence protocols call into their RubyPort/Sequencer from their top-level cache controller (L0 or L1) when a memory access is complete. This is a direct call into RubyPort rather than handing the accesses through a port-like interface. Because it is done this way, it is assumed that the RubyPort will not stall these calls. IMO, there should be a response buffer in each L0/L1 cache controller that assesses the L0/L1 latency, and the RubyPort should consume from it like all other Ruby components. However, this would be a lot of work to change.
Without changing the queue threshold or providing a feature that allows one to at least adjust or disable it, we are not resolving the problem, we are just delaying it.

The RubyPort is not a thin shim. I do not believe the Sequencer has ever been a thin shim, even in the GEMS days. It has always been a key component to handling synchronization and gluing cache block memory requests to instruction granular operations. I do not agree that GPU coalescing in the RubyPort is a "very bad idea". If we were to implement it in the other way, we would be arguing over a lot of extensive changes to packet and request. Furthermore, we would not be able to implement per-protocol synchronization mechanisms without having to customize the GPU core for every protocol.

This behavior has very little to do with the ratio of frequencies. If you are concerned with passing the Ruby clock frequency to the L1 caches, would you be happy if I added a L1 cache frequency parameter to create_system, a separate L1 freq value to ruby_system, or added logic that checks for how many cpus are created before trying to reference the cpu's clk domain value? I'm happy to do any of those options. I'm just trying to fix the current bug when more than CPU is specified when using the RubyTester.


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Nilay Vaish
2015-05-20 14:40:19 UTC
Permalink
Post by Brad Beckmann
Without changing the queue threshold or providing a feature that allows
one to at least adjust or disable it, we are not resolving the problem,
we are just delaying it.
The RubyPort is not a thin shim. I do not believe the Sequencer has
ever been a thin shim, even in the GEMS days. It has always been a key
component to handling synchronization and gluing cache block memory
requests to instruction granular operations. I do not agree that GPU
coalescing in the RubyPort is a "very bad idea". If we were to
implement it in the other way, we would be arguing over a lot of
extensive changes to packet and request. Furthermore, we would not be
able to implement per-protocol synchronization mechanisms without having
to customize the GPU core for every protocol.
This behavior has very little to do with the ratio of frequencies. If
you are concerned with passing the Ruby clock frequency to the L1
caches, would you be happy if I added a L1 cache frequency parameter to
create_system, a separate L1 freq value to ruby_system, or added logic
that checks for how many cpus are created before trying to reference the
cpu's clk domain value? I'm happy to do any of those options. I'm just
trying to fix the current bug when more than CPU is specified when using
the RubyTester.
Brad, I am surprised that you are proposing that we create another special
case here to handle the problem that exists with tester. I am not it in
favor of it. It is the same approach that AMD is taking with SLICC as
well. Almost everything that AMD wants to do, ends up in a new special
case in the compiler.

Again I am not in favor of putting in special cases in the config files.
I don't think I can convince AMD that this would lead to further problems
in future. I leave it to AMD to do whatever they want to do. I don't
want to continue with the discussion.

--
Nilay
Joel Hestness
2015-05-19 18:18:26 UTC
Permalink
Post by Brad Beckmann
src/mem/packet_queue.cc, line 117
<http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
I strongly disagree with this change. This buffer should not exist, and even 100 packets is a stretch. Any module hitting this needs to recondiser how it deals with flow control
I would agree with you that the buffer should not exist, but removing the buffer goes well beyond this patch. The worst part about the buffer is the size is not visible to the sender. It is a really bad design, but for now we need to get around the immediate problem. Later we can discuss how and who will fix this right.
We've discussed the proposal in the past to increase the size, but unfortunately we have not came to a resolution. We absolutely need to resolve this now. We cannot use the ruby tester with our upcoming GPU model without this change.
The important thing to keep in mind is that the RubyTester is designed to stress the protocol logic and it creates contention scenarios that would not exist in a real system. The RubyTester finds bugs in the matter of seconds that may not be encountered in hours of real workload simulation. It does this by sending large bursts of racey requests. Bottomline: the RubyTester needs this large value.
I would argue that if you need flow control, you should not use the QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow control. There is no point in adding flow control (or buffer visibility) to the QueuePort in my opinion.
I'd suggest to switch to a MasterPort either as part of this patch, or a patch before it. That way you have no implicit buffer, and no need to create kilobytes of invisible buffering in the system.
The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 ports to inherit from a different base port is a very signficant change with many, many implications beyond the Ruby Tester. That is a pretty unreasonable request.
Please keep in mind that I was not the one who decided RubyPort to use QueuedPorts. That decision was made back in 2012 with patches from your group 8914:8c3bd7bea667, and 8922:17f037ad8918.
Is the queued port really supposed to be a high-fidelity model of anything in the "real" world? I was under the impression that it was a courtesy wrapper around the port so that we don't have 10 different implementations of the same general thing.
IMO, if you are trying to model an on-chip network at high-fidelity you need something more than what is provided by this model anyway. It seems to me that in this case, the queued port is a gem5 interface decision, not a hardware system design decision. Therefore, infinite buffering is not a major problem.
What if this was turned into a "warn" or a "warn_once"? It seems like a good idea to let the user know that the queue is growing wildly as that might be an indication of a deadlock.
(Note: IIRC with our high-bandwidth GPU model we've run into this 100 packet limit as well.)
1) The 100 packet sanity check should definitely not be changed as part of this patch. If we do want to change it, it should be a patch on its own.
2) If there are somehow 10000 responses queued up, then something is truly _fundamentally_ broken, and having a panic telling the user it is broken is helpful imho. Do you not agree?
The QueuedPort is an evolution of the SimpleTimingPort that was used earlier, and I believe the RubyPort has always had this issue (it borrowed a lot of not-so-great design decisions from the Bus for example). The QueuedPort should really only be used as a "lazy" way of implementing non-performance critical devices. I've been meaning to remove it from the classic Caches as well, just not gotten around to it.
In the case of the RubyPort, I do not see the big issue of moving away from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. The only implication is that we need to deal with the port flow control (retry), and pass that downstream/upstream. Since the RubyPort really does not correspond to any real physical object, with any buffers, I think the case is clear: It should not do any buffering behind the user's back. Do you not think this is reasonable?
Yes, if there is 10000 responses queued up, then there might be something broken. However, 100 responses can be queued up rather easily when running with the Ruby Tester or with a GPU. That is why this patch changes the sanity check from 100 to 10000.
RubyPort is used by a lot of users, with many different scenarios/protocols. Moving it away from QueuedPorts will fundamentally change its behavior. That should be done carefully and I don't want to delay this bug fix with a project that large.
Would you be happy if we introduced a patch that modifies the QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the RubyPort can set the transmitList threshold to 10000 and the default can still be set to 100?
Can we simply leave this change out of this patch? I think the right solution here is to fix the dodgy assumptions on implicit buffering, and I'm happy to have a go myself (given that I've done something dimilar to a range of other modules).
The RubyTester and our soon-to-be-posed HSA-compatible GPU will not work without this change. I believe many Ruby users, especially Ruby users evaluating GPUs, have had this type of change in their personal patch queues for a very long time. We need to get this problem out of our personal patch queues and into the main line.
I appreciate your offer to redesign the RubyPort yourself, but even if that were to happen, I would prefer to get this change happen first. A redesign of the RubyPort will require a lot of downstream work to verify things still work as expected.
Why are you so against changing this somewhat arbritrary value, or at least allowing someone to change it? As Jason pointed out, the queue port is an infinite buffer and doesn't represent a real hardware structure. Why so much resistance? This small change really benefits Ruby users and if we make it configurable, it will have no impact on Classic users.
I fundamentally object to the change since it effectively adds an implicit infinite buffer (in quite some places), thus changing the very behaviour of the system and potentially masking problems.
The RubyPort should really just be a glue-logic adaptor. Agreed? In a system architecture, there is no such thing as a RubyPort, and there should not be kilobytes of "magic" buffering imposed on the user. If this buffer is needed, then we should instantiate a buffer, cache, etc. I do not understand the statement that this is needed to make things work. If you need an infinite buffer to make things work, then the solution is broken. Am I missing something?
I would propose to simply remove the QueuedPort, make it inherit from the Master/SlavePort directly, remove the implicit buffer altogether, and simply relay the retry to/from Ruby. Surely Ruby understands flow control...does it not?
Getting rid of implicit infinite buffering is a noble and appropriate long-term goal, I agree. However, in the near term, we already have implicit infinite buffering, and the only thing we're debating here is at what arbitrary point do we warn the user that the use of that infinite buffer is potentially getting abusive. To me it seems obvious that the fundamental near-term problem is that we have this totally arbitrary constant embedded in the code and no way to adjust it based on different circumstances in which different values would be appropriate. I think Brad's suggestion that we make this threshold configurable and then find some way to propagate the value downward seems like the right way to go for now; we can leave the threshold at 100 for CPU-based systems, and crank it up for GPU-based systems, with maybe a third even higher thershold specifically for when we run RubyTester.
I'd rather have one arbitrary constant than three :-), and the lower the better.
Can we just proceed with this patch without this change? At least then I can perhaps get an appreciation for the issue (if you can point me to an example that fails). Right now I really do not understand how this could ever be a problem in any sensible setup.
I agree with Andreas on this one. I don't feel that an "arbitrary" constant should be changed to another "arbitrary" constant in a patch that is trying to fix up the RubyTester. There must be configurations of the RubyTester that will work without increasing the threshold to 10000, so the RubyTester changes will still work for some configurations. These are largely orthogonal issues.
Split the threshold change into a separate patch where we can focus on debating the merits of changes between so-called "arbitrary" buffering limits (e.g. gem5-gpu doesn't require any changes to the RubyPort or QueuedPort buffering thresholds).
We can certainly split this change into a separate patch. It's true that 10,000 is just another arbitrary value; it may make more sense to just add a parameter that lets us turn off this threshold check altogether rather than create another arbitrary constant. Note that the RubyTester is not trying to be a "sensible setup", it's a stress tester that is stressing the protocol implementation beyond the limits of what a "sensible" configuration might be able to do.
I have no problem moving this change to a separate patch, but we still need to provide flexibility on either setting this arbitrary value or turning it off. Leaving it as is will not work for us. We have important configurations of the RubyTester and other internal testers that reach the 100 packet threshold.
I don't want to create a separate patch only to spawn off another week of high-level resistance. Can you confirm that if we create a patch that allows an object to configure the threshold or turn it off, that you will approve it? Do you have a preference between those two options?
Sure.
The RubyTester (or other testers) is certainly an exceptional case for coherence and consistency incantations aimed at finding bugs. From what I've read about the incoming GPU memory hierarchies in these reviews, however, I'm unsure whether I can be convinced that a GPU core's sequencer should be allowed to queue more than order(100) packets. Hitting this limit with real system components seems like a major red flag that your simulated system is very unbalanced and needs to be reconfigured. We definitely need to see the GPU code to understand that more deeply.
As a separate review request, I would prefer that we add an option to allow very large but finite buffering in all PacketQueue instances of a single simulation. Allowing infinite buffering is likely a bad idea for debugging (imagine a Ruby livelock that just fills a RubyPort response buffer that never unblocks). I'd also prefer comments on that option that strongly suggest it should only be used for exceptional cases, such as testers. Such an option should *not* be exposed on the command line due to obvious potential misuse by a naive user, but instead should be set by the RubyTester simulation config file. I would likely oppose a per-instance PacketQueue buffering limit, because that would just encourage strange, improper use of fake buffering. Until we can actually see your GPU code, I feel that the option should only be allowed to be used with the RubyTester in mainline gem5 code.
Thanks Joel for voicing your preference for the finite buffering case. Once Andreas approves as well, I'll implement and post that patch.
As far as your concern with an unbalanced GPU system, please keep in mind this buffer is for responses. Also in our implementation each work-item issues a separate packet request that is coalesced in the RubyPort. All it takes is 2 wavefronts (2x64 work-items) to be waiting on the same few cache lines. When those cache lines arrive at the GPU L1 cache, 128 packets will be sent back to the GPU cache thus the threshold will be exceeded. I do not think this is a result of an unbalanced system.
I would like to understand the architecture a bit better before simply increasing the limit. Again, I propose we get this patch pushed without this change, since I envision that the discussion has not quite converged.
From the brief description it sounds like the issue arises due to a rather unfortunate use of packets. Perhaps I am missing something, but why not simply add masking or do the coalescing before the "real" port? Are the GPU L1 caches not taking care of this already (or are the L1's in Ruby?).
I had a look at the RubyPort, and it seems all cases of QueuedPort can easily be removed and replaced simply by forwarding retries from one side to the other. The one exception I cannot figure out is the "hit_callback". How can we tell Ruby that a response needs to wait?
Yes this discussion has not quite converge, but I think that is mostly due to your concerns. You say "rather unfortunate use of packets". Perhaps we used packets differently than what you would have expected, but do not believe that deems the use improper or unfortunate. Please be open to uses of gem5 beyond the ones you have previously considered.
The GPU L1 caches are modeled in Ruby. Memory requests are sequenced and coalesced in the RubyPort. This is no different than the other Ruby protocols.
The "hit_callback" is pretty fundamental to Ruby. It has been designed to always "sink" responses because the protocols are set up to reserve the resources ahead of time. I believe that is fairly representative of real hardware and please note resources like register ports are outside of the realm of Ruby. Thus responses will stall inside the RubyPort.
Please let's not stall this necessary change on a fundamental redesign of Ruby.
I think you misunderstood my comment. I think the patch should go ahead...just without the change to the queue limit. No further changes needed at this point.
I used the word "unfortunate" since the packet is a pretty heavy-weight data structure for performing byte strobes. It seems to me we should be able to come up with a much better way to solve this issue (if I have understood what it is you are trying to do). Again, I only see part of the overall picture of your use-case, but we have had related cases internally, and I think we can avoid the overheads of having a packet per byte. Makes sense?
It is fine to assume that reponses sink (eventually) and for deadlock reasons it makes sense to reserve space in advance. However, it is still important to model throughput in the ports, caches and interconnect. For this reason I do not find it too unreasonable to be able to rate regulate responses (which we do in quite a lot of places). If Ruby is unable to do this then I agree that we need an infinite buffer somewhere in the interaction with the "classic" MemObjects.
Again, please push the patch without the queue limit change, and we can revisit this point when we have a better picture of what the options are. Is that ok?
My first instinct is that it's a very bad idea to do GPU coalescing in the RubyPort. The RubyPort is a thin shim that already does too many things (and poorly in a couple cases). However, without seeing the GPU code, I expect it would be hard for you to communicate the constraints on where to do coalescing (e.g. checkpointing, address translation, etc.).
Another aspect to this buffering problem is that you're changing the L1/L0 and Sequencer clock domains back to Ruby's clock. By default, Ruby's clock is 2GHz and most GPUs have lower frequency than this. If the GPU's eject width is, say, 1 packet per lane per GPU cycle, of course you're going to pile up packets within the RubyPort on the response path. This sounds unbalanced to me, especially on a response path.
@Andreas: The hit_callback is problematic: All current coherence protocols call into their RubyPort/Sequencer from their top-level cache controller (L0 or L1) when a memory access is complete. This is a direct call into RubyPort rather than handing the accesses through a port-like interface. Because it is done this way, it is assumed that the RubyPort will not stall these calls. IMO, there should be a response buffer in each L0/L1 cache controller that assesses the L0/L1 latency, and the RubyPort should consume from it like all other Ruby components. However, this would be a lot of work to change.
Without changing the queue threshold or providing a feature that allows one to at least adjust or disable it, we are not resolving the problem, we are just delaying it.
The RubyPort is not a thin shim. I do not believe the Sequencer has ever been a thin shim, even in the GEMS days. It has always been a key component to handling synchronization and gluing cache block memory requests to instruction granular operations. I do not agree that GPU coalescing in the RubyPort is a "very bad idea". If we were to implement it in the other way, we would be arguing over a lot of extensive changes to packet and request. Furthermore, we would not be able to implement per-protocol synchronization mechanisms without having to customize the GPU core for every protocol.
This behavior has very little to do with the ratio of frequencies. If you are concerned with passing the Ruby clock frequency to the L1 caches, would you be happy if I added a L1 cache frequency parameter to create_system, a separate L1 freq value to ruby_system, or added logic that checks for how many cpus are created before trying to reference the cpu's clk domain value? I'm happy to do any of those options. I'm just trying to fix the current bug when more than CPU is specified when using the RubyTester.
The RubyPort/Sequencer *absolutely is* a thin shim. It is very clearly structured as just a set of ports that communicate (i.e. translates messages) between compute cores and Ruby controllers. Its three primary responsibilities are:
1) Translate gem5 packetized requests to Ruby's internal message structure and transmit them to L1 controllers
2) Block accesses as appropriate for coherence/synchronization (e.g. currently, multiple requests to a single line, and LL/SC)
3) Translate messages (e.g. evicts) and responses from Ruby controllers back to gem5 packets for the cores
Outside of these three things, the RubyPort only does simulation management (e.g. stats handling, checkpointing Ruby caches). In other words, it's just the interface between cores and Ruby caches. Further, it's not clear that the RubyPort activities exist in real systems, or if they do, there isn't a special component for them. Coalescers, on the other hand, are very real components of GPU hardware, so on the face, it sounds like improper organization to do coalescing in the RubyPort.

Now, I hesitated raising this concern previously, because the debate about RubyPort responsibility is way outside the scope of this patch. For instance, I can't make any sense of the claims you make about implementing coalescing another way, because I don't even know how you've implemented it in the first place. We *MUST* see GPU code before we can have a legitimate debate about appropriate responsibilities of the RubyPort when GPUs exist in gem5.

Back to the buffering issue: It's quite clear in the code that the RubyPort (and Sequencer) does not adequately account for frequency differences between itself and the requesting core. The Sequencer decrements outstanding requests as soon as the hit callback is called, but then queues the responses in the Sequencer::MemSlavePort. This means that the requesting core can issue further accesses even if there isn't adequate buffering for their responses; The RubyPort won't block on too many outstanding accesses (i.e. not enough reserved response buffers). The Sequencer frequency change almost certainly changes the amount of required buffering in the Sequencer::MemSlavePort.

For setting the Sequencer/L0/L1 frequencies, I might be ok with logic in the protocol config files that checks whether you're simulating with a tester vs. CPU/GPU cores. My only strong opinion is that when simulating a real system, L1 caches be clocked at core frequency rather than uncore (Ruby) frequency (i.e. like basically all existing hardware). However, Nilay seemed to have strong feelings on how to do this, and I have not reviewed that full conversation. I'd prefer to hear his opinion.


- Joel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Jason Power
2015-05-12 15:24:38 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6171
-----------------------------------------------------------



configs/ruby/MESI_Three_Level.py (line 106)
<http://reviews.gem5.org/r/2776/#comment5361>

Should these changes be in a different patch since they are orthogonal to the Ruby tester?

I would be OK to keep them here, but you should update the commit message to say you change the clock domain for all of the controllers.



src/mem/ruby/system/RubyPort.cc (line 291)
<http://reviews.gem5.org/r/2776/#comment5362>

I don't understand why the random tester doesn't want to retry failed requests. Could you explain why this feature is necessary? It's been a few years since I've looked deeply at the random tester. I'll hold off on broadcasting my opinion on this until I understand it better :).

Again, this may be more appropriate as a separate patch, but I'm not going to push hard on this.



src/mem/ruby/system/RubyPort.cc (line 373)
<http://reviews.gem5.org/r/2776/#comment5360>

Did you mean to call the new sendRetries function here?



src/mem/ruby/system/RubyPort.cc (line 400)
<http://reviews.gem5.org/r/2776/#comment5358>

Does this function ever get called?

Is this in one of the (many many) patches I haven't gotten to yet, or should be called from above?



src/mem/ruby/system/RubyPort.cc (line 406)
<http://reviews.gem5.org/r/2776/#comment5359>

Does it make sense to call this function when the retryList is empty? Why not have an assert instead of an if statement which guards the entire function?

I suggest either changing the name (trySendRetries), or using the if statement outside of this function and instead use an assert within the function.


- Jason Power
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Brad Beckmann
2015-05-12 19:19:04 UTC
Permalink
Post by Anthony Gutierrez
configs/ruby/MESI_Three_Level.py, line 106
<http://reviews.gem5.org/r/2776/diff/1/?file=45127#file45127line106>
Should these changes be in a different patch since they are orthogonal to the Ruby tester?
I would be OK to keep them here, but you should update the commit message to say you change the clock domain for all of the controllers.
Without this change, the Ruby tester does not work when multiple CPUs are specified. I believe this is the bug that Nilay discussed a few weeks ago.

We can update the commit message if you'd like.
Post by Anthony Gutierrez
src/mem/ruby/system/RubyPort.cc, line 298
<http://reviews.gem5.org/r/2776/diff/1/?file=45140#file45140line298>
I don't understand why the random tester doesn't want to retry failed requests. Could you explain why this feature is necessary? It's been a few years since I've looked deeply at the random tester. I'll hold off on broadcasting my opinion on this until I understand it better :).
Again, this may be more appropriate as a separate patch, but I'm not going to push hard on this.
The ruby tester was never designed to handle failed request. The ruby tester is designed to send large amount of racey requests to stress the protocol logic. It does this by avoiding flow control and retries.
Post by Anthony Gutierrez
src/mem/ruby/system/RubyPort.cc, line 380
<http://reviews.gem5.org/r/2776/diff/1/?file=45140#file45140line380>
Did you mean to call the new sendRetries function here?
Yes. Thanks for pointing that out. That change must have gotten lost in one of the patch mergers/rebases.
Post by Anthony Gutierrez
src/mem/ruby/system/RubyPort.cc, line 407
<http://reviews.gem5.org/r/2776/diff/1/?file=45140#file45140line407>
Does this function ever get called?
Is this in one of the (many many) patches I haven't gotten to yet, or should be called from above?
The call to this function does not exist in the set of patches that Tony posted yesterday, but the fix you mentioned above will call it. Also there will be upcoming patches that call it.
Post by Anthony Gutierrez
src/mem/ruby/system/RubyPort.cc, line 413
<http://reviews.gem5.org/r/2776/diff/1/?file=45140#file45140line413>
Does it make sense to call this function when the retryList is empty? Why not have an assert instead of an if statement which guards the entire function?
I suggest either changing the name (trySendRetries), or using the if statement outside of this function and instead use an assert within the function.
Sure we can change the name to trySendRetries.


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6171
-----------------------------------------------------------
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Jason Power
2015-05-12 22:17:56 UTC
Permalink
Post by Brad Beckmann
configs/ruby/MESI_Three_Level.py, line 106
<http://reviews.gem5.org/r/2776/diff/1/?file=45127#file45127line106>
Should these changes be in a different patch since they are orthogonal to the Ruby tester?
I would be OK to keep them here, but you should update the commit message to say you change the clock domain for all of the controllers.
Without this change, the Ruby tester does not work when multiple CPUs are specified. I believe this is the bug that Nilay discussed a few weeks ago.
We can update the commit message if you'd like.
Thanks!
Post by Brad Beckmann
src/mem/ruby/system/RubyPort.cc, line 298
<http://reviews.gem5.org/r/2776/diff/1/?file=45140#file45140line298>
I don't understand why the random tester doesn't want to retry failed requests. Could you explain why this feature is necessary? It's been a few years since I've looked deeply at the random tester. I'll hold off on broadcasting my opinion on this until I understand it better :).
Again, this may be more appropriate as a separate patch, but I'm not going to push hard on this.
The ruby tester was never designed to handle failed request. The ruby tester is designed to send large amount of racey requests to stress the protocol logic. It does this by avoiding flow control and retries.
Ok, I understand this a little better now. One more question, though. Why is it that the RubyPort must be set to ignore retries and it's not the responsibility of the tester to ignore retries? I think this may be just an issue with the way the RubyPort is designed, but I'm trying to get a good understanding of it.

Overall, it's not very important to me. This code is only used when testing, and it's explicitly noted in the code. So if it's difficult to move this logic into the tester instead of the port this code is OK with me.


- Jason


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6171
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Brad Beckmann
2015-05-13 16:25:43 UTC
Permalink
Post by Brad Beckmann
src/mem/ruby/system/RubyPort.cc, line 298
<http://reviews.gem5.org/r/2776/diff/1/?file=45140#file45140line298>
I don't understand why the random tester doesn't want to retry failed requests. Could you explain why this feature is necessary? It's been a few years since I've looked deeply at the random tester. I'll hold off on broadcasting my opinion on this until I understand it better :).
Again, this may be more appropriate as a separate patch, but I'm not going to push hard on this.
The ruby tester was never designed to handle failed request. The ruby tester is designed to send large amount of racey requests to stress the protocol logic. It does this by avoiding flow control and retries.
Ok, I understand this a little better now. One more question, though. Why is it that the RubyPort must be set to ignore retries and it's not the responsibility of the tester to ignore retries? I think this may be just an issue with the way the RubyPort is designed, but I'm trying to get a good understanding of it.
Overall, it's not very important to me. This code is only used when testing, and it's explicitly noted in the code. So if it's difficult to move this logic into the tester instead of the port this code is OK with me.
The RubyTester will not work if there is a retry. Every request it issues needs to succeed. Therefore it doesn't make sense for the RubyTester to ignore a retry. Instead it will panic...that is how it is currently implemented.


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6171
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Jason Power
2015-05-13 16:42:52 UTC
Permalink
Post by Brad Beckmann
src/mem/ruby/system/RubyPort.cc, line 298
<http://reviews.gem5.org/r/2776/diff/1/?file=45140#file45140line298>
I don't understand why the random tester doesn't want to retry failed requests. Could you explain why this feature is necessary? It's been a few years since I've looked deeply at the random tester. I'll hold off on broadcasting my opinion on this until I understand it better :).
Again, this may be more appropriate as a separate patch, but I'm not going to push hard on this.
The ruby tester was never designed to handle failed request. The ruby tester is designed to send large amount of racey requests to stress the protocol logic. It does this by avoiding flow control and retries.
Ok, I understand this a little better now. One more question, though. Why is it that the RubyPort must be set to ignore retries and it's not the responsibility of the tester to ignore retries? I think this may be just an issue with the way the RubyPort is designed, but I'm trying to get a good understanding of it.
Overall, it's not very important to me. This code is only used when testing, and it's explicitly noted in the code. So if it's difficult to move this logic into the tester instead of the port this code is OK with me.
The RubyTester will not work if there is a retry. Every request it issues needs to succeed. Therefore it doesn't make sense for the RubyTester to ignore a retry. Instead it will panic...that is how it is currently implemented.
I think my issue here is very similar to Andreas's above. I believe that a better solution than making the change in the RubyPort is to instead make the change in the RubyTester. You're right, the RubyTester will panic, but what is your reason for not changing the RubyTester to just ignore retries? It seems simple to just remove the panic in line 79(ish) of RubyTester.hh.

I would much rather see changes to the RubyTester to fix this problem instead of changes to RubyPort. Changes to the RubyPort are much more invasive than changes to the tester. Is it impossible to fix this in the RubyTester?


- Jason


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6171
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Brad Beckmann
2015-05-13 16:55:51 UTC
Permalink
Post by Brad Beckmann
src/mem/ruby/system/RubyPort.cc, line 298
<http://reviews.gem5.org/r/2776/diff/1/?file=45140#file45140line298>
I don't understand why the random tester doesn't want to retry failed requests. Could you explain why this feature is necessary? It's been a few years since I've looked deeply at the random tester. I'll hold off on broadcasting my opinion on this until I understand it better :).
Again, this may be more appropriate as a separate patch, but I'm not going to push hard on this.
The ruby tester was never designed to handle failed request. The ruby tester is designed to send large amount of racey requests to stress the protocol logic. It does this by avoiding flow control and retries.
Ok, I understand this a little better now. One more question, though. Why is it that the RubyPort must be set to ignore retries and it's not the responsibility of the tester to ignore retries? I think this may be just an issue with the way the RubyPort is designed, but I'm trying to get a good understanding of it.
Overall, it's not very important to me. This code is only used when testing, and it's explicitly noted in the code. So if it's difficult to move this logic into the tester instead of the port this code is OK with me.
The RubyTester will not work if there is a retry. Every request it issues needs to succeed. Therefore it doesn't make sense for the RubyTester to ignore a retry. Instead it will panic...that is how it is currently implemented.
I think my issue here is very similar to Andreas's above. I believe that a better solution than making the change in the RubyPort is to instead make the change in the RubyTester. You're right, the RubyTester will panic, but what is your reason for not changing the RubyTester to just ignore retries? It seems simple to just remove the panic in line 79(ish) of RubyTester.hh.
I would much rather see changes to the RubyTester to fix this problem instead of changes to RubyPort. Changes to the RubyPort are much more invasive than changes to the tester. Is it impossible to fix this in the RubyTester?
The RubyTester will not work if it gets a retry. It is not as simple as you suggest. The RubyTester is designed to stress the memory system. It is not designed to throttle back it's requests. That defeats the purpose of the RubyTester.

The RubyTester is not broken and it does not need to be fixed. The RubyPort needs to allow large bursts of requests if we are going to stress the protocol logic.

I would agrue these changes to the RubyPort are pretty minimal. They just provide more flexibility of when and how it handles retries. Eventually we will post additional patches that build on this flexibility.


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6171
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Brad Beckmann
2015-05-13 17:32:35 UTC
Permalink
Post by Brad Beckmann
src/mem/ruby/system/RubyPort.cc, line 298
<http://reviews.gem5.org/r/2776/diff/1/?file=45140#file45140line298>
I don't understand why the random tester doesn't want to retry failed requests. Could you explain why this feature is necessary? It's been a few years since I've looked deeply at the random tester. I'll hold off on broadcasting my opinion on this until I understand it better :).
Again, this may be more appropriate as a separate patch, but I'm not going to push hard on this.
The ruby tester was never designed to handle failed request. The ruby tester is designed to send large amount of racey requests to stress the protocol logic. It does this by avoiding flow control and retries.
Ok, I understand this a little better now. One more question, though. Why is it that the RubyPort must be set to ignore retries and it's not the responsibility of the tester to ignore retries? I think this may be just an issue with the way the RubyPort is designed, but I'm trying to get a good understanding of it.
Overall, it's not very important to me. This code is only used when testing, and it's explicitly noted in the code. So if it's difficult to move this logic into the tester instead of the port this code is OK with me.
The RubyTester will not work if there is a retry. Every request it issues needs to succeed. Therefore it doesn't make sense for the RubyTester to ignore a retry. Instead it will panic...that is how it is currently implemented.
I think my issue here is very similar to Andreas's above. I believe that a better solution than making the change in the RubyPort is to instead make the change in the RubyTester. You're right, the RubyTester will panic, but what is your reason for not changing the RubyTester to just ignore retries? It seems simple to just remove the panic in line 79(ish) of RubyTester.hh.
I would much rather see changes to the RubyTester to fix this problem instead of changes to RubyPort. Changes to the RubyPort are much more invasive than changes to the tester. Is it impossible to fix this in the RubyTester?
The RubyTester will not work if it gets a retry. It is not as simple as you suggest. The RubyTester is designed to stress the memory system. It is not designed to throttle back it's requests. That defeats the purpose of the RubyTester.
The RubyTester is not broken and it does not need to be fixed. The RubyPort needs to allow large bursts of requests if we are going to stress the protocol logic.
I would agrue these changes to the RubyPort are pretty minimal. They just provide more flexibility of when and how it handles retries. Eventually we will post additional patches that build on this flexibility.
Also I want to point out that the issue with how the RubyTester handles retries is not related to Andreas's question about increasing the number of buffered packets in the packet_queue.

The retry question deals with how the RubyTester generates requests. The invisible packet limit is reached on the response path from the RubyPort back to the RubyTester.


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6171
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Nilay Vaish
2015-05-12 15:52:11 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6175
-----------------------------------------------------------



configs/ruby/MESI_Three_Level.py (line 106)
<http://reviews.gem5.org/r/2776/#comment5363>

Can you explain the rationale behind this change?


- Nilay Vaish
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Brad Beckmann
2015-05-12 19:18:59 UTC
Permalink
Post by Anthony Gutierrez
configs/ruby/MESI_Three_Level.py, line 106
<http://reviews.gem5.org/r/2776/diff/1/?file=45127#file45127line106>
Can you explain the rationale behind this change?
As I mentioned in response to Jason's review, this code is broken when running the tester with multiple CPUs. I believe this is the same issue you mentioned a couple weeks ago correct?


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6175
-----------------------------------------------------------
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Nilay Vaish
2015-05-12 19:35:40 UTC
Permalink
Post by Brad Beckmann
configs/ruby/MESI_Three_Level.py, line 106
<http://reviews.gem5.org/r/2776/diff/1/?file=45127#file45127line106>
Can you explain the rationale behind this change?
As I mentioned in response to Jason's review, this code is broken when
running the tester with multiple CPUs. I believe this is the same issue
you mentioned a couple weeks ago correct?
I am going to ask the same question again. Why setting the clock of a
controller to the cpu that talks to it incorrect?


--
Nilay
Andreas Hansson
2015-05-13 21:33:37 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6249
-----------------------------------------------------------



src/cpu/testers/rubytest/Check.cc (line 100)
<http://reviews.gem5.org/r/2776/#comment5395>

Please use the Mersenne twister (random_mt). It took a significant effort to unify how we generate random numbers, and I'd rather see we do not regress on that point.



src/mem/ruby/system/RubyPort.cc (line 291)
<http://reviews.gem5.org/r/2776/#comment5396>

I'd suggest to always send a retry, simply to comply with the timing protocol. This is extra important these days with bridging to SST and SystemC/TLM.

The tester can simply ignore the retry if it does not care.



src/mem/ruby/system/RubyPort.cc (line 416)
<http://reviews.gem5.org/r/2776/#comment5397>

In the crossbar we only send a single retry at a time. It used to look like this, but imho it makes no sense. Once one port goes ahead the RubyPort will be busy, will it not?


- Andreas Hansson
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Brad Beckmann
2015-05-15 22:42:41 UTC
Permalink
Post by Anthony Gutierrez
src/cpu/testers/rubytest/Check.cc, line 100
<http://reviews.gem5.org/r/2776/diff/1/?file=45133#file45133line100>
Please use the Mersenne twister (random_mt). It took a significant effort to unify how we generate random numbers, and I'd rather see we do not regress on that point.
Thanks for catching that. Fixed!
Post by Anthony Gutierrez
src/mem/ruby/system/RubyPort.cc, line 298
<http://reviews.gem5.org/r/2776/diff/1/?file=45140#file45140line298>
I'd suggest to always send a retry, simply to comply with the timing protocol. This is extra important these days with bridging to SST and SystemC/TLM.
The tester can simply ignore the retry if it does not care.
Not adding ports to the retry list while running the ruby tester is behavior that already exists. This change simply added a generic variable to turn on this behavior so that other testers can leverage the feature as well, rather than assuming it is only used by the Ruby Tester. What you are suggesting is to modify the Ruby Tester's basic assumptions. That goes beyond this patch and would require significant modificaitons and testing of both the public Ruby Tester and our internal testers. That is a lot of work that goes well beyond this patch.
Post by Anthony Gutierrez
src/mem/ruby/system/RubyPort.cc, line 423
<http://reviews.gem5.org/r/2776/diff/1/?file=45140#file45140line423>
In the crossbar we only send a single retry at a time. It used to look like this, but imho it makes no sense. Once one port goes ahead the RubyPort will be busy, will it not?
Actually, when we attach a GPU to the RubyPort we create on port per work-item (lane of execution). Thus it is very common to retry many ports in a single cycle.


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6249
-----------------------------------------------------------
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Andreas Hansson
2015-05-16 09:57:50 UTC
Permalink
Post by Brad Beckmann
src/mem/ruby/system/RubyPort.cc, line 298
<http://reviews.gem5.org/r/2776/diff/1/?file=45140#file45140line298>
I'd suggest to always send a retry, simply to comply with the timing protocol. This is extra important these days with bridging to SST and SystemC/TLM.
The tester can simply ignore the retry if it does not care.
Not adding ports to the retry list while running the ruby tester is behavior that already exists. This change simply added a generic variable to turn on this behavior so that other testers can leverage the feature as well, rather than assuming it is only used by the Ruby Tester. What you are suggesting is to modify the Ruby Tester's basic assumptions. That goes beyond this patch and would require significant modificaitons and testing of both the public Ruby Tester and our internal testers. That is a lot of work that goes well beyond this patch.
I am not suggesting to change any assumptions, merely that we have a recvRetry in the tester port. It can be thrown away/empty. All I'm asking is that we stick to the normal port conventions. It should be trivial to add.
Post by Brad Beckmann
src/mem/ruby/system/RubyPort.cc, line 423
<http://reviews.gem5.org/r/2776/diff/1/?file=45140#file45140line423>
In the crossbar we only send a single retry at a time. It used to look like this, but imho it makes no sense. Once one port goes ahead the RubyPort will be busy, will it not?
Actually, when we attach a GPU to the RubyPort we create on port per work-item (lane of execution). Thus it is very common to retry many ports in a single cycle.
May ports with different destinations make sense, I think my confusion here is because the RubyPort is also an implicit crossbar. Actually, should we not make the RubyPort 1:1 and rely on the interconnect Ruby or a CrossBar to do the multiplexing. To me it seems this would solve a lot of the issues.


- Andreas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6249
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Brad Beckmann
2015-05-18 05:02:35 UTC
Permalink
Post by Brad Beckmann
src/mem/ruby/system/RubyPort.cc, line 298
<http://reviews.gem5.org/r/2776/diff/1/?file=45140#file45140line298>
I'd suggest to always send a retry, simply to comply with the timing protocol. This is extra important these days with bridging to SST and SystemC/TLM.
The tester can simply ignore the retry if it does not care.
Not adding ports to the retry list while running the ruby tester is behavior that already exists. This change simply added a generic variable to turn on this behavior so that other testers can leverage the feature as well, rather than assuming it is only used by the Ruby Tester. What you are suggesting is to modify the Ruby Tester's basic assumptions. That goes beyond this patch and would require significant modificaitons and testing of both the public Ruby Tester and our internal testers. That is a lot of work that goes well beyond this patch.
I am not suggesting to change any assumptions, merely that we have a recvRetry in the tester port. It can be thrown away/empty. All I'm asking is that we stick to the normal port conventions. It should be trivial to add.
As I said in my previous comment, not adding ports to the retry list while running the ruby tester is behaviour that already exists. I do not think the bar for allowing this patch to be checked in is to fix other bugs beyond the scope of this patch. What you are suggesting, should be a different patch. One that requires additional testing.

Why are you putting up so much resistence to this change?
Post by Brad Beckmann
src/mem/ruby/system/RubyPort.cc, line 423
<http://reviews.gem5.org/r/2776/diff/1/?file=45140#file45140line423>
In the crossbar we only send a single retry at a time. It used to look like this, but imho it makes no sense. Once one port goes ahead the RubyPort will be busy, will it not?
Actually, when we attach a GPU to the RubyPort we create on port per work-item (lane of execution). Thus it is very common to retry many ports in a single cycle.
May ports with different destinations make sense, I think my confusion here is because the RubyPort is also an implicit crossbar. Actually, should we not make the RubyPort 1:1 and rely on the interconnect Ruby or a CrossBar to do the multiplexing. To me it seems this would solve a lot of the issues.
We have a lot of code that relies on the RubyPort and its associated children objects to coalesce requests. The RubyPort is a natural place to coalesce because the port-RubyPort interface is a per request/address packet-based interface and Ruby's memory is cache-blocked based.


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6249
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Andreas Hansson
2015-05-18 07:02:21 UTC
Permalink
Post by Brad Beckmann
src/mem/ruby/system/RubyPort.cc, line 423
<http://reviews.gem5.org/r/2776/diff/1/?file=45140#file45140line423>
In the crossbar we only send a single retry at a time. It used to look like this, but imho it makes no sense. Once one port goes ahead the RubyPort will be busy, will it not?
Actually, when we attach a GPU to the RubyPort we create on port per work-item (lane of execution). Thus it is very common to retry many ports in a single cycle.
May ports with different destinations make sense, I think my confusion here is because the RubyPort is also an implicit crossbar. Actually, should we not make the RubyPort 1:1 and rely on the interconnect Ruby or a CrossBar to do the multiplexing. To me it seems this would solve a lot of the issues.
We have a lot of code that relies on the RubyPort and its associated children objects to coalesce requests. The RubyPort is a natural place to coalesce because the port-RubyPort interface is a per request/address packet-based interface and Ruby's memory is cache-blocked based.
Ok. I still assume you do coalescing per port, not across ports?

In any case, we can save this one for later. I just find it unfortunate that we effectively end up maintaining two crossbar models, and that one is embedded in the RubyPort.


- Andreas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6249
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Brad Beckmann
2015-05-18 23:03:32 UTC
Permalink
Post by Brad Beckmann
src/mem/ruby/system/RubyPort.cc, line 423
<http://reviews.gem5.org/r/2776/diff/1/?file=45140#file45140line423>
In the crossbar we only send a single retry at a time. It used to look like this, but imho it makes no sense. Once one port goes ahead the RubyPort will be busy, will it not?
Actually, when we attach a GPU to the RubyPort we create on port per work-item (lane of execution). Thus it is very common to retry many ports in a single cycle.
May ports with different destinations make sense, I think my confusion here is because the RubyPort is also an implicit crossbar. Actually, should we not make the RubyPort 1:1 and rely on the interconnect Ruby or a CrossBar to do the multiplexing. To me it seems this would solve a lot of the issues.
We have a lot of code that relies on the RubyPort and its associated children objects to coalesce requests. The RubyPort is a natural place to coalesce because the port-RubyPort interface is a per request/address packet-based interface and Ruby's memory is cache-blocked based.
Ok. I still assume you do coalescing per port, not across ports?
In any case, we can save this one for later. I just find it unfortunate that we effectively end up maintaining two crossbar models, and that one is embedded in the RubyPort.
Do you mean per coalescing per RubyPort, not across RubyPorts? If so, yes.

Thanks for allowing this to pass. I assume this issue is closed. If it makes you feel better, please note that the coalescing logic implemented inside a GPU is often very different that the logic implemented in a NOC crossbar. I don't think we should expect to use the same logic in our model.


- Brad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6249
-----------------------------------------------------------
Post by Brad Beckmann
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 11, 2015, 10:28 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10833:e624796bae17
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_directory.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/CheckTable.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/cpu/testers/rubytest/RubyTester.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec
src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Tony Gutierrez
2015-05-26 19:19:29 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------

(Updated May 26, 2015, 12:19 p.m.)


Review request for Default.


Repository: gem5


Description (updated)
-------

Changeset 10806:c2d686d64b51
---------------------------
ruby: cleaner ruby tester support

This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.


Diffs (updated)
-----

src/cpu/testers/rubytest/Check.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/CheckTable.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/RubyTester.hh df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/RubyTester.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/RubyTester.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/mem/ruby/system/RubyPort.hh df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/mem/ruby/system/RubyPort.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/mem/ruby/system/Sequencer.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MOESI_hammer.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/example/ruby_random_test.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MESI_Three_Level.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MESI_Two_Level.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MI_example.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MOESI_CMP_directory.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MOESI_CMP_token.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d

Diff: http://reviews.gem5.org/r/2776/diff/


Testing
-------


Thanks,

Tony Gutierrez
Brad Beckmann
2015-05-28 23:01:40 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6412
-----------------------------------------------------------


The modifications to packet queue have been removed from this patch. Can the rest of this patch be checked in as is? There seems to be no consensus on the clk domain issue. Once we check in this patch, perhaps someone who feels passionate on the clk domain settings for the L1 cache can fix it in a better way.

- Brad Beckmann
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 26, 2015, 7:19 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10806:c2d686d64b51
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
src/cpu/testers/rubytest/Check.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/CheckTable.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/RubyTester.hh df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/RubyTester.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/RubyTester.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/mem/ruby/system/RubyPort.hh df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/mem/ruby/system/RubyPort.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/mem/ruby/system/Sequencer.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MOESI_hammer.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/example/ruby_random_test.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MESI_Three_Level.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MESI_Two_Level.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MI_example.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MOESI_CMP_directory.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MOESI_CMP_token.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Andreas Hansson
2015-05-29 07:25:41 UTC
Permalink
Post by Anthony Gutierrez
Post by Brad Beckmann
The modifications to packet queue have been removed from this patch. Can the rest of this patch be checked in as is? There seems to be no consensus on the clk domain issue. Once we check in this patch, perhaps someone who feels passionate on the clk domain settings for the L1 cache can fix it in a better way.
Thanks Brad. No objections from my side. Please go ahead.

I agree with Jason that we should follow up with some RubyPort soul searching. How can we make sure it is truly just a glue-logic shim and nothing else...


- Andreas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6412
-----------------------------------------------------------
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 26, 2015, 7:19 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10806:c2d686d64b51
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
src/cpu/testers/rubytest/Check.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/CheckTable.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/RubyTester.hh df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/RubyTester.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/RubyTester.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/mem/ruby/system/RubyPort.hh df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/mem/ruby/system/RubyPort.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/mem/ruby/system/Sequencer.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MOESI_hammer.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/example/ruby_random_test.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MESI_Three_Level.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MESI_Two_Level.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MI_example.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MOESI_CMP_directory.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MOESI_CMP_token.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Jason Power
2015-05-29 00:44:54 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6422
-----------------------------------------------------------

Ship it!


I'm going to voice my concern again for modifying the RubyPort to ignore retries instead of the RubyTester. But I'll take the critism that this is a case of asking the patch author to totally change their design to what I want.

We should fix the RubyPort abstraction at some point. Either it should be a transparent wrapper from "ports" to the "MandatoryQueues", or it should be more clear exactly what it's modeling that the classic networking (ports, crossbars, etc) don't already model.

- Jason Power
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 26, 2015, 7:19 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10806:c2d686d64b51
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
src/cpu/testers/rubytest/Check.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/CheckTable.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/RubyTester.hh df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/RubyTester.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/RubyTester.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/mem/ruby/system/RubyPort.hh df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/mem/ruby/system/RubyPort.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/mem/ruby/system/Sequencer.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MOESI_hammer.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/example/ruby_random_test.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MESI_Three_Level.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MESI_Two_Level.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MI_example.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MOESI_CMP_directory.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MOESI_CMP_token.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Joel Hestness
2015-05-29 15:14:51 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6431
-----------------------------------------------------------



configs/ruby/MESI_Three_Level.py (line 106)
<http://reviews.gem5.org/r/2776/#comment5524>

I think it's pretty clear from reviews that Nilay and I would prefer not to change the clock domains when simulating regular CPU cores. Most existing systems have L1s clocked at the CPU frequency, and gem5 should stay that way.

Can we just leave these changes out and fix the clock domains for the testers separately? The testers are the least common use case for gem5+Ruby.


- Joel Hestness
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated May 26, 2015, 7:19 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10806:c2d686d64b51
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
src/cpu/testers/rubytest/Check.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/CheckTable.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/RubyTester.hh df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/RubyTester.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/cpu/testers/rubytest/RubyTester.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/mem/ruby/system/RubyPort.hh df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/mem/ruby/system/RubyPort.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d
src/mem/ruby/system/Sequencer.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MOESI_hammer.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/example/ruby_random_test.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MESI_Three_Level.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MESI_Two_Level.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MI_example.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MOESI_CMP_directory.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
configs/ruby/MOESI_CMP_token.py df2aa91dba5b0f0baa351039f0802baad9ed8f1d
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Tony Gutierrez
2015-07-07 15:58:37 UTC
Permalink
Post by Anthony Gutierrez
configs/ruby/MESI_Three_Level.py, line 106
<http://reviews.gem5.org/r/2776/diff/2/?file=45426#file45426line106>
I think it's pretty clear from reviews that Nilay and I would prefer not to change the clock domains when simulating regular CPU cores. Most existing systems have L1s clocked at the CPU frequency, and gem5 should stay that way.
Can we just leave these changes out and fix the clock domains for the testers separately? The testers are the least common use case for gem5+Ruby.
A check has been added so that we don't overrun the end of the system.cpu list when using the tester.


- Tony


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6431
-----------------------------------------------------------
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated July 7, 2015, 8:55 a.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10877:58e937f1077b
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
src/mem/ruby/system/Sequencer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/example/ruby_random_test.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Three_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Two_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MI_example.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_directory.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_token.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_hammer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/Check.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/CheckTable.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Tony Gutierrez
2015-07-07 15:53:19 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------

(Updated July 7, 2015, 8:53 a.m.)


Review request for Default.


Repository: gem5


Description (updated)
-------

Changeset 10877:58e937f1077b
---------------------------
ruby: cleaner ruby tester support

This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.


Diffs (updated)
-----

configs/example/ruby_random_test.py 0684c3a6cee6ba44f22280e3defd3aca35bd3c18
configs/ruby/MESI_Three_Level.py 0684c3a6cee6ba44f22280e3defd3aca35bd3c18
configs/ruby/MESI_Two_Level.py 0684c3a6cee6ba44f22280e3defd3aca35bd3c18
configs/ruby/MI_example.py 0684c3a6cee6ba44f22280e3defd3aca35bd3c18
configs/ruby/MOESI_CMP_directory.py 0684c3a6cee6ba44f22280e3defd3aca35bd3c18
configs/ruby/MOESI_CMP_token.py 0684c3a6cee6ba44f22280e3defd3aca35bd3c18
configs/ruby/MOESI_hammer.py 0684c3a6cee6ba44f22280e3defd3aca35bd3c18
src/cpu/testers/rubytest/Check.cc 0684c3a6cee6ba44f22280e3defd3aca35bd3c18
src/cpu/testers/rubytest/CheckTable.cc 0684c3a6cee6ba44f22280e3defd3aca35bd3c18
src/cpu/testers/rubytest/RubyTester.hh 0684c3a6cee6ba44f22280e3defd3aca35bd3c18
src/cpu/testers/rubytest/RubyTester.cc 0684c3a6cee6ba44f22280e3defd3aca35bd3c18
src/cpu/testers/rubytest/RubyTester.py 0684c3a6cee6ba44f22280e3defd3aca35bd3c18
src/mem/ruby/system/RubyPort.hh 0684c3a6cee6ba44f22280e3defd3aca35bd3c18
src/mem/ruby/system/RubyPort.cc 0684c3a6cee6ba44f22280e3defd3aca35bd3c18
src/mem/ruby/system/Sequencer.py 0684c3a6cee6ba44f22280e3defd3aca35bd3c18

Diff: http://reviews.gem5.org/r/2776/diff/


Testing
-------


Thanks,

Tony Gutierrez
Tony Gutierrez
2015-07-07 15:55:25 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------

(Updated July 7, 2015, 8:55 a.m.)


Review request for Default.


Repository: gem5


Description
-------

Changeset 10877:58e937f1077b
---------------------------
ruby: cleaner ruby tester support

This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.


Diffs (updated)
-----

src/mem/ruby/system/Sequencer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/example/ruby_random_test.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Three_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Two_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MI_example.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_directory.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_token.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_hammer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/Check.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/CheckTable.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18

Diff: http://reviews.gem5.org/r/2776/diff/


Testing
-------


Thanks,

Tony Gutierrez
Andreas Hansson
2015-07-07 16:16:51 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6728
-----------------------------------------------------------



configs/ruby/MESI_Three_Level.py (line 103)
<http://reviews.gem5.org/r/2776/#comment5818>

Why is this needed? It seems the generic version using [i] should be equivalent.

This comment applies to all the config script changes.



src/cpu/testers/rubytest/CheckTable.cc (line 67)
<http://reviews.gem5.org/r/2776/#comment5819>

Not overly descriptive :-)



src/mem/ruby/system/RubyPort.hh (line 14)
<http://reviews.gem5.org/r/2776/#comment5820>

2009, 2013?



src/mem/ruby/system/RubyPort.cc (line 14)
<http://reviews.gem5.org/r/2776/#comment5821>

2009, 2013?



src/mem/ruby/system/RubyPort.cc (line 280)
<http://reviews.gem5.org/r/2776/#comment5822>

Surely that should only happen when returning true? In this case we know that we are not done.



src/mem/ruby/system/RubyPort.cc (line 291)
<http://reviews.gem5.org/r/2776/#comment5823>

I'd still suggest to keep consistent with the rest of gem5 and send the retry (and leave it to the RubyTester port to ignore it)


- Andreas Hansson
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated July 7, 2015, 3:55 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10877:58e937f1077b
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
src/mem/ruby/system/Sequencer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/example/ruby_random_test.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Three_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Two_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MI_example.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_directory.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_token.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_hammer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/Check.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/CheckTable.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Tony Gutierrez
2015-07-07 16:28:04 UTC
Permalink
Post by Anthony Gutierrez
configs/ruby/MESI_Three_Level.py, line 103
<http://reviews.gem5.org/r/2776/diff/4/?file=47827#file47827line103>
Why is this needed? It seems the generic version using [i] should be equivalent.
This comment applies to all the config script changes.
This is necessary because when using the tester there is always onlyone cpu object, thus system.cpu has length 1. However, the number of ports connected to the tester piggybacks off of the -n option. This causes a mismatch between len(system.cpu) and options.num_cpus. This is the easiest options to fix this bug without a more extensive refactoring of the tester and the Ruby configs.


- Tony


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6728
-----------------------------------------------------------
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated July 7, 2015, 8:55 a.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10877:58e937f1077b
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
src/mem/ruby/system/Sequencer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/example/ruby_random_test.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Three_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Two_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MI_example.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_directory.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_token.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_hammer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/Check.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/CheckTable.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Andreas Hansson
2015-07-07 16:31:12 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6730
-----------------------------------------------------------



configs/ruby/MESI_Three_Level.py (line 103)
<http://reviews.gem5.org/r/2776/#comment5825>

As far as I can tell the code is identical, with i replaced with 0. i would be 0 for the first item. Am I missing something? I see no mentioning of any ports in this block of code.


- Andreas Hansson
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated July 7, 2015, 3:55 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10877:58e937f1077b
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
src/mem/ruby/system/Sequencer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/example/ruby_random_test.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Three_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Two_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MI_example.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_directory.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_token.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_hammer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/Check.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/CheckTable.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Tony Gutierrez
2015-07-07 17:15:19 UTC
Permalink
Post by Anthony Gutierrez
configs/ruby/MESI_Three_Level.py, line 103
<http://reviews.gem5.org/r/2776/diff/4/?file=47827#file47827line103>
As far as I can tell the code is identical, with i replaced with 0. i would be 0 for the first item. Am I missing something? I see no mentioning of any ports in this block of code.
The problem is with the tester run script reusing the generic Ruby configs. options.num_cpus has different semantics for the tester; basically it is used to specify the number of cpu ports that are connected to the tester. You can also think of it has the number of virtual CPUs the tester represents. The tester, however has only 1 "cpu" i.e., system.cpu = tester, and thus len(system.cpu) = 1. The loop code is reused by the tester to implicitly set the number of L1s and sequencers equal to options.num_cpus, but for the tester this causes simulation to fail if you specify -n > 1. So we need this special case to set the clk_domain of the L1 controller and sequencers to the tester, which is system.cpu[0].

This is the quickest and least intrusive fix to this. Otherwise I think some significant refactoring would be necessary.


- Tony


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6730
-----------------------------------------------------------
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated July 7, 2015, 8:55 a.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10877:58e937f1077b
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
src/mem/ruby/system/Sequencer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/example/ruby_random_test.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Three_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Two_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MI_example.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_directory.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_token.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_hammer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/Check.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/CheckTable.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Andreas Hansson
2015-07-07 22:48:35 UTC
Permalink
Post by Tony Gutierrez
configs/ruby/MESI_Three_Level.py, line 103
<http://reviews.gem5.org/r/2776/diff/4/?file=47827#file47827line103>
As far as I can tell the code is identical, with i replaced with 0. i would be 0 for the first item. Am I missing something? I see no mentioning of any ports in this block of code.
The problem is with the tester run script reusing the generic Ruby configs. options.num_cpus has different semantics for the tester; basically it is used to specify the number of cpu ports that are connected to the tester. You can also think of it has the number of virtual CPUs the tester represents. The tester, however has only 1 "cpu" i.e., system.cpu = tester, and thus len(system.cpu) = 1. The loop code is reused by the tester to implicitly set the number of L1s and sequencers equal to options.num_cpus, but for the tester this causes simulation to fail if you specify -n > 1. So we need this special case to set the clk_domain of the L1 controller and sequencers to the tester, which is system.cpu[0].
This is the quickest and least intrusive fix to this. Otherwise I think some significant refactoring would be necessary.
Thanks for the clarification. Definitely not pretty (or maintainable), but please go ahead if you think this is the best way forward.

By the way, can we close out the issues that are indeed fixed so that it is easier to track? Thanks


- Andreas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6730
-----------------------------------------------------------
Post by Tony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated July 7, 2015, 10:44 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10877:19a630782b82
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Three_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Two_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MI_example.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_directory.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_token.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_hammer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/Check.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/CheckTable.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/Sequencer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Jason Power
2015-07-07 21:55:30 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6732
-----------------------------------------------------------



configs/ruby/MESI_Three_Level.py (line 103)
<http://reviews.gem5.org/r/2776/#comment5827>

I see why this is (unfortunately) needed. Can you at least add a comment in the code about why you have this check? I agree that this is not a good long-term solution, but no need to solve all our ruby problems in a single patch ;).



src/mem/ruby/system/RubyPort.hh (line 201)
<http://reviews.gem5.org/r/2776/#comment5828>

This should change to an if statement. See http://repo.gem5.org/gem5/rev/60eb3fef9c2d


- Jason Power
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated July 7, 2015, 3:55 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10877:58e937f1077b
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
src/mem/ruby/system/Sequencer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/example/ruby_random_test.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Three_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Two_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MI_example.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_directory.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_token.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_hammer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/Check.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/CheckTable.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Tony Gutierrez
2015-07-07 22:38:37 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------

(Updated July 7, 2015, 3:38 p.m.)


Review request for Default.


Repository: gem5


Description (updated)
-------

Changeset 10877:663554ecf21e
---------------------------
ruby: cleaner ruby tester support

This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.


Diffs (updated)
-----

configs/example/ruby_random_test.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Three_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Two_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MI_example.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_directory.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_token.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_hammer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/Check.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/CheckTable.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/Sequencer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18

Diff: http://reviews.gem5.org/r/2776/diff/


Testing
-------


Thanks,

Tony Gutierrez
Tony Gutierrez
2015-07-07 22:44:50 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------

(Updated July 7, 2015, 3:44 p.m.)


Review request for Default.


Repository: gem5


Description (updated)
-------

Changeset 10877:19a630782b82
---------------------------
ruby: cleaner ruby tester support

This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.


Diffs (updated)
-----

configs/example/ruby_random_test.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Three_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Two_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MI_example.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_directory.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_token.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_hammer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/Check.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/CheckTable.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/Sequencer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18

Diff: http://reviews.gem5.org/r/2776/diff/


Testing
-------


Thanks,

Tony Gutierrez
Nilay Vaish
2015-07-08 16:42:00 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6735
-----------------------------------------------------------



configs/ruby/MESI_Three_Level.py (lines 103 - 120)
<http://reviews.gem5.org/r/2776/#comment5833>

The comment + 3 more lines should be enough. No need to repeat the whole thing.


- Nilay Vaish
Post by Anthony Gutierrez
-----------------------------------------------------------
http://reviews.gem5.org/r/2776/
-----------------------------------------------------------
(Updated July 7, 2015, 10:44 p.m.)
Review request for Default.
Repository: gem5
Description
-------
Changeset 10877:19a630782b82
---------------------------
ruby: cleaner ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests. This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
The patch also includes better DPRINTFs and generalizes the retry behavior
needed by the ruby tester so that other testers/cpu models can use it as well.
Diffs
-----
configs/example/ruby_random_test.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Three_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MESI_Two_Level.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MI_example.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_directory.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_CMP_token.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
configs/ruby/MOESI_hammer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/Check.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/CheckTable.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/cpu/testers/rubytest/RubyTester.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/RubyPort.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
src/mem/ruby/system/Sequencer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18
Diff: http://reviews.gem5.org/r/2776/diff/
Testing
-------
Thanks,
Tony Gutierrez
Loading...