Discussion:
[gem5-dev] Review Request: PacketQueue: Add maxQueueDepth parameter and setter
(too old to reply)
Joel Hestness
2013-03-18 01:05:35 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/1777/
-----------------------------------------------------------

Review request for Default.


Description
-------

Changeset 9587:b989f6228833
---------------------------
PacketQueue: Add maxQueueDepth parameter and setter

The default maximum PacketQueue depth of 100 can cause assertion failures when
using the PacketQueue in a setting when the maximum queue depth may be known to
be greater than 100 (e.g. modeling queuing of outstanding requests to the cache
from aggressive GPU core implementations). Parameterize this max depth and add
a method to set the max depth.


Diffs
-----

src/mem/packet_queue.hh 3c62e3b7f658
src/mem/packet_queue.cc 3c62e3b7f658

Diff: http://reviews.gem5.org/r/1777/diff/


Testing
-------


Thanks,

Joel Hestness
Jason Power
2013-03-18 02:43:28 UTC
Permalink
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/1777/#review4135
-----------------------------------------------------------

Ship it!


Ship It!

- Jason Power
Post by Joel Hestness
-----------------------------------------------------------
http://reviews.gem5.org/r/1777/
-----------------------------------------------------------
(Updated March 17, 2013, 6:05 p.m.)
Review request for Default.
Description
-------
Changeset 9587:b989f6228833
---------------------------
PacketQueue: Add maxQueueDepth parameter and setter
The default maximum PacketQueue depth of 100 can cause assertion failures when
using the PacketQueue in a setting when the maximum queue depth may be known to
be greater than 100 (e.g. modeling queuing of outstanding requests to the cache
from aggressive GPU core implementations). Parameterize this max depth and add
a method to set the max depth.
Diffs
-----
src/mem/packet_queue.hh 3c62e3b7f658
src/mem/packet_queue.cc 3c62e3b7f658
Diff: http://reviews.gem5.org/r/1777/diff/
Testing
-------
Thanks,
Joel Hestness
Andreas Hansson
2013-03-18 07:32:01 UTC
Permalink
Post by Joel Hestness
Post by Jason Power
Ship It!
Sorry to cause problems here guys, but I think we need to fix this another way. The queued port is not supposed to be a "free" storage resource. If it runs out, we should rather block the cache.

Overall, I have been thinking about how we can turn the queued port into something that can signal back that it is blocking. It gets a bit tricky as you soon end up removing all the functionality of the queued port and end up with the send/retry style interface.

In short: I don't think we should ship this, and I'm keen to hear what people think is a reasonable way to make the queued port "block" it's MemObject (and start it again without polling).


- Andreas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/1777/#review4135
-----------------------------------------------------------
Post by Joel Hestness
-----------------------------------------------------------
http://reviews.gem5.org/r/1777/
-----------------------------------------------------------
(Updated March 17, 2013, 6:05 p.m.)
Review request for Default.
Description
-------
Changeset 9587:b989f6228833
---------------------------
PacketQueue: Add maxQueueDepth parameter and setter
The default maximum PacketQueue depth of 100 can cause assertion failures when
using the PacketQueue in a setting when the maximum queue depth may be known to
be greater than 100 (e.g. modeling queuing of outstanding requests to the cache
from aggressive GPU core implementations). Parameterize this max depth and add
a method to set the max depth.
Diffs
-----
src/mem/packet_queue.hh 3c62e3b7f658
src/mem/packet_queue.cc 3c62e3b7f658
Diff: http://reviews.gem5.org/r/1777/diff/
Testing
-------
Thanks,
Joel Hestness
Beckmann, Brad
2013-10-08 00:12:25 UTC
Permalink
Hi Andreas,

I know it has been several months since this topic was brought up, but I think it is worth trying to come up with a solution here. I also believe this relates to the gem5-users thread "PacketQueue sanity check on its invisible buffer size" started by Lichen Yu back in May.

Can you clarify why you are pushing back on making this size configurable? The size 100 seems pretty arbitrary and is far too small if you consider massively threaded requestors, such as GPUs. Also it doesn't make sense to make these queued ports both finite in size, yet their size is invisible to the requestors.

Ideally from the Ruby perspective, I would prefer us to simply remove the panic message from packet_queue.cc. There are other mechanisms in place to model finite buffering and the PacketQueue does not represent the type of hardware buffering we are envisioning.

Does that work for you?

Brad




-----Original Message-----
From: gem5-dev-bounces-1Gs4CP2/***@public.gmane.org [mailto:gem5-dev-bounces-1Gs4CP2/***@public.gmane.org] On Behalf Of Andreas Hansson
Sent: Monday, March 18, 2013 12:32 AM
To: Jason Power; Andreas Hansson; Default; Joel Hestness
Subject: Re: [gem5-dev] Review Request: PacketQueue: Add maxQueueDepth parameter and setter
Post by Joel Hestness
Post by Jason Power
Ship It!
Sorry to cause problems here guys, but I think we need to fix this another way. The queued port is not supposed to be a "free" storage resource. If it runs out, we should rather block the cache.

Overall, I have been thinking about how we can turn the queued port into something that can signal back that it is blocking. It gets a bit tricky as you soon end up removing all the functionality of the queued port and end up with the send/retry style interface.

In short: I don't think we should ship this, and I'm keen to hear what people think is a reasonable way to make the queued port "block" it's MemObject (and start it again without polling).


- Andreas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/1777/#review4135
-----------------------------------------------------------
Post by Joel Hestness
-----------------------------------------------------------
http://reviews.gem5.org/r/1777/
-----------------------------------------------------------
(Updated March 17, 2013, 6:05 p.m.)
Review request for Default.
Description
-------
Changeset 9587:b989f6228833
---------------------------
PacketQueue: Add maxQueueDepth parameter and setter
The default maximum PacketQueue depth of 100 can cause assertion failures when
using the PacketQueue in a setting when the maximum queue depth may be known to
be greater than 100 (e.g. modeling queuing of outstanding requests to the cache
from aggressive GPU core implementations). Parameterize this max depth and add
a method to set the max depth.
Diffs
-----
src/mem/packet_queue.hh 3c62e3b7f658
src/mem/packet_queue.cc 3c62e3b7f658
Diff: http://reviews.gem5.org/r/1777/diff/
Testing
-------
Thanks,
Joel Hestness
Andreas Hansson
2013-10-08 17:13:04 UTC
Permalink
Hi Brad,

I am actually inclined to say that the packet queue should only be used in
simple devices to model pipeline latency, and that the value 100 is
already too large. The only times I've hit this limit is when something in
the model is broken and not reacting properly. Recently, we switched the
simple fixed-latency memory model (SimpleMemory) to not use a queued port
for this very reason. Instead of "magically" storing lots of transactions
it now slows things down through back pressure of requests.

I suppose the issue you are seeing is in the cache? I am tempted to say we
should do something similar as what we did with SimpleMemory with the
cache and avoid using the queued ports. If that sounds reasonable to you I
will try and find some time to get this done.

If I'm misunderstanding the use-case causing the problem, I'd be keen to
know more.

Andreas
Post by Beckmann, Brad
Hi Andreas,
I know it has been several months since this topic was brought up, but I
think it is worth trying to come up with a solution here. I also believe
this relates to the gem5-users thread "PacketQueue sanity check on its
invisible buffer size" started by Lichen Yu back in May.
Can you clarify why you are pushing back on making this size
configurable? The size 100 seems pretty arbitrary and is far too small
if you consider massively threaded requestors, such as GPUs. Also it
doesn't make sense to make these queued ports both finite in size, yet
their size is invisible to the requestors.
Ideally from the Ruby perspective, I would prefer us to simply remove the
panic message from packet_queue.cc. There are other mechanisms in place
to model finite buffering and the PacketQueue does not represent the type
of hardware buffering we are envisioning.
Does that work for you?
Brad
-----Original Message-----
Behalf Of Andreas Hansson
Sent: Monday, March 18, 2013 12:32 AM
To: Jason Power; Andreas Hansson; Default; Joel Hestness
Subject: Re: [gem5-dev] Review Request: PacketQueue: Add maxQueueDepth parameter and setter
Post by Joel Hestness
Post by Jason Power
Ship It!
Sorry to cause problems here guys, but I think we need to fix this
another way. The queued port is not supposed to be a "free" storage
resource. If it runs out, we should rather block the cache.
Overall, I have been thinking about how we can turn the queued port into
something that can signal back that it is blocking. It gets a bit tricky
as you soon end up removing all the functionality of the queued port and
end up with the send/retry style interface.
In short: I don't think we should ship this, and I'm keen to hear what
people think is a reasonable way to make the queued port "block" it's
MemObject (and start it again without polling).
- Andreas
-----------------------------------------------------------
http://reviews.gem5.org/r/1777/#review4135
-----------------------------------------------------------
Post by Joel Hestness
-----------------------------------------------------------
http://reviews.gem5.org/r/1777/
-----------------------------------------------------------
(Updated March 17, 2013, 6:05 p.m.)
Review request for Default.
Description
-------
Changeset 9587:b989f6228833
---------------------------
PacketQueue: Add maxQueueDepth parameter and setter
The default maximum PacketQueue depth of 100 can cause assertion failures when
using the PacketQueue in a setting when the maximum queue depth may be known to
be greater than 100 (e.g. modeling queuing of outstanding requests to the cache
from aggressive GPU core implementations). Parameterize this max depth and add
a method to set the max depth.
Diffs
-----
src/mem/packet_queue.hh 3c62e3b7f658
src/mem/packet_queue.cc 3c62e3b7f658
Diff: http://reviews.gem5.org/r/1777/diff/
Testing
-------
Thanks,
Joel Hestness
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782
Beckmann, Brad
2013-10-09 00:06:47 UTC
Permalink
Yes, Andreas the Ruby interface uses the PacketQueue for the response packets after your changesets 8742 and 8914 (note the modifications to RubyPort.hh). Before you checked in these changes, the RubyPort used a simple timing port for response packets. I believe you checked in these changes so that retries would be avoided.

So what would it take to revert these changes and use a simple slave port for responses? It appears the code modifications are fairly straightforward, but I'm worried that the CPU models are not prepared to always "sink" responses. I guess the begs the second question of why would a CPU or GPU model not always "sink" a response packet?

Brad



-----Original Message-----
From: gem5-dev-bounces-1Gs4CP2/***@public.gmane.org [mailto:gem5-dev-bounces-1Gs4CP2/***@public.gmane.org] On Behalf Of Andreas Hansson
Sent: Tuesday, October 08, 2013 10:13 AM
To: gem5 Developer List; Jason Power; Joel Hestness
Subject: Re: [gem5-dev] Review Request: PacketQueue: Add maxQueueDepth parameter and setter

Hi Brad,

I am actually inclined to say that the packet queue should only be used in simple devices to model pipeline latency, and that the value 100 is already too large. The only times I've hit this limit is when something in the model is broken and not reacting properly. Recently, we switched the simple fixed-latency memory model (SimpleMemory) to not use a queued port for this very reason. Instead of "magically" storing lots of transactions it now slows things down through back pressure of requests.

I suppose the issue you are seeing is in the cache? I am tempted to say we should do something similar as what we did with SimpleMemory with the cache and avoid using the queued ports. If that sounds reasonable to you I will try and find some time to get this done.

If I'm misunderstanding the use-case causing the problem, I'd be keen to know more.

Andreas
Post by Beckmann, Brad
Hi Andreas,
I know it has been several months since this topic was brought up, but
I think it is worth trying to come up with a solution here. I also
believe this relates to the gem5-users thread "PacketQueue sanity check
on its invisible buffer size" started by Lichen Yu back in May.
Can you clarify why you are pushing back on making this size
configurable? The size 100 seems pretty arbitrary and is far too small
if you consider massively threaded requestors, such as GPUs. Also it
doesn't make sense to make these queued ports both finite in size, yet
their size is invisible to the requestors.
Ideally from the Ruby perspective, I would prefer us to simply remove
the panic message from packet_queue.cc. There are other mechanisms in
place to model finite buffering and the PacketQueue does not represent
the type of hardware buffering we are envisioning.
Does that work for you?
Brad
-----Original Message-----
Behalf Of Andreas Hansson
Sent: Monday, March 18, 2013 12:32 AM
To: Jason Power; Andreas Hansson; Default; Joel Hestness
Subject: Re: [gem5-dev] Review Request: PacketQueue: Add maxQueueDepth parameter and setter
Post by Joel Hestness
Post by Jason Power
Ship It!
Sorry to cause problems here guys, but I think we need to fix this
another way. The queued port is not supposed to be a "free" storage
resource. If it runs out, we should rather block the cache.
Overall, I have been thinking about how we can turn the queued port
into something that can signal back that it is blocking. It gets a bit
tricky as you soon end up removing all the functionality of the queued
port and end up with the send/retry style interface.
In short: I don't think we should ship this, and I'm keen to hear what
people think is a reasonable way to make the queued port "block" it's
MemObject (and start it again without polling).
- Andreas
-----------------------------------------------------------
http://reviews.gem5.org/r/1777/#review4135
-----------------------------------------------------------
Post by Joel Hestness
-----------------------------------------------------------
http://reviews.gem5.org/r/1777/
-----------------------------------------------------------
(Updated March 17, 2013, 6:05 p.m.)
Review request for Default.
Description
-------
Changeset 9587:b989f6228833
---------------------------
PacketQueue: Add maxQueueDepth parameter and setter
The default maximum PacketQueue depth of 100 can cause assertion
failures when using the PacketQueue in a setting when the maximum
queue depth may be known to be greater than 100 (e.g. modeling
queuing of outstanding requests to the cache from aggressive GPU core
implementations). Parameterize this max depth and add a method to set
the max depth.
Diffs
-----
src/mem/packet_queue.hh 3c62e3b7f658
src/mem/packet_queue.cc 3c62e3b7f658
Diff: http://reviews.gem5.org/r/1777/diff/
Testing
-------
Thanks,
Joel Hestness
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782
Andreas Hansson
2013-10-09 07:43:14 UTC
Permalink
Hi Brad,

The SimpleTimingPort is a packet queue, so the functionality would have
been the same (just with an infinite buffer). Hence, there is no point in
reverting any code. Instead we have to decide what is sensible to model
and then change it to reflect that.

What is the RubyPort actually modelling? Is is a network interface? If so,
if the master (CPU/GPU/device/etc) stalls the response, is there any
chance of propagating that stall? If the answer is no, then can we stall
new requests for coming from the master port instead (so that we know the
max number of responses, probably below 100 :-))? If the answer to both is
no, then we need an infinite queue in the RubyPort, which does feel rather
unfortunate (and unrealistic).

Is it hitCallback in RubyPort that is causing the issue? Is there any easy
way for me to reproduce it?

Thanks,

Andreas
Post by Beckmann, Brad
Yes, Andreas the Ruby interface uses the PacketQueue for the response
packets after your changesets 8742 and 8914 (note the modifications to
RubyPort.hh). Before you checked in these changes, the RubyPort used a
simple timing port for response packets. I believe you checked in these
changes so that retries would be avoided.
So what would it take to revert these changes and use a simple slave port
for responses? It appears the code modifications are fairly
straightforward, but I'm worried that the CPU models are not prepared to
always "sink" responses. I guess the begs the second question of why
would a CPU or GPU model not always "sink" a response packet?
Brad
-----Original Message-----
Behalf Of Andreas Hansson
Sent: Tuesday, October 08, 2013 10:13 AM
To: gem5 Developer List; Jason Power; Joel Hestness
Subject: Re: [gem5-dev] Review Request: PacketQueue: Add maxQueueDepth parameter and setter
Hi Brad,
I am actually inclined to say that the packet queue should only be used
in simple devices to model pipeline latency, and that the value 100 is
already too large. The only times I've hit this limit is when something
in the model is broken and not reacting properly. Recently, we switched
the simple fixed-latency memory model (SimpleMemory) to not use a queued
port for this very reason. Instead of "magically" storing lots of
transactions it now slows things down through back pressure of requests.
I suppose the issue you are seeing is in the cache? I am tempted to say
we should do something similar as what we did with SimpleMemory with the
cache and avoid using the queued ports. If that sounds reasonable to you
I will try and find some time to get this done.
If I'm misunderstanding the use-case causing the problem, I'd be keen to know more.
Andreas
Post by Beckmann, Brad
Hi Andreas,
I know it has been several months since this topic was brought up, but
I think it is worth trying to come up with a solution here. I also
believe this relates to the gem5-users thread "PacketQueue sanity check
on its invisible buffer size" started by Lichen Yu back in May.
Can you clarify why you are pushing back on making this size
configurable? The size 100 seems pretty arbitrary and is far too small
if you consider massively threaded requestors, such as GPUs. Also it
doesn't make sense to make these queued ports both finite in size, yet
their size is invisible to the requestors.
Ideally from the Ruby perspective, I would prefer us to simply remove
the panic message from packet_queue.cc. There are other mechanisms in
place to model finite buffering and the PacketQueue does not represent
the type of hardware buffering we are envisioning.
Does that work for you?
Brad
-----Original Message-----
Behalf Of Andreas Hansson
Sent: Monday, March 18, 2013 12:32 AM
To: Jason Power; Andreas Hansson; Default; Joel Hestness
Subject: Re: [gem5-dev] Review Request: PacketQueue: Add maxQueueDepth parameter and setter
Post by Joel Hestness
Post by Jason Power
Ship It!
Sorry to cause problems here guys, but I think we need to fix this
another way. The queued port is not supposed to be a "free" storage
resource. If it runs out, we should rather block the cache.
Overall, I have been thinking about how we can turn the queued port
into something that can signal back that it is blocking. It gets a bit
tricky as you soon end up removing all the functionality of the queued
port and end up with the send/retry style interface.
In short: I don't think we should ship this, and I'm keen to hear what
people think is a reasonable way to make the queued port "block" it's
MemObject (and start it again without polling).
- Andreas
-----------------------------------------------------------
http://reviews.gem5.org/r/1777/#review4135
-----------------------------------------------------------
Post by Joel Hestness
-----------------------------------------------------------
http://reviews.gem5.org/r/1777/
-----------------------------------------------------------
(Updated March 17, 2013, 6:05 p.m.)
Review request for Default.
Description
-------
Changeset 9587:b989f6228833
---------------------------
PacketQueue: Add maxQueueDepth parameter and setter
The default maximum PacketQueue depth of 100 can cause assertion
failures when using the PacketQueue in a setting when the maximum
queue depth may be known to be greater than 100 (e.g. modeling
queuing of outstanding requests to the cache from aggressive GPU core
implementations). Parameterize this max depth and add a method to set
the max depth.
Diffs
-----
src/mem/packet_queue.hh 3c62e3b7f658
src/mem/packet_queue.cc 3c62e3b7f658
Diff: http://reviews.gem5.org/r/1777/diff/
Testing
-------
Thanks,
Joel Hestness
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
-- IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended
recipient, please notify the sender immediately and do not disclose the
contents to any other person, use it for any purpose, or store or copy
the information in any medium. Thank you.
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
Registered in England & Wales, Company No: 2557590 ARM Holdings plc,
Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in
England & Wales, Company No: 2548782
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782
Beckmann, Brad
2013-10-10 05:20:22 UTC
Permalink
Thanks Andreas for the responses. I think we will eventually determine the best solution here if we keep this thread up. Let's keep the information flowing.

So your response brings up several thoughts on my end:

- SimpleTimingPort may be a packet queue today, but back when the RubyPort was created (see 6882) SimpleTimingPort looked more like a "raw" point-to-point port with no internal, invisible, multi-packet buffering. I think that is where we want to get back to but we need to figure out either how to handle deferred response packets or eliminate the possibility for cores to reject response packets. I would prefer the latter solution, but I'm hoping you or someone else can help explain why cores currently reject response packets. Or at least there is an expectation that responses may be delayed.

- RubyPort's master request port and it's slaver response port are not symmetric. It is only RubyPort's slave response port that use the packet queues. RubyPort's master port is a raw port that does return false under certain situations, including resource stalls.

- The RubyPort is used in realistic finite buffering scenarios, as well as scenarios that are unrealistic and that must use infinite buffering. The unrealistic, infinite buffering scenarios are needed for a couple reasons: 1) When a protocol is first developed, it is often helpful to first focus on getting the logic working correctly. Then when the initial logic is flushed out, performance tuning using finite buffering is done. 2) More importantly, we use testers that stress our protocols far beyond what can be achieved by realistic scenarios. In particular, we allow for infinite buffering so that requests latencies can be extended and bandwidth bursts can be extreme. This allows us to find protocol logic bugs early and makes supporting multiple protocols under a single infrastructure
practical. Writing protocol logic is very difficult and these aggressive, unrealistic testers are essential.

- The RubyPort is just an interface and it doesn't represent anything in hardware. Ruby's current assumption with hitcallbacks is that the requestor will not stall a response. The idea is that the requestor should have allocated resources ahead of time, before issuing the request to the memory system.

- Yes, multiple hitcallbacks occurring at the same time cause the packet queue within the RubyPort slave port to panic. Unfortunately it is not easy for me to provide you a simple way to reproduce the error. However, I don't know if it is that important for you to reproduce the error. Instead, I think we need to focus on how we resolve the bigger design inconsistencies in the RubyPort - M5 port interface.

Going forward I see a few options:

- Simply remove the panic message in packet queue. That is essential the quick fix we've done at AMD and I suspect that is what the Wisconsin folks have effectively done as well.
- Remove any possibility of cores to stall/reject response packets and keep the same hitcallback behavior as today.
- Modify hitcallback to support stalled responses. This will certainly take a lot of work, I think we need to better understand the assumptions that the cores have regarding retries of the same versus different packets.

Thoughts? I hope that made sense to you. Let me know if anything was unclear.

Thanks,

Brad


-----Original Message-----
From: gem5-dev-bounces-1Gs4CP2/***@public.gmane.org [mailto:gem5-dev-bounces-1Gs4CP2/***@public.gmane.org] On Behalf Of Andreas Hansson
Sent: Wednesday, October 09, 2013 12:43 AM
To: gem5 Developer List; Jason Power; Joel Hestness
Subject: Re: [gem5-dev] Review Request: PacketQueue: Add maxQueueDepth parameter and setter

Hi Brad,

The SimpleTimingPort is a packet queue, so the functionality would have been the same (just with an infinite buffer). Hence, there is no point in reverting any code. Instead we have to decide what is sensible to model and then change it to reflect that.

What is the RubyPort actually modelling? Is is a network interface? If so, if the master (CPU/GPU/device/etc) stalls the response, is there any chance of propagating that stall? If the answer is no, then can we stall new requests for coming from the master port instead (so that we know the max number of responses, probably below 100 :-))? If the answer to both is no, then we need an infinite queue in the RubyPort, which does feel rather unfortunate (and unrealistic).

Is it hitCallback in RubyPort that is causing the issue? Is there any easy way for me to reproduce it?

Thanks,

Andreas
Post by Beckmann, Brad
Yes, Andreas the Ruby interface uses the PacketQueue for the response
packets after your changesets 8742 and 8914 (note the modifications to
RubyPort.hh). Before you checked in these changes, the RubyPort used a
simple timing port for response packets. I believe you checked in
these changes so that retries would be avoided.
So what would it take to revert these changes and use a simple slave
port for responses? It appears the code modifications are fairly
straightforward, but I'm worried that the CPU models are not prepared
to always "sink" responses. I guess the begs the second question of
why would a CPU or GPU model not always "sink" a response packet?
Brad
-----Original Message-----
Behalf Of Andreas Hansson
Sent: Tuesday, October 08, 2013 10:13 AM
To: gem5 Developer List; Jason Power; Joel Hestness
Subject: Re: [gem5-dev] Review Request: PacketQueue: Add maxQueueDepth parameter and setter
Hi Brad,
I am actually inclined to say that the packet queue should only be used
in simple devices to model pipeline latency, and that the value 100 is
already too large. The only times I've hit this limit is when something
in the model is broken and not reacting properly. Recently, we switched
the simple fixed-latency memory model (SimpleMemory) to not use a
queued port for this very reason. Instead of "magically" storing lots
of transactions it now slows things down through back pressure of requests.
I suppose the issue you are seeing is in the cache? I am tempted to say
we should do something similar as what we did with SimpleMemory with
the cache and avoid using the queued ports. If that sounds reasonable
to you I will try and find some time to get this done.
If I'm misunderstanding the use-case causing the problem, I'd be keen to know more.
Andreas
Post by Beckmann, Brad
Hi Andreas,
I know it has been several months since this topic was brought up, but
I think it is worth trying to come up with a solution here. I also
believe this relates to the gem5-users thread "PacketQueue sanity
check on its invisible buffer size" started by Lichen Yu back in May.
Can you clarify why you are pushing back on making this size
configurable? The size 100 seems pretty arbitrary and is far too
small if you consider massively threaded requestors, such as GPUs.
Also it doesn't make sense to make these queued ports both finite in
size, yet their size is invisible to the requestors.
Ideally from the Ruby perspective, I would prefer us to simply remove
the panic message from packet_queue.cc. There are other mechanisms in
place to model finite buffering and the PacketQueue does not represent
the type of hardware buffering we are envisioning.
Does that work for you?
Brad
-----Original Message-----
Behalf Of Andreas Hansson
Sent: Monday, March 18, 2013 12:32 AM
To: Jason Power; Andreas Hansson; Default; Joel Hestness
Subject: Re: [gem5-dev] Review Request: PacketQueue: Add maxQueueDepth parameter and setter
Post by Joel Hestness
Post by Jason Power
Ship It!
Sorry to cause problems here guys, but I think we need to fix this
another way. The queued port is not supposed to be a "free" storage
resource. If it runs out, we should rather block the cache.
Overall, I have been thinking about how we can turn the queued port
into something that can signal back that it is blocking. It gets a bit
tricky as you soon end up removing all the functionality of the queued
port and end up with the send/retry style interface.
In short: I don't think we should ship this, and I'm keen to hear what
people think is a reasonable way to make the queued port "block" it's
MemObject (and start it again without polling).
- Andreas
-----------------------------------------------------------
http://reviews.gem5.org/r/1777/#review4135
-----------------------------------------------------------
Post by Joel Hestness
-----------------------------------------------------------
http://reviews.gem5.org/r/1777/
-----------------------------------------------------------
(Updated March 17, 2013, 6:05 p.m.)
Review request for Default.
Description
-------
Changeset 9587:b989f6228833
---------------------------
PacketQueue: Add maxQueueDepth parameter and setter
The default maximum PacketQueue depth of 100 can cause assertion
failures when using the PacketQueue in a setting when the maximum
queue depth may be known to be greater than 100 (e.g. modeling
queuing of outstanding requests to the cache from aggressive GPU
core implementations). Parameterize this max depth and add a method
to set the max depth.
Diffs
-----
src/mem/packet_queue.hh 3c62e3b7f658
src/mem/packet_queue.cc 3c62e3b7f658
Diff: http://reviews.gem5.org/r/1777/diff/
Testing
-------
Thanks,
Joel Hestness
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
-- IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended
recipient, please notify the sender immediately and do not disclose the
contents to any other person, use it for any purpose, or store or copy
the information in any medium. Thank you.
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
Registered in England & Wales, Company No: 2557590 ARM Holdings plc,
Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in
England & Wales, Company No: 2548782
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782
Andreas Hansson
2013-10-10 07:59:48 UTC
Permalink
Hi Brad,

I'm sure we'll get there :-)

1) When it comes to the SimpleTimingPort, I am not sure I follow. Even for
the change set you mention, it has an unbounded transmit list. Perhaps I'm
missing something, but the only way it can simplify flow control is to
stick an infinite buffer somewhere...

2) To make life a bit simpler, I tidied up the RubyPort flow control and
routing towards the normal gem5 ports: http://reviews.gem5.org/r/2039/ I
hope this patch can be the start of a more targeted effort to fix your
issue.

3) What requestor is it that is stalling your responses? It is not simply
an issue of clock speeds or similar causing responses to queue up? I could
also imagine that even if a CPU/$ or similar would have space for a
response, it might not sink it straight away. Typically, reserving
response space is only to guarantee the consumption assumption, and simply
promising that it will eventually be consumed without needing any requests
to go out. Agreed?

Concerning your suggestions:

4) Removing the panic I'm strongly against, as it has already unearthed
unintentional behaviour and bugs in other places.

5) See (3) concerning the removing of possibility to stall responses. I
don't think this is a desirable or achievable assumption.

6) Ultimately, I think the ports have to deal with flow control. A simple
fix after my aforementioned patch would be to control the number of
outstanding transactions in the RubyPort. I guess 100 would already be a
very large value?


Andreas
Post by Beckmann, Brad
Thanks Andreas for the responses. I think we will eventually determine
the best solution here if we keep this thread up. Let's keep the
information flowing.
- SimpleTimingPort may be a packet queue today, but back when the
RubyPort was created (see 6882) SimpleTimingPort looked more like a "raw"
point-to-point port with no internal, invisible, multi-packet buffering.
I think that is where we want to get back to but we need to figure out
either how to handle deferred response packets or eliminate the
possibility for cores to reject response packets. I would prefer the
latter solution, but I'm hoping you or someone else can help explain why
cores currently reject response packets. Or at least there is an
expectation that responses may be delayed.
- RubyPort's master request port and it's slaver response port are not
symmetric. It is only RubyPort's slave response port that use the packet
queues. RubyPort's master port is a raw port that does return false
under certain situations, including resource stalls.
- The RubyPort is used in realistic finite buffering scenarios, as well
as scenarios that are unrealistic and that must use infinite buffering.
The unrealistic, infinite buffering scenarios are needed for a couple
reasons: 1) When a protocol is first developed, it is often helpful to
first focus on getting the logic working correctly. Then when the
initial logic is flushed out, performance tuning using finite buffering
is done. 2) More importantly, we use testers that stress our protocols
far beyond what can be achieved by realistic scenarios. In particular,
we allow for infinite buffering so that requests latencies can be
extended and bandwidth bursts can be extreme. This allows us to find
protocol logic bugs early and makes supporting multiple protocols under a
single infrastructure practical. Writing protocol logic is very
difficult and these aggressive, unrealistic testers are essential.
- The RubyPort is just an interface and it doesn't represent anything in
hardware. Ruby's current assumption with hitcallbacks is that the
requestor will not stall a response. The idea is that the requestor
should have allocated resources ahead of time, before issuing the request
to the memory system.
- Yes, multiple hitcallbacks occurring at the same time cause the packet
queue within the RubyPort slave port to panic. Unfortunately it is not
easy for me to provide you a simple way to reproduce the error. However,
I don't know if it is that important for you to reproduce the error.
Instead, I think we need to focus on how we resolve the bigger design
inconsistencies in the RubyPort - M5 port interface.
- Simply remove the panic message in packet queue. That is essential the
quick fix we've done at AMD and I suspect that is what the Wisconsin
folks have effectively done as well.
- Remove any possibility of cores to stall/reject response packets and
keep the same hitcallback behavior as today.
- Modify hitcallback to support stalled responses. This will certainly
take a lot of work, I think we need to better understand the assumptions
that the cores have regarding retries of the same versus different
packets.
Thoughts? I hope that made sense to you. Let me know if anything was unclear.
Thanks,
Brad
-----Original Message-----
Behalf Of Andreas Hansson
Sent: Wednesday, October 09, 2013 12:43 AM
To: gem5 Developer List; Jason Power; Joel Hestness
Subject: Re: [gem5-dev] Review Request: PacketQueue: Add maxQueueDepth parameter and setter
Hi Brad,
The SimpleTimingPort is a packet queue, so the functionality would have
been the same (just with an infinite buffer). Hence, there is no point in
reverting any code. Instead we have to decide what is sensible to model
and then change it to reflect that.
What is the RubyPort actually modelling? Is is a network interface? If
so, if the master (CPU/GPU/device/etc) stalls the response, is there any
chance of propagating that stall? If the answer is no, then can we stall
new requests for coming from the master port instead (so that we know the
max number of responses, probably below 100 :-))? If the answer to both
is no, then we need an infinite queue in the RubyPort, which does feel
rather unfortunate (and unrealistic).
Is it hitCallback in RubyPort that is causing the issue? Is there any
easy way for me to reproduce it?
Thanks,
Andreas
Post by Beckmann, Brad
Yes, Andreas the Ruby interface uses the PacketQueue for the response
packets after your changesets 8742 and 8914 (note the modifications to
RubyPort.hh). Before you checked in these changes, the RubyPort used a
simple timing port for response packets. I believe you checked in
these changes so that retries would be avoided.
So what would it take to revert these changes and use a simple slave
port for responses? It appears the code modifications are fairly
straightforward, but I'm worried that the CPU models are not prepared
to always "sink" responses. I guess the begs the second question of
why would a CPU or GPU model not always "sink" a response packet?
Brad
-----Original Message-----
Behalf Of Andreas Hansson
Sent: Tuesday, October 08, 2013 10:13 AM
To: gem5 Developer List; Jason Power; Joel Hestness
Subject: Re: [gem5-dev] Review Request: PacketQueue: Add maxQueueDepth parameter and setter
Hi Brad,
I am actually inclined to say that the packet queue should only be used
in simple devices to model pipeline latency, and that the value 100 is
already too large. The only times I've hit this limit is when something
in the model is broken and not reacting properly. Recently, we switched
the simple fixed-latency memory model (SimpleMemory) to not use a
queued port for this very reason. Instead of "magically" storing lots
of transactions it now slows things down through back pressure of requests.
I suppose the issue you are seeing is in the cache? I am tempted to say
we should do something similar as what we did with SimpleMemory with
the cache and avoid using the queued ports. If that sounds reasonable
to you I will try and find some time to get this done.
If I'm misunderstanding the use-case causing the problem, I'd be keen to know more.
Andreas
Post by Beckmann, Brad
Hi Andreas,
I know it has been several months since this topic was brought up, but
I think it is worth trying to come up with a solution here. I also
believe this relates to the gem5-users thread "PacketQueue sanity
check on its invisible buffer size" started by Lichen Yu back in May.
Can you clarify why you are pushing back on making this size
configurable? The size 100 seems pretty arbitrary and is far too
small if you consider massively threaded requestors, such as GPUs.
Also it doesn't make sense to make these queued ports both finite in
size, yet their size is invisible to the requestors.
Ideally from the Ruby perspective, I would prefer us to simply remove
the panic message from packet_queue.cc. There are other mechanisms in
place to model finite buffering and the PacketQueue does not represent
the type of hardware buffering we are envisioning.
Does that work for you?
Brad
-----Original Message-----
Behalf Of Andreas Hansson
Sent: Monday, March 18, 2013 12:32 AM
To: Jason Power; Andreas Hansson; Default; Joel Hestness
Subject: Re: [gem5-dev] Review Request: PacketQueue: Add maxQueueDepth
parameter and setter
Post by Joel Hestness
Post by Jason Power
Ship It!
Sorry to cause problems here guys, but I think we need to fix this
another way. The queued port is not supposed to be a "free" storage
resource. If it runs out, we should rather block the cache.
Overall, I have been thinking about how we can turn the queued port
into something that can signal back that it is blocking. It gets a bit
tricky as you soon end up removing all the functionality of the queued
port and end up with the send/retry style interface.
In short: I don't think we should ship this, and I'm keen to hear what
people think is a reasonable way to make the queued port "block" it's
MemObject (and start it again without polling).
- Andreas
-----------------------------------------------------------
http://reviews.gem5.org/r/1777/#review4135
-----------------------------------------------------------
Post by Joel Hestness
-----------------------------------------------------------
http://reviews.gem5.org/r/1777/
-----------------------------------------------------------
(Updated March 17, 2013, 6:05 p.m.)
Review request for Default.
Description
-------
Changeset 9587:b989f6228833
---------------------------
PacketQueue: Add maxQueueDepth parameter and setter
The default maximum PacketQueue depth of 100 can cause assertion
failures when using the PacketQueue in a setting when the maximum
queue depth may be known to be greater than 100 (e.g. modeling
queuing of outstanding requests to the cache from aggressive GPU
core implementations). Parameterize this max depth and add a method
to set the max depth.
Diffs
-----
src/mem/packet_queue.hh 3c62e3b7f658
src/mem/packet_queue.cc 3c62e3b7f658
Diff: http://reviews.gem5.org/r/1777/diff/
Testing
-------
Thanks,
Joel Hestness
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
-- IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended
recipient, please notify the sender immediately and do not disclose the
contents to any other person, use it for any purpose, or store or copy
the information in any medium. Thank you.
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
Registered in England & Wales, Company No: 2557590 ARM Holdings plc,
Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in
England & Wales, Company No: 2548782
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
-- IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended
recipient, please notify the sender immediately and do not disclose the
contents to any other person, use it for any purpose, or store or copy
the information in any medium. Thank you.
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
Registered in England & Wales, Company No: 2557590 ARM Holdings plc,
Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in
England & Wales, Company No: 2548782
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782
Loading...