Joel Hestness
2016-02-01 23:24:04 UTC
Hi Andreas,
I'd like to circle back on the thread about removing the QueuedSlavePort
response queue from DRAMCtrl. I've been working to shift over to DRAMCtrl
from the RubyMemoryController, but nearly all of my simulations now crash
on the DRAMCtrl's response queue. Since I need the DRAMCtrl to work, I'll
be looking into this now. However, based on my inspection of the code, it
looks pretty non-trivial to remove the QueueSlavePort, so I'm hoping you
can at least help me work through the changes.
To reproduce the issue, I've put together a slim gem5 patch (attached) to
use the memtest.py script to generate accesses. Here's the command line I
used:
% build/X86/gem5.opt --debug-flag=DRAM --outdir=$outdir
configs/example/memtest.py -u 100
If you're still willing to take a stab at it, let me know if/how I can
help. Otherwise, I'll start working on it. It seems the trickiest thing is
going to be modeling the arbitrary frontendLatency and backendLatency while
still counting all of the accesses that are in the controller when it needs
to block back to the input queue. These latencies are currently assessed
with scheduling in the port response queue. Any suggestions you could give
would be appreciated.
Thanks!
Joel
Below here is our conversation from the email thread "[gem5-dev] Review
Request 3116: ruby: RubyMemoryControl delete requests"
On Wed, Sep 23, 2015 at 3:51 PM, Andreas Hansson <***@arm.com>
wrote:
> Great. Thanks Joel.
>
> If anything pops up on our side I’ll let you know.
>
> Andreas
>
> From: Joel Hestness <***@gmail.com>
> Date: Wednesday, 23 September 2015 20:29
>
> To: Andreas Hansson <***@arm.com>
> Cc: gem5 Developer List <gem5-***@gem5.org>
> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl
> delete requests
>
>
>
>> I don’t think there is any big difference in our expectations, quite the
>> contrary :-). GPUs are very important to us (and so is throughput computing
>> in general), and we run plenty simulations with lots of memory-level
>> parallelism from non-CPU components. Still, we haven’t run into the issue.
>>
>
> Ok, cool. Thanks for the context.
>
>
> If you have practical examples that run into problems let me know, and
>> we’ll get it fixed.
>>
>
> I'm having trouble assembling a practical example (with or without using
> gem5-gpu). I'll keep you posted if I find something reasonable.
>
> Thanks!
> Joel
>
>
>
>> From: Joel Hestness <***@gmail.com>
>> Date: Tuesday, 22 September 2015 19:58
>>
>> To: Andreas Hansson <***@arm.com>
>> Cc: gem5 Developer List <gem5-***@gem5.org>
>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl
>> delete requests
>>
>> Hi Andreas,
>>
>>
>>> If it is a real problem affecting end users I am indeed volunteering to
>>> fix the DRAMCtrl use of QueuedSlavePort. In the classic memory system there
>>> are enough points of regulation (LSQs, MSHR limits, crossbar layers etc)
>>> that having a single memory channel with >100 queued up responses waiting
>>> to be sent is extremely unlikely. Hence, until now the added complexity has
>>> not been needed. If there is regulation on the number of requests in Ruby,
>>> then I would argue that it is equally unlikely there…I could be wrong.
>>>
>>
>> Ok. I think a big part of the difference between our expectations is just
>> the cores that we're modeling. AMD and gem5-gpu can model aggressive GPU
>> cores with potential to expose, perhaps, 4-32x more memory-level parallel
>> requests than a comparable number of multithreaded CPU cores. I feel that
>> this difference warrants different handling of accesses in the memory
>> controller.
>>
>> Joel
>>
>>
>>
>> From: Joel Hestness <***@gmail.com>
>>> Date: Tuesday, 22 September 2015 17:48
>>>
>>> To: Andreas Hansson <***@arm.com>
>>> Cc: gem5 Developer List <gem5-***@gem5.org>
>>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl
>>> delete requests
>>>
>>> Hi Andreas,
>>>
>>> Thanks for the "ship it!"
>>>
>>>
>>>> Do we really need to remove the use of QueuedSlavePort in DRAMCtrl? It
>>>> will make the controller more complex, and I don’t want to do it “just in
>>>> case”.
>>>>
>>>
>>> Sorry, I misread your email as offering to change the DRAMCtrl. I'm not
>>> sure who should make that change, but I think it should get done. The
>>> memory access response path starts at the DRAMCtrl and ends at the
>>> RubyPort. If we add control flow to the RubyPort, packets will probably
>>> back-up more quickly on the response path back to where there are open
>>> buffers. I expect the DRAMCtrl QueuedPort problem becomes more prevalent as
>>> Ruby adds flow control, unless we add a limitation on outstanding requests
>>> to memory from directory controllers.
>>>
>>> How does the classic memory model deal with this?
>>>
>>> Joel
>>>
>>>
>>>
>>>> From: Joel Hestness <***@gmail.com>
>>>> Date: Tuesday, 22 September 2015 17:30
>>>> To: Andreas Hansson <***@arm.com>
>>>> Cc: gem5 Developer List <gem5-***@gem5.org>
>>>>
>>>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl
>>>> delete requests
>>>>
>>>> Hi guys,
>>>> Thanks for the discussion here. I had quickly tested other memory
>>>> controllers, but hadn't connected the dots that this might be the same
>>>> problem Brad/AMD are running into.
>>>>
>>>> My preference would be that we remove the QueuedSlavePort from the
>>>> DRAMCtrls. That would at least eliminate DRAMCtrls as a potential source of
>>>> the QueueSlavePort packet overflows, and would allow us to more closely
>>>> focus on the RubyPort problem when we get to it.
>>>>
>>>> Can we reach resolution on this patch though? Are we okay with
>>>> actually fixing the memory leak in mainline?
>>>>
>>>> Joel
>>>>
>>>>
>>>> On Tue, Sep 22, 2015 at 11:19 AM, Andreas Hansson <
>>>> ***@arm.com> wrote:
>>>>
>>>>> Hi Brad,
>>>>>
>>>>> We can remove the use of QueuedSlavePort in the memory controller and
>>>>> simply not accept requests if the response queue is full. Is this
>>>>> needed?
>>>>> If so we’ll make sure someone gets this in place. The only reason we
>>>>> haven’t done it is because it hasn’t been needed.
>>>>>
>>>>> The use of QueuedPorts in the Ruby adapters is a whole different
>>>>> story. I
>>>>> think most of these can be removed and actually use flow control. I’m
>>>>> happy to code it up, but there is such a flux at the moment that I
>>>>> didn’t
>>>>> want to post yet another patch changing the Ruby port. I really do
>>>>> think
>>>>> we should avoid having implicit buffers for 1000’s of kilobytes to the
>>>>> largest extend possible. If we really need a constructor parameter to
>>>>> make
>>>>> it “infinite” for some quirky Ruby use-case, then let’s do that...
>>>>>
>>>>> Andreas
>>>>>
>>>>>
>>>>> On 22/09/2015 17:14, "gem5-dev on behalf of Beckmann, Brad"
>>>>> <gem5-dev-***@gem5.org on behalf of ***@amd.com> wrote:
>>>>>
>>>>> >From AMD's perspective, we have deprecated our usage of
>>>>> RubyMemoryControl
>>>>> >and we are using the new Memory Controllers with the port interface.
>>>>> >
>>>>> >That being said, I completely agree with Joel that the packet queue
>>>>> >finite invisible buffer limit of 100 needs to go! As you know, we
>>>>> tried
>>>>> >very hard several months ago to essentially make this a infinite
>>>>> buffer,
>>>>> >but Andreas would not allow us to check it in. We are going to post
>>>>> that
>>>>> >patch again in a few weeks when we post our GPU model. Our GPU model
>>>>> >will not work unless we increase that limit.
>>>>> >
>>>>> >Andreas you keep arguing that if you exceed that limit, that
>>>>> something is
>>>>> >fundamentally broken. Please keep in mind that there are many uses of
>>>>> >gem5 beyond what you use it for. Also this is a research simulator
>>>>> and
>>>>> >we should not restrict ourselves to what we think is practical in real
>>>>> >hardware. Finally, the fact that the finite limit is invisible to the
>>>>> >producer is just bad software engineering.
>>>>> >
>>>>> >I beg you to please allow us to remove this finite invisible limit!
>>>>> >
>>>>> >Brad
>>>>> >
>>>>> >
>>>>> >
>>>>> >-----Original Message-----
>>>>> >From: gem5-dev [mailto:gem5-dev-***@gem5.org] On Behalf Of
>>>>> Andreas
>>>>> >Hansson
>>>>> >Sent: Tuesday, September 22, 2015 6:35 AM
>>>>> >To: Andreas Hansson; Default; Joel Hestness
>>>>> >Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl
>>>>> >delete requests
>>>>> >
>>>>> >
>>>>> >
>>>>> >> On Sept. 21, 2015, 8:42 a.m., Andreas Hansson wrote:
>>>>> >> > Can we just prune the whole RubyMemoryControl rather? Has it not
>>>>> been
>>>>> >>deprecated long enough?
>>>>> >>
>>>>> >> Joel Hestness wrote:
>>>>> >> Unless I'm overlooking something, for Ruby users, I don't see
>>>>> other
>>>>> >>memory controllers that are guaranteed to work. Besides
>>>>> >>RubyMemoryControl, all others use a QueuedSlavePort for their input
>>>>> >>queues. Given that Ruby hasn't added complete flow control,
>>>>> PacketQueue
>>>>> >>size restrictions can be exceeded (triggering the panic). This occurs
>>>>> >>infrequently/irregularly with aggressive GPUs in gem5-gpu, and
>>>>> appears
>>>>> >>difficult to fix in a systematic way.
>>>>> >>
>>>>> >> Regardless of the fact we've deprecated RubyMemoryControl, this
>>>>> is
>>>>> >>a necessary fix.
>>>>> >
>>>>> >No memory controller is using QueuedSlaavePort for any _input_ queues.
>>>>> >The DRAMCtrl class uses it for the response _output_ queue, that's
>>>>> all.
>>>>> >If that is really an issue we can move away from it and enfore an
>>>>> upper
>>>>> >bound on responses by not accepting new requests. That said, if we hit
>>>>> >the limit I would argue something else is fundamentally broken in the
>>>>> >system and should be addressed.
>>>>> >
>>>>> >In any case, the discussion whether to remove RubyMemoryControl or not
>>>>> >should be completely decoupled.
>>>>> >
>>>>> >
>>>>> >- Andreas
>>>>>
>>>>
--
Joel Hestness
PhD Candidate, Computer Architecture
Dept. of Computer Science, University of Wisconsin - Madison
http://pages.cs.wisc.edu/~hestness/
I'd like to circle back on the thread about removing the QueuedSlavePort
response queue from DRAMCtrl. I've been working to shift over to DRAMCtrl
from the RubyMemoryController, but nearly all of my simulations now crash
on the DRAMCtrl's response queue. Since I need the DRAMCtrl to work, I'll
be looking into this now. However, based on my inspection of the code, it
looks pretty non-trivial to remove the QueueSlavePort, so I'm hoping you
can at least help me work through the changes.
To reproduce the issue, I've put together a slim gem5 patch (attached) to
use the memtest.py script to generate accesses. Here's the command line I
used:
% build/X86/gem5.opt --debug-flag=DRAM --outdir=$outdir
configs/example/memtest.py -u 100
If you're still willing to take a stab at it, let me know if/how I can
help. Otherwise, I'll start working on it. It seems the trickiest thing is
going to be modeling the arbitrary frontendLatency and backendLatency while
still counting all of the accesses that are in the controller when it needs
to block back to the input queue. These latencies are currently assessed
with scheduling in the port response queue. Any suggestions you could give
would be appreciated.
Thanks!
Joel
Below here is our conversation from the email thread "[gem5-dev] Review
Request 3116: ruby: RubyMemoryControl delete requests"
On Wed, Sep 23, 2015 at 3:51 PM, Andreas Hansson <***@arm.com>
wrote:
> Great. Thanks Joel.
>
> If anything pops up on our side I’ll let you know.
>
> Andreas
>
> From: Joel Hestness <***@gmail.com>
> Date: Wednesday, 23 September 2015 20:29
>
> To: Andreas Hansson <***@arm.com>
> Cc: gem5 Developer List <gem5-***@gem5.org>
> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl
> delete requests
>
>
>
>> I don’t think there is any big difference in our expectations, quite the
>> contrary :-). GPUs are very important to us (and so is throughput computing
>> in general), and we run plenty simulations with lots of memory-level
>> parallelism from non-CPU components. Still, we haven’t run into the issue.
>>
>
> Ok, cool. Thanks for the context.
>
>
> If you have practical examples that run into problems let me know, and
>> we’ll get it fixed.
>>
>
> I'm having trouble assembling a practical example (with or without using
> gem5-gpu). I'll keep you posted if I find something reasonable.
>
> Thanks!
> Joel
>
>
>
>> From: Joel Hestness <***@gmail.com>
>> Date: Tuesday, 22 September 2015 19:58
>>
>> To: Andreas Hansson <***@arm.com>
>> Cc: gem5 Developer List <gem5-***@gem5.org>
>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl
>> delete requests
>>
>> Hi Andreas,
>>
>>
>>> If it is a real problem affecting end users I am indeed volunteering to
>>> fix the DRAMCtrl use of QueuedSlavePort. In the classic memory system there
>>> are enough points of regulation (LSQs, MSHR limits, crossbar layers etc)
>>> that having a single memory channel with >100 queued up responses waiting
>>> to be sent is extremely unlikely. Hence, until now the added complexity has
>>> not been needed. If there is regulation on the number of requests in Ruby,
>>> then I would argue that it is equally unlikely there…I could be wrong.
>>>
>>
>> Ok. I think a big part of the difference between our expectations is just
>> the cores that we're modeling. AMD and gem5-gpu can model aggressive GPU
>> cores with potential to expose, perhaps, 4-32x more memory-level parallel
>> requests than a comparable number of multithreaded CPU cores. I feel that
>> this difference warrants different handling of accesses in the memory
>> controller.
>>
>> Joel
>>
>>
>>
>> From: Joel Hestness <***@gmail.com>
>>> Date: Tuesday, 22 September 2015 17:48
>>>
>>> To: Andreas Hansson <***@arm.com>
>>> Cc: gem5 Developer List <gem5-***@gem5.org>
>>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl
>>> delete requests
>>>
>>> Hi Andreas,
>>>
>>> Thanks for the "ship it!"
>>>
>>>
>>>> Do we really need to remove the use of QueuedSlavePort in DRAMCtrl? It
>>>> will make the controller more complex, and I don’t want to do it “just in
>>>> case”.
>>>>
>>>
>>> Sorry, I misread your email as offering to change the DRAMCtrl. I'm not
>>> sure who should make that change, but I think it should get done. The
>>> memory access response path starts at the DRAMCtrl and ends at the
>>> RubyPort. If we add control flow to the RubyPort, packets will probably
>>> back-up more quickly on the response path back to where there are open
>>> buffers. I expect the DRAMCtrl QueuedPort problem becomes more prevalent as
>>> Ruby adds flow control, unless we add a limitation on outstanding requests
>>> to memory from directory controllers.
>>>
>>> How does the classic memory model deal with this?
>>>
>>> Joel
>>>
>>>
>>>
>>>> From: Joel Hestness <***@gmail.com>
>>>> Date: Tuesday, 22 September 2015 17:30
>>>> To: Andreas Hansson <***@arm.com>
>>>> Cc: gem5 Developer List <gem5-***@gem5.org>
>>>>
>>>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl
>>>> delete requests
>>>>
>>>> Hi guys,
>>>> Thanks for the discussion here. I had quickly tested other memory
>>>> controllers, but hadn't connected the dots that this might be the same
>>>> problem Brad/AMD are running into.
>>>>
>>>> My preference would be that we remove the QueuedSlavePort from the
>>>> DRAMCtrls. That would at least eliminate DRAMCtrls as a potential source of
>>>> the QueueSlavePort packet overflows, and would allow us to more closely
>>>> focus on the RubyPort problem when we get to it.
>>>>
>>>> Can we reach resolution on this patch though? Are we okay with
>>>> actually fixing the memory leak in mainline?
>>>>
>>>> Joel
>>>>
>>>>
>>>> On Tue, Sep 22, 2015 at 11:19 AM, Andreas Hansson <
>>>> ***@arm.com> wrote:
>>>>
>>>>> Hi Brad,
>>>>>
>>>>> We can remove the use of QueuedSlavePort in the memory controller and
>>>>> simply not accept requests if the response queue is full. Is this
>>>>> needed?
>>>>> If so we’ll make sure someone gets this in place. The only reason we
>>>>> haven’t done it is because it hasn’t been needed.
>>>>>
>>>>> The use of QueuedPorts in the Ruby adapters is a whole different
>>>>> story. I
>>>>> think most of these can be removed and actually use flow control. I’m
>>>>> happy to code it up, but there is such a flux at the moment that I
>>>>> didn’t
>>>>> want to post yet another patch changing the Ruby port. I really do
>>>>> think
>>>>> we should avoid having implicit buffers for 1000’s of kilobytes to the
>>>>> largest extend possible. If we really need a constructor parameter to
>>>>> make
>>>>> it “infinite” for some quirky Ruby use-case, then let’s do that...
>>>>>
>>>>> Andreas
>>>>>
>>>>>
>>>>> On 22/09/2015 17:14, "gem5-dev on behalf of Beckmann, Brad"
>>>>> <gem5-dev-***@gem5.org on behalf of ***@amd.com> wrote:
>>>>>
>>>>> >From AMD's perspective, we have deprecated our usage of
>>>>> RubyMemoryControl
>>>>> >and we are using the new Memory Controllers with the port interface.
>>>>> >
>>>>> >That being said, I completely agree with Joel that the packet queue
>>>>> >finite invisible buffer limit of 100 needs to go! As you know, we
>>>>> tried
>>>>> >very hard several months ago to essentially make this a infinite
>>>>> buffer,
>>>>> >but Andreas would not allow us to check it in. We are going to post
>>>>> that
>>>>> >patch again in a few weeks when we post our GPU model. Our GPU model
>>>>> >will not work unless we increase that limit.
>>>>> >
>>>>> >Andreas you keep arguing that if you exceed that limit, that
>>>>> something is
>>>>> >fundamentally broken. Please keep in mind that there are many uses of
>>>>> >gem5 beyond what you use it for. Also this is a research simulator
>>>>> and
>>>>> >we should not restrict ourselves to what we think is practical in real
>>>>> >hardware. Finally, the fact that the finite limit is invisible to the
>>>>> >producer is just bad software engineering.
>>>>> >
>>>>> >I beg you to please allow us to remove this finite invisible limit!
>>>>> >
>>>>> >Brad
>>>>> >
>>>>> >
>>>>> >
>>>>> >-----Original Message-----
>>>>> >From: gem5-dev [mailto:gem5-dev-***@gem5.org] On Behalf Of
>>>>> Andreas
>>>>> >Hansson
>>>>> >Sent: Tuesday, September 22, 2015 6:35 AM
>>>>> >To: Andreas Hansson; Default; Joel Hestness
>>>>> >Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl
>>>>> >delete requests
>>>>> >
>>>>> >
>>>>> >
>>>>> >> On Sept. 21, 2015, 8:42 a.m., Andreas Hansson wrote:
>>>>> >> > Can we just prune the whole RubyMemoryControl rather? Has it not
>>>>> been
>>>>> >>deprecated long enough?
>>>>> >>
>>>>> >> Joel Hestness wrote:
>>>>> >> Unless I'm overlooking something, for Ruby users, I don't see
>>>>> other
>>>>> >>memory controllers that are guaranteed to work. Besides
>>>>> >>RubyMemoryControl, all others use a QueuedSlavePort for their input
>>>>> >>queues. Given that Ruby hasn't added complete flow control,
>>>>> PacketQueue
>>>>> >>size restrictions can be exceeded (triggering the panic). This occurs
>>>>> >>infrequently/irregularly with aggressive GPUs in gem5-gpu, and
>>>>> appears
>>>>> >>difficult to fix in a systematic way.
>>>>> >>
>>>>> >> Regardless of the fact we've deprecated RubyMemoryControl, this
>>>>> is
>>>>> >>a necessary fix.
>>>>> >
>>>>> >No memory controller is using QueuedSlaavePort for any _input_ queues.
>>>>> >The DRAMCtrl class uses it for the response _output_ queue, that's
>>>>> all.
>>>>> >If that is really an issue we can move away from it and enfore an
>>>>> upper
>>>>> >bound on responses by not accepting new requests. That said, if we hit
>>>>> >the limit I would argue something else is fundamentally broken in the
>>>>> >system and should be addressed.
>>>>> >
>>>>> >In any case, the discussion whether to remove RubyMemoryControl or not
>>>>> >should be completely decoupled.
>>>>> >
>>>>> >
>>>>> >- Andreas
>>>>>
>>>>
--
Joel Hestness
PhD Candidate, Computer Architecture
Dept. of Computer Science, University of Wisconsin - Madison
http://pages.cs.wisc.edu/~hestness/