Discussion:
GEM5 Issues
(too old to reply)
Beckmann, Brad
2009-07-31 00:46:37 UTC
Permalink
Hi All,

Tushar and I have noticed a few issues with GEM5 and we were wondering
if anyone had plans to fix them. If not, we'll go ahead and check in
fixes to the global repository. I just want to make sure we're not
stepping on anyone's toes.

- Fix SLICC so that protocol generated files are created in separate
protocol specific folders. Currently they are created in the same
mem/protocol folder overwriting older files (eg.
build/ALPHA_SE/mem/protocol//ControllerFactory.cc).
- Fully integrate SLICC parsing errors into scons. Right now the error
messages don't give any useful info. Should we hold off on fixing this
until SLICCer is added to GEM5?
- Support broadcast protocols with NetDest ruby functions. I notices
that one of Derek's recent checkins removed all the NetDest functions in
RubySlicc_ComponentMapping.hh. Most of them were lot old, protocol
specific, and needed to go. However there were a few general ones that
are needed by any broadcast based protocol. Are there plans to add
these back into the tree?
- I appreciate that a lot of the controller assumptions on the ruby side
have been removed. However, it appears that Ruby now assumes a DMA
controller is present. Is there plans to clean this up.

Again, we're happy to fix this stuff up ourselves. I just want to be
sure there aren't plans already in progress.

Thanks,

Brad
nathan binkert
2009-07-31 00:55:27 UTC
Permalink
I have no plans to fix any of this, but please do not just check in
fixes without sending out patches for review. I'm particularly
interested in the first two. For the generated protocol code, are you
trying to simultaneously compile multiple protocols into the same
binary or just trying to make it so you don't have to recompile if you
switch between the two. I don't know if you can do the latter without
doing the former because of how scons does dependencies. I can say
that I do know how to do the former.

For the error messages, I see no reason not to fix this now.

Nate
Post by Beckmann, Brad
Tushar and I have noticed a few issues with GEM5 and we were wondering
if anyone had plans to fix them.  If not, we'll go ahead and check in
fixes to the global repository.  I just want to make sure we're not
stepping on anyone's toes.
- Fix SLICC so that protocol generated files are created in separate
protocol specific folders. Currently they are created in the same
mem/protocol folder overwriting older files (eg.
build/ALPHA_SE/mem/protocol//ControllerFactory.cc).
- Fully integrate SLICC parsing errors into scons.  Right now the error
messages don't give any useful info.  Should we hold off on fixing this
until SLICCer is added to GEM5?
- Support broadcast protocols with NetDest ruby functions.  I notices
that one of Derek's recent checkins removed all the NetDest functions in
RubySlicc_ComponentMapping.hh.  Most of them were lot old, protocol
specific, and needed to go.  However there were a few general ones that
are needed by any broadcast based protocol.  Are there plans to add
these back into the tree?
- I appreciate that a lot of the controller assumptions on the ruby side
have been removed.  However, it appears that Ruby now assumes a DMA
controller is present.  Is there plans to clean this up.
Again, we're happy to fix this stuff up ourselves.  I just want to be
sure there aren't plans already in progress.
Thanks,
Brad
_______________________________________________
m5-dev mailing list
http://m5sim.org/mailman/listinfo/m5-dev
Derek Hower
2009-07-31 05:34:29 UTC
Permalink
---------- Forwarded message ----------
From: Derek Hower <derek.hower-***@public.gmane.org>
To: M5 Developer List <m5-dev-***@public.gmane.org>
Date: Thu, 30 Jul 2009 23:24:08 -0500
Subject: Re: [m5-dev] GEM5 Issues
Post by Beckmann, Brad
Hi All,
Tushar and I have noticed a few issues with GEM5 and we were wondering
if anyone had plans to fix them.  If not, we'll go ahead and check in
fixes to the global repository.  I just want to make sure we're not
stepping on anyone's toes.
Glad to get some help!!
Post by Beckmann, Brad
- Fix SLICC so that protocol generated files are created in separate
protocol specific folders. Currently they are created in the same
mem/protocol folder overwriting older files (eg.
build/ALPHA_SE/mem/protocol//ControllerFactory.cc).
Go ahead with this one.
Post by Beckmann, Brad
- Fully integrate SLICC parsing errors into scons.  Right now the error
messages don't give any useful info.  Should we hold off on fixing this
until SLICCer is added to GEM5?
I wouldn't hold off.  SLICCer is still indefinitely postponed, at
least on my end.  I don't expect to work on it until November-ish.
Even after SLICCer is around, I think it would be wise to maintain
SLICC support for a while so this would be a good fix.
Post by Beckmann, Brad
- Support broadcast protocols with NetDest ruby functions.  I notices
that one of Derek's recent checkins removed all the NetDest functions in
RubySlicc_ComponentMapping.hh.  Most of them were lot old, protocol
specific, and needed to go.  However there were a few general ones that
are needed by any broadcast based protocol.  Are there plans to add
these back into the tree?
Those weren't just removed because they are old.  A lot of them were
removed because they call a function similar to
"mapL2toDirectoryNode," which assumes that there *is* an L2 in the
system.  We can put these functions back in (and I was planning on
doing it as protocols are revisited), but we need to do it in a more
generic way.

For example, one of the things we are doing to get MESI_CMP_directory
working again is removing map_L1CacheMachId_to_L2cache and replacing
it with mapToAddressRange(low_bit, high_bit).  Low bit and high bit
are set at configuration time, and correspond to the address bits that
select the L2 bank.

If you could be more specific about which functions you're concerned
about, we could brainstorm and come up with a new way to do the
mapping.
Post by Beckmann, Brad
- I appreciate that a lot of the controller assumptions on the ruby side
have been removed.  However, it appears that Ruby now assumes a DMA
controller is present.  Is there plans to clean this up.
I don't think Ruby actually assumes that one is present (if it does
it shouldn't).  The existing configuration files certainly do, though,
mostly because Rocks can't run without one.  If you create a
configuration that doesn't instantiate a DMA controller, I think
you'll be OK.

The reason Rocks needs a DMA controller is mostly due to the fact that
Ruby handles data now and can't just ignore DMA accesses like it used
to.  I don't know enough about what M5 does with DMA to know if that
is a problem or not.
Post by Beckmann, Brad
Again, we're happy to fix this stuff up ourselves.  I just want to be
sure there aren't plans already in progress.
Thanks,
Brad
_______________________________________________
m5-dev mailing list
http://m5sim.org/mailman/listinfo/m5-dev
Beckmann, Brad
2009-08-01 00:40:04 UTC
Permalink
Thanks Derek and Nate for the replies. Overall, it sounds like no one is directly working on the four issues I outlined below, so I'll put them on my to do list. The NetDest and DMA controller issues are most pressing, so I'll work on those first. Later on I'll work on the SLICC errors and protocol folder issues. Below is a more thorough response to your questions.

- Supporting multiple protocol folders.

To answer Nate's question, it is the later reason. Because SLICC generates different classes with the same name depending on the protocol, I don't think it is possible to former. Could we generate all protocol files outside of the build directory and then tell scons to copy only the necessary protocols files to the appropriate build directory? For example, have SLICC generate the MI_example and MESI_CMP_directory protocols in the respective directories: src/mem/protocol/MI_example and src/mem/protocol/MESI_CMP_directory. Then have scons copy the MI_example directory to the build directory: build/ALPHA_SE_MI_example/mem/protocol and copy the MESI_CMP_directory to build/ALPHA_SE_MESI_CMP_directory/mem/protocol without copying the other protocol's generated files. Therefore, scons can maintain that the m5 executables under the directories build/ALPHA_SE_MI_example/ and build/ALPHA_SE_MESI_CMP_directory/ depend on only the files listed in those directories. Does that make sense? Will that work with scons?

- SLICC generation parsing error support in scons

Of the four issues I listed, this is definitely the lowest priority, so I'm happy to put it on the back burner. I just wanted to point out right now the ParseError just provides a file name and the LexToken that describes the error. If you know that the third parameter of the LexToken is the line number, then that is all the information an experience SLICC programmer needs to debug. However, I'm concerned for novice SLICC programmers who may need more information.

- NetDest support in RubySlicc_ComponentMapping.hh

The functionality I'm particularly concerned about is setting the destination of a message so that it broadcasts to all L1 caches or all directories. Before, all of those helper functions were in RubySlicc_ComponentMapping.hh. If we don't want those helper functions on the ruby slide any more, we could add them to the generated MachineType.cc file by hacking the "if (m_isMachineType)" statement in slicc/symbols/Type.cc. It wouldn't be the first hack added to that file or the ugliest hack in SLICC. :) If you have a suggestion for a more elegant solution, I'm happy to hear it.

- DMA assumptions in Ruby.

If you look at line 84 in my version of ruby/slicc_interface/RubySlicc_ComponentMapping.hh, you'll see a direct reference to DMA. I am confused whether we're still sticking to the strategy to use "#define"'s for machine types to handle these errors, or do you have plans to deprecate the "#define"'s? I thought part of the motivation to remove the NetDest functions in RubySlicc_ComponentMapping.hh was because they used the "#define"'s? Even if you add a DMA #define, The ruby/system/DMASequencer.cc relies on the DMARequestMsg type and other various DMA classes. Could the DMASequencer use the CacheMsg class instead...we could rename CacheMsg to SequencerMsg? Or we could just move DMARequest and the other DMA support to RubySlicc_Exports.sm?

Brad


-----Original Message-----
From: m5-dev-bounces-***@public.gmane.org [mailto:m5-dev-bounces-***@public.gmane.org] On Behalf Of Derek Hower
Sent: Thursday, July 30, 2009 10:34 PM
To: M5 Developer List
Subject: [m5-dev] Fwd: GEM5 Issues

---------- Forwarded message ----------
From: Derek Hower <derek.hower-***@public.gmane.org>
To: M5 Developer List <m5-dev-***@public.gmane.org>
Date: Thu, 30 Jul 2009 23:24:08 -0500
Subject: Re: [m5-dev] GEM5 Issues
Post by Beckmann, Brad
Hi All,
Tushar and I have noticed a few issues with GEM5 and we were wondering
if anyone had plans to fix them.  If not, we'll go ahead and check in
fixes to the global repository.  I just want to make sure we're not
stepping on anyone's toes.
Glad to get some help!!
Post by Beckmann, Brad
- Fix SLICC so that protocol generated files are created in separate
protocol specific folders. Currently they are created in the same
mem/protocol folder overwriting older files (eg.
build/ALPHA_SE/mem/protocol//ControllerFactory.cc).
Go ahead with this one.
Post by Beckmann, Brad
- Fully integrate SLICC parsing errors into scons.  Right now the error
messages don't give any useful info.  Should we hold off on fixing this
until SLICCer is added to GEM5?
I wouldn't hold off.  SLICCer is still indefinitely postponed, at
least on my end.  I don't expect to work on it until November-ish.
Even after SLICCer is around, I think it would be wise to maintain
SLICC support for a while so this would be a good fix.
Post by Beckmann, Brad
- Support broadcast protocols with NetDest ruby functions.  I notices
that one of Derek's recent checkins removed all the NetDest functions in
RubySlicc_ComponentMapping.hh.  Most of them were lot old, protocol
specific, and needed to go.  However there were a few general ones that
are needed by any broadcast based protocol.  Are there plans to add
these back into the tree?
Those weren't just removed because they are old.  A lot of them were
removed because they call a function similar to
"mapL2toDirectoryNode," which assumes that there *is* an L2 in the
system.  We can put these functions back in (and I was planning on
doing it as protocols are revisited), but we need to do it in a more
generic way.

For example, one of the things we are doing to get MESI_CMP_directory
working again is removing map_L1CacheMachId_to_L2cache and replacing
it with mapToAddressRange(low_bit, high_bit).  Low bit and high bit
are set at configuration time, and correspond to the address bits that
select the L2 bank.

If you could be more specific about which functions you're concerned
about, we could brainstorm and come up with a new way to do the
mapping.
Post by Beckmann, Brad
- I appreciate that a lot of the controller assumptions on the ruby side
have been removed.  However, it appears that Ruby now assumes a DMA
controller is present.  Is there plans to clean this up.
I don't think Ruby actually assumes that one is present (if it does
it shouldn't).  The existing configuration files certainly do, though,
mostly because Rocks can't run without one.  If you create a
configuration that doesn't instantiate a DMA controller, I think
you'll be OK.

The reason Rocks needs a DMA controller is mostly due to the fact that
Ruby handles data now and can't just ignore DMA accesses like it used
to.  I don't know enough about what M5 does with DMA to know if that
is a problem or not.
Post by Beckmann, Brad
Again, we're happy to fix this stuff up ourselves.  I just want to be
sure there aren't plans already in progress.
Thanks,
Brad
_______________________________________________
m5-dev mailing list
http://m5sim.org/mailman/listinfo/m5-dev
nathan binkert
2009-08-01 01:18:19 UTC
Permalink
To answer Nate's question, it is the later reason.  Because SLICC generates
different classes with the same name depending on the protocol, I don't think
it is possible to former.
The difficulty with the former is not the filenames, but symbols. It
would be easy enough to do src/mem/protocol/MI_example and
src/mem/protocol/MESI_CMP_directory and pick which set of
autogenerated files to link into the binary based on the PROTOCOL
variable. The question that I don't know is if SCons will force a
recompile of those files anyway when you switch PROTOCOL. We could
try. This would not give you two different m5.opt binaries when you
compile, but would potentially save time when you switched between
them by not recompiling stuff.
Could we generate all protocol files outside of the build directory and
then tell scons to copy only the necessary protocols files to the
appropriate build directory?
I don't think so. Scons tends to be smart about things and it gets in
your way when you try to play tricks. Right now, we do have a similar
problem when we build say both ALPHA_SE and ALPHA_FS, but what
actually happens under the covers is that every file that is shared
between the two (without any differing #defines) is still compiled
twice. I'm nearly done with a big change to SCons that fixes this
problem. I can work the protocol into this framework as well.

The real solution in the long run is to allow multiple protocols to be
compiled into a single M5 binary. This isn't hugely difficult, but
would require some work because as things stand now, there will be
symbol conflicts. The solution is to make generated code and code
that uses generated code take the PROTOCOL as a template parameter.
I'll have to look through ruby/slicc to see how much work this will
be.
- SLICC generation parsing error support in scons
Of the four issues I listed, this is definitely the lowest priority, so I'm happy to
put it on the back burner.  I just wanted to point out right now the ParseError
just provides a file name and the LexToken that describes the error.  If you
know that the third parameter of the LexToken is the line number, then that
is all the information an experience SLICC programmer needs to debug.
 However, I'm concerned for novice SLICC programmers who may need
more information.
This has to do with the fact that there are actually two slicc
parsers. One written in python used by SCons, and a second which is
the actual slicc parser. I created the first because SCons needs to
know the names of the files that slicc will generate and I couldn't
actually get that information without running slicc. I tried to hack
it in, but in the end, it was just easier to do what I did.
Generally, syntax errors are the only thing that should give you these
sorts of error messages and I can improve them. If there are worse
errors, slicc itself will run and give you the more detailed error
message. Is this situation really bad? Does SLICC give better
messages (other than just reformatting them to include the line and
column number) on syntax errors?

I was intending to go back and look at slicc itself and see if I could
modularize it a bit better to do what the python code does. I could
also see what I can do about generating code that uses templates.

Nate
Beckmann, Brad
2009-08-04 19:30:27 UTC
Permalink
Now that the other GEM5 issues are mostly resolved (thanks everyone!), I wanted to discuss the protocol directory issue in more detail. I'm completely fine with recompiling all files when changing protocols. I would just like the ability to have two different M5 binaries, using different protocols, co-exist in the same M5 tree. A solution similar to the current ALPHA_SE and ALPHA_FS binaries would be great.

I want to better understand the motivation to compile multiple protocols into the same M5 binary. In my opinion, the comes down to the tradeoff of compile complexity and time vs. binary flexibility and testing. From my perspective, there is a lot of work involved to change the Ruby/SLICC interface to support a protocol template parameter. Though I'm sure with enough effort, it could be done. I assume that each protocol could be selectively compiled into the m5 binary and there would be some way to identify what specific protocols a M5 binary supports. Is that correct? So I'm curious to understand why we want to treat protocols differently from the compile time options like the ISA and SE vs. FS? All of these features seem to involve a lot of specific code used only by that feature. To me that seems to best match making them all compile time features, but I probably don't understand all the issues involved.

Brad


-----Original Message-----
From: m5-dev-bounces-***@public.gmane.org [mailto:m5-dev-bounces-***@public.gmane.org] On Behalf Of nathan binkert
Sent: Friday, July 31, 2009 6:18 PM
To: M5 Developer List
Subject: Re: [m5-dev] Fwd: GEM5 Issues
To answer Nate's question, it is the later reason.  Because SLICC generates
different classes with the same name depending on the protocol, I don't think
it is possible to former.
The difficulty with the former is not the filenames, but symbols. It
would be easy enough to do src/mem/protocol/MI_example and
src/mem/protocol/MESI_CMP_directory and pick which set of
autogenerated files to link into the binary based on the PROTOCOL
variable. The question that I don't know is if SCons will force a
recompile of those files anyway when you switch PROTOCOL. We could
try. This would not give you two different m5.opt binaries when you
compile, but would potentially save time when you switched between
them by not recompiling stuff.
Could we generate all protocol files outside of the build directory and
then tell scons to copy only the necessary protocols files to the
appropriate build directory?
I don't think so. Scons tends to be smart about things and it gets in
your way when you try to play tricks. Right now, we do have a similar
problem when we build say both ALPHA_SE and ALPHA_FS, but what
actually happens under the covers is that every file that is shared
between the two (without any differing #defines) is still compiled
twice. I'm nearly done with a big change to SCons that fixes this
problem. I can work the protocol into this framework as well.

The real solution in the long run is to allow multiple protocols to be
compiled into a single M5 binary. This isn't hugely difficult, but
would require some work because as things stand now, there will be
symbol conflicts. The solution is to make generated code and code
that uses generated code take the PROTOCOL as a template parameter.
I'll have to look through ruby/slicc to see how much work this will
be.
nathan binkert
2009-08-04 20:33:46 UTC
Permalink
Now that the other GEM5 issues are mostly resolved (thanks everyone!), I wanted to discuss the protocol directory issue in more detail.  I'm completely fine with recompiling all files when changing protocols.  I would just like the ability to have two different M5 binaries, using different protocols, co-exist in the same M5 tree.  A solution similar to the current ALPHA_SE and ALPHA_FS binaries would be great.
This can be done right now by using a different build directory for
each protocol. You're used to having a scons target of
"build/ALPHA_FS/...", you can however do something like
build/<protocol>/build/ALPHA_FS and that will cause things to be built
in a different directory. You can even specify a full path that is
not a subdirectory of the m5 dir.

I should note that there is a limitation in the current SCons system
in that the directory that ALPHA_FS, etc lives in must be called
build. (The new version our stuff will have this limitation removed.)
I want to better understand the motivation to compile multiple protocols into the same M5 binary.  In my opinion, the comes down to the tradeoff of compile complexity and time vs. binary flexibility and testing.  From my perspective, there is a lot of work involved to change the Ruby/SLICC interface to support a protocol template parameter.  Though I'm sure with enough effort, it could be done.  I assume that each protocol could be selectively compiled into the m5 binary and there would be some way to identify what specific protocols a M5 binary supports.  Is that correct?  So I'm curious to understand why we want to treat protocols differently from the compile time options like the ISA and SE vs. FS?  All of these features seem to involve a lot of specific code used only by that feature.  To me that seems to best match making them all compile time features, but I probably don't understand all the issues involved.
I actually want to get rid of ISA and SE/FS as compile time
parameters. It's a long term goal, but I'd prefer not to do things
that make it worse. Basically, the problem is that m5 currently
rebuilds everything if any compile time parameter changes. Because of
this, as we add more compile time parameters, less of the simulator
will be properly tested. (I'm pretty sure that few people compile all
combinations of ISA and emulation.) As protocols multiply, this will
get worse. Another downside of having multiple binaries is that it's
easier to accidentally have one that is out of date. This can be
particularly troublesome when running hundreds of simulations.

I'm nearly done with a big overhaul of our SCons code that at least
will not recompile files that don't depend on ISA or emulation or
protocol. (the #defines will have to be explicitly #included and
options to the Source directive in scons determine which #defines are
available). This will significantly improve compile time and will
probably solve most of the problem. We could stop here, but once this
is done, it might not be hugely difficult to compile in multiple
protocols if we just wrap the code in a protocol dependent namespace.
Similarly, this might be a short cut for getting rid of ISA and FS/SE.

Nate
Steve Reinhardt
2009-08-04 21:03:32 UTC
Permalink
Post by nathan binkert
Now that the other GEM5 issues are mostly resolved (thanks everyone!), I wanted to discuss the protocol directory issue in more detail.  I'm completely fine with recompiling all files when changing protocols.  I would just like the ability to have two different M5 binaries, using different protocols, co-exist in the same M5 tree.  A solution similar to the current ALPHA_SE and ALPHA_FS binaries would be great.
This can be done right now by using a different build directory for
each protocol.  You're used to having a scons target of
"build/ALPHA_FS/...", you can however do something like
build/<protocol>/build/ALPHA_FS and that will cause things to be built
in a different directory.  You can even specify a full path that is
not a subdirectory of the m5 dir.
I don't think even this is necessary... if you say something like

scons default=ALPHA_FS build/ALPHA_FS_P1/m5.opt PROTOCOL=P1
scons default=ALPHA_FS build/ALPHA_FS_P2/m5.opt PROTOCOL=P2

... that should set up ALPHA_FS_P1 and ALPHA_FS_P2 to do just what you
want (without needing the extra args after the first time, i.e., just
say "scons build/ALPHA_FS_P1/m5.opt").

This is all off the top of my head, so the syntax may be slightly off
(esp. the protocol specification part), but the basic idea is
supported.

There's a litte more detail on this page; look for the bullet that
starts "You can define your own configurations":

http://m5sim.org/wiki/index.php/SCons_build_system


Steve
Derek Hower
2009-08-05 17:11:40 UTC
Permalink
I tried this with the command:

scons -j 12 DEFAULT=ALPHA_SE
build/LIBRUBY_MOESI_CMP_directory/libm5_opt.so
PROTOCOL=MOESI_CMP_directory RUBY=True USE_MYSQL=No

but it fails with the following:

Error: cannot find variables file
/afs/cs.wisc.edu/p/multifacet/users/drh5/gem5/build/variables/LIBRUBY_MOESI_CMP_directory
or build_opts/LIBRUBY_MOESI_CMP_directory

Is there something else that needs to be done with this variables file?

-Derek
Post by Steve Reinhardt
Post by nathan binkert
Now that the other GEM5 issues are mostly resolved (thanks everyone!), I wanted to discuss the protocol directory issue in more detail.  I'm completely fine with recompiling all files when changing protocols.  I would just like the ability to have two different M5 binaries, using different protocols, co-exist in the same M5 tree.  A solution similar to the current ALPHA_SE and ALPHA_FS binaries would be great.
This can be done right now by using a different build directory for
each protocol.  You're used to having a scons target of
"build/ALPHA_FS/...", you can however do something like
build/<protocol>/build/ALPHA_FS and that will cause things to be built
in a different directory.  You can even specify a full path that is
not a subdirectory of the m5 dir.
I don't think even this is necessary... if you say something like
scons default=ALPHA_FS build/ALPHA_FS_P1/m5.opt PROTOCOL=P1
scons default=ALPHA_FS build/ALPHA_FS_P2/m5.opt PROTOCOL=P2
... that should set up ALPHA_FS_P1 and ALPHA_FS_P2 to do just what you
want (without needing the extra args after the first time, i.e., just
say "scons build/ALPHA_FS_P1/m5.opt").
This is all off the top of my head, so the syntax may be slightly off
(esp. the protocol specification part), but the basic idea is
supported.
There's a litte more detail on this page; look for the bullet that
http://m5sim.org/wiki/index.php/SCons_build_system
Steve
_______________________________________________
m5-dev mailing list
http://m5sim.org/mailman/listinfo/m5-dev
Steve Reinhardt
2009-08-05 17:20:23 UTC
Permalink
Scons arguments are case-sensitive, and "default" should be lower case.

I don't know that it's an official convention, but loosely speaking
the all-uppercase arguments are m5 configuration switches, while the
lowercase ones control scons (or the build process) itself (the only
other one I can think of is update_ref, which updates the regression
outputs).

Steve
Post by Derek Hower
scons -j 12 DEFAULT=ALPHA_SE
build/LIBRUBY_MOESI_CMP_directory/libm5_opt.so
PROTOCOL=MOESI_CMP_directory RUBY=True USE_MYSQL=No
Error: cannot find variables file
/afs/cs.wisc.edu/p/multifacet/users/drh5/gem5/build/variables/LIBRUBY_MOESI_CMP_directory
or build_opts/LIBRUBY_MOESI_CMP_directory
Is there something else that needs to be done with this variables file?
-Derek
Post by Steve Reinhardt
Post by nathan binkert
Now that the other GEM5 issues are mostly resolved (thanks everyone!), I wanted to discuss the protocol directory issue in more detail.  I'm completely fine with recompiling all files when changing protocols.  I would just like the ability to have two different M5 binaries, using different protocols, co-exist in the same M5 tree.  A solution similar to the current ALPHA_SE and ALPHA_FS binaries would be great.
This can be done right now by using a different build directory for
each protocol.  You're used to having a scons target of
"build/ALPHA_FS/...", you can however do something like
build/<protocol>/build/ALPHA_FS and that will cause things to be built
in a different directory.  You can even specify a full path that is
not a subdirectory of the m5 dir.
I don't think even this is necessary... if you say something like
scons default=ALPHA_FS build/ALPHA_FS_P1/m5.opt PROTOCOL=P1
scons default=ALPHA_FS build/ALPHA_FS_P2/m5.opt PROTOCOL=P2
... that should set up ALPHA_FS_P1 and ALPHA_FS_P2 to do just what you
want (without needing the extra args after the first time, i.e., just
say "scons build/ALPHA_FS_P1/m5.opt").
This is all off the top of my head, so the syntax may be slightly off
(esp. the protocol specification part), but the basic idea is
supported.
There's a litte more detail on this page; look for the bullet that
http://m5sim.org/wiki/index.php/SCons_build_system
Steve
_______________________________________________
m5-dev mailing list
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
http://m5sim.org/mailman/listinfo/m5-dev
Derek Hower
2009-08-05 17:25:04 UTC
Permalink
Great, thanks Steve.
Post by Steve Reinhardt
Scons arguments are case-sensitive, and "default" should be lower case.
I don't know that it's an official convention, but loosely speaking
the all-uppercase arguments are m5 configuration switches, while the
lowercase ones control scons (or the build process) itself (the only
other one I can think of is update_ref, which updates the regression
outputs).
Steve
Post by Derek Hower
scons -j 12 DEFAULT=ALPHA_SE
build/LIBRUBY_MOESI_CMP_directory/libm5_opt.so
PROTOCOL=MOESI_CMP_directory RUBY=True USE_MYSQL=No
Error: cannot find variables file
/afs/cs.wisc.edu/p/multifacet/users/drh5/gem5/build/variables/LIBRUBY_MOESI_CMP_directory
or build_opts/LIBRUBY_MOESI_CMP_directory
Is there something else that needs to be done with this variables file?
-Derek
Post by Steve Reinhardt
Post by nathan binkert
Now that the other GEM5 issues are mostly resolved (thanks everyone!), I wanted to discuss the protocol directory issue in more detail.  I'm completely fine with recompiling all files when changing protocols.  I would just like the ability to have two different M5 binaries, using different protocols, co-exist in the same M5 tree.  A solution similar to the current ALPHA_SE and ALPHA_FS binaries would be great.
This can be done right now by using a different build directory for
each protocol.  You're used to having a scons target of
"build/ALPHA_FS/...", you can however do something like
build/<protocol>/build/ALPHA_FS and that will cause things to be built
in a different directory.  You can even specify a full path that is
not a subdirectory of the m5 dir.
I don't think even this is necessary... if you say something like
scons default=ALPHA_FS build/ALPHA_FS_P1/m5.opt PROTOCOL=P1
scons default=ALPHA_FS build/ALPHA_FS_P2/m5.opt PROTOCOL=P2
... that should set up ALPHA_FS_P1 and ALPHA_FS_P2 to do just what you
want (without needing the extra args after the first time, i.e., just
say "scons build/ALPHA_FS_P1/m5.opt").
This is all off the top of my head, so the syntax may be slightly off
(esp. the protocol specification part), but the basic idea is
supported.
There's a litte more detail on this page; look for the bullet that
http://m5sim.org/wiki/index.php/SCons_build_system
Steve
_______________________________________________
m5-dev mailing list
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
http://m5sim.org/mailman/listinfo/m5-dev
Derek Hower
2009-08-03 18:16:46 UTC
Permalink
Thanks Derek and Nate for the replies.  Overall, it sounds like no one is directly working on the four issues I outlined below, so I'll put them on my to do list.  The NetDest and DMA controller issues are most pressing, so I'll work on those first.  Later on I'll work on the SLICC errors and protocol folder issues.  Below is a more thorough response to your questions.
- Supporting multiple protocol folders.
To answer Nate's question, it is the later reason.  Because SLICC generates different classes with the same name depending on the protocol, I don't think it is possible to former.  Could we generate all protocol files outside of the build directory and then tell scons to copy only the necessary protocols files to the appropriate build directory?  For example, have SLICC generate the MI_example and MESI_CMP_directory protocols in the respective directories: src/mem/protocol/MI_example and src/mem/protocol/MESI_CMP_directory.  Then have scons copy the MI_example directory to the build directory: build/ALPHA_SE_MI_example/mem/protocol and copy the MESI_CMP_directory to build/ALPHA_SE_MESI_CMP_directory/mem/protocol without copying the other protocol's generated files.  Therefore, scons can maintain that the m5 executables under the directories build/ALPHA_SE_MI_example/ and build/ALPHA_SE_MESI_CMP_directory/ depend on only the files listed in those directories.  Does that make sense?  Will that work with scons?
- SLICC generation parsing error support in scons
Of the four issues I listed, this is definitely the lowest priority, so I'm happy to put it on the back burner.  I just wanted to point out right now the ParseError just provides a file name and the LexToken that describes the error.  If you know that the third parameter of the LexToken is the line number, then that is all the information an experience SLICC programmer needs to debug.  However, I'm concerned for novice SLICC programmers who may need more information.
I should point out that I've also run in to this problem, and it is
annoying enough that I'd really like to see it fixed sooner rather
than later. Right now I usually have to copy my changes to GEMS and
compile there to get useful error messages.
- NetDest support in RubySlicc_ComponentMapping.hh
The functionality I'm particularly concerned about is setting the destination of a message so that it broadcasts to all L1 caches or all directories.  Before, all of those helper functions were in RubySlicc_ComponentMapping.hh.  If we don't want those helper functions on the ruby slide any more, we could add them to the generated MachineType.cc file by hacking the "if (m_isMachineType)" statement in slicc/symbols/Type.cc.  It wouldn't be the first hack added to that file or the ugliest hack in SLICC.  :)  If you have a suggestion for a more elegant solution, I'm happy to hear it.
I think the right wat to do this would be to a broadcast(MachineType)
method in RubySlicc_ComponenMapping.hh that looks like:

NetDest broadcast(MachineType type) {
NetDest dest
for (int i=0; i<MachineType_base_count(type); i++) {
dest.add(MachineID(type, i));
}
return dest;
}

Then in SLICC have a line that looks like:

out_msg.Destination := broadcast(MachineType::L1Cache);

The problem is that MachineType::L1Cache is not legal SLICC code.
Adding this would require a change to parser as well as the code
generator. It would be a bit of work, and the question is, is it
worth it to make this fix given that SLICCer will (eventually) happen?
I'm not sure.

Perhaps a comprimise between right and fast would be to pass something
to broadcast that SLICC does know about. For example, you could pass
a string to broadcast and then use string_to_MachineType to get the
MachineType. Obviously, this is slow and you probably don't want to
do it, but something similar could work. (use
MachineType_from_base_level?).

Other ideas?
-  DMA assumptions in Ruby.
If you look at line 84 in my version of ruby/slicc_interface/RubySlicc_ComponentMapping.hh, you'll see a direct reference to DMA.  I am confused whether we're still sticking to the strategy to use "#define"'s for machine types to handle these errors, or do you have plans to deprecate the "#define"'s?  I thought part of the motivation to remove the NetDest functions in RubySlicc_ComponentMapping.hh was because they used the "#define"'s?  Even if you add a DMA #define, The ruby/system/DMASequencer.cc relies on the DMARequestMsg type and other various DMA classes.  Could the DMASequencer use the CacheMsg class instead...we could rename CacheMsg to SequencerMsg?  Or we could just move DMARequest and the other DMA support to RubySlicc_Exports.sm?
Yes, you're right. I'll work on getting those DMA assumptions out.
I plan to deprecate the #defines in favor of something more universal.

As for DMARequestMsg, those messages are used in the network as well
as in DMASequencer. Thus, they are really a combination of a normal
RequestMsg and a CacheMsg (e.g., they have a destination field), and
so at least for now I think should be left where they are.

I do like the idea of a generic SequencerMsg class, but am a little
worried that it would ultimately be confusing. We can certianly split
DMARequestMsg into two different types, one for the request to the DMA
sequencer and one for the request to the network, like CacheMsg and
RequestMsg. In fact, this is probably the right thing to do to get rid
of the DMA assumption. The request to the DMA Sequencer wouldn't look
exactly like a CacheMsg, though, because it needs an Offset and Len
field to account for block-unaligned accesses. (The cache Sequencer
doesn't need these because it updates data locally after a request
completes and handles alignment then. The DMA controller doesn't have
that luxury because the update happens at memory, not in the DMA
itself, and so it must propagate those fields to the protocol). Thus,
a generic SequencerMsg would have to have Offset and Len fields as
well, even though they wouldn't be used by the cache Sequencer. The
cache Sequencer *could* use those fields if it let the protocol do the
data update, but that would require changing all the protocols. The
change would be very minor, though. Make sense?
Brad
-----Original Message-----
Sent: Thursday, July 30, 2009 10:34 PM
To: M5 Developer List
Subject: [m5-dev] Fwd: GEM5 Issues
---------- Forwarded message ----------
Date: Thu, 30 Jul 2009 23:24:08 -0500
Subject: Re: [m5-dev] GEM5 Issues
Post by Beckmann, Brad
Hi All,
Tushar and I have noticed a few issues with GEM5 and we were wondering
if anyone had plans to fix them.  If not, we'll go ahead and check in
fixes to the global repository.  I just want to make sure we're not
stepping on anyone's toes.
Glad to get some help!!
Post by Beckmann, Brad
- Fix SLICC so that protocol generated files are created in separate
protocol specific folders. Currently they are created in the same
mem/protocol folder overwriting older files (eg.
build/ALPHA_SE/mem/protocol//ControllerFactory.cc).
Go ahead with this one.
Post by Beckmann, Brad
- Fully integrate SLICC parsing errors into scons.  Right now the error
messages don't give any useful info.  Should we hold off on fixing this
until SLICCer is added to GEM5?
I wouldn't hold off.  SLICCer is still indefinitely postponed, at
least on my end.  I don't expect to work on it until November-ish.
Even after SLICCer is around, I think it would be wise to maintain
SLICC support for a while so this would be a good fix.
Post by Beckmann, Brad
- Support broadcast protocols with NetDest ruby functions.  I notices
that one of Derek's recent checkins removed all the NetDest functions in
RubySlicc_ComponentMapping.hh.  Most of them were lot old, protocol
specific, and needed to go.  However there were a few general ones that
are needed by any broadcast based protocol.  Are there plans to add
these back into the tree?
Those weren't just removed because they are old.  A lot of them were
removed because they call a function similar to
"mapL2toDirectoryNode," which assumes that there *is* an L2 in the
system.  We can put these functions back in (and I was planning on
doing it as protocols are revisited), but we need to do it in a more
generic way.
For example, one of the things we are doing to get MESI_CMP_directory
working again is removing map_L1CacheMachId_to_L2cache and replacing
it with mapToAddressRange(low_bit, high_bit).  Low bit and high bit
are set at configuration time, and correspond to the address bits that
select the L2 bank.
If you could be more specific about which functions you're concerned
about, we could brainstorm and come up with a new way to do the
mapping.
Post by Beckmann, Brad
- I appreciate that a lot of the controller assumptions on the ruby side
have been removed.  However, it appears that Ruby now assumes a DMA
controller is present.  Is there plans to clean this up.
I don't think Ruby actually assumes that one is present (if it does
it shouldn't).  The existing configuration files certainly do, though,
mostly because Rocks can't run without one.  If you create a
configuration that doesn't instantiate a DMA controller, I think
you'll be OK.
The reason Rocks needs a DMA controller is mostly due to the fact that
Ruby handles data now and can't just ignore DMA accesses like it used
to.  I don't know enough about what M5 does with DMA to know if that
is a problem or not.
Post by Beckmann, Brad
Again, we're happy to fix this stuff up ourselves.  I just want to be
sure there aren't plans already in progress.
Thanks,
Brad
_______________________________________________
m5-dev mailing list
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
http://m5sim.org/mailman/listinfo/m5-dev
nathan binkert
2009-08-03 19:29:49 UTC
Permalink
Post by Derek Hower
I should point out that I've also run in to this problem, and it is
annoying enough that I'd really like to see it fixed sooner rather
than later. Right now I usually have to copy my changes to GEMS and
compile there to get useful error messages.
Can you give me an example of an error that I can introduce that
should give a better error message? I can try this out and try to get
something in there that helps out.

Nate
Derek Hower
2009-08-03 19:39:21 UTC
Permalink
Just remove a semicolon from an action statement.

GEM5 prints

TypeError: exceptions must be classes, instances, or strings
(deprecated), not type:
File "/afs/cs.wisc.edu/p/multifacet/users/drh5/gem5/SConstruct", line 898:
exports = 'env')
File "/s/scons-1.1.0/amd64_rhel5/lib/scons-1.1.0.d20081207/SCons/Script/SConscript.py",
line 612:
return apply(method, args, kw)
File "/s/scons-1.1.0/amd64_rhel5/lib/scons-1.1.0.d20081207/SCons/Script/SConscript.py",
line 549:
return apply(_SConscript, [self.fs,] + files, subst_kw)
File "/s/scons-1.1.0/amd64_rhel5/lib/scons-1.1.0.d20081207/SCons/Script/SConscript.py",
line 259:
exec _file_ in call_stack[-1].globals
File "/afs/cs.wisc.edu/p/multifacet/users/drh5/gem5/build/ALPHA_SE/SConscript",
line 252:
SConscript(joinpath(root, 'SConscript'), build_dir=build_dir)
File "/s/scons-1.1.0/amd64_rhel5/lib/scons-1.1.0.d20081207/SCons/Script/SConscript.py",
line 612:
return apply(method, args, kw)
File "/s/scons-1.1.0/amd64_rhel5/lib/scons-1.1.0.d20081207/SCons/Script/SConscript.py",
line 549:
return apply(_SConscript, [self.fs,] + files, subst_kw)
File "/s/scons-1.1.0/amd64_rhel5/lib/scons-1.1.0.d20081207/SCons/Script/SConscript.py",
line 259:
exec _file_ in call_stack[-1].globals
File "/afs/cs.wisc.edu/p/multifacet/users/drh5/gem5/build/ALPHA_SE/mem/protocol/SConscript",
line 82:
hh, cc = scan([s.srcnode().abspath for s in sm_files])
File "/afs/cs.wisc.edu/p/multifacet/users/drh5/gem5/src/mem/slicc/parser/parser.py",
line 547:
raise type(e), tuple([filename] + [ i for i in e ])

Where GEMS would print

../protocols/MOESI_CMP_directory-L1cache.sm:483: syntax error,
unexpected IDENT at out_msg
make[1]: *** [generated/MOESI_CMP_directory_m/generated] Error 1
make[1]: Leaving directory `/afs/cs.wisc.edu/u/d/r/drh5/gems-2.0/ruby'

Which doesn't exactly say "you're missing a semicolon," but at least
points you to the right file and line.

-Derek
Post by nathan binkert
Post by Derek Hower
I should point out that I've also run in to this problem, and it is
annoying enough that I'd really like to see it fixed sooner rather
than later. Right now I usually have to copy my changes to GEMS and
compile there to get useful error messages.
Can you give me an example of an error that I can introduce that
should give a better error message?  I can try this out and try to get
something in there that helps out.
 Nate
_______________________________________________
m5-dev mailing list
http://m5sim.org/mailman/listinfo/m5-dev
Derek Hower
2009-08-04 15:44:45 UTC
Permalink
I was able to get better messages out of the parser by changing the
exception handler. It seems that the version of python (2.4.6)
installed at Wisconsin doesn't like the way it's currently handled.
If you change the yacc.parse call to:

try:
results = yacc.parse(file(filename, 'r').read()
except (ParseError, TokenError), e:
print "File ",filename," ",e
raise e

You get much more helpful output. It seems the problem is trying to
raise type(e) rather than just e itself. Anyone know why this would
be?

-Derek
Post by Derek Hower
Just remove a semicolon from an action statement.
GEM5 prints
TypeError: exceptions must be classes, instances, or strings
   exports = 'env')
 File "/s/scons-1.1.0/amd64_rhel5/lib/scons-1.1.0.d20081207/SCons/Script/SConscript.py",
   return apply(method, args, kw)
 File "/s/scons-1.1.0/amd64_rhel5/lib/scons-1.1.0.d20081207/SCons/Script/SConscript.py",
   return apply(_SConscript, [self.fs,] + files, subst_kw)
 File "/s/scons-1.1.0/amd64_rhel5/lib/scons-1.1.0.d20081207/SCons/Script/SConscript.py",
   exec _file_ in call_stack[-1].globals
 File "/afs/cs.wisc.edu/p/multifacet/users/drh5/gem5/build/ALPHA_SE/SConscript",
   SConscript(joinpath(root, 'SConscript'), build_dir=build_dir)
 File "/s/scons-1.1.0/amd64_rhel5/lib/scons-1.1.0.d20081207/SCons/Script/SConscript.py",
   return apply(method, args, kw)
 File "/s/scons-1.1.0/amd64_rhel5/lib/scons-1.1.0.d20081207/SCons/Script/SConscript.py",
   return apply(_SConscript, [self.fs,] + files, subst_kw)
 File "/s/scons-1.1.0/amd64_rhel5/lib/scons-1.1.0.d20081207/SCons/Script/SConscript.py",
   exec _file_ in call_stack[-1].globals
 File "/afs/cs.wisc.edu/p/multifacet/users/drh5/gem5/build/ALPHA_SE/mem/protocol/SConscript",
   hh, cc = scan([s.srcnode().abspath for s in sm_files])
 File "/afs/cs.wisc.edu/p/multifacet/users/drh5/gem5/src/mem/slicc/parser/parser.py",
   raise type(e), tuple([filename] + [ i for i in e ])
Where GEMS would print
../protocols/MOESI_CMP_directory-L1cache.sm:483: syntax error,
unexpected IDENT at out_msg
make[1]: *** [generated/MOESI_CMP_directory_m/generated] Error 1
make[1]: Leaving directory `/afs/cs.wisc.edu/u/d/r/drh5/gems-2.0/ruby'
Which doesn't exactly say "you're missing a semicolon," but at least
points you to the right file and line.
-Derek
Post by nathan binkert
Post by Derek Hower
I should point out that I've also run in to this problem, and it is
annoying enough that I'd really like to see it fixed sooner rather
than later. Right now I usually have to copy my changes to GEMS and
compile there to get useful error messages.
Can you give me an example of an error that I can introduce that
should give a better error message?  I can try this out and try to get
something in there that helps out.
 Nate
_______________________________________________
m5-dev mailing list
http://m5sim.org/mailman/listinfo/m5-dev
Steve Reinhardt
2009-08-03 19:49:28 UTC
Permalink
Post by Derek Hower
I do like the idea of a generic SequencerMsg class, but am a little
worried that it would ultimately be confusing.  We can certianly split
DMARequestMsg into two different types, one for the request to the DMA
sequencer and one for the request to the network, like CacheMsg and
RequestMsg. In fact, this is probably the right thing to do to get rid
of the DMA assumption.  The request to the DMA Sequencer wouldn't look
exactly like a CacheMsg, though, because it needs an Offset and Len
field to account for block-unaligned accesses.  (The cache Sequencer
doesn't need these because it updates data locally after a request
completes and handles alignment then.  The DMA controller doesn't have
that luxury because the update happens at memory, not in the DMA
itself, and so it must propagate those fields to the protocol).  Thus,
a generic SequencerMsg would have to have Offset and Len fields as
well, even though they wouldn't be used by the cache Sequencer.  The
cache Sequencer *could* use those fields if it let the protocol do the
data update, but that would require changing all the protocols.  The
change would be very minor, though.  Make sense?
Just for reference, the current M5 memory system solves these issues
as follows--not that it's the best way to do it, but I'd be curious
about the motivation (other than history/inertia) for Ruby to do it
differently:

- All requests have a byte address and a length. Cache coherence
messages all set the length to the block size, and insure/assert that
the offset bits of the byte address are zero (or possibly ignore
them). It's necessary for caches to be able to generate requests with
unaligned addresses and other lengths for when they forward
uncacheable requests between CPUs and memory/devices. Having separate
block number and offset fields is functionally equivalent, but seems
slightly more awkward to me, and would probably make your struct a
little bigger (not huge, but every little bit helps).

- A system configuration with coherent caches needs to have a cache
between any DMA devices and the rest of the memory system to translate
partial-block DMA writes into coherent requests. When you say "the
update happens at memory" for DMA, does this mean that the memory
controller is responsible for the coherence side effects, e.g.,
invalidating other cached copies and potentially merging partial block
writes with outstanding modified copies? Or are these side effects
not modeled? Note that the latter is OK for a stopgap but
unacceptable long term for us...

Steve
Derek Hower
2009-08-03 20:03:14 UTC
Permalink
Post by Steve Reinhardt
Post by Derek Hower
I do like the idea of a generic SequencerMsg class, but am a little
worried that it would ultimately be confusing.  We can certianly split
DMARequestMsg into two different types, one for the request to the DMA
sequencer and one for the request to the network, like CacheMsg and
RequestMsg. In fact, this is probably the right thing to do to get rid
of the DMA assumption.  The request to the DMA Sequencer wouldn't look
exactly like a CacheMsg, though, because it needs an Offset and Len
field to account for block-unaligned accesses.  (The cache Sequencer
doesn't need these because it updates data locally after a request
completes and handles alignment then.  The DMA controller doesn't have
that luxury because the update happens at memory, not in the DMA
itself, and so it must propagate those fields to the protocol).  Thus,
a generic SequencerMsg would have to have Offset and Len fields as
well, even though they wouldn't be used by the cache Sequencer.  The
cache Sequencer *could* use those fields if it let the protocol do the
data update, but that would require changing all the protocols.  The
change would be very minor, though.  Make sense?
Just for reference, the current M5 memory system solves these issues
as follows--not that it's the best way to do it, but I'd be curious
about the motivation (other than history/inertia) for Ruby to do it
A lot of Ruby is inertia :)
Post by Steve Reinhardt
- All requests have a byte address and a length.  Cache coherence
messages all set the length to the block size, and insure/assert that
the offset bits of the byte address are zero (or possibly ignore
them).  It's necessary for caches to be able to generate requests with
unaligned addresses and other lengths for when they forward
uncacheable requests between CPUs and memory/devices.  Having separate
block number and offset fields is functionally equivalent, but seems
slightly more awkward to me, and would probably make your struct a
little bigger (not huge, but every little bit helps).
Good point.
Post by Steve Reinhardt
- A system configuration with coherent caches needs to have a cache
between any DMA devices and the rest of the memory system to translate
partial-block DMA writes into coherent requests.  When you say "the
update happens at memory" for DMA, does this mean that the memory
controller is responsible for the coherence side effects, e.g.,
invalidating other cached copies and potentially merging partial block
writes with outstanding modified copies?  Or are these side effects
not modeled?  Note that the latter is OK for a stopgap but
unacceptable long term for us...
Yes, the directory handles the side effects. Here's the sequence for
a DMA write:

1. DMA Controller issues DMA_WRITE request to directory
2. Directory invalidates sharers/owner if needed
3. Directory updates the data
4. Directory writes to DRAM.
5. Directory sends Ack to DMA

The DMA controller never sees the original copy of the data, so can't
do the merge itself on a partial write.
Post by Steve Reinhardt
Steve
_______________________________________________
m5-dev mailing list
http://m5sim.org/mailman/listinfo/m5-dev
Beckmann, Brad
2009-08-03 20:14:10 UTC
Permalink
Thanks Derek.

I think we should be consistent with how GEMS separates sequencer requests from network requests and thus we should split them into two different types. The DMA network requests should be specified in the protocol message file since they interact with that particular protocol's state machines. However, the DMA sequencer messages need to be in Ruby_Exports because the DMA sequencer lives on the Ruby side. That should work well with Steve's suggestion to use byte addresses and lengths.

I'll reply to the other issues on separate threads.

Brad


-----Original Message-----
From: m5-dev-bounces-***@public.gmane.org [mailto:m5-dev-bounces-***@public.gmane.org] On Behalf Of Derek Hower
Sent: Monday, August 03, 2009 1:03 PM
To: M5 Developer List
Subject: Re: [m5-dev] Fwd: GEM5 Issues
Post by Steve Reinhardt
Post by Derek Hower
I do like the idea of a generic SequencerMsg class, but am a little
worried that it would ultimately be confusing.  We can certianly split
DMARequestMsg into two different types, one for the request to the DMA
sequencer and one for the request to the network, like CacheMsg and
RequestMsg. In fact, this is probably the right thing to do to get rid
of the DMA assumption.  The request to the DMA Sequencer wouldn't look
exactly like a CacheMsg, though, because it needs an Offset and Len
field to account for block-unaligned accesses.  (The cache Sequencer
doesn't need these because it updates data locally after a request
completes and handles alignment then.  The DMA controller doesn't have
that luxury because the update happens at memory, not in the DMA
itself, and so it must propagate those fields to the protocol).  Thus,
a generic SequencerMsg would have to have Offset and Len fields as
well, even though they wouldn't be used by the cache Sequencer.  The
cache Sequencer *could* use those fields if it let the protocol do the
data update, but that would require changing all the protocols.  The
change would be very minor, though.  Make sense?
Just for reference, the current M5 memory system solves these issues
as follows--not that it's the best way to do it, but I'd be curious
about the motivation (other than history/inertia) for Ruby to do it
A lot of Ruby is inertia :)
Post by Steve Reinhardt
- All requests have a byte address and a length.  Cache coherence
messages all set the length to the block size, and insure/assert that
the offset bits of the byte address are zero (or possibly ignore
them).  It's necessary for caches to be able to generate requests with
unaligned addresses and other lengths for when they forward
uncacheable requests between CPUs and memory/devices.  Having separate
block number and offset fields is functionally equivalent, but seems
slightly more awkward to me, and would probably make your struct a
little bigger (not huge, but every little bit helps).
Good point.
Post by Steve Reinhardt
- A system configuration with coherent caches needs to have a cache
between any DMA devices and the rest of the memory system to translate
partial-block DMA writes into coherent requests.  When you say "the
update happens at memory" for DMA, does this mean that the memory
controller is responsible for the coherence side effects, e.g.,
invalidating other cached copies and potentially merging partial block
writes with outstanding modified copies?  Or are these side effects
not modeled?  Note that the latter is OK for a stopgap but
unacceptable long term for us...
Yes, the directory handles the side effects. Here's the sequence for
a DMA write:

1. DMA Controller issues DMA_WRITE request to directory
2. Directory invalidates sharers/owner if needed
3. Directory updates the data
4. Directory writes to DRAM.
5. Directory sends Ack to DMA

The DMA controller never sees the original copy of the data, so can't
do the merge itself on a partial write.
Post by Steve Reinhardt
Steve
_______________________________________________
m5-dev mailing list
http://m5sim.org/mailman/listinfo/m5-dev
Beckmann, Brad
2009-08-03 20:53:04 UTC
Permalink
I was hoping that there was a better way to solve this...I'm glad you found it! However I'm confused why you don't think "MachineType::L1Cache" is legal SLICC code? SLICC can reference a particular machine type by simply removing a colon, "MachineType:L1Cache". This will translate to "MachineType_L1Cache" in the generated code. Simple broadcasts in systems with little hierarchy can use this broadcast function without too much complexity. However, systems with more hierarchy will require a more comprehensive solution. For example, if you want to broadcast only to those L2 caches that are within you same chip. I know you recently changed the configuration infrastructure to support heterogeneity and I'm not quite clear how the new configuration infrastructure specifies hierarchy in the ruby-language files. Could you elaborate or point me to an example? It would be great if we could get the new configuration infrastructure to generate functions such as: "NetDest getLocalL2Caches(MachineID machID)". That is what I was trying to get at with adding a hack to slicc/symbols/Type.cc.

Thanks,

Brad


-----Original Message-----
From: m5-dev-bounces-***@public.gmane.org [mailto:m5-dev-bounces-***@public.gmane.org] On Behalf Of Derek Hower
Sent: Monday, August 03, 2009 11:17 AM
To: M5 Developer List
Subject: Re: [m5-dev] Fwd: GEM5 Issues
Post by Beckmann, Brad
- NetDest support in RubySlicc_ComponentMapping.hh
The functionality I'm particularly concerned about is setting the destination of a message so that it broadcasts to all L1 caches or all directories.  Before, all of those helper functions were in RubySlicc_ComponentMapping.hh.  If we don't want those helper functions on the ruby slide any more, we could add them to the generated MachineType.cc file by hacking the "if (m_isMachineType)" statement in slicc/symbols/Type.cc.  It wouldn't be the first hack added to that file or the ugliest hack in SLICC.  :)  If you have a suggestion for a more elegant solution, I'm happy to hear it.
I think the right wat to do this would be to a broadcast(MachineType)
method in RubySlicc_ComponenMapping.hh that looks like:

NetDest broadcast(MachineType type) {
NetDest dest
for (int i=0; i<MachineType_base_count(type); i++) {
dest.add(MachineID(type, i));
}
return dest;
}

Then in SLICC have a line that looks like:

out_msg.Destination := broadcast(MachineType::L1Cache);

The problem is that MachineType::L1Cache is not legal SLICC code.
Adding this would require a change to parser as well as the code
generator. It would be a bit of work, and the question is, is it
worth it to make this fix given that SLICCer will (eventually) happen?
I'm not sure.

Perhaps a comprimise between right and fast would be to pass something
to broadcast that SLICC does know about. For example, you could pass
a string to broadcast and then use string_to_MachineType to get the
MachineType. Obviously, this is slow and you probably don't want to
do it, but something similar could work. (use
MachineType_from_base_level?).

Other ideas?
Derek Hower
2009-08-03 21:38:54 UTC
Permalink
I was hoping that there was a better way to solve this...I'm glad you found it!  However I'm confused why you don't think "MachineType::L1Cache" is legal SLICC code?  SLICC can reference a particular machine type by simply removing a colon, "MachineType:L1Cache".  This will translate to "MachineType_L1Cache" in the generated code.
Well then, problem solved!

 Simple broadcasts in systems with little hierarchy can use this
broadcast function without too much complexity.  However, systems with
more hierarchy will require a more comprehensive solution.  For
example, if you want to broadcast only to those L2 caches that are
within you same chip.  I know you recently changed the configuration
infrastructure to support heterogeneity and I'm not quite clear how
the new configuration infrastructure specifies hierarchy in the
ruby-language files.  Could you elaborate or point me to an example?
It would be great if we could get the new configuration infrastructure
to generate functions such as: "NetDest getLocalL2Caches(MachineID
machID)".  That is what I was trying to get at with adding a hack to
slicc/symbols/Type.cc.

I see. Unfortunately this isn't possible at the moment. In the new
configuration, there really isn't any notion of "local" because there
is no longer a generated chip object. Now Ruby (network included)
doesn't care where a controller is located and treats them all the
same. You configure a multi-chip system by just adjusting latencies
to reflect off-chip communication.

This of course complicates finding on-chip components. Here's one
idea of how to fix this that follows the "keep it generic" philosophy:

1. Create a Chip class that basically just contains pointers to
AbstractControllers, e.g.,

class Chip {
public:
void addController(AbstractController* c) { ... }
void getControllers(MachineType type) { ... }

private:
Vector* m_controllers = new Vector[MachineType_NUM];
}

2. Also add a getChip() function to AbstractController.

3. Then add a function in RubySlicc_ComponentMapping called
getLocalBroadcast(AbstractController*, MachineType). You'll need the
ability to reference "this" in SLICC, which I don't think you can
currently do.

4. Modify the generated Controller init functions and add lines like:
m_chip = RubySystem::getChip(m_cfg["chip"]);
m_chip->addController(this);

5. Here's the only tricky part - you'll need to add chips to the
Ruby-lang configuration. This will involve adding a new attribute to
NetPort called chip, adding a chip argument to all NetPort subclasses,
and printing out the actual chip configuration.

Thoughts? If you like this approach I can help you navigate through
the configuration changes.
Thanks,
Brad
-----Original Message-----
Sent: Monday, August 03, 2009 11:17 AM
To: M5 Developer List
Subject: Re: [m5-dev] Fwd: GEM5 Issues
Post by Beckmann, Brad
- NetDest support in RubySlicc_ComponentMapping.hh
The functionality I'm particularly concerned about is setting the destination of a message so that it broadcasts to all L1 caches or all directories.  Before, all of those helper functions were in RubySlicc_ComponentMapping.hh.  If we don't want those helper functions on the ruby slide any more, we could add them to the generated MachineType.cc file by hacking the "if (m_isMachineType)" statement in slicc/symbols/Type.cc.  It wouldn't be the first hack added to that file or the ugliest hack in SLICC.  :)  If you have a suggestion for a more elegant solution, I'm happy to hear it.
I think the right wat to do this would be to a broadcast(MachineType)
NetDest broadcast(MachineType type) {
  NetDest dest
  for (int i=0; i<MachineType_base_count(type); i++) {
    dest.add(MachineID(type, i));
 }
 return dest;
}
out_msg.Destination := broadcast(MachineType::L1Cache);
The problem is that MachineType::L1Cache is not legal SLICC code.
Adding this would require a change to parser as well as the code
generator.  It would be a bit of work, and the question is, is it
worth it to make this fix given that SLICCer will (eventually) happen?
 I'm not sure.
Perhaps a comprimise between right and fast would be to pass something
to broadcast that SLICC does know about.  For example, you could pass
a string to broadcast and then use string_to_MachineType to get the
MachineType.  Obviously, this is slow and you probably don't want to
do it, but something similar could work. (use
MachineType_from_base_level?).
Other ideas?
_______________________________________________
m5-dev mailing list
http://m5sim.org/mailman/listinfo/m5-dev
Steve Reinhardt
2009-08-03 21:51:03 UTC
Permalink
I see.  Unfortunately this isn't possible at the moment.  In the new
configuration, there really isn't any notion of "local" because there
is no longer a generated chip object.  Now Ruby (network included)
doesn't care where a controller is located and treats them all the
same.  You configure a multi-chip system by just adjusting latencies
to reflect off-chip communication.
This of course complicates finding on-chip components.  Here's one
1. Create a Chip class that basically just contains pointers to
AbstractControllers, e.g.,
class Chip {
  void addController(AbstractController* c) { ... }
  void getControllers(MachineType type) { ... }
  Vector* m_controllers = new Vector[MachineType_NUM];
}
2. Also add a getChip() function to AbstractController.
3. Then add a function in RubySlicc_ComponentMapping called
getLocalBroadcast(AbstractController*, MachineType). You'll need the
ability to reference "this" in SLICC, which I don't think you can
currently do.
m_chip = RubySystem::getChip(m_cfg["chip"]);
m_chip->addController(this);
5. Here's the only tricky part - you'll need to add chips to the
Ruby-lang configuration.  This will involve adding a new attribute to
NetPort called chip, adding a chip argument to all NetPort subclasses,
and printing out the actual chip configuration.
Thoughts? If you like this approach I can help you navigate through
the configuration changes.
I like the fact that Ruby no longer enforces the notion of a "chip",
and I'm hesitant to see us add one back in, even if it's optional.
I'd prefer to see something that at least has the appearance of
providing support for an arbitrarily deep generic hierarchy, even if
it's just a matter of changing the name Chip to Domain or Neighborhood
or something like that.

Steve
Beckmann, Brad
2009-08-03 23:24:38 UTC
Permalink
I agree. Whatever we develop long-term should allow multiple levels of arbitrary hierarchy. In the end, it would be ideal if M5's front-end configuration removed the .slicc files and directly glued the state machines together. In particular, it would be great if there was a "domain" super class that could encapsulate some number of state machines and generate domain mapping functions that were used by the .sm files. Of course this is the end goal and will require a large amount of work and coordination to do so.

In the meantime, what do we do to support hierarchies? Here are a few options, let me know if you see others:
- The simplest thing is to do what GEMS has done in the past and create specific routing functions in RubySlicc_ComponentMapping.hh that use the MachineType "#define"s.
- Follow Derek's suggestion to add a simple Chip class that encapsulates a chip's state machines.
- Bite off the major tasks of removing the .slicc files and add a new domain class to the .rb files that auto generates domain mapping functions.

My vote is that for the short-term we don't completely deprecate the "#define"s in RubySlicc_ComponentMapping.hh and allow for protocol specific routing functions. Later on, when we have the time to fully integrate the GEMS and M5 configurations, creating a domain class in the .rb files could be a nice intermediate step before migrating the ruby configuration to M5's python files. I like Derek's suggestion, but as Steve points out this may be too much work for only an intermediate solution and it may not solve everyone's problems. For instance, I can imagine some situations will call for more levels of hierarchy than just at the chip level.

What is your opinion?

Brad


-----Original Message-----
From: m5-dev-bounces-***@public.gmane.org [mailto:m5-dev-bounces-***@public.gmane.org] On Behalf Of Steve Reinhardt
Sent: Monday, August 03, 2009 2:51 PM
To: M5 Developer List
Subject: Re: [m5-dev] Fwd: GEM5 Issues
I see.  Unfortunately this isn't possible at the moment.  In the new
configuration, there really isn't any notion of "local" because there
is no longer a generated chip object.  Now Ruby (network included)
doesn't care where a controller is located and treats them all the
same.  You configure a multi-chip system by just adjusting latencies
to reflect off-chip communication.
This of course complicates finding on-chip components.  Here's one
1. Create a Chip class that basically just contains pointers to
AbstractControllers, e.g.,
class Chip {
  void addController(AbstractController* c) { ... }
  void getControllers(MachineType type) { ... }
  Vector* m_controllers = new Vector[MachineType_NUM];
}
2. Also add a getChip() function to AbstractController.
3. Then add a function in RubySlicc_ComponentMapping called
getLocalBroadcast(AbstractController*, MachineType). You'll need the
ability to reference "this" in SLICC, which I don't think you can
currently do.
m_chip = RubySystem::getChip(m_cfg["chip"]);
m_chip->addController(this);
5. Here's the only tricky part - you'll need to add chips to the
Ruby-lang configuration.  This will involve adding a new attribute to
NetPort called chip, adding a chip argument to all NetPort subclasses,
and printing out the actual chip configuration.
Thoughts? If you like this approach I can help you navigate through
the configuration changes.
I like the fact that Ruby no longer enforces the notion of a "chip",
and I'm hesitant to see us add one back in, even if it's optional.
I'd prefer to see something that at least has the appearance of
providing support for an arbitrarily deep generic hierarchy, even if
it's just a matter of changing the name Chip to Domain or Neighborhood
or something like that.

Steve
Continue reading on narkive:
Loading...