Thanks for the feedback. After a bit of thinking, the fundamental problem with doing this "implicitly" as you suggest is that it has to be done per port, and not per MemObject. Thus, the enable/disable would have to be moved to each and every port, and the ports are parameters in the Python classes, not SimObjects/MemObjects, so it is not straight forward to enable/disable the stats in a convenient way (unless we start turning the flags in to vector params etc). Similarly for the stats, they would have to be stats implemented in each port, or some for of multi-dimensional stats on the MemObjects.
I think the functionality and configuration interface ends up being rather cluttered, but haven't actually attempted to implement it.
In conclusion, I think there are compelling reasons to keep the monitor the way it is.
From: Steve Reinhardt [mailto:email@example.com]
Sent: 30 April 2012 01:08
To: Andreas Hansson
Subject: Re: Review Request: MEM: Add the communication monitor
(switching to email since this is getting a bit detailed)
These are valid concerns. I'd like to explore them a little more. I'm not necessarily picking sides, just playing devil's advocate a bit.
On Sun, Apr 29, 2012 at 5:05 AM, Andreas Hansson <firstname.lastname@example.org<mailto:email@example.com>> wrote:
This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/1159/
On April 28th, 2012, 9:08 a.m., Steve Reinhardt wrote:
I like the idea of having a common way of collecting these stats (and probably more importantly, common names and semantics for the output). Does it make sense to have this in a separate object though? It seems like that just adds complexity to the configuration, and the possibility of not putting one of these where you wish you had. Did you consider putting all this code in a base class like MemObject so that everyone automatically collects all these stats?
A very valid comment, and this option was under serious consideration. The benefits of having it behaviourally encapsulated (as you now propose) rather than structurally encapsulated (as the current patch does) is indeed that:
1) it does not involve changing any configuration scripts, and
2) any communicating object would have a uniform set of stats relating to the communication
The issues are:
1) how to "hide" it in the interface timing calls, as we must catch the case where sendTiming returns false and undo the changes to the senderState
If we make this common to all MemObjects, we could take the monitor state (which is just a tick value) and put it in the base SenderState class. I bet there are some MemObjects that already record the transmit time in the sender state anyway so we could get rid of that double recording. Eliminating the memory allocation and chaining of the separate SenderState object could possibly be a noticeable performance win when you're actually collecting the stats.
2) the rather extensive set of disable flags in the object parameters
This doesn't bother me much. If we define the flags on the base MemObject and provide reasonable defaults, they'll be invisible to most people.
3) the potential negative performance impact caused by collecting more stats than we really want
How expensive are these stats? A timing-mode transaction is not cheap to begin with, so the percentage slowdown may not be that bad. Plus I find that, when you're running long simulations, you're better off collecting all the stats you can (within reason), since having them and not needing them is less costly than wanting them and not having them.
Plus, it looks like all the stats can be disabled, so if performance is a concern, they could just be turned off, right? This would eliminate the first-order overheads (notwithstanding #5 below).
4) more "noise" in the stats output
The stats output really isn't designed to be read casually anyway... if you're using grep or some other script, then the excess data won't be noticeable.
5) potential negative performance impact even when not collecting stats due to the large number of if-statements
True. As with #3, we don't know how big this impact is though. Branch predictors can be pretty good, so although a sequence of load/compare/branch instructions isn't free, it's not necessarily as bad as you think.
Plus, there may be some existing stats in some MemObjects that are redundant with respect to these (like bytes read and bytes written maybe?) and if we go back and get rid of those then we'd save a little overhead there.
The current CommMonitor may not be ideal in that it requires config script changes, and it is indeed up to the user to determine which points in the memory system are of interest. However, I would expect no more than a handful monitors in a typical system, and the stats in the monitor could become multi-dimensional and indexed on the masterId in a future patch.
How many masters are there in a typical system anyway, and how many of them (1) issue a lot of requests and (2) are ones we're not likely to care about stats on? These are the only cases where it would matter to not collect stats, IMO. My guess is that the only things that might fall into this category are fabric components like the Bus and *maybe* the Bridge. I think the CPUs are the things you would definitely want to be capturing stats on, and other things like DMA devices probably don't issue enough requests typically to matter (from a simulation performance perspective).
As a minor side note, the structural CommMonitor also more closely resembles the bus monitors used in actual devices.
Only because in silicon it's too expensive to automatically put one in every bus master ;-). In software, it's not so bad...
As I said, I'm not convinced either way at this point, just trying to explore the issues a little more deeply. Thanks for bearing with me.
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.