9/18 I have an embedded C++ application that passes around a lot of
68-byte structs-- the struct is a wrapper around a binary message,
and just about every function call includes one of them (usually
passed by copy). Today I changed the struct into a class which
contains an empty Constructor/Destructor, and Configure() method
which initializes every field. Basically nothing else was changed:
the 68-byte struct became a 72-byte object. No inheritance, no
virtual functions, everything is allocated on the stack. My test
suite is taking 300% longer to run, even with -O3. Any ideas
about what might be causing this? I'm going to look into gprof,
but if there are any hints and tips I'd like to hear them. Thanks.
\_ WAG: From your description, I gather you changed the definition
from: struct Foo { ... }; to: class Foo { ... };
Try changing "class" back to "struct" and see if that changes
anything.
\_ It does-- it cuts the running time by 2/3. I've got 2
parallel (except for the class stuff) directories and I'm
running the tests side by side.
\_ Well, are you passing these structs by value? Are you
constructing and deconstructing tons and tons of classes
now? Calling all those empty constructors and deconstructors
can get quite expensive. Also do you have RTTI turned on?
That could explain the extra bytes,
\_ I don't have RTTI on explicitly (how would I check?),
and I'm not using any template stuff. The compiler is
gcc 3.2. Is there a way to optimize the (de|con)structor
calls to nothing? I had figured that -O2 would take
care of that for me. Also, I think that extra 4 bytes
is just a pointer to the dispatch table, which is
completely expected. --op
\_ If you have no virtual functions you don't have a
dispatch table, and you shouldn't have a pointer.
Trust me when your basic math objects are classes
and you have umpty millions of them, doubling the
size for a dispatch pointer would be really annoying.
\_ Aaah! You're totally right. Out of habit I had
put "virtual ~Foo() { };". I removed the virtual
and the code is now about 5-10% *faster* than the
struct version. Thank you thank you thank you.
\_ Two more suggestions: Did you define Configure() in the
header file? If so, move it to the implementation file
and see if that makes a difference. Also, try compiling
with -Os, which optimizes for size (at least on my
gcc). - struct guy
\_ Configure is defined in the .cc file; I'll give -Os
a shot now.
\_ Try swapping the the definition of the ctor/dtor
and Configure from .cc to .h or vice-versa.
\_ Why did you change working code in the first place? That's where
your real problem is.
\_ Obviously so he could put C/C++ on his resume! |