TITLE: C vs C++ perfromance-wise (Newsgroup: comp.lang.c++,rec.games.programmer, 31 Jan 99) HSIEH: qed@pobox.com (Paul Hsieh) > Ok, I know in the past I've said that C and C++ should be roughly the > same in terms of speed, but I have just come back from a marathon > optimization exercise in C++ and I have some bad news ... > > While straight C in modern compilers will usually do reasonably versus > hand coded assembly in most cases, C++ used in its most natural way > with its powerful abstraction features will hurt performance > *SIGNIFICANTLY*. STROUSTRUP: Bjarne Stroustrup - http://www.research.att.com/~bs Since you don't say which "most natural way" you talk about and don't say what you consider significant, there is no way to disprove that statement. However, for *some* natural and elegant styles and *many* applications that statement is false. HSIEH: > It is now clear to me that C++ can make your coding somewhat more > maintainable and reusable, the price paid in performance in a hefty > one. These are my experiences with MS VC++ 6.0: > > 1. Because of the existence of "this" and re-instantiatable "classes", > ordinary call overhead has gone up significantly. This will hurt all > processors (to a different degree). Virtual function calls which have > these penalties are used too pervasively, while their flexibility > features are typically seldom used in practice. STROUSTRUP: Indeed, if you make function calls that do not need to be virtual virtual, you pay. Virtual function calls are about as efficient as any indirect function call, but indirect function calls are slower than direct function calls on many architectures, and - sometimes much more significantly - making a function virtual disables most inlining. Virtual functions exists for a reason. Where needed, they are no more expensive to use than alternatives. Where used appropriately, they are one of the most powerful mechanisms for writing elegant, mainainable, and efficient code. However, if you misuse/overused virtual functions, if you write overly general code, you pay in C++ exactly as you would in any other language. That's why good teachers of design and C++ style recommend using virtual functions where the flexibility they offer is eeded and only there (I know that proponents of some other languages supporting OO give different advice, but that's their problem - except when novices take their advice, apply it blindly in a C++ program, obtain the predictable poor performance, and proceed to blame C++ and/or me; *then*, it becomes my problem). Consider: class X { int x; static int st; public: virtual void f(int a); void g(int a); static void h(int a); void k(int i) { x+=i; } }; and struct S { int x; }; int glob = 0; void f(S* p, int a); void g(S* p, int a); void h(int a); typedef void (*PF)(S* p, int a); PF p[2] = { g , f }; // to call f() indirectly #define K(p,i) ((p)->x+=(i)) Now call each 10,000,000 times or so. What do you find? That X::f and p[1] run about equally fast, that X::g and g run equally fast, that X::h and h run equally fast, and X::k and K run equally fast. If not, make sure that you have set your compiler options equivalently and that you have timed your tests properly. If differences persist, complain to your compiler vendor and/or switch to a better compiler. C++ was designed to make that equivalence of performance easy for an implementor to achieve and easy for a programmer to understand. I tried and found this: p->f(1) 3.82s f(p,1) 3.02s p.f(1) 2.78s f(&s,1) 3.02s p->g(1) 2.73s g(&s,1) 2.94s x.g(1) 3.02s g(p,1) 2.98s X::h(1) 2.93s h(1) 2.94s p->k(1) 2.69s K(p,1) 1.37s x.k(1) 2.74s K(&s,1) 1.17s Not bad, but not excellent, so I turned on the optimizer: p->f(1) 1.72s f(p,1) 1.67s p.f(1) 1.47s f(&s,1) 1.67s p->g(1) 1.47s g(&s,1) 1.46s x.g(1) 1.47s g(p,1) 1.47s X::h(1) 1.72s h(1) 1.72s p->k(1) 0.66s K(p,1) 0.67s x.k(1) 0.46s K(&s,1) 0.46s That's good enough for anyone who can live with C. (These figures are from a single run so they are subject to a bit of random variation. I ran the tests repeatedly, though, and there were no significant differences between runs. I ran the tests on an SGI Challenge box. The figures for other platforms and other compilers will vary. In my experience, virtual function calls are relatively more efficient compared to non-virtual calls on an SGI than on most other architectures.) HSIEH: > 2. C++ had added the additional placebo directive "inline" (much like > "register" was added to C). Unfortunately MSVC++ has adopted it as the > primary mechanism for inlining code. Although, MSVC has an option for > always attempting to inline code when it can, it turns out MSVC *can't* > do this very often. Specifically, it cannot pull code out of one class > and inject it into another. I don't know if this is a language > restriction or just a C++ restriction -- all I know is that it sucks. > This hurts the very typical practice of hiding ordinary structure entry > accesses through access functions. STROUSTRUP: "inline" is not "placebo" (and neither was register during the first ten years or so of C). In the fairly typical example of a case where one would use inlining above (a small, frequently-called function) the grain in speed was about a factor of two (and if you examine the code, you'll find that there is also a minor space improvement). I do not know what you mean by "pull code out of one class and inject it into another." However, inlining simple functions is designed to be trivial in C++ and has worked from the earliest compilers. I find it hard to believe that any C++ compiler fail to do it. I have seen examples where MSC++ did a very decent job of inlining. I know that there is a line of thought that insists that a compiler should do inlining without the help of the programmer and that the compiler can do better. However, I observed almost 20 years ago that commercial compilers do not know how to do better, and except in very rare cases they still do much worse than a minimally competent programmer. When optimizers gets smart enough to outperform programmers - and not before - "inline" will become a relict of the past just like "register." HSIEH: > 3. The C++ => object code name mangling mechanisms make it impossible > to call C++ code from inline assembly (illegal C++ characters are used.) > The fact that you can use the exact same name more than once (just with > different parameters) works against tools like VTune, which identify > functions by name alone. STROUSTRUP: I have not tried inline assembly with the compiler mentioned, but I have never seen a compiler where you could not do at least as well from C++ as you could from C. If a tool is designed for C only, you cannot seriously expect it to work for C++ without problems. HSIEH: > 4. Operator overloading makes it very difficult to back track through > someone else's sources to where the definition of that operator > overload is. MSVC++'s browsers and tools are pretty good at this sort > of thing, except in the case of operator overloading. STROUSTRUP: Good use of overloading can greatly increase readability of code (witness the almost universal overloading of basic mathematical operations, assignment, etc.). Poory chosen overloading and excessive overloading are among the thousands of ways a poor programmer can create a mess for the next programmer who comes along. I rate overuse of macros a more significant problem in C and C++. Also, overloading is key to generic programming which in turn is key to efficient, general, and typesafe containers. HSIEH: > So what can be done? Unfortunately, I don't have really good > suggestions right now. It seems as though the features of C++ clash > directly with getting good performance. So simply using extern "C" {} > around the critical parts of your code, may itself be beneficial, but > not too useful if you need to call C++ code from it. STROUSTRUP: ``Using extern "C" {} around the critical parts of your code'' is meaningless. You put apply ``extern "C"'' to declarations to disable C++'s typesafe linkage to be able to link to C code. There is no performance impact. HSIEH: > In the optimization exercise that I've just gone through, it turns out > that I spent a lot of time undoing a lot of the abstractions as opposed > to actually optimizing code. Perhaps the folks at id software knew > something of this which would explain their decision not to move to > C++. STROUSTRUP: It sounds to me as if the most likely cause of the problem is poor design, poor use of C++, and probably lack of experience with C++. If so, the solution is better understanding/education. There may also be weaknesses in the programming environment used, but an experienced C++ programmer can usually compensate for such weaknesses given the soundness of the basic tool (C++) and sound programming concepts. HSIEH: > I think the Bjarne Strousoupe has sold us all some Silicon Snake Oil > here folks. By building it into the infrastructure of C, he makes it seem > as though it should be a language that is good for performance, but as in > typical bait and switch tactics, you need to give up that performance > to get at the new goodies of the language. STROUSTRUP: You think wrongly. Selling snake oil and using "bait and switch" are despicable activities implying rather nasty personal traits. I do not indulge in them. People who do lack intellectual and personal intergrity. Do not try to cover lack of experience and understanding by ascribing odious personal habits to others. By now, C++ has delivered on essentially all the promises I have made for it. I have consistently described C++ accurately and fairly - documenting both its strengths and weaknesses (for example, see "The Design and Evolution of C++"). For people who wants to reach a professional level of C++ use, I recommend "The C++ Programming Language (3rd Edition)." __________________________________________________________ See the C++ Tip-of-the-day home page for more information. http://www.ses.com/~clarke/TableOfContents.html