Nice article
I feel the need to post a link to this old site:
http://www.graficaobscura.com/future/
Last time round, we looked at the way an unnamed developer had used Cocoa routines to chop up a simple C-string in order to determine whether or not it contained a particular, named OpenGL extension name. A few folks pointed out that if, hypothetically speaking, some new extension name gets devised that happens to be a …
Great article, but it's tradition. We have to pick nits.
No argument in regards to caching [NSUserDefaults standardUserDefaults], but for entirely different reasons. That is, readability is key.
Str1 = [[NSUserDefaults standardUserDefaults] stringForKey: @"myStr"];
is nowhere near as useful as
NSUserDefaults * ourUserDefaults = [NSUserDefaults standardUserDefaults];
ourDefaultDocumentNameString = [ourUserDefaults valueForKey: RDKeyDefaultDocumentName];
with a header file defining:
extern const NSString * RDKeyDefaultDocumentName;
RD standing for Reg Developer, of course. That way you save space and CPU because the constant string is only defined once instead of several times over, and not only that, but it reduces the chance for typos. There won't be any symbol-checking on @"myStr".
In regards to the final quiz question, it's the same reason the last stack entry of printf would be the first string; namely, it's a variable/unknown length argument list, so you don't know where the first argument would be if it was pushed first. Last in first out, and all that. Because after self and the selector, you've got all the arguments that the selector may or may not have.
Fun fact: On the PPC implementation, this isn't done the same way. Arguments are placed in registers when they can, and stack space is set aside, but not set. Only if the argument is too long to fit, or is at the tail end of a the arglist then it's put into the stack.
http://developer.apple.com/documentation/DeveloperTools/Conceptual/LowLevelABI/Articles/32bitPowerPC.html#//apple_ref/doc/uid/TP40002438-SW17
Here's the problem. NSInsetRect takes a NSRect, not an NSRect *. That is, the actual rect, not the address, is placed on the stack. And similarly, [self frame] returns the rect, not the pointer, on the stack. So we have:
{Beginning}{Local Vars} {NSRect} *esp
when we return from [self frame]. Problem is that there's two arguments, floats of 2.0, at the end of NSInsetRect. Which means we need:
{Beginning}{Local Vars} {2.0}{2.0}{NSRect} {linkage area}*esp
in the stack. Remember the quizzy bit? And since this is all passed in the stack, not heap, we can't just let things sit at random. So what it's doing?
2902-291b is [self frame]; Since we passed NSRects, it's not in a local variable.
2923-2938 is copying from the returned NSRect into a safe spot.
293b-2952 is moving both 2.0s into position, overwriting our old NSRect.
2956-296e is restoring NSRect back onto the stack, but shifted 8 bytes over because of the floats.
And then, of course, after calling NSInsetRect, we're moving the resulting NSRect back into insetRect.
Yeah, GCC could have done it better, but it's not as bad as it first looks.
Viable Imagination Shared the Word 4Visions2.
"Finally, here's a little teaser for you experienced Cocoa-heads. Going back to Listing One, you may notice that the "self" pointer (either a pointer to an object instance, or to a class if we're dealing with a class method such as [NSCursor crosshairCursor]) is always passed last in the EAX register before the call. Can you figure out why? "
A FailSafe S.M.A.R.T.er cookie ....... to keep IT Leading, Dave?
And that is not really a question, at all.
Hi Blain. You said "In regards to the final quiz question, it's the same reason the last stack entry of printf would be the first string; namely, it's a variable/unknown length argument list, so you don't know where the first argument would be if it was pushed first. Last in first out, and all that. Because after self and the selector, you've got all the arguments that the selector may or may not have."
That's certainly true as far as it goes, but not quite the angle I was coming from. One of the oddities/conveniences (delete as appropriate) of Objective-C is that --- as I'm sure you know --- it's possible to send a message to a nil object. In the PPC implementation, the implicit "self" pointer is passed in the r3 register which (suprise, surprise!) just happens to be the register that's used to return scalar function arguments. The effect is that if you send a message to a nil object using a selector that's expected to return a boolean, int, float, etc, you'll effectively get NO, 0, 0.0 (respectively) back *because* the appropriate value is already in r3.
The effect is the same with the Intel implementation. 'Self' is passed in the EAX register, and EAX is used for the return value. This is -- I believe -- the reason why GCC ensures that Self is the last value that's placed in EAX immediately before the dispatcher call.
Dave
Doesn't look like this compiler has a very good optimizer. Is this the debug version? I'd expect a VC++ compiler doing a release build to do much better, even on miserable intel instruction sets.
You get different looking source code depending on how you were trained. Some people are trained to write what looks readable and expect the optimizer to make it efficient. Other people are trained to assume the optimizer is asleep and do it all in the source code. And yes, some people are not trained at all. It's hard to guess the skill of the developer by looking at the generated assembly listing.
Real optimization comes from better optimization, by better multithreading, and from avoiding 1000 cycle naps in the memory allocator.
I think that equating unitentionally bloated code with maliciously peppering your code with sleeps() is a bit of a puzzeling comparison to draw.
There is not much point in a high level language if you have to keep looking at the ASM listings because you don't trust it.
Firstly you should always look at the release output - the simple fix to bloated debug code is to release- release code.
One reason for debug output being bloaty is that a debugger needs to be able to work back to the line that originated the asm section, this not only means many optimisations are not on the cards, but also things like inline functions may have a different effect.
So the fact that the article spends most of it's time picking apart the of debug is a bit bogus.
Mutiple calls to NSUserDefaults is just bad software engineering. Readibilty is improved by refactoring the code to use one instance.
Thanks for pointing out the potential cost of the dispatch though, it's good to know.
In companies that I have worked in. Patterns of code which produced code bloat were documented as things to avoid as part of the coding standard, and applied during peer review.
your arguments are not very robust. I think the bigger message you are trying to send is "pay more attention to what you are doing." Your main point of avoiding code bloat is sort of a side-note. Why? Very simply because most of the code your process will be executing WILL NOT BE YOUR OWN! Between the objC runtime and Cocoa frameworks, there is enough code bloat for everyone. Nit-picking a high level language's output in assembly does little to make your point, as you detailed yourself. That's compiler territory, and it is no secret that gcc doesn't offer up the tightest code.
Finally, an extra 100 bytes here or there or everywhere will make a small change in the overall memory footprint of your program when you're looking at it from the standpoint of total number of pages required.
I think the focus of the article should be more on pointing out obvious things not to do in objC code (and perhaps some reasons why), but shying away from the "this costs an extra 80 bytes!!" argument.
" The effect is that if you send a message to a nil object using a selector that's expected to return a boolean, int, float, etc, you'll effectively get NO, 0, 0.0 (respectively) back *because* the appropriate value is already in r3. The effect is the same with the Intel implementation. 'Self' is passed in the EAX register, and EAX is used for the return value. This is -- I believe -- the reason why GCC ensures that Self is the last value that's placed in EAX immediately before the dispatcher call."
Aha. I didn't think about that. Honestly, I try to avoid depending on this activity, to a degree. No, I don't check before [fooObject release]; but I'd avoid {if ([fooString length]) }, using {if ((fooString != nil) && ([fooString length] != 0)) }. The latter, while chattier, is actually faster, safer, etc. True, it uses a couple more bytes. But the checks against 0, if I'm not mistaken simplify out to the same assembly. And a nil will reduce another function call of send_msg. More importantly, here's where you're fighting resistance.
This bit about handling nil sounded familiar. Indeed, it's a really nasty design tradeoff. You don't always get 0 back from a nil call.
http://ridiculousfish.com/blog/archives/2005/05/29/nil/
r3 is 0, true, but r4 (The lower half of a long long return value) is the selector, and fpr1 is the first floating argument in, and the floating result out. And none of the registers have been changed. In other words, for PPCs
- (UInt32) [nil fooInt]; returns 0.
- (UInt64) [nil fooLongLong]; returns (long long)(@selector(fooLongLong)) on 32 bit systems, but 0 on 64-bit systems.
- (float) [nil fooFloat: (float) barArg]; returns barArg.
- (float) [nil fooFloat]; is undefined.
There needs to be a happy medium between tight code and flexible code. And there is a lot of bloated code out there, regardless of language. But if you over-optimize, bad things happen. This isn't to diminish your message. Honestly, I enjoy your articles, despite playing the devil's advocate so often.