[Note: This was originally written for "FYI", the internal newsletter for the worldwide software engineering departments of Dendrite International]
Recently, Ive heard some complaints that Im too rough in these articles. Actually, Ive always thought that I was being gentle -- I never reveal the author of the code Ive dissected on these pages, and when I used recognizable Dendrite code, its usually from someone who has left the company. Just to show you what I could do if I "let loose", this month well study a snippet of code from someone completely outside of Dendrite.
This function was recently published in the "Letters" section of the "C/C++ Users Journal". The author had seen a function with handled a similar task in an earlier issue, and was offering this "better" way to do the job (adding commas to a number):
void FormatOutput(char * String) { int x,c,d,i; char Work[20]; x = strlen(String)-1; for (i=1,c=x+(x/3)+1,d=x; c>0; --c,--d,++i) { if(i%4==0) { *(Work+(--c)) = ','; i=1; } *(Work+(c-1)) = *(String+d); } Work[x+(x/3)+1] = '\0'; strcpy(String, Work); } void main(int argc, char *argv[]) { FormatOutput(argv[1]); printf("String format output=[ %s ]\n", argv[1]); }
The author billed this method as "shorter" than that given in the original article, but it achieves it's small size mostly by sticking several statements on a single line. The line with the "for()" actually contains seven separate actions, for example. If we were to "unroll" these to their equivalent individual statements, the lines of code increase by 50% but they generate byte-for-byte identical objects code (at least with Borland C++ 3.1 & Microsoft Visual C++):
#include <stdio.h> #include <string.h> void FormatOutput(char * String) { int x; int c; int d; int i; char Work[20]; x = strlen(String)-1; i=1; c=x+(x/3)+1; d=x; while (c > 0) { if(i%4==0) { --c; Work[c] = ','; i=1; } Work[c-1] = String[d]; --c; --d; ++i; } Work[x+(x/3)+1] = '\0'; strcpy(String, Work); }
More important then the fact that the two versions produce identical output on two particular compilers is that they are logical equivalent, so if they failed to produce the same output a some compiler, I'd have to say that the problem is with the compiler, not the source code.
Also, note that Ive replaced the array accesses in the form of "*(Work+(c-1))" with the form "Work[c-1]". I imagine the author was once told that pointer access is more efficient then array access, which is partly true -- but only in the sense that:
x=*ptr; ptr++;
is more efficient than:
x = ptr[i]; i++;
This is particularly true if ptr is pointer to something larger than a char. But once you index off the ptr ("*(ptr+i)") the two formats are identical. I've never seen (nor can I conceive of) a compiler generating different code for the two forms. The Array style just seems easier to read.
However, now that we've got the function laid out in front of us, other optimizations become more obvious. First, the value "x+(x/3)+1" is calculated twice. And what kind of a name is "x" for a variable? Also, note that "i" starts at 1, and is reset to 1 every time i%4 ==0, which means i will never get higher than 4. This means that the "i%4==0" can be replaced by "i==4" which saves us the divide needed to generate the modulo. This may seem like a minor point, but remember that a divide is one of the most expensive instructions in clock-cycles. In fact, that divide, by itself, will take nearly as long as every other instruction inside the inner loop. This one change by itself will nearly double the speed of this function. When I emailed the author (The Info Highway at work!), he said on most projects they profile an early version of system to determine where they should spend time optimizing. In other words, they slap it together as fast as they can, and then, when they realize its too slow, they tried to figure out what they can patch to speed it up. Sound familiar? Wont it be simpler just to do it right in the first place ?
Next, note that first we subtract one from c in the line "Work[c-1]", and throw away that value after using it. And then, subtract one from c in the very next line ("--c;") and keep than value. This work can be halved by moving the "--c" above the other line and making that "Work[c] =..."
But this leads us to another discovery. Every time the function wants to use "c", we subtract one from it first. Clearly, "c" is one too high. Not surprising, since we add one to it when we assign it a value. By moving that "+1" to the bottom (where we use that value again), we make our code a little clearer.
Finally, we have the issue of internationalization! As hopefully any Dendriter knows, some countries have some funny ideas about how numbers should be separated. Since there are about a thousand different theories on the best way to handle this, all of which are beyond the scope if this article, well just try something quick and utilitarian, a couple #defines.
Assigning "," to a mnemonic is simple enough, but in the original design, the number of digits in the group is designated by a "3" in two spots (which Ive already reduced to one), and by a "4" in another. But this is easy to fix: "i" can go from 0 to 3, just as easily as it goes from 1 to 4.
Putting all of this together, we get:
#define SEP_CHAR ',' #define GROUP 3 void FormatOutput(char *String) { int SrcLen; int ExtLen; int c; int d; int grp; char Work[20]; SrcLen = strlen(String)-1; ExtLen = SrcLen+(SrcLen/GROUP); grp = 0; c = ExtLen; d = SrcLen; while (c > 0) { if (grp == GROUP) { Work[c] = SEP_CHAR; --c; grp = 0; } Work[c] = String[d]; --c; --d; ++grp; } Work[ExtLen+1] = '\0'; strcpy(String, Work); }
This produces an object code which is slightly (5%) smaller then the original, but with an inner loop which is 33% smaller.
However, since I limited my changes to those inside the "blackbox", as to not change the function's interface, I could not address what I think is the most significant problem with this routine - the fact that it copies the lengthened string back into the space of the original. This assumes that the buffer allocated for the original string is big enough not only for itself, but for what it will later become. While this is not necessarily a bad thing, it requires very explicit documentation, or disaster can occur. For an example of this, one need only take the very demo code that the original author provided, and extend it slightly....
void main(int argc, char *argv[]) { FormatOutput(argv[1]); FormatOutput(argv[2]); printf("String format output=[%s]\n", argv[1]); printf("String format output=[%s]\n", argv[2]); }
Here, the compilers run-time library allocates space for command-line parameters, but gives no extra for expansion. Running this with a command of :
TEST 1234567 12345
one would expect a response of
String format output = [1,234,567] String format output = [12,345]
However, what you would get would be:
String format output = [1,234,567] String format output = [7]
as the expanded first string overwrote the second.
Copyright © 1994, James M. Curran