[intermediate-high level]
Hi All,
Today I want to consider the following case:
In my application I want to use some Fibonacci numbers (this is just a hypothetical example :)). I want to see if I can make the JIT/C++ Compiler to put the actual number instead of calling the calculation method. I made two implementations of the calculation but I will show only one, recursive and horribly slow implementation (don't ever use it or even consider of using it!):
static int FibonacciRecursive(int level)
{
if (level == 0)
return 0;
if (level == 1)
return 1;
return FibonacciRecursive(level - 1) + FibonacciRecursive(level - 2);
}
And my code for the tests is:
static void Main(string[] args)
{
Console.ReadLine();
int result = FibonacciRecursive(45);
Console.Out.WriteLine(result);
}
I added the ReadLine so that I will have time to attach a debugger when I run the code. The 45th Fibonacci number, if you were wondering, is 1134903170 and not long after that it overflows int :).
I will save you the reading time and not try my code in debug mode. I will first run my code from Visual Studio by pressing "Run" and here is the result (I added comments in case your assembly is a little rusty):
; int result = FibonacciRecursive(45);
00000022 mov ecx,2Dh ;put 45 in the ecx register (this is how methods in .net expect their first argument)
00000027 call dword ptr ds:[006C1F34h] ;call the FibonacciRecursive method
0000002d mov dword ptr [ebp-0Ch],eax ;get the return value from eax register and put it in the "result" variable
00000030 mov eax,dword ptr [ebp-0Ch]
00000033 mov dword ptr [ebp-8],eax
As you can see there was no optimization in this case, in fact we can see the two lines
0000002d mov dword ptr [ebp-0Ch],eax
00000030 mov eax,dword ptr [ebp-0Ch]
Which copy the return value of the function (stored in eax) to the result variable and then copy it back to eax.
Now I will run the same code but attach the debugger at a later stage (after JIT generated the code) here is the result (again I added some comments):
00000013 mov ecx,2Ch ;put 44 in ecx register
00000018 call dword ptr ds:[001A37F8h] ;call the FibonacciRecursive method
0000001e mov esi,eax ;put the result in esi register
00000020 mov ecx,2Bh ;put 43 in the ecx register
00000025 call dword ptr ds:[001A37F8h] ;call the FibonacciRecursive method
0000002b add eax,esi ;add the results of the two calls into eax register
As you can see the code is much more optimized. Which means JIT generates different code if it sees a debugger attached so even when you compile in Release with optimization and run in VS you don't see the most optimized code.
You can also see that the JIT compiler unwinded the recursion and put its code inline but again the JIT compiler called the method and did not generate the result directly as I wanted. If you are beginning to doubt this is at all possible, keep reading.
Now I will change the call to
int result = FibonacciRecursive(1);
and run it without VS. Here is the result:
00000019 mov edx,1
Just as we wanted! Instead of calling the method it simply put 1 as the result.
Lets try running it with
int result = FibonacciRecursive(2);
The result is quite interesting:
00000013 mov ecx,1 ;put 1 in the ecx register
00000018 call dword ptr ds:[001937F8h] ;call FibonacciRecursive
0000001e mov esi,eax ;put the result in esi register
00000020 mov ecx,0 ;put 0 in the ecx register
00000025 call dword ptr ds:[001937F8h] ;call FibonacciRecursive
0000002b add eax,esi ;add the results of the two calls
This is really nice. Now we have two calls which are non recursive. This makes me wonder why it didn't run the same logic as it did in the previous run and replaced the method calls with the actual values. My guess would be to speed up the compilation time as it is done in runtime.
These are all my tries in .NET and as you can see I failed to achieve my goal. One could say that this is not possible but I want to look at a little C++ trick which can achieve just what I want. This trick is taken from a programming paradigm known as Generic Programming. This is the C++ code:
#include <iostream>
template <int N>
struct Fibonacci
{
static const long result = Fibonacci<N-1>::result+Fibonacci<N-2>::result;
};
template <>
struct Fibonacci<0>
{
static const long result = 0;
};
template <>
struct Fibonacci<1>
{
static const long result = 1;
};
int main()
{
std::cout << Fibonacci<45>::result;
return 0;
}
and when run, it compiles into this:
01001000 mov ecx,dword ptr [__imp_std::cout (1002048h)]
01001006 push 43A53F82h ; 1134903170 decimal == 43A53F82 hexadecimal
0100100B call dword ptr [__imp_std::basic_ostream<char,std::char_traits<char> >::operator<< (1002044h)]
This is exactly what I wanted. The compiler simply pushes the 45th Fibonacci element onto the stack and calls cout. (In C++ arguments are passed using stdcall convention where all the arguments are pushed to the stack.)
That's it for this week.
Thank you for reading.
Boris.