« Home | DBMS Blogs Moved » 

Saturday, December 03, 2005 

PInvoke and IJW

One of the additions to the C++\CLI was a new method to call unmanaged code from managed code, and Microsoft guys called it IJW, which is a short form for "It Just Works".
The old method for invoking unmanaged code from inside managed code in C# was called "Platfrom invoke" or simply PInvoke.

I was very interested when I read in an article that the performance of IJW is extremely higher than that of PInvoke. The writer wrote two small programs, one was written in C# and used PInvoke to call an unmanaged function, and the other one was written in C++, and used the IJW method in calling the same unmanaged function.
But in one of the comments to the article, someone noted that there's a performance overhead in the C# code for changing a string builder into a char array. When removing this overhead by using other data type than StringBuilder, the performance of the program written in C# using PInvoke became the higher one!

I thought of trying it myself. I modified the two programs to look the same as much as possible. The input to the unmanaged method was an array of TCHAR. I used a char array in C# instead of using a StringBuilder or even a string, to make the C++ and the C# code as similar as possible. I used the QueryPerformanceCounter method to measure the elapsed time to get the most possible accuracy.
Here are the two programs- I am sorry because this blogging application doesn't have C++ or C# syntax highlighting features, so the code may be some what difficult to read.

The C++ program:


void Test (int x)
{
TCHAR Buffer[512];
for (int i=0;i < x ;i++)
GetWindowsDirectory(Buffer,511);
}

void DoTest(int x)
{
LARGE_INTEGER Initial;
if(!QueryPerformanceCounter(&Initial))
{
Console::WriteLine("Failed");
return;
}
Test (x);
LARGE_INTEGER Final;
if (QueryPerformanceCounter(&Final))
{
__int64 ElapsedTime = Final.QuadPart - Initial.QuadPart;
Console::WriteLine("It Took {0} cycles to perform {1} iterations",ElapsedTime,x);
}
else
{
Console::WriteLine("Failed");
}
}

void main(void)
{
LARGE_INTEGER Frequency;
QueryPerformanceFrequency(&Frequency);
Console::WriteLine ( "Counter frequency =" + Frequency.QuadPart);
DoTest(100);
DoTest(1000);
DoTest(100000);
DoTest(1000000);
Console::Read();
}

The C# Program:

[SuppressUnmanagedCodeSecurity]
[DllImport("Kernel32.dll")]
static extern bool QueryPerformanceCounter(out long PerformanceCount);

[SuppressUnmanagedCodeSecurity]
[DllImport("Kernel32.dll")]
static extern bool QueryPerformanceFrequency(out long PerformanceFrequency);

[SuppressUnmanagedCodeSecurity]
[DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Auto)]
static extern uint GetWindowsDirectory([Out] char[] Buffer, uint uSize);

static void Test(int x)
{
char[] Buffer = new char[512];
for (int i = 0; i < x; i++)
{
GetWindowsDirectory(Buffer, 511);
}
}

static void DoTest(int x)
{
long Initial;
if (!QueryPerformanceCounter(out Initial))
{
Console.WriteLine("Failed");
return;
}
Test(x);
long Final;
if (QueryPerformanceCounter(out Final))
{
long ElapsedTime = Final - Initial;
Console.WriteLine("It Took {0} cycles to perform {1} iterations", ElapsedTime, x);
}
else{Console.WriteLine("Failed");
return;
}
}
static void Main()
{long PerformanceFrequency;
QueryPerformanceFrequency(out PerformanceFrequency);
Console.WriteLine("Performance Frequency = {0}", PerformanceFrequency);
DoTest(100);
DoTest(10000);
DoTest(100000);
DoTest(1000000);
Console.Read();
}

The result was that the C# program was faster with a small margin. This was strange for me, because I expected the C++ program to be faster.

I used the ILdasm tool to look at the IL of the two generated exe files. Something caught my attention: in the C++ application, the calling to the unmanaged function- GetWindowsDirectory- is done using the stdcall calling convention. This was the IL of calling the GetWindowsDirectory function:

uint32 modopt([mscorlib/*23000001*/]System.Runtime.CompilerServices.CallConvStdcall/*01000001*/) GetWindowsDirectoryW(char*, uint32) /* 06000058 */

On the other hand, In the IL of the C# program, the stdcall calling convention was not used for calling the GetWindowsDirectory method, the winapi calling convention was used instead. This is the IL of calling the GetWindowsDirectory method in the C# application:

.method /*06000001*/ private hidebysig static pinvokeimpl("Kernel32.dll" winapi) bool QueryPerformanceCounter([out] int64& PerformanceCount) cil managed preservesig

I searched the documentation, and I found that the stdcall is the default method for calling methods when using PInvoke. Of course this was not true for our case. I forced the function call to use the stdcall calling convention by explicitly specifying this to the DllImport attribute like this:

[DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Auto, CallingConvention = CallingConvention.StdCall)]

I tested the two programs, and I found that the C++ application was the better one this time, but also with a small margin.

These are some notes to be taken into consideration when comparing between PInvoke and IJW:

  1. Theoretically, when using PInvoke, a managed data type should be marshaled to the same unmanaged type as when using IJW. This is not true for all cases. Arjun Bijanki from the Visual C++ team has noted in a discussion that some data types can be marshaled to different types when using IJW than when using PInvoke.
  2. The MSIL code generated is not the same for the PInoke as that for the IJW, even for a very similar code.
  3. Not all things mentioned in the docs are true!
  4. From the syntax point of view, the IJW method is more easy and elegant. This is a strange thing to find that there's something that can be done in C++ in a more easy and convenient way than in C#, but it is true. Microsoft has played it well this time!

I seem to recall that the WINAPI calling convention is actually stdcall - at least, that's what it is in C++.

Search for the definition of WINAPI (It's a #define) in the windows headers, and you'll find it's defined to be stdcall.

You are right; The StdCall calling convention depends on the default platform calling convention, which is StdCall for Windows, and Cdecl for Windows CE.NET.

Any way as I said, in both cases the performance margin was very small that it cannot be taken into account, but the results are repeated, i.e. when using the StdCall in C#, the C++ application is the faster one, and when using the WinAPI call, the C# application is the faster one.
May be there's something I don't know or something I didn't notice in the disassembly of the two applications.

Well, there's plenty of room to experiment :)

Why don't you try contacting your ISV-contact or whatever he was called? Or try GameDev's .NET forum; there are highly knowledgeable people there - most notably the moderator, Washu.

I have asked about stuff related to the IJW and PInvoke and the difference between them on Microsoft's forum, and no one answered me except for one of the VC++ team. He answered my questions partially and said that he will ask the compiler team about them, but he didn't reply after that.
One of the problems of Microsoft's forums is that you cannot send a personal message to someone, so I couldn't send him a message reminding him of my questions.
An regarding my ISV buddy at microsoft, he answers lately, sometimes after three weeks of sending the question, he seems always busy.
But lately he sent me an email saying that he can not answer my questions.
Any way it is not very important to me. I was just trying both techniques to find the difference in performance myself , because this would have affected my choice of the programming language to use to my DBMS project. The performance margin (parts of a second in a million call) is not very great. It would not affect my decision.
I have already made my choice, and I plan to use C++ (mixed code) in my application, GOD Willing.

Post a Comment

Links to this post

Create a Link

About me

  • I'm Mohammad Adel
  • From Cairo, Egypt
  • I am an Eyptian developer @ optimize software company.
My profile

Previous posts