Standardizing the .NET Framework
In October 2000, Microsoft (along with Intel and Hewlett-Packard as co-sponsors) proposed a large subset of the .NET Framework to the ECMA (the European Computer Even though today’s CPUs can’t execute IL instructions directly, CPUs of the future might have this capability. To execute a method, its IL must first be converted to native CPU instructions. This is the job of the CLR’s JIT (just-in-time) compiler. Manufacturer’s Association) for the purpose of standardization. The ECMA accepted thisproposal and created a technical committee (TC39) to oversee the standardization process. The technical committee is charged with the following duties:
Group 1 Develop a dynamic scripting language standard (ECMAScript). Microsoft’s implementation of ECMAScript is JScript.
Group 2 Develop a standardized version of the C# programming language.
Group 3 Develop a Common Language Infrastructure (CLI) based on a subset of the functionality offered by the .NET Framework’s CLR and class library. Specifically, the CLI will define a file format, a common type system, an extensible
metadata system, an intermediate language (IL), and access to the underlying platform (P/Invoke). In addition, the CLI will define a factorable (to allow for small hardware devices) base class library designed for use by multiple programming anguages.
Once the standardization is complete, these standards will be contributed to ISO/IEC JTC 1 (Information Technology). At this time, the technical committee will also investigate further directions for CLI, C#, and ECMAScript as well as entertain proposals
for any complementary or additional technology. For more information about ECMA, see http://www.ECMA.ch and http://MSDN.Microsoft.com/Net/ECMA.
With the standardization of the CLI, C#, and ECMAScript, Microsoft won’t “own” any of these technologies. Microsoft will simply be one company of many (hopefully) that are producing implementations of these technologies. Certainly Microsoft hopes that their implementation will be the best in terms of performance and customer-demand-driven features. This is what will help sales of Windows, since the Microsoft “best of breed” implementation will run only on Windows. However, other companies may implement these standards, compete against Microsoft, and possibly win.
Figure 1-4 shows what happens the first time a method is called.
Just before the Main method executes, the CLR detects all the types that are referenced by
Main’s code. This causes the CLR to allocate an internal
data structure that is used to manage access to the referenced type. In Figure 1-4, the Main
method refers to a single type, Console, causing the CLR to allocate a single
internal structure. This internal data structure contains an entry for each
method defined by the type. Each entry holds the address where the method’s
implementation can be found. When initializing this structure,
the CLR sets each entry to an internal, undocumented function contained inside the CLR itself. I call this function JITCompiler.
makes its first call to WriteLine, the JITCompiler function is called. The
JITCompiler function is responsible for compiling a method’s IL code into
native CPU instructions. Because the IL is being compiled "just in
time," this component of the CLR is frequently referred to as a JITter or
a JIT compiler.
When called, the JITCompiler function knows what method is being called and what type defines this method. The JITCompiler function then searches the defining assembly’s metadata for the called method’s IL. JITCompiler next verifies and compiles the IL code into native CPU instructions. The native CPU instructions are saved in a dynamically allocated block of memory. Then, JITCompiler goes back to the type’s internal data structure and replaces the address of the called method with the address of the block of memory containing the native CPU instructions. Finally, JITCompiler jumps to the code in the memory block. This code is the implementation of the WriteLine method (the version that takes a String parameter). When this code returns, it returns to the code in
which continues execution as normal.
Figure 1-5 shows what the situation looks like when WriteLine is called the second time.
A performance hit is incurred only the first time a method is called. All subsequent calls to the method execute at the full speed of the native code: verification and compilation to native code are not performed again.
The JIT compiler stores the native CPU instructions in dynamic memory. This means that the compiled code is discarded when the application terminates. So, if you run the application again in the future or if you run two instances of the application imultaneously (in two different operating system processes), the JIT compiler will have to compile the IL to native instructions again.
For most applications, the performance hit incurred by JIT compiling isn’t significant. Most applications tend to call the same methods over and over again. These methods will take the performance hit only once while the application executes. It’s also likely that more time is spent inside the method than calling the method.
Figure 1-5 : Calling a method for the second time
You should also be aware that the CLR’s JIT compiler optimizes the native code just as the back-end of an unmanaged C++ compiler does. Again, it may take more time to produce the optimized code, but the code will execute with much better performance than if it hadn’t been optimized.
For those developers coming from an unmanaged C or C++ background, you’re probably thinking about the performance ramifications of all this. After all, unmanaged code is compiled for a specific CPU platform and, when invoked, the code can simply execute. In this managed environment, compiling the code is accomplished in two phases. First, the compiler passes over the source code, doing as much work as possible in producing IL. But to execute the code, the IL itself must be compiled into native CPU instructions at run time, requiring more memory to be allocated and requiring additional CPU time to do the work.
Believe me, since I approached the CLR from a C/C++ background myself, I was quite skeptical and concerned about this additional overhead. The truth is that this second compilation stage that occurs at run time does hurt performance and it does allocate dynamic memory. However, Microsoft has done a lot of performance work to keep this additional overhead to a inimum.
If you too are skeptical, you should certainly build some applications and test the performance for yourself. In addition, you should run some nontrivial managed applications Microsoft or others have produced and measure their performance. I think you’ll be surprised at how good the performance actually is. In fact, you’ll probably find this hard to believe, but many people (including me) think that managed applications could actually outperform unmanaged applications. There are many reasons to believe this. For example, when the JIT compiler compiles the IL code into native code at run time, the compiler knows more about the execution environment than an unmanaged compiler would know. Here are some ways that managed code could
outperform unmanaged code:
A JIT compiler could detect that the application is running on a Pentium 4 and produce native code that takes advantage of any special instructions offered by the Pentium 4. Usually, unmanaged applications are compiled for the lowest-common-denominator CPU and avoid using special instructions that would give the application a performance boost over newer CPUs.
A JIT compiler could detect that a certain test is always false on the machine that it is running on. For example, consider a method with code like this:
if (numberOfCPUs > 1)
This code could cause the JIT compiler not to generate any CPU instructions if the host machine has only one CPU. In this case, the native code has been fine-tuned for the host machine: the code is smaller and executes faster.
The CLR could profile the code’s execution and recompile the IL into native code while the application runs. The recompiled code could be reorganized to reduce incorrect branch predictions depending on the observed execution patterns.
These are only a few of the reasons why you should expect future managed code to execute better than today’s unmanaged code. As I said, the performance is currently quite good for most applications, and it promises to improve as time goes on.
If your experiments show that the CLR’s JIT compiler doesn’t offer your application the kind of performance it requires, you may want to take advantage of the NGen.exe tool that ships with the .NET Framework SDK. This tool compiles all an assembly’s IL code into native code and saves the resulting native code to a file on disk. At run time, when an assembly is loaded, the CLR automatically checks to see whether a precompiled version of the assembly also exists, and if it does, the CLR loads the precompiled code so that no compilation at run time is required.