C# Compiler Emits call IL Instruction for Instance Methods Called on the Reference Returned by a Constructor

Some time ago, I wrote how the call instruction could actually call an instance method on a null reference, and that inside this instance method, the this keyword would reference to null.

I find that very interesting, so I kept on disassembling some sample code to see what’s the generated IL and try to grasp some of the compiler’s logic.

Here is some simple code used to see what’s the IL generated by the C# compiler:

class Hello1
{
    internal String GetHello()
    {
        return "Hello1";
    }
}

sealed class Hello2
{
    internal String GetHello()
    {
        return "Hello2";
    }
}

static void Main(string[] args)
{
    var h1 = new Hello1();
    Console.WriteLine(h1.GetHello());

    var h2 = new Hello2();
    Console.WriteLine(h2.GetHello());

    Console.WriteLine(new Hello1().GetHello());

    Console.WriteLine(new Hello2().GetHello());

    Console.ReadLine();
}

In the first two calls, we use a local variable that we call the GetHello method on, and in the two last calls we instantiate the object and call the GetHello method on the reference returned by the constructor, reference that we don’t keep.

Here’s the IL generated for the Main method:

CallCallvirtIL

We can see that in the first two call, the callvirt instruction is emitted by the compiler. As the call happens on a variable, the runtime type of the object could be different from the compile type, meaning that using the callvirt instruction makes sense (the compiler is not “smart” enough to detect that the compile time and the runtime types are the same).

In the two subsequent calls, however, as the method call is done on the reference returned by the constructor, the instruction emitted is call, which is slightly more performant than callvirt.

For more information on call and callvirt instructions, see ECMA 335 12.4.1.2.

Catching all Exceptions

These days, I’ve seen a lot of code like this in the code base I’m working on:

try
{
    //Do some parsing or any dangerous operation
}
catch (Exception)
{
    //Return something, as if nothing happened
}

Now, I can understand why this is done. It feels safe. After all, if you are in a method that is supposed to return something, you can simply return a default value if an exception shows up, right?

Wrong. Very wrong.

Please, only catch the exceptions you know you can recover from. Let’s say you parse something in the try block. At first, check if there is a TryParse method to do that instead, so no exception is thrown in the first place. If that’s not an option, only catch the exceptions that are caused by the parsing process, not all exceptions.

Leave the exception you can’t deal with bubble up the calling hierarchy. At the end, the will reach a process that can handle them or that will fail and log them. That is the way to go in all situations.

For excellent references on exceptions guidelines, please see  the excellent

Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries (2nd Edition).

SharePoint, SPWeb Objects and Dispose()

If you worked a bit with SharePoint, you surely know how important it is to dispose of SPSite objects, SPWeb objects and such.

However, these objects are sometimes retrieved by using another object’s property. A very common one is SPContext.Current that has properties to return the current SPSite and SPWeb.

Needless to say, calling dispose on an object that you got via another object’s property seems utterly rude. I mean, you are basically disposing something that was “borrowed” from another object. When I see code doing this, I get very suspicious.

Here is a common example:

using (SPWeb web = SPContext.Current.Web)
{
    //Do something with web
}

Using reflector, here is what SPContext.Web does:

public SPWeb Web
{
    get
    {
        if (this.m_web == null)
        {
            this.m_web = SPControl.GetContextWeb(this.m_context);
        }
        return this.m_web;
    }
}

As we can see, this returns a private field of type SPWeb. Calling Dispose() on that (via the using statement) is clearly not very nice to our beloved SPContext. This kind of coding is known to bring unexpected errors. In fact, FxCop will raise a warning if it is set to check for SharePoint Best Practices Rules.

If this is a general rule, there are of course exceptions, I believe due to some inconsistencies in the API. Here is one of them:

SPWeb web = SPContext.Current.Web;

foreach (SPWeb subWeb in web.Webs)
{
    //Do something with subWeb

    subWeb.Dispose();
}

In this particular case, we iterate over a collection of SPWeb objects trough the SPWeb.Webs property. As said before, at first sight, this looks bad.

Again, using reflector, this is what the Webs property does:

public SPWebCollection Webs
{
    get
    {
        if (this.m_Webs == null)
        {
            this.m_Webs = new SPWebCollection(new SPWebCollectionProvider(this), this.m_guidId);
        }
        return this.m_Webs;
    }
}

It is very similar to the previous code, so it feels very unnatural to call Dispose() on all object that are held in that collection.

However, this is the way to go, otherwise FxCop will raise a warning saying that SPWeb objects were not disposed of. In my opinion, this shouldn’t be a property, but a method called GetWebs() or something similar, as explained in Framework Design Guidelines.

Related links:

Instance Methods Called on null References

In a previous post, I wrote how you can call Extension Methods on null references, as in fact the are static methods with one more parameter, the extended object itself.

I’m currently reading CLR via C# (which is a fascinating read), and I was surprised to learn in chapter 6 how the CIL instructions call and callvirt actually work.

What is amazing is that for methods called with the call instruction, the CLR does not check if the referenced object is null. The method call will succeed, but the this reference will be null in the instance method. Actually, in both cases, the reference to the object that the method was called on is passed as a hidden parameter to the method.

Before examining this, another interesting fact is the that the C# compiler mostly emits callvirt instructions when calling a method, which checks if the reference is null. To test the call instruction easily, we will have to disassemble, modify then reassemble the following code:

public class SomeClass
{
    public String GetHello()
    {
        if (this == null)
        {
            return "Amazing!";
        }

        return "Hello";
    }
}

class Program
{
    static void Main(string[] args)
    {
        var o = null as SomeClass;
        var hello = o.GetHello();

        Console.WriteLine(hello);
    }
}

Pretty dumb, right? Especially the if statement where we check if this is null. It’s seems logical to most of us that this will throw a NullReferenceException. However, this is just to get the compiler to build us code that is very close to what to achieve, so we don’t have to write IL ourselves.

After running ILDasm.exe on the assembly, this is what we have in the Main method:

  .method private hidebysig static void  Main(string[] args) cil managed
  {
    .entrypoint
    // Code size       18 (0x12)
    .maxstack  1
    .locals init ([0] class Sandbox09.SomeClass o,
             [1] string hello)
    IL_0000:  nop
    IL_0001:  ldnull
    IL_0002:  stloc.0
    IL_0003:  ldloc.0
    IL_0004:  callvirt   instance string Sandbox09.SomeClass::GetHello()
    IL_0009:  stloc.1
    IL_000a:  ldloc.1
    IL_000b:  call       void [mscorlib]System.Console::WriteLine(string)
    IL_0010:  nop
    IL_0011:  ret
  } // end of method Program::Main

As we can see, the call to GetHello is done with the callvirt instruction. As this instruction checks if the object is null (and in this case, it is), this will fail at runtime.

Just to make sure, I used ILasm.exe to build the assembly and ran it, here is what it outputs:

Unhandled Exception: System.NullReferenceException: Object reference not set to an instance of an object.

   at Pvle.Program.Main(String[] args)

Now, let’s try to replace the callvirt by call to see how it behaves.

    IL_0004:  call       instance string Sandbox09.SomeClass::GetHello()

Now, run it again trough ILasm.exe once more and run it. Here’s what it outputs:

Amazing!

The actual difference between call and callvirt is that call calls the method on the compile time type of the object, so there is no need to check if the reference is null. The object will be passed as a hidden parameter to the method and will be references as this. It’s very similar to extension methods.

Callvirt, on the other hand, will resolve the method that is to be called at runtime, depending on the runtime type of the object, so the object cannot be null. The CLR enforces this check at runtime.

What About Value Types?

For value types, it’s a bit different. As they are implicitly sealed, the only methods that are virtual are the ones that are defined in System.Object. Oh wait, there is another case: if the value type is cast to an interface it implements, calls to methods on that variable will be using callvirt, as the value type will have to be boxed.

Here is some sample code that demonstrates this:

public interface ISomeInterface
{
    String GetHelloFromInterface();
}

public struct SomeClass : ISomeInterface
{
    public String GetHello()
    {
        return "Hello";
    }

    public override string ToString()
    {
        return "Hello";
    }

    public String GetHelloFromInterface()
    {
        return "Hello from interface";
    }
}

class Program
{
    static void Main(string[] args)
    {
        var o = new SomeClass();
        var hello = o.GetHello();

        o.ToString();
        o.GetHelloFromInterface();
        ((ISomeInterface)o).GetHelloFromInterface();
        o.GetHashCode();
    }
}

And here is the corresponding IL for the Main method:

  .method private hidebysig static void  Main(string[] args) cil managed
  {
    .entrypoint
    // Code size       66 (0x42)
    .maxstack  1
    .locals init ([0] valuetype Pvle.SomeClass o,
             [1] string hello)
    IL_0000:  nop
    IL_0001:  ldloca.s   o
    IL_0003:  initobj    Pvle.SomeClass
    IL_0009:  ldloca.s   o
    IL_000b:  call       instance string Pvle.SomeClass::GetHello()
    IL_0010:  stloc.1
    IL_0011:  ldloca.s   o
    IL_0013:  constrained. Pvle.SomeClass
    IL_0019:  callvirt   instance string [mscorlib]System.Object::ToString()
    IL_001e:  pop
    IL_001f:  ldloca.s   o
    IL_0021:  call       instance string Pvle.SomeClass::GetHelloFromInterface()
    IL_0026:  pop
    IL_0027:  ldloc.0
    IL_0028:  box        Pvle.SomeClass
    IL_002d:  callvirt   instance string Pvle.ISomeInterface::GetHelloFromInterface()
    IL_0032:  pop
    IL_0033:  ldloca.s   o
    IL_0035:  constrained. Pvle.SomeClass
    IL_003b:  callvirt   instance int32 [mscorlib]System.Object::GetHashCode()
    IL_0040:  pop
    IL_0041:  ret
  } // end of method Program::Main

We can see that when calling the method trough the interface, the value type is boxed.

I find this very interesting in understanding how calls to methods actually work. Getting your nose in IL is always a good idea when you want to see what happening under the hood, but I have to admit that this is the first time that I modify it and reassemble it.

SelectMany, Sorting and Grouping Objects

So here is the problem: I have a list of items that

var collection = new[]
{
    new { Title = "One", References = "1;3" },
    new { Title = "Two", References = "2;3" },
    new { Title = "Three", References = "1;4" },
    new { Title = "Four", References = "4"}
};

The References fields of these object is some kind of category. What I want to do here is to have a list for each different reference (in this example: 1, 2, 3 and 4) containing all the items that are in the reference. Items will be duplicated if they are in more than one category.

To sum it up, the expected output would be: One, Three, Two, One, Two, Three, Four

After fooling around a bit, here is the query I came out with:

var query = from c in collection
            from d in c.References.Split(';')
            orderby d
            group c by d into groups
            select groups;

This does exactly what I want and produces the output I expected from the input data.

However, when I use Linq, I generally use extensions methods directly and not the pretty query syntax. This is mostly because I want to understand what happens behind the scene, and I have to admit that this query was quite a beast.

First, as there are two from clauses, there is a SelectMany somewhere. You probably know that SelectMany is a kind of the beast and that understanding it fully is quite a challenge compared to the other operators/extensions methods. Also, I thought that the GroupBy clause was going to be tough, as we groups c items by d which is in the other collection.

I couldn’t figure out by myself how to write that query using extension methods, so I fell back on the good old Reflector that gave me a straight answer:

var query = collection.SelectMany(delegate (<>f__AnonymousType0 c) {
    return c.Values.Split(new char[] { ';' });
}, delegate (<>f__AnonymousType0 c, string d) {
    return new { c = c, d = d };
}).OrderBy(delegate (<>f__AnonymousType1<<>f__AnonymousType0, string> <>h__TransparentIdentifier0) {
    return <>h__TransparentIdentifier0.d;
}).GroupBy(delegate (<>f__AnonymousType1<<>f__AnonymousType0, string> <>h__TransparentIdentifier0) {
    return <>h__TransparentIdentifier0.d;
}, delegate (<>f__AnonymousType1<<>f__AnonymousType0, string> <>h__TransparentIdentifier0) {
    return <>h__TransparentIdentifier0.c;
}).Select(delegate (IGrouping<>f__AnonymousType0> groups) {
    return groups;
});

After reading that, it made much more sense. Here is what I came up with when writing it on my own:

var p = collection
    .SelectMany(c => c.References.Split(';'), (c, d) => new { c, d })
    .OrderBy(t => t.d)
    .GroupBy(t => t.d, c => c.c);

Much more readable. The idea here is that the SelectMany clause outputs a sequence of anonymous types that contains the two kind of elements. This sequence is then sorted with the OrderBy, and finally fed trough a GroupBy that uses the d property as the grouping key and the c property as the project in the resulting collections. Not that difficult after all…

Here is another version that is probably a bit more clear:

var q = collection
    .SelectMany(c => c.References.Split(';'), (c, d) => new { Title = c.Title, Reference = d })
    .GroupBy(c => c.Reference, c => c.Title)
    .OrderBy(g => g.Key);

Note that this is a simplified version of the original issue. The issue itself was to do this with some ListItems retrieved from SharePoint. Objects were a bit more complicated, but logic is the same.

← Previous PageNext Page →