Modify System.Exception in SetILFunctionBody

I am wondering if I can get some help on this.

We have working "Proxy" that is invoked via insertion into methods and gathers all the calling arguments, etc. and passes to a handler.

We are using the SetILFunctionBody during JIT compilation. It works fine for 1.1, but in 2.0 ONLY works when using LoaderOptimization other than SingleHost when hooking corlib methods such as the System.Exception .ctor. We are hooking many methods - all NON core methods work well.

I have come to the conclusion that it has to do with Domain Neutrality of mscorlib (only) being in the shared domain.

So, my questions are this: Other than the LoaderOptimizationAttribute OR the args to the CorRuntimeBindEx to load the CLR - how can one override the LoaderOptimization via registry, .config file, etc.

Second, if we immediately "Sandboxed" the Proxy via CreateInstance into another domain and marshalled the parameters - do you think that would get around the domain neutral issues - we just want the call and the args.

It seems like something really changed in 2.0 as 1.1 lets us do as we please.

And of course, any other suggestions would be appreciated...


Answer this question

Modify System.Exception in SetILFunctionBody

  • sherry326

    Thanks for your response...

    We are hooking the .ctor in System.Exception...

    Nothing wrong with the IL because it works fine in MultiDomain or MultiDomainHost mode, it works fine with SingleDomain as long as you don't specify a new Domain...

    The hook "fixes" up tokens, and instantiates a .DLL in the GAC...that .DLL dispatches to "substitutes" that do the handling for Exceptions, ASP.NET events (hooking the HttpApplication class), SQL client calls, etc....

    The crash occurs immediately on the new xxx() call (newobj) that is injected...it causes "rejits" of the Exception which in turn causes our code to be invoked...which in turn another rejit...eventually, stack overflow.

    It is only for 2.0, apps work fine in 1.1. For some reason, something changed in 2.0. I am sure it has to do with the LoaderOptimization and domain neutrality. I believe that mscorlib in SingleDomain mode absolutely will not call outside of itself...

    Any clues












  • Carel Greaves

    Hi,

    Could you specify what kind of error do you get There are a number of scenarios when the CLR-inside errors raises that you can't event catch but in your case it is a managed Exception (you said it leads to the overflow) so I think it can be just a bad IL code inserted. You can find the type of exception in Exception callback of the profiler.

    Or I misunderstand smth


  • nyforever

    David,

    What is wrong is that the newly Jitted code crashes as soon as it is executed by throwing a Exception...well, since it is the Exception class that has the problem - it goes into an infinite JIT loop and crashes with stack overflow...we crash during execution as soon as it does the new xxx() on the proxy, not modification.

    We have put NO reference in the mscorlib. However, we do put in all the tokens into the method. We are hooking the .ctor methods. Our proxy is in the GAC, we use the opcode for new to load the class (and assume it'll find in the GAC). This code works well in 1.1. Also it works fine (with no AppDomains in 2.x). If more than 1 AppDomain and set SingleDomain - it crashes like this. If the Exception is first thrown in the default domain - then it's fine in SingleDomain - but then the first call to CreateDomain generates an EngineExecutionException.

    As for the Enter/Leave - we tried that, we had a Com Proxy and it worked fine for calling. However, we had hell trying to get the parameters using the FunctionID to work into PCCOR_SIGNATURE and then parsing the signature and formulating the marshal from a native type to managedtype. Not to mention, everybody told us - do not call managed code from the unmanaged code in Function Enter/Leave...well, as long as you realize that you could have a recursive callback problem and manage it - we didn't have any problems. But we abandoned it because we got tired of playing around with VARIANTS, SAFEARRAYS, and other very unfriendly code types.

    Any way we could talk on the phone or does that violate rules of this blog...

  • Corby

    Phone won't work well for me, but I think we can make good progress on this forum. As I understand it (and please correct me if I'm wrong), you prefer all your inspection code (which sits in what you're calling your "proxy") to be managed. And yes, I agree, if you go the FunctionEnter2 route, you must not be running managed code from there!

    Your managed proxy resides in its own assembly (that you created) in the GAC, and you rewrite IL of ctors from mscorlib so that they call your GAC'd proxy's ctor so that your proxy's ctor may do something with the hooked mscorlib object's ctor parameters. The part I'm still trying to understand is how you make the rewritten IL of the mscorlib object ctor call your proxy ctor. It sounds like you're using the "newobj" IL instruction, which means you need to include in that instruction a metadata token representing the target of the call (which will be your proxy's ctor). This token would probably be a MemberRef, which must have been composed in part by using an AssemblyRef that references your proxy's assembly. That right there constitutes a reference from the assembly you're instrumenting (i.e., mscorlib) to your proxy. (I was a little vague when I used the term "reference" in my earlier post, so I apologize if I was unclear.)

    If that's indeed what's going on, you may simply be lucky that you're only hitting problems when mscorlib is loaded domain neutral. In general, such a scenario can't be assumed to work--everyone's allowed to reference mscorlib, but mscorlib isn't allowed to reference anyone. The standard fix we recommend for this is not to create your own assembly to house the proxy. Instead, dynamically add your proxy to mscorlib directly using the metadata API at runtime. Literally add your proxy's class and methods inside mscorlib at runtime (say from ModuleLoadFinished() for mscorlib), so that any tokens you use to reference your proxy are Defs instead of Refs.

    Then again, I could easily be misintepreting how your proxy fits into the equation. :-) If that's so, please let me know how your proxy actually gets called.


  • Valst

    Sergey,

    I am familar with the AppDomainManager - I actually attempted to use it to change the LoaderOptimization on the app (since I do not have access to the app) - even though I interception the init and create of the Domain, it was too late to change the global domain...and although I changed the Setup, SetData("LOADER_OPTIMIZATION"), etc. - it didn't matter to the runtime.

    So, you are saying - Load the assembly up front early on in AppDomain creation, and the Domain Manager is given some kind of "special pass", and that the corlib would somehow be "ok" with the assembly because it was already loaded


  • ddstevens

    It's possible to merge everything into mscorlib.dll without having to resign the assembly (and thus without needing a key), so long as you're flexible on your definition of "merge". :-) You can do this dynamically at runtime, inside the ModuleLoadFinished callback from mscorlib.dll. At that point, you can use GetModuleMetadata() to get pointers to metadata interfaces on mscorlib.dll, including those that can WRITE to the metadata like IMetaDataEmit. This allows you to add whatever classes, properties, functions, etc. you like to mscorlib.dll.

    Of course, none of this is persisted to disk, it must be done each time the app runs, you're forced to use the jitted (not ngen'd) version of mscorlib, and all your additions will be jitted as well. But at least you don't need a key! :-)

    I'm not sure-- you might already have known about this. I wanted to mention it here explicitly just in case. In spite of the disadvantages of this approach, it is generally considered the safest to use.


  • hypo

    Hi,

    I think it will be interesting. We have the same problem with the instrumenting on the early loading phase. I know the method to solve this problem, I've tested it but we didn't implemented it yet =(.

    If you don't know there is the class System.AppDomainManager in the FW 2.0. You can set your own appdomain manager. After that your managed assembly - your AppDomainManager will be loaded before any call of the mscorlib methods.

    You can read about AppDomainManager here: http://blogs.msdn.com/shawnfa/archive/2004/11/12/256550.aspx

    AppDomainManager in its turn can load any assembly you want and you will not have any exceptions, occurs because the incorrect tokens resolution - your assembly is already loaded.

    Please notify about your expirience.

    2 David: Thanks a lot for your blog. I like it very much.


  • Larry.Dugger

    >>
    Ok, assuming we use the ModuleLoadFinished (and do you have any links to any good examples here) and put our proxy in there...
    how do we ever get out of mscorlib
    <<

    There is an old IL rewriting sample on MSDN at
    http://msdn.microsoft.com/msdnmag/issues/03/09/NETProfilingAPI/,
    and one of my blog entries talks a bit about IL rewriting (with a note on the dangers of having mscorlib reference outside itself):

    http://blogs.msdn.com/davbr/archive/2006/02/27/540280.aspx

    but there isn't really a whole lot out there to guide you through what you want to do. If your instrumented code absolutely must escape mscorlib.dll somehow, you could try writing IL that does a P/Invoke outside of the instrumented mscorlib function into native code you write. That native code would then call back into managed code (presumably a non-mscorlib assembly) via reverse p/invoke.

    That's a bit intrusive to the app if you're doing this to every mscorlib function. But if you're hand-picking a few that aren't called every microsecond, this is worth exploring. Any mscorlib instrumentation is pretty dangerous stuff, though, and if you try this with mscorlib functions that are called very early on startup, all sorts of bad things can happen.

    The crux is that the CLR and mscorlib have insider knowledge about each other. The loader saw pretty substantial revs from 1.x to 2.0, and it has all sorts of assumptions about what mscorlib looks like (mscorlib is loaded first, it references no one, etc.). I understand how frustrating this is, particularly since things worked fine for you in 1.1, and they appear to work fine in 2.0 as long as mscorlib is not loaded domain-neutral (though I'd be wary of any conclusions you draw from observed behavior, as there might be circumstances other than domain-neutral that could spell trouble when you instrument mscorlib).

    One of the perils of writing profilers is that, since they're so low to the ground, major revs of the CLR often require changes to the profilers as well just to keep up. Still, you are not the only one who really needs to instrument mscorlib and is running into problems doing so. So this is something we need to do some more thinking about.

    Regarding lack of documentation--yes, I agree, more would be better! We are at least taking steps in the right direction. Now we at least have the API documented on MSDN, as opposed to 1.x which had a Word doc and comments in the corprof.idl file. :-)


  • raghu_grdr

    David, Thanks we are considering this...

    We modified to inject into a different class this time so we did not get the infinite loop, we picked the IndexOutOfRangeException...it tells me:

    Injected into System.IndexOutOfRangeException::.ctor 18a5f98
    EXCEPTION: multidomain.exe ERROR: System.IO.FileLoadException: Could not load f
    ile or assembly 'CLRGatewayProxyWrapper, Version=0.0.0.0, Culture=neutral, Publi
    cKeyToken=229f8702f35954dd' or one of its dependencies. The parameter is incorre
    ct. (Exception from HRESULT: 0x80070057 (E_INVALIDARG))
    File name: 'CLRGatewayProxyWrapper, Version=0.0.0.0, Culture=neutral, PublicKeyT
    oken=229f8702f35954dd'
    at TestIt.Called.doit(Object TheObject)

    No doubt it is in the GAC...and I even copied it into the app directory - works fine on anything BUT stuff out of corelib...
    Would be nice if the guys at Microsoft would give a more descriptive message or at least say in plain english WHY an Assembly would not load. I would have to say Assembly loading is one of the top 5 error spots for any .NET application and there's not alot of information when it goes bad.


  • Newbie Kam

    Hi,

    Our problem was that we are instrumenting method during the AppDomain loading between AppDomainCreationStarted and AppDomainCreationFinished callbacks. During these period we've got correct tokens, but these tokens couldn't be resolved. I actually do not know the real reason of this, but this behavior can be fixed using AppDomainManager.

    I set environment variables as it said in the documentation of AppDomainManager. My AppDomainManager in it's turn loads the assembly I want to call. So the tokens to that assembly resolves correctly.


  • karen reyes

    Hi! Let me ask a couple questions so I understand better what you're seeing. What exactly is going wrong with SetILFunctionBody when mscorlib is loaded domain neutral Are there problems with the SetILFunctionBody call itself, or do you see problems shortly afterward when the function you instrumented is JITted (or executed) Also, do you modify the metadata of mscorlib at runtime to add any references from mscorlib to any other assemblies If so, you will definitely see problems as the loader assumes mscorlib references nothing else. Finally, have you looked into the Enter2 function hook If all you want is a notification of a call and its arguments, that might be all you need (rather than having to do any IL rewriting at all).


  • Webbert

    well, I'm going to try some things and get back to you and let you know how it goes...

  • stombiztalker

    David,

    Everything you said is correct, we add the reference, we use the newobj, and we put calls on entry into the method and any exit...

    Ok, assuming we use the ModuleLoadFinished (and do you have any links to any good examples here) and put our proxy in there...
    how do we ever get out of mscorlib

    The Proxy eventually reads an XML file, which contains "handlers" that use the arguments and use LoadFrom to invoke special code which is designed to understand and do things with the method.

    Right now, we hook all of the sqlclient classes...and calculate response time for executeQuery, etc. - we use ThreadLocal storage to track all the way back to the user identity and track the queries.

    On ASP.NET we have hooked the HttpRuntime class (globals.asax) and we intercept all of the callbacks for Begin/End Session/Application/Request and errors.

    Now, both of those work fine in 2.0 as they must not be in corelib...it's only the core classes...

    So, if we stick our proxy in corelib - how do we ever get out of there...we cannot put our entire instrumentation in it...and we actually start TCP servers (so we can control the process) and all kinds of other stuff.

    I wrote a transparent "proxy" that allows me to move between AppDomains seamlessly as we do not know what AppDomain we will be in, but we have to be able to get back to our SubSystem (which lives in the Default Domain so it won't be unloaded).

    So, I guess you confirm what we are seeing - although we've never had a problem on 1.1...do you have a good example for loading into mscorlib and if we do how do we every get out of it (can we CreateInstanceAndUnwrap and marshal our way out )

    Thanks for you help...as you know, there's almost no documentation on this kind of stuff (get maybe 1 page on google and 1/2 of it is Japaneze - but we still try to read it :))

  • scottre

    Nope, even though I combined everything into 1 dll which also contained the AppManager classes AND I successfully loaded the .DLL in the initialize every single time - mscorlib throws exceptions when executing newobj. I can set the app to MultiHost load...and it works fine...

    If it was an instruction issue - then wouldn't it allways fail...not just for SingleDomain (mscorlib domain neutral).

    Hell, I'd just merge everything into mscorlib, but I don't have the key and I'd have to merge every single runtime version...

    Any suggestions

  • Modify System.Exception in SetILFunctionBody