Saturday, October 24, 2009

"I thought you were still there..."


NTSTATUS: STATUS_IS_NOT_WIN32



This has to do with the .net CLR loading assemblies dynamically. Sure, it's not a Win32 thing, but it is related when I went about trying to achieve what I normally would in native Win32 apps.

The story begins quite simply, in implementing basic plug-in type architectures. When one did it in regular (native) Win32 apps with C/C++, you'd just use LoadLibrary() and FreeLibrary() to load and unload the relevant DLL's that would function as plug-ins.

Trying to achieve the same in .net, the methods from the Assembly class LoadFrom() and LoadFile() can be used to load an assembly at runtime. However, what's obviously missing is any method to unload those assemblies. After a little searching, we come across this. Apparently, there are plenty of reasons not to. So, this clearly begs the question, why it can be done in native Win32, and not in the .net CLR. From what I can tell, it's basically a case of trying to stop people from shooting themselves in the foot. In my case, I know exactly what I'm doing, and it's exactly what I intend to do. But too bad, everyone gets the same treatment.

It seems the closest is to read the entire assembly binary into memory, and use Assembly.Load(). Effectively, this gets the job done. So, problem solved. However, it wasn't so for me. The problem laid in WPF. This 'bug' is related to user controls and assembly loading, and I will write about that in a subsequent post.

So anyway, I wanted to see what would happen if I manually unloaded a .net assembly. I did the obvious thing having the assembly's path:

IntPtr hModule = Win32.GetModuleHandleW(assemblyPath);

if (hModule != null)
{
if (Win32.FreeLibrary(hModule))
{
Debug.WriteLine("DLL released...");
}
else
{
// failed
Debug.WriteLine("Failed to release DLL...");
}
}


Win32 in the above code is a class in which I declared the Win32 API functions. In any case, the assembly is unloaded from the process, piece of cake. So, the question now is, does anybody know about it? Obviously, the answer is NO. I suppose it is not reasonable for the .net CLR to check if the assembly is still there anytime it wanted anything from it. Heck, I wouldn't do that either since I did not let anybody else but myself do it.

Calling the code to load the assembly again does not result in the assembly being loaded again. Simply a matter of replacing the assembly with one with different content/code, and loading it again. As far as the CLR is concerned, it's already loaded. Fair enough. So the next time you called anything in the assembly, BOOM! It crashes as expected.

Oh, do note that there would still be a handle to the assembly (the DLL) in the process. It is no easy task to find the handle and close it, so I wasn't bothered as it wouldn't make any difference.

Freeing the .net DLL manually was really just a case of 'let's see what happens'. Not recommended for .net assemblies, but it's just regular for native Windows Win32 DLL's if you're working with them.

Thursday, October 8, 2009

"Who's your daddy, and what does he do?"


NTSTATUS: STATUS_NO_SUCH_PROCESS



Recently, I had stumbled upon some bizarre behaviour relating to Windows processes. Here's how it went...

I noticed that a process (of an app I wrote) had been running for a couple of minutes and still had not completed. It usually takes only a matter of seconds to complete (this is in Windows Vista). So I killed it (with Sysinternal's Process Explorer), and ran it again. Same behaviour...

Before going further, the basic operation of this app is to run some checks on processes in certain branches of the system process tree. For example, to only perform a task on processes that are not services.

To dig into the problem, I proceeded to rebuild the app with debug output, and see what it says with Sysinternal's DebugView. According to the output, it appears that the process doesn't get past going through the processes in the system.

Next thing I did was to run the app in the debugger and see what happens. Well, it just keeps going, and going. Trying my luck, I hit the 'pause' button to break the process. Happily, it actually stopped in my code. It stopped in a piece of code that traverses up the process tree from any particular process. This is pretty straightforward:

I had already built a map of all the processes in the system. I got the processes via ToolHelp32. So the basic idea is (simplified):

For a process:
  1. Stop if this is an ancestor I am looking for
  2. Get the parent process of currently examined process
  3. If there are no more processes, stop
  4. Loop back to 1.
So basically, it just goes up a process's ancestry, to see if it can find a specific process. The problem that was occurring, was that it somehow ended up being an infinite loop! And how can that happen?!?

Well, here's how it happened... Two processes have each other as parents! In the instance of the problem I was having, this was how it went:

devenv.exe[PID:3824] has parent [PID:3364]
explorer.exe[PID:3364] has parent [PID:3824]

"This can't be right?!?", I thought. Taking a look at explorer.exe in Process Explorer, it did show that devenv.exe was its parent, which I know is wrong because I did not use devenv.exe to start the shell. It was the other way round, as is normally. However, Process Explorer, or rather, Mark Russinovich (We're not worthy!!!) knows it's wrong, and indicates that the parent process is a 'Non-existent Process'.

So, I reworked the code to account for such a situation, and it works nominally again. In retrospect, I think it's fair that I did not write the code to expect such a situation, or to even think of it.

It was late and I was tired, so I turned in right after that. Later, I realized I should have taken some screenshots as I still have no idea how that whole fiasco happened...