PDA

View Full Version : Uh... my server spontaneously rebooted...



LarryBoy
08-24-2011, 02:46 PM
So, I installed IS 2012 SAB on my Server 2008 R2 box, commented out calls to my arbitrer (see prior thread discussing parallel builds), and started three builds.

All was going well until things broke on account of my not making a change to a specific helper binary (forgot to use the latest auto library) -- accordingly, I fixed the code and restarted the builds.

Upon doing so, two of the three builds locked up while attempting to extract COM info while building -- once I noticed they locked up, I killed those builds (and the one that had successfully continued) and restarted all three.

Not good -- all three locked up! At that point, I went to the task manager lookinig for a rogue process (or two) that needed axing... I found 5 ISReg<something> processes (can't remember the exact process name) -- I killed one, I went to kill the second, and BAM -- no more machine!

ARGH -- trying to kill that hung process appeared to cause the server to crash and restart -- NOT GOOD.

Needless to say, I'm feeling a wee bit nervous about moving to IS 2012 -- as a further experiment, I put the arbitrer calls around the two merge module builds using extract COM at build time and re-ran all three builds. But this decidedly gives me the willies...

Help!

Thanks,
John

LarryBoy
08-24-2011, 03:04 PM
I just noticed on one of the marketing pages that IS 2012 now uses a kernel driver (or something to that effect) for COM extraction? Granted, I've not done kernel-level development in about 13 years or so, but I gotta admit I'm a wee bit nervous about having my installation tool working at the kernel level, threatening the stability of my server -- seems risky for a tool that simply builds installs...

joshstechnij
08-24-2011, 04:18 PM
If a bugcheck occurred, a memory dump should have been saved in C:\Windows (as memory.dmp) or C:\Windows\Minidumps (as Mini*.dmp). If these files are available, please attach them to this thread or, if needed, we can provide an FTP location they can be uploaded to. The dumps contain information about the state of the system at the time of a bugcheck and can be useful in debugging this issue.

You can also revert to using the COM extraction method provided by IS 2011 and older by adding/changing the following value:
HKLM\Software\InstallShield\RegSpy
UseAPIRegistryHooks (REG_DWORD): 1

Possible values for UseAPIRegistryHooks:
0 - Use old (pre-IS 12) user mode API hooking
1 - Use registry redirection (IS 12+)
2 - Use kernel mode registry filtering (IS 2012+)

LarryBoy
08-25-2011, 07:02 AM
Thanks for the tip on how to change the method used!

I've attached the minidump analysis (using WinDbg) -- let me know if it's sufficient or if you still need me to attach the original minidump.

As an aside, I find that the DRIVER_UNLOADED_WITHOUT_CANCELLING_PENDING_OPERATIONS code makes sense, given I was walking along killing (seemingly) hung IsRegSpy processes (re: my original post, where I note that each new build process was hanging during COM extraction, hence my desire to ax things).

So in a sense I caused my own grief -- however, when dealing with hung processes it's not unusual behavior to track down (seemingly) orphaned processes to nuke them before trying again, and to have such actions result in an OS crash is less than desirable...

My 2 cents...

Thanks,
John

LarryBoy
08-25-2011, 07:07 AM
I feel like I should add one positive thing, and that is that every project I converted built without a hitch (trying again after the server crashed) -- so I do appreciate that!

Thanks,
John

joshstechnij
08-25-2011, 11:25 AM
Thanks for your reply.

Would it be possible to attach a zipped copy of the dump file? Since we have symbols for the driver the stack trace would be more meaningful as to what function was being called (it would be faster and simpler with the dump since WinDbg with symbols provides a powerful analysis for us). Based on the stack in the attached file it appears that the filter callback was not unregistered before the driver unloaded (which should not be possible, but likely indicates some race condition).

LarryBoy
08-25-2011, 02:48 PM
Ok - here's the minidump...

Thanks!
John

joshstechnij
08-29-2011, 06:17 PM
We're currently investigating this issue. The behavior here appears to be a race condition that results in the registry filter driver unloading before unregistering its registry callback with the system. Since this is a timing issue, determining the exact scenario that causes this and timing it just right is a bit difficult.

Would it be possible for you to test your scenario with an updated copy of the driver if we were to provide an update?

LarryBoy
08-30-2011, 07:01 AM
Good question -- on one hand, I can certainly appreciate the desire to reproduce issues on the systems where they originally occurred.

However -- that would mean reproducing the issue on our production build machine -- _not_ gonna happen (I'd hate to find out if they really would bring back being drawn and quartered...).

So... I apologize, but until I have the opportunity to try this on a different machine, I won't be able to try any fixes for you...

Thanks,
John