Blue Screen's Suck

Ok, Blue Screens Suck. Before this week, I had not seen one in quite a long time. And even when I do, I usually do not have more than one or two per year.

 

BUT, don't blame MS straight off the bat. Keep in mind that there is code running on your machine capable of initiating a Kernel Panic (blue screen) without our involvement. (And before you blame THAT problem on MS, think if PCs were like game consoles and every app had to be licensed and approved by MS? That is not the case and that is what makes the PC ecosystem so great. But also more risky of course.) Every device attached to your machine can do so as well as every device driver. In MANY cases the device drivers should be signed by either Microsoft directly (in the case that the driver came with windows), or by the windows hardware quality labs (in the case that the company submitted a driver sometime later to be signed).

 

Lets talk about signatures and signature types a bit as there are several kinds. There is what is called an "unqualified" signature. This is where a device manufacturer is submitting a driver of some kind (usually kernel mode driver, but can be UMDF as well (user mode driver framework), but either does not tell what type of device it is, or the device does not attempt to conform to a specific type of device for which there is a specific program. Currently 3G cards are a good example since they either emulate ethernet or PPP devices, but really do not 100% conform to the implementations of each of those profiles, so really the device is unclassified. In these cases, it is still possible to get a signature, but all that the signature represents is that to the best of our knowledge this driver generally does not behave badly. (I don't have knowledge regarding the exact testing process for this, but you can read more on the websites below.) In the case where a driver receives an unqualified driver signature, their driver will not warn the user during driver installation, and the driver itself will always show that it has been signed by WHQL (and of course, the vendor can not replace the driver binaries with a non-signed version and keep the signature).

There are also various specific "logo programs" (You can find various hardware which is labeled either "Certified for Windows Vista" or "Works with Windows Vista") where the drivers submitted must adhere to very strict behaviors and support very specific standards and practices. In these cases, we know to an even BETTER degree that the device will behave well in the machine, even in not so good conditions because we were able to test much more extreme behaviors of the driver during testing because the type of device is well known and well specified.

(You can read even more details about all of this here: http://www.microsoft.com/whdc/default.mspx AND here: https://winqual.microsoft.com/)

There is usually a good sign that devices who participate in any of the above programs actually will not cause a blue screen (not saying that it is impossible, just improbable). Any devices not participating will of course not be "blocked" (unless you have an x64 machine, in which case, I believe that SOME type of signature (maybe not whql, but from someone) is required just so that its not completely anonymous kernel code), but generally users get a strict warning when installing unsigned drivers. In many cases now you get a blue screen only in the following cases: 1. Hardware goes bad (If your memory goes...happens a lot...or a hard drive starts going, then bits start flipping and corruptions can happen causing a blue screen), 2. An unsigned driver was installed, and there is a bug in that driver (1 & 2 are the MOST COMMON), 3. Unclassified but signed driver fails doing a specific action (bug) related to its custom purpose (which could not have been tested during the signature process), 4. (least common) a logo'd driver fails in some way uncovering a bug in the logo'd driver. Of course all of these instances are possible (and of course MORE are possible as well, but less likely these days. Almost any Kernel mode code in the OS will cause a bluescreen when it fails).

 

And in case you want to know what a blue screen is, its basically any action for which the Kernel can not recover from (think Kernel Panic). This can be because of hardware behavior or because of driver/software behavior. In general in my experience, if you see a IRQL_NOT_LESS_OR_EQUAL, you most likely have a driver problem. Think to if you installed any new devices or software which prompted you to install drivers lately.

Of course, its never impossible that it is not Microsoft's fault either. We have our share of problems as well. But I can truly say that in my experience this is never out of negligence or reckless oversight. In every case I have seen its just some corner case never thought of, some corner case that via some new hardware or software becomes a mainstream case, or many times just the random issue stemming from how complicated software is today.

 

Finally, I have said this before and will say it again, EVERYONE out there has the power to stop and investigate a bluescreen before it happens. Just download the debugging tools: http://www.microsoft.com/whdc/DevTools/Debugging/default.mspx, connect a null modem cable to another machine, use "bcdedit /debug" (in vista...boot.ini in xp) to enable debugging on next reboot, Download symbols for the OS (or just point the debugger at the website), and you too can be starting that the offending call stack the next time your machine's kernel is about to crash. Its EXACTLY the same process and tools that we use here to investigate the kernel. We have additional information in the symbols regarding source files and line numbers, but that's the only difference. So if you ARE inclined enough to want to know the WHY and the WHAT about your bluescreens, just give it a shot. Then you will at least know if you are indeed cursing at the correct offender. (Though not to get your hopes up, but even with the debugger attached, if your machine was going to blue screen, chances are it will eventually after you hit 'g' (unless you are a master of Intel assembly and can correct the wayward bits ;)).

 

Enjoy.

» Similar Posts

  1. How to install unsigned 64bit drivers in Windows 7 (or: How I got my HP ScanJet 5p working)
  2. From Italy with Love
  3. The age of enlightenment

» Trackbacks & Pingbacks

    No trackbacks yet.

» Comments

    There are no comments.

Comments are closed