I have been using daily a stable Win XP platform for many years for business purposes to which it is ideally suited. No updates, no or almost no new programs — certainly none in the period in which the following problem unexpectedly arose.
Problem No. 1: Following a normal extended period of continual use of hibernation (reboot maybe once in six weeks), one day the computer would no longer resume. Never had this problem before.
Behavior: Press power button to turn on, hardware presents Sony splash screen, computer goes to blinking cursor that usually precedes starting of the resume bar, but stops dead there. Hard drive OK (tested with SpinRite 6, no problems reported). All files present and accessible via a USB auxiliary drive.
The fix: Clone the hard drive using EZGig II at the “as is” bit/sector level, the way I have made backups for years, replace production drive with backed up image, computer boots right up. Continue use of replacement drive, while experimenting on hung original drive. (SpinRite, bootcfg+fixboot+fixmbr all in the Recovery Console — doesn’t correct the problem.)
Problem repeats a month later. Conclude it would be wise to at least temporarily discontinue Hibernation and Resume, so disable Hibernation, and boot and shut down all subsequent uses of the machine, multiple times a day, works fine. But problem as yet unsolved.
Problem No. 2: A month later, problem repeats, this time from cold boot.
Normal behavior is that after brief flashing cursor, the multi-boot menu contained in the BOOT.INI file appears for “Windows XP Home” (Service Pack 3) or “Windows Recovery Console” (which no longer works, as I think it disappeared years ago in upgrading to SP3), then the computer would boot normally to the desktop via default Win XP selection.
New behavior is, again it stops at the flashing cursor.
The temporary fix: Same as above. Clone hard drive, swap in, computer boots normally, resume work. However, this solution could get old very quickly, as it takes 2 hours to clone the drive.
Analysis:
Careful study of the Windows XP Resource Kit, 3rd Edition, 2005, updated through SP2, shows that the bootup sequence involved here is as follows:
A. On Resumption from Hibernation: “NTLDR uses firmware calls to locate the startup disk. If it finds a HIBERFIL.SYS file on the systemdrive root, the information is read back into memory and the computer resumes exactly where it left off without going through a full startup sequence. If the Windows loader cannot locate the HIBERFIL.SYS file, it processes the BOOT.INI file and proceeds with normal startup.” (Page 1238)
My problem is, the HIBERFIL.SYS file is there (as witness I can see it, and when I install the clone backup in the machine, the machine actively rejects the HIBERFIL.SYS as a failed resume and asks if I want to do a normal start instead). But when the problem is occurring, no error message is generated, and the cursor just sits there indefinitely, flashing.
B. Cold Boot: The nitty gritty of the startup from cold can be summarized as follows:
. . BIOS calls for Power On Self Test — does hardware checks, necessary devices present, retrieve configuration from CMOS
. . Add-on adapters such as video and hard drive controllers perform their own self checks
. . As there is no floppy disk, the computer reads the boot code from the Master Boot Record residing in the first sector of data on the hard disk (contains boot code and partition table)
. . Boot code searches the partition table for the active partition
. . First sector of the active partition contains additional boot code that determines the file system used, and locates the OS loader file, NTLDR
. . NTLDR loads startup files from the boot partition, and then switches the processor to 32-bit mode, starts the NTFS file system, and reads the BOOT,INI file. (See pp.1182-1186).
My problem here is that in either of my failure modes, the computer appears to be hanging just before it reads the BOOT.INI file.
AND, that when I do a bit-for-bit clone of the drive, the problem goes away! (But remains on the source disk, and could recur at any time on the cloned drive now in production.)
Now, in reality, there really isn’t a whole lot going on here in the boot sequence before this problem crops up, in terms of the usual Windows complexities of Registry, etc.
Does anyone see a clue in the fact that cloning fixes it?
Is this a behavior that could be occasioned by avirus? And if so, why would cloningremove it?
Where do I look next?
I’m really puzzled.
Thanks for any help.
— WSRon