Contact Us
Technical Guide
Your current position:Home > Technical Guide
【Case Sharing】SunFireV890 troubleshooting case



Fault Background


Antute received a report from the customer, a SunFireV890 equipment can not start, the dual machine did not switch normally, business interruption.


After arriving at the site, Antutech engineers understood that it was because the customer adjusted the configuration of the system did not respond to a physical shutdown, and then restarted the device could not start normally, and was stuck in this "Boot device: /pci@8,600000/SUNW,qlc@2/fp@0,0/disk@ w21000014c31799fd,0:a File and args:" at the system can not boot normally, but the device self-test can pass without error, so the basic judgment is that the system problem.


Antutech engineers used some methods to try to recover it, such as repairing the file system, adding boot files, removing the image to boot with a single disk, etc., but all to no avail; since this device has some very important data that has not been exported, so it cannot be reinstalled, and finally, after analysis, it was updated and archived to make it resume booting.




Processing



1、Bootstrap the fail-safe archive file. At the ok prompt, type the following command:

ok boot -F failsafe

2、Mount the root (/) filesystem to /a

# mount /dev/dsk/c0t0d0s0 /a

3、Copy the md.conf file to the /kernel/drv directory

# cp /a/kernel/drv/md.conf /kernel/drv/

4、Uninstall the /a directory

# unmount /a

5、Load the md driver

# update_drv -f md

The running of this command causes the configuration to be read and the necessary devices to be created.


6. Use the metasync command to ensure synchronization of the root (/) file system. Example:


# metasync d0

7、Mount the root image metadevice on the /a directory

# mount /dev/md/dsk/d0 /a

8、Update the boot archive file of the device mounted in the previous step

# bootadm update-archive -v -R /a

forced update of archive requested
cannot find: /a/etc/cluster/nodeid: No such file or directory
cannot find: /a/etc/mach: No such file or directory
Creating boot_archive for /a
updating /a/platform/sun4u/boot_archive


If updating the boot archive file fails or an error message appears, you can do the following:


a. Update the timestamp of the md.conf file on the /a directory, which will force the boot archive to be updated

# touch /a/kernel/drv/md.conf

b.Run the bootadm command to update the boot archive

# bootadm update-archive -v -R /a

9.Uninstall /a

# unmount /a

10.Rebooting the system

image001.gif


After rebooting the system, I saw the long-lost login screen.



Experience Summary



1. When Solaris is installed, the bootadm command creates a boot archive, which is a subset of the root filesystem, on the system. The boot archive contains all kernel modules, the driver.conf file, and several configuration files. These files are located in the /etc directory.


2. Before mounting the root file system, the kernel will read the files in the boot archive. After mounting the root file system, the kernel will drop the boot archive from memory, and then the system will perform file I/O against the root device.


3. The bootadm command handles the details of the boot archive update and verification. During a normal system shutdown, the shutdown process compares the contents of the boot archive with the root file system. If the system has updates (such as drivers or configuration files), the boot archive is rebuilt to include those changes so that the boot archive and root file system can be synchronized after a reboot.


4. Because the device has not been rebooted for a long time, an unexpected cold shutdown of the device causes the device to not have time to synchronize the boot archive file, resulting in an unbootable reboot. Therefore, reboot of devices with long operation time should be done carefully, especially physical reboot, and it is recommended to do the corresponding backup and preparation work before operation, just in case.


Note: The files in the SPARC boot archive are located in the /platform directory. You can use the bootadm list-archive command to list the contents of the boot archive.


如欲了解更多,请登录安图特官方网站:www.antute.com.cn

版权所有 安图特(北京)科技有限公司 Filing No:京ICP备17074963号-1
Technical Support:Genesis Network