How i crashed solaris !
Well I’ve been working for quite some time on Linux platform. I’ve using solaris for the past few months. By mistake i found out a major flaw that can be dangerous to the system & the data. So i started my detailed research on how can i crash the system & to what extent it can be recovered.
Hardware:
Processor: AMD Athlon(tm) 64 X2 Dual Core Processor 4400+
Hard disk: 40 GB IDE
RAM: 4 GB RAM & other normal stuff.
Distributions used for testing.
Belenix 0.7.1
Problems faced:
GRUB error
Boot archive error
ZFS degraded.
Methodology:
Simply switch off the system (plz note that it’s switch off or cut the power and not shutdown).
The mistake that gave the gyan:
I had installed belenix on my system. I’ve been doing R&D with it for quite some time. One day the power went down when the solaris system was running & my ups dried out. So the system just switched off. Next time i started the system, i just these words on the screen “GRUB” and nothing else. There were no grub menu etc. So i realized that something is wrong & i started investigating the problem.
kindly note that these tests have been performed many times. Sometimes the grub is lost in the first instance & sometimes it takes more than twice to switch off the machine to loose the grub.
. Well lets get started.
Test 1: Steps performed.
- I reinstalled belenix on my system and confirmed that everything was working fine.
- Then i just switched off the power.
- Restarted the system & the grub was lost (very strange). Sometimes it takes 2-3 times of step 2 to get to this stage.
I’ve read and heard about the stability & security of solaris, and that’s why i thought of using it in the first place. But what’s this flaw which can compromise the entire system. At first i thought this way specific to belenix, so i decided to give opensolaris a try.
Test 2: Steps
- Installed opensolaris on my system.
- 40 GB divided into 2 primary partitions.
- /dev/rdsk/c4d0s1 as swap (1.2 GB)
- /dev/rdsk/c4d0s0 as zfs root filesystem on which os2008 was installed (18.5 GB)
- /dev/rdsk/c4d0s2 was left for creating zfs pool (20.5 GB)
After checking that everything was working fine. I switched off the power again. And what i find is that grub is lost again. Now this is a serious issue.Any ways, i did lot of googleing related to recovery of grub on opensolaris, & found some good solution.
To know more, on how to recover grub in opensolaris, read my other post.
http://www.linuxguy.in/?p=3
I recovered the grub, using the methods provided on the link & things were normal again. Till this level if you happen to loose the grub, you can recover it. But i was not satisfied so i thought of performing some more stunts.
Test 3: Steps
- I created zfs pool on the second hard disk.
- This time i was copying some huge data on the new pool over the ssh when i switched off the computer. As expected grub was lost. So i recovered the grub. But this time after recovering the grub three different scenarios happened:
- Scenario 1: Some boot archive issue. check the last entry in the mailing list thread to know more.
- I tried to recover the boot archive & even deleted & recreated the same but still the system was not starting. Had to reinstall the system.
- Scenario 2: System booted. i could see the other pool & when i imported it then it showed in the status as degraded. I cleared the pool with the command “zpool create poolname “.
- Restarted the system and i lost my grub again. i restored the grub again and the system won’t boot even after restoring the grub.
- Scenario 3: Restored the grub. Started the system. Grub menu shows up & then reboots the system in the continuous loop. Only solution was to reinstall the entire system again.
I found this thread on the mailing list in which some guys have earlier faced a similar issue.
http://opensolaris.org/jive/thread.jspa?messageID=268456
Result: In the thread whatever methods they’ve told are not successful. I was successful in recovering the system if the grub was lost. But if you happen to face a situation that the last guy in the thread is facing which even i did, I still don’t have the solution to recover the system from that stage.
Outcome: Please be very careful while switching off your solaris system. Do a proper shutdown, or you may end up loosing your valuable data.
Updates on 2.11.2008
Test 4: Steps
- Installed os2008 on the entire disk (choose the whole disk while installation), rather than on a partition.
- Confirmed that the OS is working perfectly & switched off the machine.
- Restarted the system, & the system works. (Great news).
Result: If you are installing os2008 on the entire disk, & switch off & start, the grub is not lost.

November 1st, 2008 at 8:06 pm
Opensolaris and belinux are not what I would call “ready for productin”. With that said I run opensolaris on my laprop and through out the last the earlier releases I have done unclean shutdowns and never seen what you went through. If you are looking for stability give solaris x86(download from sun) a try since this is the commercial version it ensure stability in their releases.
November 1st, 2008 at 10:05 pm
This is very weird. I have had occassions in the past due to some issue or the other where I had to switch off the system without shutting down OpenSolaris (mostly due to hangs with earlier Nvidia drivers). I am yet to face this problem.
November 3rd, 2008 at 6:56 am
True. Fully true.
Living in a tropical area with a lot of power outages through thunderstorms, I can confirm your observation partially. I never lost grub, but I get rather regular corruptions of the boot archive, that require a failsafe reboot.
What amazes me most, is that atomic (reads and) writes are the sales argument. Somewhere I read that corruption didn’t happen in a simulated 1 million power-cut-offs. In a real-world situation, it rather looks like 1 in 5 that end up with data corruption. I have been running various versions (usually close to the most recent) of Nevada.
November 8th, 2008 at 6:57 pm
This is strange. I have shut off the power several times on my Solaris Nevada computer, when I tried different device drivers etc. I never once had a problem, nor with ZFS nor with anything else.