ZEVO resilience to untimely loss of L2ARC

Moderators: jhartley, MSR734, nola

ZEVO resilience to untimely loss of L2ARC

Post by grahamperrin » Sun Nov 04, 2012 1:59 pm

Testing the ability of ZEVO Community Edition 1.1.1 on OS X 10.8.2 to cope with user error and some types of hardware problem.

I backed up, then aimed to mimic failure of a cache vdev by physically removing the device (a USB flash drive) from my MacBookPro5,2.

Results at http://www.wuala.com/grahamperrin/publi ... de=gallery

In brief:

  • notification of an Unexpected Disk Condition
  • I probably opted to reconnect the cache vdev to the laptop
  • towards the end of the OS shutdown or restart routine, force was required.

For a ZFS pool, not bad :)

Comparison

I should not expect a pure Fusion Drive configuration to cope as gracefully with loss of an SSD from the CoreStorage pool.

Cross reference

L2ARC: dynamic import before cache vdev is present
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: ZEVO resilience to untimely loss of L2ARC

Post by TomUnderhill » Sun Nov 04, 2012 11:11 pm

I've been making so many changes to my hardware setup lately that I have been running my MacPro with the side bay off. I know it's not the best for proper air flow, but the ambient temperatures have not been bad lately. Newest addition has been a second 2.5" SSD placed in service as L2ARC. I had my boot SSD securely mounted with foam blocks in the lower drive bay... securely until I jury-rigged the second SSD into the bay.

Guess I had the L2ARC SATA cable detach from my drive when cable tension pushed beyond the limits of the connector's adhesion.

Didn't see any error messages when I booted or ran the system for a couple of hours. I finally noticed the little black tail on the red cable when it vibrated against the optical drive housing when the DVD spun up.

Powered down, reconnected the SATA cable, rebooted. Scrubbed the system with no errors presented.

Not the most scientific of tests or conditions, but adds to my feelings of data integrity.
TomUnderhill Offline


 
Posts: 36
Joined: Wed Oct 10, 2012 8:06 am
Location: Southern California

Re: ZEVO resilience to untimely loss of L2ARC

Post by grahamperrin » Sun Nov 04, 2012 11:59 pm

Tom's tale is very reassuring. +1 to ZFS and +1 to Don/GreenBytes for an implementation for OS X that allows uninterrupted use without the cache vdev.

TomUnderhill wrote:… Didn't see any error messages when I booted or ran the system for a couple of hours. …


Do you have Growl, with enough history to see whether a notification occurred?

My own example below (cropped from the screenshot in Wuala):

example.png


If no notification, I wonder why not. Maybe the cable was detached whilst the computer was off.

In any case I'd wish for a notification of some sort – maybe recurring, but not too frequent – whilst a cache vdev is missing for a mounted file system.
You do not have the required permissions to view the files attached to this post.
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: ZEVO resilience to untimely loss of L2ARC

Post by TomUnderhill » Mon Nov 05, 2012 12:55 am

grahamperrin wrote:Do you have Growl, with enough history to see whether a notification occurred?


I will check Growl's history tomorrow morning and see what I find.

But definitely +! for a filesystem that not only didn't hiccup with the loss of the read cache, but picked right back up when it was restored to the pool.
TomUnderhill Offline


 
Posts: 36
Joined: Wed Oct 10, 2012 8:06 am
Location: Southern California

Re: ZEVO resilience to untimely loss of L2ARC

Post by si-ghan-bi » Wed Nov 07, 2012 10:47 am

Yesterday I plugged out by mistake the L2ARC USB2 cable: no issues whatsoever, "zpool status -v" notified me of "too manz" reading errors and the consequent drop of the cache drive. I noticed half hour later.
si-ghan-bi Offline


 
Posts: 145
Joined: Sat Sep 15, 2012 5:55 am

Re: ZEVO resilience to untimely loss of L2ARC

Post by grahamperrin » Thu Nov 08, 2012 1:57 am

+1

Looking good – 

Code: Select all
sh-3.2$ date
Thu  8 Nov 2012 06:42:29 GMT
sh-3.2$ sudo zpool status -v zhandy
  pool: zhandy
 state: ONLINE
 scan: scrub repaired 0 in 13h23m with 0 errors on Sat Nov  3 07:30:53 2012
config:

   NAME                                         STATE     READ WRITE CKSUM
   zhandy                                       ONLINE       0     0     0
     GPTE_1928482A-7FE4-482D-B692-3EC6B03159BA  ONLINE       0     0     0  at disk5s2
   cache
     GPTE_EC9A371E-C089-4E64-A8AA-F270CB9FB4B6  FAULTED      0     0     0  too many errors

errors: No known data errors
sh-3.2$


– and I'm aggressively writing to, and reading from, the pool (adding various versions of an 8 GB .vdi file, repeatedly, then starting a VirtualBoxVM from that virtual disk image).

In greater detail –

Code: Select all
sh-3.2$ sudo zdb -h zhandy | grep 2012-11-08
2012-11-08.02:22:17 [internal pool import txg:2302042] pool spa 28; zfs spa 28; zpl 5; uts macbookpro08-centrim.home@1B4C77AE-B80A-59F9-B5CB-7A86B7437D40 12.2.0 Darwin Kernel Version 12.2.0: Sat Aug 25 00:48:52 PDT 2012; root:xnu-2050.18.24~1/RELEASE_X86_64 MacBookPro5,2
2012-11-08.02:45:40 zpool online zhandy GPTE_EC9A371E-C089-4E64-A8AA-F270CB9FB4B6


– with a message from kernel in the midst:

Code: Select all
2012-11-08 02:45:32.000 kernel[0]: ZFSLabelScheme:willTerminate: this 0xffffff80310cec00 provider 0xffffff802eba8e00 '%noformat%'


So far everything seems more than adequate for this second release (1.1.1) of the Community Edition.

For a bells, whistles and polish edition of ZEVO I might wish for a notification to both:

  • recur; and
  • use plain English (or its equivalent in other locales).

On the other hand, if it's perfectly acceptable to be without a cache vdev: notify just once with plain English.
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom


Return to General Discussion

Who is online

Users browsing this forum: Google [Bot] and 1 guest

cron