how does one identify GPTE back to /dev/diskn when faulted?

Moderators: jhartley, MSR734, nola

how does one identify GPTE back to /dev/diskn when faulted?

Post by tangles » Fri Oct 05, 2012 3:21 am

Hi,

My pool just became degraded:

Code: Select all
mrq:~ sadmin$ zpool list
NAME    SIZE   ALLOC    FREE     CAP  HEALTH  ALTROOT
Data  19.1Ti  5.59Ti  13.5Ti     29%  DEGRADED  -
mrq:~ sadmin$


First of all, I'm not quite sure how I'm exceeding the 16TB limit of Zevo CE, but that's another post for later.

zpool status gives:

Code: Select all
 pool: Data
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
   Sufficient replicas exist for the pool to continue functioning in a
   degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
   repaired.
 scan: scrub repaired 0 in 2h23m with 0 errors on Fri Oct  5 09:04:07 2012
config:

   NAME                                           STATE     READ WRITE CKSUM
   Data                                           DEGRADED     0     0     0
     raidz1-0                                     ONLINE       0     0     0
       GPTE_7FF8B766-B40D-427E-B4FB-6D06BE2D5C24  ONLINE       0     0     0  at disk2s2
       GPTE_266D6073-2A1E-44C0-A3ED-1414A16F23CF  ONLINE       0     0     0  at disk0s2
       GPTE_312E1FF8-5AD6-46A5-B349-CD198BB7650F  ONLINE       0     0     0  at disk6s2
     raidz1-1                                     ONLINE       0     0     0
       GPTE_7ED54770-69A5-47DF-A88C-DB968EDCACE0  ONLINE       0     0     0  at disk3s2
       GPTE_55B4FCD7-A340-48EF-A07B-6BAD50EA5B03  ONLINE       0     0     0  at disk1s2
       GPTE_C780F366-DACC-4353-A97E-70F543677D15  ONLINE       0     0     0  at disk4s2
     raidz1-2                                     ONLINE       0     0     0
       GPTE_2F013EDA-524B-42D9-A596-33561E3F0CB6  ONLINE       0     0     0  at disk8s2
       GPTE_BA31AC30-F529-4912-85E8-1295F4A0B581  ONLINE       0     0     0  at disk14s2
       GPTE_5380700A-E849-46C6-AA3F-026249B0AAB4  ONLINE       0     0     0  at disk9s2
     raidz1-3                                     DEGRADED     0     0     0
       GPTE_53D7F6A4-0D57-4A11-AA7A-5384E1B698EB  ONLINE       0     0     0  at disk11s2
       GPTE_7783397E-406F-4827-8840-852AB933499C  FAULTED      0     0     0  too many errors
       GPTE_D6722D1F-94C8-4197-8BAB-7084B3C0AA22  ONLINE       0     0     0  at disk12s2
     raidz1-4                                     ONLINE       0     0     0
       GPTE_B8D4FD7B-665D-49A4-9B37-1DC548BE4EB8  ONLINE       0     0     0  at disk15s2
       GPTE_CCCBF756-1A5E-4CB0-B5B7-7D006B669870  ONLINE       0     0     0  at disk13s2
       GPTE_540E62A6-ABE9-413B-912E-4A9CEFD938B6  ONLINE       0     0     0  at disk16s2

errors: No known data errors
mrq:~ sadmin$


so disk "GPTE_7783397E-406F-4827-8840-852AB933499C" is the one but how do I relate this back to a "diskn" ?

because ls -l /var/zfs/dsk gives:

Code: Select all
mrq:~ sadmin$ ls -l /var/zfs/dsk
total 112
lrwxr-xr-x  1 root  wheel  12 28 Sep 18:28 GPTE_266D6073-2A1E-44C0-A3ED-1414A16F23CF -> /dev/disk0s2
lrwxr-xr-x  1 root  wheel  12 28 Sep 18:28 GPTE_2F013EDA-524B-42D9-A596-33561E3F0CB6 -> /dev/disk8s2
lrwxr-xr-x  1 root  wheel  12 28 Sep 18:28 GPTE_312E1FF8-5AD6-46A5-B349-CD198BB7650F -> /dev/disk6s2
lrwxr-xr-x  1 root  wheel  12 28 Sep 18:28 GPTE_5380700A-E849-46C6-AA3F-026249B0AAB4 -> /dev/disk9s2
lrwxr-xr-x  1 root  wheel  13 28 Sep 18:28 GPTE_53D7F6A4-0D57-4A11-AA7A-5384E1B698EB -> /dev/disk11s2
lrwxr-xr-x  1 root  wheel  13 28 Sep 18:28 GPTE_540E62A6-ABE9-413B-912E-4A9CEFD938B6 -> /dev/disk16s2
lrwxr-xr-x  1 root  wheel  12 28 Sep 18:28 GPTE_55B4FCD7-A340-48EF-A07B-6BAD50EA5B03 -> /dev/disk1s2
lrwxr-xr-x  1 root  wheel  12 28 Sep 18:28 GPTE_7ED54770-69A5-47DF-A88C-DB968EDCACE0 -> /dev/disk3s2
lrwxr-xr-x  1 root  wheel  12 28 Sep 18:28 GPTE_7FF8B766-B40D-427E-B4FB-6D06BE2D5C24 -> /dev/disk2s2
lrwxr-xr-x  1 root  wheel  13 28 Sep 18:28 GPTE_B8D4FD7B-665D-49A4-9B37-1DC548BE4EB8 -> /dev/disk15s2
lrwxr-xr-x  1 root  wheel  13 28 Sep 18:28 GPTE_BA31AC30-F529-4912-85E8-1295F4A0B581 -> /dev/disk14s2
lrwxr-xr-x  1 root  wheel  12 28 Sep 18:28 GPTE_C780F366-DACC-4353-A97E-70F543677D15 -> /dev/disk4s2
lrwxr-xr-x  1 root  wheel  13 28 Sep 18:28 GPTE_CCCBF756-1A5E-4CB0-B5B7-7D006B669870 -> /dev/disk13s2
lrwxr-xr-x  1 root  wheel  13 28 Sep 18:28 GPTE_D6722D1F-94C8-4197-8BAB-7084B3C0AA22 -> /dev/disk12s2
mrq:~ sadmin$


As you can see, the disk "GPTE_7783397E-406F-4827-8840-852AB933499C" is missing from the output ...

Please advise as I wanted to use dd if=/dev/diskn of=/dev/null so I could light up the LED associated with the drive so I could replace it...

using zpool iostat -v Data, I can determine that the drive belongs to one of my 2TB sets of raids, but which of the six 2TB disks I don't know... help!

I guess I'll have to use elimination method to determine diskn, (I think disk10) but this is not ideal...

I'm thinking it's no longer being referenced because the OS can no longer see the drive? (and so I can't use dd anyway?)

FYI
I'm running Zevo CE ZFS 5/28 on an Hack
Intel X48BT2 mobo
16GB DDR3
boot is an 18GB SCSI U320

ZFS pool consists of 1TB or 2TB drives in 5 sets of raidz.

uname:
Code: Select all
mrq:~ sadmin$ uname -a
Darwin mrq.langford.lan 11.3.0 Darwin Kernel Version 11.3.0: Thu Jan 12 18:47:41 PST 2012; root:xnu-1699.24.23~1/RELEASE_X86_64 x86_64
mrq:~ sadmin$


ZFSDriver:

Version: 2012.09.14
Last Modified: 15/09/12 5:22 AM
Kind: Intel
Architectures: x86_64
64-Bit (Intel): Yes
Location: /System/Library/Extensions/ZFSDriver.kext
Kext Version: 2012.09.14
Load Address: 0xffffff7f80de0000
Valid: Yes
Authentic: Yes
Dependencies: Satisfied

ZFSFilesystem:

Version: 2012.09.14
Last Modified: 15/09/12 5:22 AM
Kind: Intel
Architectures: x86_64
64-Bit (Intel): Yes
Location: /System/Library/Extensions/ZFSFilesystem.kext
Kext Version: 2012.09.14
Load Address: 0xffffff7f80c43000
Valid: Yes
Authentic: Yes
Dependencies: Satisfied
tangles Offline


 
Posts: 13
Joined: Sun Sep 16, 2012 5:49 am

Re: how does one identify GPTE back to /dev/diskn when fault

Post by BrianDieckman » Fri Oct 05, 2012 9:07 am

Use diskutil to find your drive.

Code: Select all
diskutil list


This will list all your disks with diskn label.
BrianDieckman Offline


 
Posts: 17
Joined: Tue Sep 25, 2012 2:11 pm

Re: how does one identify GPTE back to /dev/diskn when fault

Post by tangles » Fri Oct 05, 2012 8:40 pm

Hi Brian,
Sure, diskutil will give me diskn, but it can't link back to GPTE labels...

This is the whole crux of the issue...
tangles Offline


 
Posts: 13
Joined: Sun Sep 16, 2012 5:49 am

UUID of a slice of a disk in IODeviceTree.txt (sysdiagnose)

Post by grahamperrin » Sat Oct 06, 2012 12:19 am

UUID of a slice of a disk in IODeviceTree.txt (sysdiagnose)

Where Lion or greater is used, sysdiagnose may help.

Suggested actions

  1. run sysdiagnose
  2. when Finder presents the result, expand the archive
  3. from within the expanded results, open IODeviceTree.txt
  4. in the tree you might find a match for the GPT UUID 7783397E-406F-4827-8840-852AB933499C of your device with too many errors.

Hint: for the allmemory part of the routine, ignore Apple's suggestion of two minutes. Be patient for longer.

(For this tangles case in particular, there could be swifter approaches to identification. More generally, I encourage awareness of sysdiagnose.)
grahamperrin Offline

User avatar
 
Posts: 1596
Joined: Fri Sep 14, 2012 10:21 pm
Location: Brighton and Hove, United Kingdom

Re: how does one identify GPTE back to /dev/diskn when fault

Post by dbrady » Sat Oct 06, 2012 12:51 am

If the disk is available in /dev, you can try using zdb to find it:

Code: Select all
$ sudo su
# ls /dev/disk*s2 | xargs -n 1 -t zdb -l | grep GPTE_7783397E-406F-4827-8840-852AB933499C
dbrady Offline


 
Posts: 67
Joined: Wed Sep 12, 2012 12:43 am

Re: how does one identify GPTE back to /dev/diskn when fault

Post by tangles » Sat Oct 06, 2012 4:03 am

thanx guys for the tips,

Fortunately, my RocketRaid 2744 knew the drive was still there and displays serial numbers of the all disks.

I simply shutdown, determined which one of the 2TB disks by SN and swapped it with a freshy.

The faulted disk was no longer visible in /dev, but the RocketRaid still knew about it so I was lucky there.

Is this the reason why the BSD crew have got that alias feature with their ZFS port? so that they can "label" disks, i.e. in accordance to bay/slot numbers?

Once I knew what disk, zpool replace Data /dev/dsk/GPTE_7783397E-406F-4827-8840-852AB933499C /dev/disk10 got the resolver happening.

I'm really liking Zevo's new feature of not having to "prepare" drives anymore for use with ZFS! It reminds me of Bill Moore's talk years ago how he described the ZFS command line tools just "know" what you want done and so "take care" of all the low level stuff for you.

Thank you very much for adding this feature.

R.

Raoul.
tangles Offline


 
Posts: 13
Joined: Sun Sep 16, 2012 5:49 am


Return to General Discussion

Who is online

Users browsing this forum: No registered users and 0 guests

cron