ZFS Help (if possible)


#1

ZFS Help (if possible)

This is what happened:

  • I had two 1Tb new drives which I installed TrueOS on as a zmirror.

  • I pulled the drives (because I got some SSD’s to install on…)

  • I had two additional new 1Tb drives so I used all 4 to create a Raidz1 pool.

    • I did not dd (erase) the previously used drives before using them to create the Raidz1 pool.
  • I restored about 140Gb of data to the new Raidz1 pool. (zfs receive)

  • Then … a failure occurred:

    ] ~% zpool status jpool
    pool: jpool
    state: UNAVAIL
    status: One or more devices could not be opened. There are insufficient
    replicas for the pool to continue functioning.
    action: Attach the missing device and online it using ‘zpool online’.
    see: http://illumos.org/msg/ZFS-8000-3C
    scan: none requested
    config:

          NAME                      STATE     READ WRITE CKSUM
          jpool                     UNAVAIL      0     0     0
            raidz1-0                UNAVAIL      0     0     0
              4275143066245817878   UNAVAIL      0     0     0  was /dev/ada1
              10708658390192165714  UNAVAIL      0     0     0  was /dev/ada0
              ada0                  ONLINE       0     0     0
              ada1                  ONLINE       0     0     0
    

    ada0: ATA8-ACS SATA 3.x device
    ada0: Serial Number JR10004M2YR8KE
    ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
    ada0: Command Queueing enabled
    ada0: 953869MB (1953525168 512 byte sectors)
    ada1 at siisch1 bus 0 scbus3 target 0 lun 0
    ada1: ATA8-ACS SATA 3.x device
    ada1: Serial Number JR1020BN0XG3AM
    ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
    ada1: Command Queueing enabled
    ada1: 953869MB (1953525168 512 byte sectors)
    ada2 at siisch2 bus 0 scbus4 target 0 lun 0
    ada2: ATA8-ACS SATA 3.x device
    ada2: Serial Number JR100XD30GW59E
    ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
    ada2: Command Queueing enabled
    ada2: 953869MB (1953525168 512 byte sectors)
    ada3 at siisch3 bus 0 scbus5 target 0 lun 0
    ada3: ATA8-ACS SATA 3.x device
    ada3: Serial Number JR1000BN1U7P7E
    ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
    ada3: Command Queueing enabled
    ada3: 953869MB (1953525168 512 byte sectors)

    GEOM: ada2: the primary GPT table is corrupt or invalid.
    GEOM: ada2: using the secondary instead – recovery strongly advised.
    GEOM: ada3: the primary GPT table is corrupt or invalid.
    GEOM: ada3: using the secondary instead – recovery strongly advised.

I am not sure how to recover these devices or if they are able to be recovered.

-Ben


#2

In my opinion you cannot recover your pool: you have two missing devices in a raidz1 pool.

IMHO this is the problem. When I recycle a HDD for ZFS, without using the dd command for completely erasing it, I always use the zpool labelclear command for every partitions/slices in the disk to remove old labels. For example, If I wanted to recycle my laptop HDD:

$ gpart show  ada0
=>       34  976773101  ada0  GPT  (466G)
         34          6        - free -  (3.0K)
         40        512     5  freebsd-boot  (256K)
        552       1496        - free -  (748K)
       2048    1021952     1  ms-recovery  (499M)
    1024000     614400     2  efi  (300M)
    1638400     262144     3  ms-reserved  (128M)
    1900544  442443777     4  ms-basic-data  (211G)
  444344321          7        - free -  (3.5K)
  444344328  478773248     6  freebsd-zfs  (228G)
  923117576    4194304     9  freebsd-swap  (2.0G)
  927311880      94201        - free -  (46M)
  927406081   47269888     7  ms-recovery  (23G)
  974675969    2097152     8  ms-recovery  (1.0G)
  976773121         14        - free -  (7.0K)

I would use ten zpool labelclear commands:

$ sudo zpool labelclear -f /dev/ada0
$ sudo zpool labelclear -f /dev/ada0p1
$ sudo zpool labelclear -f /dev/ada0p2
...
$ sudo zpool labelclear -f /dev/ada0p9

Probably some of these commands will fail, but this isn’t a problem,


#3

@bforest I’m going to have to agree with @maurizio on this. Michael W Lucas has the clearest statement on it in one of his books:
“A RAIDZ1 VDEV can withstand the failure of any single storage provider. If a second provider fails before the first failed drive is replaced, all data is lost”.

Clearing labels and partitioning will help: gpart destroy command should work. Something like gpart destroy ada0 to use @maurizio example above. If you do that you shouldn’t need to do the individual zpool labelclear commands, but if you’re a belt and suspenders guy, you could do both.

Now, @bforest your fourth bullet about restoring the data to the new pool via zfs receive: if you still have that snapshot available on another system, you can simply rebuild everything and redo the send.

RAIDZ1 can only tolerate a single device failure, no matter how many disks are in the VDEV. It’s the best for creating a big pool, one disk is used for parity, so the others are available for data, which means your 4TB disks give 3TB total for storage. Read/Write performance is basically the same as a single device.

Other configurations will let you fail more than one device at the cost of overall space, so maybe another configuration would work better? Something like “a mirror of mirrors”?


#4

Wow… did NOT know this, and I was thinking of setting up a 4 disk (1 Tb each) under ZFS in the near future.

So then, would it not be better to create two separate mirros of 2 disks each ?

J.


#5

It all depends on what you need and what you want to accomplish.

Mirrors offer no space improvement, but good reliability and typically better read performance. ZFS lets you have more than 2 in a mirror (which is a quick and dirty trick to back things up :slight_smile: )

4 drives you can do a RAIDZ2 (2 parity drives) which lets you lose 2, but you have 50% space efficency (4x1TB drives you have 2TB storage).
5 drives you can do RAIDZ3 which would let you lose up to 3 drives.

Tradeoffs of data integrity vs storage space and performance. Don’t forget there may be some configurations you can’t boot off.

This is true of any other kind of RAID, not unique to ZFS.


#6

Aha ! It’s a surprise a minute around here.

So I don’t have to make two separate mirrors with the 4 disks,
I can make one big mirror with all 4 of them.

Am I correct that 3 could fail and I still get to keep the data (quickly getting new drives and re-silvering of course)?

If correct, then I don’t quite see the advantage of the RAIDZ1 but that’s really a ZFS rather than TrueOS question and I don’t want to distract from the main show.

J.


#7

In theory, yes you could do that have 4 drives in a mirror. But if you did that with 4 1TB drives you have only 1TB of space.
RAIDZ1 with those same 4 drives would give you 3TB of space, and one could fail and it would still work. As long as you replace the single failed drive quickly, you’re fine.
RAIDZ2 gives your 2TB of space and you can have 2 fail.
Add one more drive (total of 5) and you can do RAIDZ3, a total of 2TB space but you could fail 3.

That’s why you need to figure out what is right for you.


#8

A very good document at the FreeNAS forum:
Slideshow explaining VDev, zpool, ZIL and L2ARC and other newbie mistakes!


#9

Thank !!

J.