[OT] ZFS on linux

Well, thanks to Andrew G and others who mentioned ZFS in the SATA disks thread.

I've been playing with it and - wow, it is impressive.

Installing on Debian is a breeze.

It seems perfectly happy on partitions because it would be insane to run / on it - primarily because few rescue USB images would cope.

So I went to my old favourite:

/dev/sd[abcd]1 -> md-raid1 -> ext2 -> /boot /dev/sd[abcd]2 -> md-raid1 -> ext4 -> / /dev/sd[abcd]3 -> md-raid1 -> SWAP

and /dev/sd[abcd]4 for ZFS as a RAIDZ1 setup.

After the manual step of making sure grub was installed on all 4 disks, I have "destroyed"[1] disk1, and readded it, then "destroyed" disk 2, repaired it and pulled another disk and just readded.

[1] Boot from rescue, zero 1GB of the ZFS partition then zero 1GB of the front of the disk.

I must admit, the re-adding was a little weird:

zpool offline tank1 scsi-SATA_WDC_WD20EFRX-68_WD-WMC4M3224833-part4 zpool online tank1 scsi-SATA_WDC_WD20EFRX-68_WD-WMC4M3224833-part4 zpool scrub tank1

I was expecting zpool replace tank1 scsi-SATA_WDC_WD20EFRX-68_WD-WMC4M3224833-part4

but it did not like that.

However, I had rsync'd a load of (expendable) stuff onto it so a re-rsync showed me that the data had survived.

Very very cool.

Nice that I can divvy up lots of "filesystems", add flexible pool quotas to each and not in effect have lots of bits of wasted space all over the place like LVM.

Reply to
Tim Watts
Loading thread data ...

google 'ZFS horror' before you get too happy.

The incidence of people saying 'never ever use ZFS because' is ..enough for me to have noticed it..

Reply to
The Natural Philosopher

Ok will do - I am not yet committed...

Reply to
Tim Watts

To be honest the only one I can find that's vaguely worrying is:

formatting link

and that's old.

Googling and limiting to the last 12 months - I really can't find anything that worries me. and I do have real automated backups.

Bear in mind I have gone though 2 full disk replacement cycles - sadly I cannot hot-pull a disk as the HP doesn't support that (don't want to mash the electronics).

If you want true horror it was ReiserFS circa 2003 - loads of lab PCs and the odd server coming up (or rather not) with bits of /etc/passwd in /sbin/init !!

Reply to
Tim Watts

In article , Tim Watts writes

Colleague has just lost several TB of data in a ZFS pool. Machine hung, needed power cycling, on coming back up zpool had vanished. No backup.

It went to a data recovery firm, after looking at it they quoted £5k /per hour/ and they cannot guarantee to get anything back.

Google for 'zpool vanished' - seems to be a common problem.

Too bleeding edge for me.

Reply to
Mike Tomlinson

You don't want to try btrfs then!

Reply to
Tim Watts

yes, Look on the face of it ZFS is bleeding edge brilliant.

What they are trying to do makes huge sense, BUT because they are using and abstraction layer above the disk to make all the good stuff happen, if you lose that, its a helluva job to get the data back.

The theory is that you wont ever lose it and can plug disks in and out, at will. And that makes a lotta sense on a 24 disk array with 4 gigabyte Ethernets in it coupled to 20 sun CPUs grinding way at a database.

I looked at RAID and various novel filesystems here, and decide that actually I didnt need high AVAILABILITY or massive performance beyond my ageing 100mbps switch anyway: what I need was a simple way to handle the inevitable loss of a single disk.

So I picked a trad system and two disk one of which mirrors the other every night.

So by all means play with ZFS BUT I would recommend you mirror the whole ZFS array onto something every night that is slow conventional and is likely NOT to be writing data when the power goes down and wipes the ZFS array clean.

The key point is that RAID is an availability solution, not a backup solution, and ZFS seems to be the ultimate logical extension of RAID.

If you want a backup solution, don't bother with raid. just mirror the data with e.g. rsync under cron once a day (or night)

Reply to
The Natural Philosopher

All good points. I have a backup that runs rsnapshot 4 times a day - so I'm pretty bombproof :) I also mirror my most critical data to my laptop ('cos I use it there.

Reply to
Tim Watts

I'm assuming all these horror stories are on Linux (or rather, non-solaris)?

We've run ZFS for all disks for many years, and have many many TB (if not PB) of data on it and I don't think we've ever had a failure. It has saved us many times over though :-) We often take snapshots - frequently on machines with churn we'll take them automatically every 20 mins or similar.

This is all on Sun machines, and to be fair, much is backed onto large HDS disk arrays which are resilient in their own right.

I'm interested in ZFS on Linux though - we have a couple of large Solaris systems that we are migrating to RedHat (we are moving from Solaris entirely) and these particular systems have many TB of files and more of an issue, many many millions of files. Migrating those will be awkward... hence my interest in importing a zpool from the solaris machine onto RHEL if it's an option. Sounds like it might be a bit hairy on something other than Solaris though :-(

Darren

Reply to
D.M.Chapman

Oddly enough I can find very few on linux and a good few of those are syadmin f***ed up.

A fair few are on MacOSX and others on some BSD NAS variant.

Reply to
Tim Watts

I believe some of those files are mine! :-)

But I mirror them all offsite..

Reply to
Bob Eager

Look m8 I really dont know. I am not speaking from experience, but from hearsay. As I said it didn't enter my thinking because its not summat that offered me more than what I have.

BUT there have been troubles. Somehow it seems you can lose all your data irrecoverably. IF you have a proper non ZFS backup in place or all the UPS stuff, then fine.

I just wanted to flag that up so you could do your won research etc etc.

Reply to
The Natural Philosopher

Heh, maybe :-)

The majority of them (10s of millions) form the British cartoon archive

formatting link

Each hires image is tiled into many small images. There are many thousands of huge images...

So do we, but we let the disk array do that for us :)

Reply to
D.M.Chapman

I believe so. And likely reverse engineered.

Isn't everyone? :o)

Reply to
Huge

What's performance like? Last time I tried btrfs, on SSD it was slower than ext3 on HDD. Doing apt-get upgrade was an absolute killer. Be interested to know how ZFS on Linux compares.

Theo

Reply to
Theo Markettos

"bonnie++ -f" tests I did last week:

formatting link

You want to look at the "godzilla" machine tests - same hardware, lots of configurations, various ZFS setups vs LVM+MDRAID+XFS/EXT2/4

All "HDD" tests were using WD RED SATA drives (2TB). "USB3" tests were a pair of Sandisk Extreme USB3 pendrives.

One High power VMWare ESXi system and one Linode.com VM for comparisons with " real mans systems".

The numbers mean sod all - it's the relative numbers you should consider.

Reply to
Tim Watts

Very useful, thanks. So about a factor of 2 drop in IOPS, but a factor of

10 on the write/read latency (I think).

It's random write that tends to be the tricky one - on HDDs it causes lots of seeking about, while flash devices (SSDs, SD cards etc) get their knickers in a twist trying to erase blocks. Obviously it depends on your workload, but it tends to match apt-get upgrade and big compile runs which is where I start feeling the performance hit.

Theo

Reply to
Theo Markettos

Have a look at Nexentastor, which is a storage appliance distro based on Illumos, the opensource version of what was Sun Solaris/opensolaris, which has many of the key ZFS and Solaris kernel developers working on it for well over 3 years now. There's also a server distro OmniOS, again based on Illumos.

With these, you have the inherited goodness of Sun Solaris, and full support available for some of the products/distros if you need it, without having to work with Oracle or be tied to Oracle hardware. You can build out your own Software Defined Storage architecture.

[Disclaimer - I work for Nexenta]
Reply to
Andrew Gabriel

Heh, aware of it. Trying to consolidate on RHEL and VMware with management from single Satellite instance so would rather stick to the one OS where possible but...

Ah, I'd missed that. Congrats on the escape ;-)

Darren

Reply to
D.M.Chapman

This falls between two stools for me, since it's way OTT for home and I have little or no input to the decision making process at work. I am thinking about what to upgrade my very elderly Ubuntu too, though.

Reply to
Huge

HomeOwnersHub website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.