Reputation: 4950
Am I misinterpreting iostat
results or is it really writing just 3.06 MB per minute?
# zpool iostat -v 60
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
zfs-backup 356G 588G 465 72 1.00M 3.11M
xvdf 356G 588G 465 72 1.00M 3.11M
---------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
zfs-backup 356G 588G 568 58 1.26M 3.06M
xvdf 356G 588G 568 58 1.26M 3.06M
---------- ----- ----- ----- ----- ----- -----
Currently rsync
is writing files from the other HDD (ext4
). Based on our file characteristics (~50 KB files) it seems that math is correct 3.06 * 1024 / 58 = 54 KB
.
For the record:
primarycache=metadata
compression=lz4
dedup=off
checksum=on
relatime=on
atime=off
Server is on the EC2
, currently 1 core, 2GB RAM (t2.small
), HDD - the cheapest one on amazon. OS - Debian Jessie
, zfs-dkms
installed from the debian testing
repository.
If it's really that slow, then why? Is there a way to improve performance without moving all to SSD and adding 8 GB of RAM? Can it perform well on VPS
at all, or was ZFS
designed with bare metal in mind?
EDIT
I've added a 5 GB general purpose SSD to be used as ZIL
, as it was suggested in the answers. That didn't help much, as ZIL
doesn't seem to be used at all. 5 GB should be more than plenty in my use case, as according to the following Oracle article I should have 1/2
of the size of the RAM.
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
zfs-backup 504G 440G 47 36 272K 2.74M
xvdf 504G 440G 47 36 272K 2.74M
logs - - - - - -
xvdg 0 4.97G 0 0 0 0
---------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
zfs-backup 504G 440G 44 37 236K 2.50M
xvdf 504G 440G 44 37 236K 2.50M
logs - - - - - -
xvdg 0 4.97G 0 0 0 0
---------- ----- ----- ----- ----- ----- -----
EDIT
dd
test shows pretty decent speed.
# dd if=/dev/zero of=/mnt/zfs/docstore/10GB_test bs=1M count=10240
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 29.3561 s, 366 MB/s
However iostat
output hasn't changed much bandwidth-wise. Note higher number of write operations.
# zpool iostat -v 10
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
zfs-backup 529G 415G 0 40 1.05K 2.36M
xvdf 529G 415G 0 40 1.05K 2.36M
logs - - - - - -
xvdg 0 4.97G 0 0 0 0
---------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
zfs-backup 529G 415G 2 364 3.70K 3.96M
xvdf 529G 415G 2 364 3.70K 3.96M
logs - - - - - -
xvdg 0 4.97G 0 0 0 0
---------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
zfs-backup 529G 415G 0 613 0 4.48M
xvdf 529G 415G 0 613 0 4.48M
logs - - - - - -
xvdg 0 4.97G 0 0 0 0
---------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
zfs-backup 529G 415G 0 490 0 3.67M
xvdf 529G 415G 0 490 0 3.67M
logs - - - - - -
xvdg 0 4.97G 0 0 0 0
---------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
zfs-backup 529G 415G 0 126 0 2.77M
xvdf 529G 415G 0 126 0 2.77M
logs - - - - - -
xvdg 0 4.97G 0 0 0 0
---------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
zfs-backup 529G 415G 0 29 460 1.84M
xvdf 529G 415G 0 29 460 1.84M
logs - - - - - -
xvdg 0 4.97G 0 0 0 0
---------- ----- ----- ----- ----- ----- -----
Upvotes: 0
Views: 2625
Reputation: 617
Can it perform well on VPS at all, or was ZFS designed with bare metal in mind?
Yes to both.
Originally it was designed for bare metal, and that is were you naturally get the best performance and full feature set (otherwise you have to trust the underlying storage, for example if writes are really committed to disk when requesting synchronized writes). Although it is quite flexible, as your vdevs can consist of any files or devices you have available - of course, performance can only be as good as the underlying storage.
*) To be more precise: in general all small sync writes below a certain size are additionally collected in the ZIL before being written to disk from RAM, which happens either every five seconds or about 4 GB, whichever comes first (all those parameters can be modified). This is done because:
Normally the ZIL resides on the pool itself, which should be protected by using redundant vdevs, making the whole operation very resilient against power loss, disk crashes, bit errors etc. The downside is that the pool disks need to do the random small writes before they can flush the same data to disk in more efficient continuous transfer - therefore it is recommended to move the ZIL onto another device - usually called an SLOG device (Separate LOG device). This can be another disk, but an SSD performs much better at this workload (and will wear out pretty fast, as most transactions are going through it). If you never experience a crash, your SSD will never be read, only written to.
Upvotes: 2
Reputation: 19573
This particular problem may be due to a noisy neighbor. Being that its a t2 instance, you will end up with the lowest priority. In this case you can stop/start your instance to get a new host.
Unless you are using instance storage (which is not really an option for t2 instances anyway), all disk writing is done to what are essentially SAN volumes. The network interface to the EBS system is shared by all instances on the same host. The size of the instance will determine the priority of the instance.
If you are writing from one volume to another, you are passing all read and write traffic over the same interface.
There may be other factors at play depending which volume types you have and if you have any CPU credits left on your t2 instance
Upvotes: 1