Trying to figure out if there is a way to do this without zfs send
ing a ton of data. I have:
s/test1
, inside it are folders:folder1
folder2
I have this pool backed up remotely by sending snapshots.
I’d like to split this up into:
s/test1
, inside is folder:folder1
s/test2
, inside is folder:folder2
I’m trying to figure out if there is some combination of clone
and promote
that would limit the amount of data needed to be sent over the network.
Or maybe there is some record/replay method I could do on snapshots that I’m not aware of.
Thoughts?
I can’t think of a way off hand to match your scenario, but Ive heard ideas suggested that come close. This is exactly the type of question you should ask at practicalzfs.com.
If you don’t know it, that’s Jim Salter’s forum (author of sanoid and syncoid) and there are some sharp ZFS experts hanging out there.
Great, thank you.
what is your goal with this?
do you still want to keep all the data in a single pool?
if so, you could make datasets in the pool, and move the top directories into the datasets. datasets are basically dirs that can have special settings on how they are handledninja edit: now that I think about it, moving across datasets probably makes that data to be resent.
it would be easier to give advice by knowing why do you want to do thisYea your edit is the problem unfortunately. Moving across datasets would incur disk reads/writes and sending of terabytes of data.
The goal in separating them out is because I want to be able to independently
zfs send
folder 1 somewhere without including folder 2. Poor choice of dataset layout when I built the array.hmm I see. and why do you want that? balancing storage usage between backup sites? one of them is too little for the whole pool?
for now I don’t have a better idea, sorry. maybe this is the second best time to think up a structure for the datasets, and move everything into it.
but if the reason is the latter, one backup site cant hold the whole pool, you may need to reorganize it again in the future. and that’s not an easy thing, because now you’ll have the same data (files of the same category) scattered around the FS tree even locally. maybe you could ease that with something like mergerfs, and having it write each file to the dataset with lower storage usage.if you are ready to reorganize, think about what kinds (and subkinds) of files will you be likely to store in a larger amount, like media/video, media/image, and don’t forget to take advantage of per-dataset storage settings, like for compression, recordsize, maybe caching. not everything needs its own custom recordsize, but for contiguously read files a higher value might be better, also if its not too often accessed and want better compression ratio as compression (and checksumming!) happens per records. video is sometimes compressible, or rather some larger data blob inside the container