Akhil Dev – Super Charged Digital Marketing Strategy for Your Business › Ask Questions › General Questions › Rsync difference between –checksum and –ignore-times options
MemberAugust 21, 2020 at 4:57 pm
Can anyone clarify the differences between the
--ignore-timesoptions of rsync?
My understanding is as follows:
If the file size and time match, it will do a checksum at both ends to see if the files are really identical.
‘Transfer’ every file, regardless of whether file time is same at both ends. Since it will still use the delta-transfer algorithm, if a file actually is identical, nothing gets transferred.
That’s the technical difference, but as far as I can tell, they are semantically the same thing.
So, what I’m wondering is:
- What is the practical difference between the two options?
- In what cases would you use one rather than the other?
- Is there any performance difference between them?
- MemberAugust 22, 2020 at 10:46 am
rsyncskips files when the files have identical sizes and times on the source and destination sides. This is a heuristic which is usually a good idea, as it prevents
rsyncfrom having to examine the contents of files that are very likely identical on the source and destination sides.
rsyncto turn off the file-times-and-sizes heuristic, and thus unconditionally transfer ALL files from source to destination.
rsyncwill then proceed to read every file on the source side, since it will need to either use its delta-transfer algorithm, or simply send every file in its entirety, depending on whether the
--whole-fileoption was specified.
--checksumalso modifies the file-times-and-sizes heuristic, but here it ignores times and examines only sizes. Files on the source and destination sides that differ in size are transferred, since they are obviously different. Files with the same size are checksummed (with MD5 in
rsyncversion 3.0.0+, or with MD4 in earlier versions), and those found to have differing sums are also transferred.
In cases where the source and destination sides are mostly the same,
--checksumwill result in most files being checksummed on both sides. This could take long time, but the upshot is that the barest minimum of data will actually be transferred over the wire, especially if the delta-transfer algorithm is used. Of course, this is only a win if you have very slow networks, and/or very fast CPU.
--ignore-times, on the other hand, will send more data over the network, and it will cause all source files to be read, but at least it will not impose the additional burden of computing many cryptographically-strong hashsums on the source and destination CPUs. I would expect this option to perform better than
--checksumwhen your networks are fast, and/or your CPU relatively slow.
I think I would only ever use
--ignore-timesif I were transferring files to a destination where it was suspected that the contents of some files were corrupted, but whose modification times were not changed. I can’t really think of any other good reason to use either option, although there are probably other use-cases.
Log in to reply.