svn hotcopy incremental overwrites existing revisions in backup

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

svn hotcopy incremental overwrites existing revisions in backup

lumi
I use "svnadmin hotcopy --incremental" command to create backup. Subversion 1.9.5. It was discovered that this command recreates already backuped revisions files which size exceeds e.g. 120kb in directory db/revs/. Backup log (1st backup was made to empty folder, next ones to the same folder):
C:\Users\Администратор.WIN-DBM2OE9OJ54>svnadmin hotcopy D:\Repositories\Sandbox D:\Test --incremental
* Copied revision 0.
* Copied revision 1.
* Copied revision 2.
* Copied revision 3.
* Copied revision 4.
* Copied revision 5.
* Copied revision 6.
* Copied revision 7.
* Copied revision 8.
* Copied revision 9.
* Copied revision 10.
* Copied revision 11.
* Copied revision 12.
* Copied revision 13.
* Copied revision 14.
* Copied revision 15.
* Copied revision 16.
* Copied revision 17.
* Copied revision 18.
* Copied revision 19.
* Copied revision 20.
* Copied revision 21.
* Copied revision 22.
* Copied revision 23.
* Copied revision 24.
* Copied revision 25.
* Copied revision 26.
* Copied revision 27.
* Copied revision 28.
* Copied revision 29.
* Copied revision 30.
* Copied revision 31.
* Copied revision 32.
* Copied revision 33.
* Copied revision 34.
* Copied revision 35.
* Copied revision 36.
* Copied revision 37.
* Copied revision 38.
* Copied revision 39.
* Copied revision 40.
* Copied revision 41.
* Copied revision 42.
* Copied revision 43.
* Copied revision 44.
* Copied revision 45.
* Copied revision 46.
* Copied revision 47.
* Copied revision 48.
* Copied revision 49.
* Copied revision 50.
* Copied revision 51.
* Copied revision 52.
* Copied revision 53.
* Copied revision 54.
* Copied revision 55.

C:\Users\Администратор.WIN-DBM2OE9OJ54>svnadmin hotcopy D:\Repositories\Sandbox D:\Test --incremental
* Copied revision 14.
* Copied revision 21.
* Copied revision 22.

C:\Users\Администратор.WIN-DBM2OE9OJ54>svnadmin hotcopy D:\Repositories\Sandbox D:\Test --incremental
* Copied revision 14.
* Copied revision 21.
* Copied revision 22.


And so on with each next hotcopy --incremental command. Binary comparison revision 14, 21, 22 files of original repositary and backup gives equal result. What reason of this strange behaviour?
Reply | Threaded
Open this post in threaded view
|

Re: svn hotcopy incremental overwrites existing revisions in backup

Daniel Shahaf-2
lumi wrote on Tue, May 16, 2017 at 06:21:06 -0700:

> C:\Users\Администратор.WIN-DBM2OE9OJ54>svnadmin hotcopy
> D:\Repositories\Sandbox D:\Test --incremental
> * Copied revision 14.
> * Copied revision 21.
> * Copied revision 22.
>
> C:\Users\Администратор.WIN-DBM2OE9OJ54>svnadmin hotcopy
> D:\Repositories\Sandbox D:\Test --incremental
> * Copied revision 14.
> * Copied revision 21.
> * Copied revision 22./
>
> And so on with each next hotcopy --incremental command. Binary comparison
> revision 14, 21, 22 files of original repositary and backup gives equal
> result. What reason of this strange behaviour?

I can't reproduce this:

% rm -rf r d
% svnadmin create r
% repeat 100 svnmucc put -mm -U file://$PWD/r =(dd if=/dev/urandom bs=1k count=200 2>/dev/null) f$RANDOM.$RANDOM  >/dev/null
% svnadmin hotcopy --incremental r d  >/dev/null
% svnadmin hotcopy --incremental r d
% svnadmin hotcopy --incremental r d
% svnadmin hotcopy --incremental r d
% svnadmin hotcopy --incremental r d
%                                                                                                                       13:39

If you delete D:\Test and run the 'hotcopy' command three more times,
does it say 14, 21, 22 in those times too?

What filesystem is D:?  Is it NTFS, or a network drive, or…?
Reply | Threaded
Open this post in threaded view
|

Re: svn hotcopy incremental overwrites existing revisions in backup

Stefan Sperling
In reply to this post by lumi
On Tue, May 16, 2017 at 06:21:06AM -0700, lumi wrote:
> And so on with each next hotcopy --incremental command. Binary comparison
> revision 14, 21, 22 files of original repositary and backup gives equal
> result. What reason of this strange behaviour?

The only possible reasons are a size mismatch or a timestamp mismatch
on the affected files.
Reply | Threaded
Open this post in threaded view
|

Re: svn hotcopy incremental overwrites existing revisions in backup

lumi
In reply to this post by Daniel Shahaf-2
NTFS with deduplication enabled (Windows Server 2016). Problem files have APL attributes (Archive, SparseFile, ReparsePoint), which means that file takes part in deduplication I guess. Hotcopy of repository to WebDav network drive gives exactly the same result. It means that problem in source files.
Reply | Threaded
Open this post in threaded view
|

Re: svn hotcopy incremental overwrites existing revisions in backup

lumi
In reply to this post by Stefan Sperling
Size mismatch is definitly takes place. Actual size is normal, but size on disk is a bit unreal, again because of deduplication.
Filesize
Reply | Threaded
Open this post in threaded view
|

Re: svn hotcopy incremental overwrites existing revisions in backup

Stefan Sperling
On Tue, May 16, 2017 at 10:57:44AM -0700, lumi wrote:
> Size mismatch is definitly takes place. Actual size is normal, but size on
> disk is a bit unreal, again because of deduplication.
> <http://subversion.1072662.n5.nabble.com/file/n198986/FileSize.png>

The whole point of a hotcopy is to have a 1-to-1 bit-identical backup.
If the NTFS filesystem which stores the backup is de-duplicating files
in a way that makes their filesize change, then incremental hotcopy
cannot work. By design, incremental hotcopy compares the size and
timestamp to see if a revision file must be copied again.
So if you really want to use svnadmin hotcopy you should disable the
de-duplication feature on the target filesystem.

But there are other tools you could use for backup purposes instead,
such as svnadmin dump/load and svnsync (e.g. with file:// URLs).
These should work fine with NTFS de-duplication enabled on backup storage.
See http://svnbook.red-bean.com/nightly/en/svn.reposadmin.maint.html#svn.reposadmin.maint.migrate
and http://svnbook.red-bean.com/nightly/en/svn.reposadmin.maint.html#svn.reposadmin.maint.replication
Reply | Threaded
Open this post in threaded view
|

Re: svn hotcopy incremental overwrites existing revisions in backup

lumi
This post was updated on .
Isn't it a mistake in method of getting file size on deduplicated volume? What I showed on screenshots is what Windows Explorer says. Other applications shows only one size value, e.g. Powershell Get-ItemProperty gets only actual size, no matter deduplicated volume or not, and this size is always the same.
PS C:\Users\Администратор.WIN-DBM2OE9OJ54> Get-ItemProperty -Path D:\Repositories\Sandbox\db\revs\0\14


    Каталог: D:\Repositories\Sandbox\db\revs\0


Mode                LastWriteTime         Length Name                                                                                    
----                -------------         ------ ----                                                                                    
-a---l       16.03.2017     13:00         126748 14                                                                                      



PS C:\Users\Администратор.WIN-DBM2OE9OJ54> Get-ItemProperty -Path Y:\RepositoriesBackup\Daily\Sandbox\db\revs\0\14


    Каталог: Y:\RepositoriesBackup\Daily\Sandbox\db\revs\0


Mode                LastWriteTime         Length Name                                                                                    
----                -------------         ------ ----                                                                                    
-a----       16.05.2017     11:08         126748 14    
                                                                                 

The first one is deduplicated file on local drive, the second is hotcopied file on WebDav Network drive.
I would like to clarify that deduplication is applied on source file system, not on the target.
For now I try to temporarily (I hope temporarily) disable deduplication on volume with repositories. But it's a quite cool feature to use especially on FSFS repositories, where once created revision files never change.
Reply | Threaded
Open this post in threaded view
|

Re: svn hotcopy incremental overwrites existing revisions in backup

Stefan Sperling
On Tue, May 16, 2017 at 12:02:38PM -0700, lumi wrote:
> Isn't it a mistake in method of getting file size on deduplicated volume?

Subversion asks APR (a portability library) for the filesize.
APR does something to find that size. Subversion uses the value reported
by APR, and Subversion does not care about how APR figured it out.

So if there is a problem with how the size is determined on Windows with
NTFS de-duplication enabled, then this problem is probably located in APR
and should be fixed there. The APR project is at https://apr.apache.org

That said, if you know of a way to find the correct size with the win32 API
we could probably patch Subversion to bypass APR for this specific case.
But APR would have to be fixed anyway.
Reply | Threaded
Open this post in threaded view
|

Re: svn hotcopy incremental overwrites existing revisions in backup

Branko Čibej
On 16.05.2017 21:28, Stefan Sperling wrote:

> On Tue, May 16, 2017 at 12:02:38PM -0700, lumi wrote:
>> Isn't it a mistake in method of getting file size on deduplicated volume?
> Subversion asks APR (a portability library) for the filesize.
> APR does something to find that size. Subversion uses the value reported
> by APR, and Subversion does not care about how APR figured it out.
>
> So if there is a problem with how the size is determined on Windows with
> NTFS de-duplication enabled, then this problem is probably located in APR
> and should be fixed there. The APR project is at https://apr.apache.org
>
> That said, if you know of a way to find the correct size with the win32 API
> we could probably patch Subversion to bypass APR for this specific case.
> But APR would have to be fixed anyway.

I suspect the ReparsePoint attribute on the file is what actually makes
APR hiccup. A "reparse point" is distantly related to a unix symlink ...
it tells the file-system path resolver to restart with a different path.
I bet that APR reports is the size of the reparse-point record instead
of the size of the target file, but when we open the file we get the
actual file contents.

-- Brane