[PATCH] improved behaviour of tools/backup/hot-backup.py

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH] improved behaviour of tools/backup/hot-backup.py

Roberto Reale
Hi everybody,

enclosed is a puny patch I wrote against the script
tools/backup/hot-backup.py.in.

When creating a backup archive of a live repository, the "legacy"
behaviour of the script is simply to call tar.add() on a hotdump copy.
This has, however, an inconvenience when the original repository is on
a POSIX filesystem and the hotcopy dump is written e.g. on a CIFS
filesystem, thus leading to a possible "permission leakage".

My patch changes this behaviour by letting the script build up the
archive file by file, and paying special care in copying the
permissions for each file from the repository itself rather than from
the hotcopy dump.

Do you think it would be useful?

Regards,
Roberto

hot-backup.py.in.diff (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] improved behaviour of tools/backup/hot-backup.py

Roberto Reale
Do you think the patch can be improved in any regards?

I would like to "breed it up", so to say, to an acceptable status.

Regards,
Roberto

On Fri, Oct 2, 2015 at 5:44 PM, Roberto Reale <[hidden email]> wrote:

> Hi everybody,
>
> enclosed is a puny patch I wrote against the script
> tools/backup/hot-backup.py.in.
>
> When creating a backup archive of a live repository, the "legacy"
> behaviour of the script is simply to call tar.add() on a hotdump copy.
> This has, however, an inconvenience when the original repository is on
> a POSIX filesystem and the hotcopy dump is written e.g. on a CIFS
> filesystem, thus leading to a possible "permission leakage".
>
> My patch changes this behaviour by letting the script build up the
> archive file by file, and paying special care in copying the
> permissions for each file from the repository itself rather than from
> the hotcopy dump.
>
> Do you think it would be useful?
>
> Regards,
> Roberto
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] improved behaviour of tools/backup/hot-backup.py

Philip Martin-4
Roberto Reale <[hidden email]> writes:

> Do you think the patch can be improved in any regards?
>
> I would like to "breed it up", so to say, to an acceptable status.

Your patch assumes that all the files in the backup are still present in
the source repository and that is not always true.  For example pack on
a FSFS repository could remove revision files, a commit to a BDB
repository could delete a log file.  Any such files will cause archive
creation to fail with your patch.

--
Philip Martin
WANdisco
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] improved behaviour of tools/backup/hot-backup.py

Roberto Reale
Which behaviour would you deem correct?

Just assigning a default permission set when the original file cannot
be found, or inferring it from the file type (e.g., revision files
should not be executable)?

Regards,
Roberto

On Thu, Oct 22, 2015 at 1:12 PM, Philip Martin
<[hidden email]> wrote:

> Roberto Reale <[hidden email]> writes:
>
>> Do you think the patch can be improved in any regards?
>>
>> I would like to "breed it up", so to say, to an acceptable status.
>
> Your patch assumes that all the files in the backup are still present in
> the source repository and that is not always true.  For example pack on
> a FSFS repository could remove revision files, a commit to a BDB
> repository could delete a log file.  Any such files will cause archive
> creation to fail with your patch.
>
> --
> Philip Martin
> WANdisco
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] improved behaviour of tools/backup/hot-backup.py

Roberto Reale
I've endeavoured to improve on my previous patch, by adopting a
heuristic approach as follows:

1) First and foremost, for each file in the backup copy I try and
assign permissions based on those of the corresponding file in the
source repository, if the latter does indeed exist.
2) Should this fail, I infer the permissions from the context, i.e. I
choose one "typical file" from the same folder.
3) Should the folder itself be empty, I assign permissions based on
the current umask.

Do you deem such an approach to be correct?

Regards,
Roberto

On Thu, Oct 22, 2015 at 4:20 PM, Roberto Reale
<[hidden email]> wrote:

> Which behaviour would you deem correct?
>
> Just assigning a default permission set when the original file cannot
> be found, or inferring it from the file type (e.g., revision files
> should not be executable)?
>
> Regards,
> Roberto
>
> On Thu, Oct 22, 2015 at 1:12 PM, Philip Martin
> <[hidden email]> wrote:
>> Roberto Reale <[hidden email]> writes:
>>
>>> Do you think the patch can be improved in any regards?
>>>
>>> I would like to "breed it up", so to say, to an acceptable status.
>>
>> Your patch assumes that all the files in the backup are still present in
>> the source repository and that is not always true.  For example pack on
>> a FSFS repository could remove revision files, a commit to a BDB
>> repository could delete a log file.  Any such files will cause archive
>> creation to fail with your patch.
>>
>> --
>> Philip Martin
>> WANdisco

hot-backup.py.in.diff (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] improved behaviour of tools/backup/hot-backup.py

Daniel Shahaf-2
Roberto Reale wrote on Mon, Oct 26, 2015 at 20:56:30 +0100:

> I've endeavoured to improve on my previous patch, by adopting a
> heuristic approach as follows:
>
> 1) First and foremost, for each file in the backup copy I try and
> assign permissions based on those of the corresponding file in the
> source repository, if the latter does indeed exist.
> 2) Should this fail, I infer the permissions from the context, i.e. I
> choose one "typical file" from the same folder.
> 3) Should the folder itself be empty, I assign permissions based on
> the current umask.
>
> Do you deem such an approach to be correct?

A bit late, but:

I think (3) should fall back to $REPOS_DIR/db/fs-type (which is
guaranteed to exist) rather than to umask.  (The umask has the right
dimension — permission bits — but using it might surprise users.)

Of academic interest [if you accept the change in the above paragraph]:
the os.listdir() call had an O(N²) cost where N is the number of files
in a directory, aka, was inefficient for unsharded FSFS repositories.

I don't have any other comments, but I haven't read the patch closely
enough to be comfortable committing it, either.  Please feel free to
file an issue in our bug tracker about this patch, so we don't forget
it: https://issues.apache.org/jira/browse/SVN

Cheers,

Daniel

Reply | Threaded
Open this post in threaded view
|

RE: [PATCH] improved behaviour of tools/backup/hot-backup.py

Bert Huijben-5


> -----Original Message-----
> From: Daniel Shahaf [mailto:[hidden email]]
> Sent: maandag 21 december 2015 15:58
> To: Roberto Reale <[hidden email]>
> Cc: Philip Martin <[hidden email]>;
> [hidden email]
> Subject: Re: [PATCH] improved behaviour of tools/backup/hot-backup.py

>
> I think (3) should fall back to $REPOS_DIR/db/fs-type (which is
> guaranteed to exist) rather than to umask.  (The umask has the right
> dimension — permission bits — but using it might surprise users.)

Are you sure this 'must exist'?

When I accidentally tried to use a wrong directory some weeks ago (while working on libsvn_fs_git) I found that when that file is missing we assume that the filesystem is BDB.

        Bert


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] improved behaviour of tools/backup/hot-backup.py

Daniel Shahaf-2
On Mon, Dec 21, 2015, at 15:16, Bert Huijben wrote:

>
>
> > -----Original Message-----
> > From: Daniel Shahaf [mailto:[hidden email]]
> > Sent: maandag 21 december 2015 15:58
> > To: Roberto Reale <[hidden email]>
> > Cc: Philip Martin <[hidden email]>;
> > [hidden email]
> > Subject: Re: [PATCH] improved behaviour of tools/backup/hot-backup.py
>
> >
> > I think (3) should fall back to $REPOS_DIR/db/fs-type (which is
> > guaranteed to exist) rather than to umask.  (The umask has the right
> > dimension — permission bits — but using it might surprise users.)
>
> Are you sure this 'must exist'?
>
> When I accidentally tried to use a wrong directory some weeks ago (while
> working on libsvn_fs_git) I found that when that file is missing we assume
> that the filesystem is BDB.

Without checking the code, I suspect you're right, and that file only exists
for repositories created by svn 1.1 or later...

Good catch :-)

Daniel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] improved behaviour of tools/backup/hot-backup.py

Roberto Reale
Of course I would be glad to modify the patch, but... shall we content ourselves with the somewhat elusive fs-type? :)

Roberto

On Mon, Dec 21, 2015 at 4:23 PM, Daniel Shahaf <[hidden email]> wrote:
On Mon, Dec 21, 2015, at 15:16, Bert Huijben wrote:
>
>
> > -----Original Message-----
> > From: Daniel Shahaf [mailto:[hidden email]]
> > Sent: maandag 21 december 2015 15:58
> > To: Roberto Reale <[hidden email]>
> > Cc: Philip Martin <[hidden email]>;
> > [hidden email]
> > Subject: Re: [PATCH] improved behaviour of tools/backup/hot-backup.py
>
> >
> > I think (3) should fall back to $REPOS_DIR/db/fs-type (which is
> > guaranteed to exist) rather than to umask.  (The umask has the right
> > dimension — permission bits — but using it might surprise users.)
>
> Are you sure this 'must exist'?
>
> When I accidentally tried to use a wrong directory some weeks ago (while
> working on libsvn_fs_git) I found that when that file is missing we assume
> that the filesystem is BDB.

Without checking the code, I suspect you're right, and that file only exists
for repositories created by svn 1.1 or later...

Good catch :-)

Daniel

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] improved behaviour of tools/backup/hot-backup.py

Daniel Shahaf-2
Roberto Reale wrote on Mon, Dec 21, 2015 at 16:25:10 +0100:
> Of course I would be glad to modify the patch, but... shall we content
> ourselves with the somewhat elusive fs-type? :)

Using fs-type would break backwards compatibility with BDB repositories
created by svnadmin 1.0.x (or earlier) that have never been dump/load-ed.

Now, since hot-backup.py was in tools/ in 1.0, it's formally covered by
the same backwards compatibility promise as everything else.

However, I don't think there are many people who use a repository
created by 1.0 that has never been dump/load-ed by any later version of
svnadmin; that is, I think few if any people would be affected by
breaking compatibility in this specific case.  (And those who *would* be
affected would receive an error message, as opposed to silent misbehaviour.)

So, how about we (a) make the patch use fs-type, (b) issue an erratum¹
stating the breakage and providing the workaround?

Alternatively, we could just find some file that's guaranteed to exist
in all 1.0 BDB repositories and use that file if fs-type is missing.

Cheers,

Daniel

¹ See notes/api-errata/.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] improved behaviour of tools/backup/hot-backup.py

Roberto Reale
To begin with, I edited the patch to implement Daniel's (a)-proposal.

Cheers,
Roberto

On Wed, Dec 23, 2015 at 10:11 AM, Daniel Shahaf <[hidden email]> wrote:

> Roberto Reale wrote on Mon, Dec 21, 2015 at 16:25:10 +0100:
>> Of course I would be glad to modify the patch, but... shall we content
>> ourselves with the somewhat elusive fs-type? :)
>
> Using fs-type would break backwards compatibility with BDB repositories
> created by svnadmin 1.0.x (or earlier) that have never been dump/load-ed.
>
> Now, since hot-backup.py was in tools/ in 1.0, it's formally covered by
> the same backwards compatibility promise as everything else.
>
> However, I don't think there are many people who use a repository
> created by 1.0 that has never been dump/load-ed by any later version of
> svnadmin; that is, I think few if any people would be affected by
> breaking compatibility in this specific case.  (And those who *would* be
> affected would receive an error message, as opposed to silent misbehaviour.)
>
> So, how about we (a) make the patch use fs-type, (b) issue an erratumน
> stating the breakage and providing the workaround?
>
> Alternatively, we could just find some file that's guaranteed to exist
> in all 1.0 BDB repositories and use that file if fs-type is missing.
>
> Cheers,
>
> Daniel
>
> น See notes/api-errata/.

hot-backup.py.in.diff (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] improved behaviour of tools/backup/hot-backup.py

Daniel Shahaf-2
Roberto Reale wrote on Wed, Dec 23, 2015 at 11:22:14 +0100:
> To begin with, I edited the patch to implement Daniel's (a)-proposal.

> +      # "typical file" from the same folder (case C).  Should the folder itself
> +      # be empty, we assign permissions based on those of a file which is
> +      # guaranteed to exist (case F).
...
> +    # (T) Assign permissions based on those of the file db/fs-type in
> +    #     the source repository.

"(case F)" and "(T)" refer to the same thing, one of them is a typo.

Could you please submit the patch to the issue tracker?

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] improved behaviour of tools/backup/hot-backup.py

Roberto Reale
Hi folks,

I am going to submit the patch to the issue tracker. Am I right?

Roberto

On Mon, Dec 28, 2015 at 4:01 PM, Daniel Shahaf <[hidden email]> wrote:

> Roberto Reale wrote on Wed, Dec 23, 2015 at 11:22:14 +0100:
>> To begin with, I edited the patch to implement Daniel's (a)-proposal.
>
>> +      # "typical file" from the same folder (case C).  Should the folder itself
>> +      # be empty, we assign permissions based on those of a file which is
>> +      # guaranteed to exist (case F).
> ...
>> +         # (T) Assign permissions based on those of the file db/fs-type in
>> +         #     the source repository.
>
> "(case F)" and "(T)" refer to the same thing, one of them is a typo.
>
> Could you please submit the patch to the issue tracker?
>