wildcard authz docs question

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

wildcard authz docs question

Daniel Shahaf-2
From the 1.10 draft release notes:

> All wildcards apply to full path segments only, i.e. * never matches
> /, except for the case where /**/ matches zero or more path segments.
> For example, /*/**/* will match any path which contains at least
> 2 segments and is equivalent to /**/*/* as well as /*/*/**.

Are «/*/**/*» «/**/*/*» «/*/*/**» really equivalent?  I would have
expected the first two to match any node except / and /'s immediate
children, but I wouldn't expect the third form to match /trunk/iota
where iota is a file, since the pattern has a trailing slash after the
non-optional second component.

Testing this in
    cd $(mktemp -d)
    mkdir -p foo/bar
, I see that neither vim nor zsh finds any matches for */*/**, meaning
they don't interpret ** as "zero or more" path components in this
pattern.  I suppose they only treat ** in this way when it appears with
slashes immediately before and after it.

Cheers,

Daniel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Doug Robinson
Daniel:

The shell's all treat ** as * and require that it match something.  So
"mkdir -p foo/bar/baz" would match.

I would expect "/*/**/*", "/**/*/*" and "/*/*/**" to all match exactly the
same sets of components.

No command shell that I know of (sh,bash,zsh,tcsh,csh,ksh) has a
moral equivalent to "zero or more path components".  Perl, python,
et. al. do.

Cheers,

Doug

On Wed, Mar 15, 2017 at 5:55 AM, Daniel Shahaf <[hidden email]> wrote:
From the 1.10 draft release notes:

> All wildcards apply to full path segments only, i.e. * never matches
> /, except for the case where /**/ matches zero or more path segments.
> For example, /*/**/* will match any path which contains at least
> 2 segments and is equivalent to /**/*/* as well as /*/*/**.

Are «/*/**/*» «/**/*/*» «/*/*/**» really equivalent?  I would have
expected the first two to match any node except / and /'s immediate
children, but I wouldn't expect the third form to match /trunk/iota
where iota is a file, since the pattern has a trailing slash after the
non-optional second component.

Testing this in
    cd $(mktemp -d)
    mkdir -p foo/bar
, I see that neither vim nor zsh finds any matches for */*/**, meaning
they don't interpret ** as "zero or more" path components in this
pattern.  I suppose they only treat ** in this way when it appears with
slashes immediately before and after it.

Cheers,

Daniel



--
DOUGLAS B. ROBINSON SENIOR PRODUCT MANAGER




Learn how WANdisco Fusion solves Hadoop data protection and scalability challenges

Listed on the London Stock Exchange: WAND

THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY, AND MAY BE PRIVILEGED.  If this message was misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not waive any confidentiality or privilege.  If you are not the intended recipient, please notify us immediately and destroy the message without disclosing its contents to anyone.  Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized.  The views and opinions expressed in this e-mail message are the author's own and may not reflect the views and opinions of WANdisco, unless the author is authorized by WANdisco to express such views or opinions on its behalf.  All email sent to or from this address is subject to electronic storage and review by WANdisco.  Although WANdisco operates anti-virus programs, it does not accept responsibility for any damage whatsoever caused by viruses being passed.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Daniel Shahaf-2
Doug Robinson wrote on Tue, Mar 21, 2017 at 11:40:50 -0400:
> Daniel:
>
> The shell's all treat ** as * and require that it match something.  So
> "mkdir -p foo/bar/baz" would match.
>
> No command shell that I know of (sh,bash,zsh,tcsh,csh,ksh) has a
> moral equivalent to "zero or more path components".  Perl, python,
> et. al. do.

zsh interprets ** as meaning "zero or more path components" when it's
followed by a slash:

    % mkdir -p foo/bar
    % echo */**
    foo/bar
    % echo */**/
    foo/ foo/bar/

I looked up the Python and Perl equivalents, but the Python one has
a bug (the pattern '*/*/**' finds 'trunk/iota/' — with a trailing
slash — even if trunk/iota is a file) and I found no Perl equivalent in
its stdlib's File::Glob, so I couldn't compare against either of them.

> I would expect "/*/**/*", "/**/*/*" and "/*/*/**" to all match exactly the
> same sets of components.

Then our expectations are different as to what */*/** should mean.  Can
you give an example of a tool where ./*/*/** matches ./trunk/iota when
iota is a file (not a directory)?  As I said in my previous mail,
neither vim nor zsh — which, to clarify, both support a ** recursion
operaetor — match ./trunk/iota in that situation.

Thanks for jumping in.

Cheers,

Daniel


> Cheers,
>
> Doug
>
> On Wed, Mar 15, 2017 at 5:55 AM, Daniel Shahaf <[hidden email]>
> wrote:
>
> > From the 1.10 draft release notes:
> >
> > > All wildcards apply to full path segments only, i.e. * never matches
> > > /, except for the case where /**/ matches zero or more path segments.
> > > For example, /*/**/* will match any path which contains at least
> > > 2 segments and is equivalent to /**/*/* as well as /*/*/**.
> >
> > Are «/*/**/*» «/**/*/*» «/*/*/**» really equivalent?  I would have
> > expected the first two to match any node except / and /'s immediate
> > children, but I wouldn't expect the third form to match /trunk/iota
> > where iota is a file, since the pattern has a trailing slash after the
> > non-optional second component.
> >
> > Testing this in
> >     cd $(mktemp -d)
> >     mkdir -p foo/bar
> > , I see that neither vim nor zsh finds any matches for */*/**, meaning
> > they don't interpret ** as "zero or more" path components in this
> > pattern.  I suppose they only treat ** in this way when it appears with
> > slashes immediately before and after it.
> >
> > Cheers,
> >
> > Daniel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Doug Robinson
Daniel:

Sorry for the delay - I missed the post.

And, I'm going to recant my original conclusion - my apologies for not treating this with sufficient vigor the 1st time around.

Wow - it's been a long time since I played with zsh.  Yep, I see the reference to "**/’ is equivalent to ‘(*/)#’".  So, playing with this, the zsh man page is a bit loose with the term "file" where they are really talking about a "directory entry":

# ls -dl **/foo
drwxr-xr-x 2 root root 4096 Mar 28 14:32 a/b/c/foo
# ls -dl **/bar
-rw-r--r-- 1 root root 0 Mar 28 14:32 a/b/c/bar
# ls -dl **/baz
lrwxrwxrwx 1 root root 10 Mar 28 14:36 a/b/c/baz -> a/b/c/biff

That said, in discussions I've had I think about the SVN regex "**" differently than the zsh construct.  The way that I interpret "/**" is "everything below and including slash" - so "**" is the moral equivalent of Perl's ".*" wildcard.  It need not be followed by any terminal pattern to match anything - since it matches them all.  If it was followed by something then that something would be required.

So let me break the 3 patterns down:

/*/*/**   This requires 2 directories. It will match all directories 2 levels down - and then everything in all of the rest of those trees however deep.  It should not, however, match a file or symlink in a directory, e.g. "/dirA/fileB".  Whereas it will match "/dirA/dirB" along with "/dirA/dirB/fileC", etc.

/*/**/*   This requires 1 directory and then something else.  It will match "/dirA/fileB" or "/dirA/symlinkX" since "/**" can simply go to nothing.  Or perhaps a different way to look at it is that "/**" can match "/" which, in its simplest will mean "/*/**/*" becomes "/*//*" and given that multiple '/' always collapse to a single '/' in "path arithmetic" becomes "/*/*" for its shortest match.

/**/*/*   This requires 1 directory and then something else.  Pretty much the same as the prior example and for the same reasons.

Is this more along the lines of what you were thinking?

Thank you.

Doug


On Tue, Mar 21, 2017 at 12:36 PM, Daniel Shahaf <[hidden email]> wrote:
Doug Robinson wrote on Tue, Mar 21, 2017 at 11:40:50 -0400:
> Daniel:
>
> The shell's all treat ** as * and require that it match something.  So
> "mkdir -p foo/bar/baz" would match.
>
> No command shell that I know of (sh,bash,zsh,tcsh,csh,ksh) has a
> moral equivalent to "zero or more path components".  Perl, python,
> et. al. do.

zsh interprets ** as meaning "zero or more path components" when it's
followed by a slash:

    % mkdir -p foo/bar
    % echo */**
    foo/bar
    % echo */**/
    foo/ foo/bar/

I looked up the Python and Perl equivalents, but the Python one has
a bug (the pattern '*/*/**' finds 'trunk/iota/' — with a trailing
slash — even if trunk/iota is a file) and I found no Perl equivalent in
its stdlib's File::Glob, so I couldn't compare against either of them.

> I would expect "/*/**/*", "/**/*/*" and "/*/*/**" to all match exactly the
> same sets of components.

Then our expectations are different as to what */*/** should mean.  Can
you give an example of a tool where ./*/*/** matches ./trunk/iota when
iota is a file (not a directory)?  As I said in my previous mail,
neither vim nor zsh — which, to clarify, both support a ** recursion
operaetor — match ./trunk/iota in that situation.

Thanks for jumping in.

Cheers,

Daniel


> Cheers,
>
> Doug
>
> On Wed, Mar 15, 2017 at 5:55 AM, Daniel Shahaf <[hidden email]>
> wrote:
>
> > From the 1.10 draft release notes:
> >
> > > All wildcards apply to full path segments only, i.e. * never matches
> > > /, except for the case where /**/ matches zero or more path segments.
> > > For example, /*/**/* will match any path which contains at least
> > > 2 segments and is equivalent to /**/*/* as well as /*/*/**.
> >
> > Are «/*/**/*» «/**/*/*» «/*/*/**» really equivalent?  I would have
> > expected the first two to match any node except / and /'s immediate
> > children, but I wouldn't expect the third form to match /trunk/iota
> > where iota is a file, since the pattern has a trailing slash after the
> > non-optional second component.
> >
> > Testing this in
> >     cd $(mktemp -d)
> >     mkdir -p foo/bar
> > , I see that neither vim nor zsh finds any matches for */*/**, meaning
> > they don't interpret ** as "zero or more" path components in this
> > pattern.  I suppose they only treat ** in this way when it appears with
> > slashes immediately before and after it.
> >
> > Cheers,
> >
> > Daniel



--
DOUGLAS B. ROBINSON SENIOR PRODUCT MANAGER




Learn how WANdisco Fusion solves Hadoop data protection and scalability challenges

Listed on the London Stock Exchange: WAND

THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY, AND MAY BE PRIVILEGED.  If this message was misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not waive any confidentiality or privilege.  If you are not the intended recipient, please notify us immediately and destroy the message without disclosing its contents to anyone.  Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized.  The views and opinions expressed in this e-mail message are the author's own and may not reflect the views and opinions of WANdisco, unless the author is authorized by WANdisco to express such views or opinions on its behalf.  All email sent to or from this address is subject to electronic storage and review by WANdisco.  Although WANdisco operates anti-virus programs, it does not accept responsibility for any damage whatsoever caused by viruses being passed.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Daniel Shahaf-2
Doug Robinson wrote on Tue, Mar 28, 2017 at 09:05:53 -0400:
> Daniel:
>
> Sorry for the delay - I missed the post.

No worries.

> That said, in discussions I've had I think about the SVN regex "**"
> differently than the zsh construct.  The way that I interpret "/**" is
> "everything below and including slash" - so "**" is the moral equivalent of
> Perl's ".*" wildcard.  It need not be followed by any terminal pattern to
> match anything - since it matches them all.  If it was followed by
> something then that something would be required.
>

Note that your terminology is backwards: "**" is a wildcard and ".*" is
a regex.

> So let me break the 3 patterns down:
>
> /*/*/**   This requires 2 directories. It will match all directories 2
> levels down - and then everything in all of the rest of those trees however
> deep.  It should not, however, match a file or symlink in a directory, e.g.
> "/dirA/fileB".  Whereas it will match "/dirA/dirB" along with
> "/dirA/dirB/fileC", etc.

That's an interesting one.  Neither vim nor zsh matches dirA/dirB here —
they only match dirents _under_ it — but it's certainly defensible to
match it, exactly as you say.

To clarify, if foo/bar is a symlink then it is not a directory, no
matter what its target is and what else exists in the repository.  (In
particular, if its target is "baz" and foo/baz/ exists, foo/bar is still
not a directory.)

So, for example, [foo/bar/**] would apply to foo/bar/, iff it exists and
is a directory.  That sounds good.

> /*/**/*   This requires 1 directory and then something else.  It will match
> "/dirA/fileB" or "/dirA/symlinkX" since "/**" can simply go to nothing.  Or
> perhaps a different way to look at it is that "/**" can match "/" which, in
> its simplest will mean "/*/**/*" becomes "/*//*" and given that multiple
> '/' always collapse to a single '/' in "path arithmetic" becomes "/*/*" for
> its shortest match.

Agreed.

> /**/*/*   This requires 1 directory and then something else.  Pretty much
> the same as the prior example and for the same reasons.

Agreed.

Thanks,

Daniel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Doug Robinson
Daniel:

On Tue, Mar 28, 2017 at 5:43 PM, Daniel Shahaf <[hidden email]> wrote:
Doug Robinson wrote on Tue, Mar 28, 2017 at 09:05:53 -0400:
> That said, in discussions I've had I think about the SVN regex "**"
> differently than the zsh construct.  The way that I interpret "/**" is
> "everything below and including slash" - so "**" is the moral equivalent of
> Perl's ".*" wildcard.  It need not be followed by any terminal pattern to
> match anything - since it matches them all.  If it was followed by
> something then that something would be required.
>

Note that your terminology is backwards: "**" is a wildcard and ".*" is
a regex.

Agreed - sort of.

I've been using them interchangeably: the  longer you use them the more that
you come to consider them the same.  I know that the shell calls them wildcards
and that awk/sed/perl/et. al. call them regular expressions.  That said, they're
doing the same thing in different contexts.

> So let me break the 3 patterns down:
>
> /*/*/**   This requires 2 directories. It will match all directories 2
> levels down - and then everything in all of the rest of those trees however
> deep.  It should not, however, match a file or symlink in a directory, e.g.
> "/dirA/fileB".  Whereas it will match "/dirA/dirB" along with
> "/dirA/dirB/fileC", etc.

That's an interesting one.  Neither vim nor zsh matches dirA/dirB here —
they only match dirents _under_ it — but it's certainly defensible to
match it, exactly as you say.

To clarify, if foo/bar is a symlink then it is not a directory, no
matter what its target is and what else exists in the repository.  (In
particular, if its target is "baz" and foo/baz/ exists, foo/bar is still
not a directory.)

Agreed: symlinks are their own form of fun given their semantics on
various system calls.  But in the end they are just another type of directory
entry and for the purposes of matching act more like a file than a directory.
And I very much agree that it does not matter what they are pointing at.
 
So, for example, [foo/bar/**] would apply to foo/bar/, iff it exists and
is a directory.  That sounds good.

> /*/**/*   This requires 1 directory and then something else.  It will match
> "/dirA/fileB" or "/dirA/symlinkX" since "/**" can simply go to nothing.  Or
> perhaps a different way to look at it is that "/**" can match "/" which, in
> its simplest will mean "/*/**/*" becomes "/*//*" and given that multiple
> '/' always collapse to a single '/' in "path arithmetic" becomes "/*/*" for
> its shortest match.

Agreed.

> /**/*/*   This requires 1 directory and then something else.  Pretty much
> the same as the prior example and for the same reasons.

Agreed.

Excellent!

Cheers!

Doug
--
DOUGLAS B. ROBINSON SENIOR PRODUCT MANAGER




Learn how WANdisco Fusion solves Hadoop data protection and scalability challenges

Listed on the London Stock Exchange: WAND

THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY, AND MAY BE PRIVILEGED.  If this message was misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not waive any confidentiality or privilege.  If you are not the intended recipient, please notify us immediately and destroy the message without disclosing its contents to anyone.  Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized.  The views and opinions expressed in this e-mail message are the author's own and may not reflect the views and opinions of WANdisco, unless the author is authorized by WANdisco to express such views or opinions on its behalf.  All email sent to or from this address is subject to electronic storage and review by WANdisco.  Although WANdisco operates anti-virus programs, it does not accept responsibility for any damage whatsoever caused by viruses being passed.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Stefan Fuhrmann
In reply to this post by Daniel Shahaf-2
On 15.03.2017 10:55, Daniel Shahaf wrote:

> >From the 1.10 draft release notes:
>
>> All wildcards apply to full path segments only, i.e. * never matches
>> /, except for the case where /**/ matches zero or more path segments.
>> For example, /*/**/* will match any path which contains at least
>> 2 segments and is equivalent to /**/*/* as well as /*/*/**.
> Are «/*/**/*» «/**/*/*» «/*/*/**» really equivalent?  I would have
> expected the first two to match any node except / and /'s immediate
> children, but I wouldn't expect the third form to match /trunk/iota
> where iota is a file, since the pattern has a trailing slash after the
> non-optional second component.
How do you know that /trunk/iota is a file?

The problem is that the authz callback does not provide
enough context information to make that distinction.
We might extend the interface in the future - allowing
to restrict rules to exclusively match files or dirs only.

But making that backward compatible adds quite a bit
of complexity that I don't want to pile on there in 1.10.

-- Stefan^2.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Daniel Shahaf-2
Stefan Fuhrmann wrote on Mon, Apr 17, 2017 at 22:22:33 +0200:

> On 15.03.2017 10:55, Daniel Shahaf wrote:
> >>From the 1.10 draft release notes:
> >
> >>All wildcards apply to full path segments only, i.e. * never matches
> >>/, except for the case where /**/ matches zero or more path segments.
> >>For example, /*/**/* will match any path which contains at least
> >>2 segments and is equivalent to /**/*/* as well as /*/*/**.
> >Are «/*/**/*» «/**/*/*» «/*/*/**» really equivalent?  I would have
> >expected the first two to match any node except / and /'s immediate
> >children, but I wouldn't expect the third form to match /trunk/iota
> >where iota is a file, since the pattern has a trailing slash after the
> >non-optional second component.
> How do you know that /trunk/iota is a file?

I was reviewing the API docs as a black box, i.e., from a user
(repository admin) perspective, not from an implementation perspective.

 From that perspective, I would say that having a [/trunk/iota/**]
stanza to apply to a /trunk/iota file violates the principle of least
surprise.

> The problem is that the authz callback does not provide
> enough context information to make that distinction.
> We might extend the interface in the future - allowing
> to restrict rules to exclusively match files or dirs only.

Are you referring to svn_repos_authz_check_access()?  [which doesn't
have an svn_fs_t handle or the information to open one]

> But making that backward compatible adds quite a bit
> of complexity that I don't want to pile on there in 1.10.

I don't understand this sentence at all.  Why do we need to be backwards
compatible (this is a new feature), and why is being back compat in
this case necessarily expensive?

Moreover, implementation considerations aside, there is still the
question of what the documentation should say about this situation.

Cheers,

Daniel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Doug Robinson
Daniel:

On Mon, Apr 17, 2017 at 21:13 Daniel Shahaf <[hidden email]> wrote:
Stefan Fuhrmann wrote on Mon, Apr 17, 2017 at 22:22:33 +0200:
> On 15.03.2017 10:55, Daniel Shahaf wrote:
> >>From the 1.10 draft release notes:
> >
> >>All wildcards apply to full path segments only, i.e. * never matches
> >>/, except for the case where /**/ matches zero or more path segments.
> >>For example, /*/**/* will match any path which contains at least
> >>2 segments and is equivalent to /**/*/* as well as /*/*/**.
> >Are «/*/**/*» «/**/*/*» «/*/*/**» really equivalent?  I would have
> >expected the first two to match any node except / and /'s immediate
> >children, but I wouldn't expect the third form to match /trunk/iota
> >where iota is a file, since the pattern has a trailing slash after the
> >non-optional second component.
> How do you know that /trunk/iota is a file?

I was reviewing the API docs as a black box, i.e., from a user
(repository admin) perspective, not from an implementation perspective.

 From that perspective, I would say that having a [/trunk/iota/**]
stanza to apply to a /trunk/iota file violates the principle of least
surprise.

From a very critical point of view I agree.

However, the point of wildcards is to easily reserve a complete namespace.  If we do not apply that stanza apply to the file means requiring 2 stanzas to cover the space entirely. That's both expensive and brittle (2X stanzas and requires remembering to treat them in pairs - both when adding and when removing).

And I think the "surprise" will be very short-lived if at all.

From a cost/benefit standpoint I think it is extremely positive.

Doug




> The problem is that the authz callback does not provide
> enough context information to make that distinction.
> We might extend the interface in the future - allowing
> to restrict rules to exclusively match files or dirs only.

Are you referring to svn_repos_authz_check_access()?  [which doesn't
have an svn_fs_t handle or the information to open one]

> But making that backward compatible adds quite a bit
> of complexity that I don't want to pile on there in 1.10.

I don't understand this sentence at all.  Why do we need to be backwards
compatible (this is a new feature), and why is being back compat in
this case necessarily expensive?

Moreover, implementation considerations aside, there is still the
question of what the documentation should say about this situation.

Cheers,

Daniel
--
DOUGLAS B ROBINSON SENIOR PRODUCT MANAGER

World Leader in Active Data Replication™
Find out more wandisco.com

THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY AND MAY BE PRIVILEGED

If this message was misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not waive any confidentiality or privilege. If you are not the intended recipient, please notify us immediately and destroy the message without disclosing its contents to anyone. Any distribution, use or copying of this email or the information it contains by other than an intended recipient is unauthorized. The views and opinions expressed in this email message are the author's own and may not reflect the views and opinions of WANdisco, unless the author is authorized by WANdisco to express such views or opinions on its behalf. All email sent to or from this address is subject to electronic storage and review by WANdisco. Although WANdisco operates anti-virus programs, it does not accept responsibility for any damage whatsoever caused by viruses being passed.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Daniel Shahaf-2
Doug Robinson wrote on Mon, May 01, 2017 at 14:20:16 +0000:

> Daniel:
>
> On Mon, Apr 17, 2017 at 21:13 Daniel Shahaf <[hidden email]> wrote:
>
> > Stefan Fuhrmann wrote on Mon, Apr 17, 2017 at 22:22:33 +0200:
> > > On 15.03.2017 10:55, Daniel Shahaf wrote:
> > > >>From the 1.10 draft release notes:
> > > >
> > > >>All wildcards apply to full path segments only, i.e. * never matches
> > > >>/, except for the case where /**/ matches zero or more path segments.
> > > >>For example, /*/**/* will match any path which contains at least
> > > >>2 segments and is equivalent to /**/*/* as well as /*/*/**.
> > > >Are «/*/**/*» «/**/*/*» «/*/*/**» really equivalent?  I would have
> > > >expected the first two to match any node except / and /'s immediate
> > > >children, but I wouldn't expect the third form to match /trunk/iota
> > > >where iota is a file, since the pattern has a trailing slash after the
> > > >non-optional second component.
> > > How do you know that /trunk/iota is a file?
> >
> > I was reviewing the API docs as a black box, i.e., from a user
> > (repository admin) perspective, not from an implementation perspective.
> >
> >  From that perspective, I would say that having a [/trunk/iota/**]
> > stanza to apply to a /trunk/iota file violates the principle of least
> > surprise.
>
>
> From a very critical point of view I agree.
>
> However, the point of wildcards is to easily reserve a complete namespace.

Sure, that's a valid use-case.

I was envisioning that, if a [/trunk/iota/**] stanza were present, then
an authz query for a /trunk/iota file would return either "No access" or
a parse error.  This would reserve the namespace, wouldn't it?  Referring
to your next paragraph, this logic would neither leak the contents of
the file nor require multiple stanzas.

> If we do not apply that stanza apply to the file means requiring 2 stanzas
> to cover the space entirely. That's both expensive and brittle (2X stanzas
> and requires remembering to treat them in pairs - both when adding and when
> removing).
>
> And I think the "surprise" will be very short-lived if at all.
>
> From a cost/benefit standpoint I think it is extremely positive.

Well, if a common task requires two stanzas, then _of course_ we'll find
an easier way for users to spell it.  For example, we could invent some
new "reserve prefix" stanza syntax, or pass to svn_repos_authz_check_access()
the svn_node_kind_t of the path it checks access to, or any number of
other solutions.

In short: there might well be a design that meets both of our criteria:
principle of least surprise _and_ namespace reservation.

Cheers,

Daniel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Doug Robinson
Daniel:

On Mon, May 1, 2017 at 2:05 PM, Daniel Shahaf <[hidden email]> wrote:
Doug Robinson wrote on Mon, May 01, 2017 at 14:20:16 +0000:
> On Mon, Apr 17, 2017 at 21:13 Daniel Shahaf <[hidden email]> wrote:
> > Stefan Fuhrmann wrote on Mon, Apr 17, 2017 at 22:22:33 +0200:
> > > On 15.03.2017 10:55, Daniel Shahaf wrote:
> > > >>From the 1.10 draft release notes:
> > > >
> > > >>All wildcards apply to full path segments only, i.e. * never matches
> > > >>/, except for the case where /**/ matches zero or more path segments.
> > > >>For example, /*/**/* will match any path which contains at least
> > > >>2 segments and is equivalent to /**/*/* as well as /*/*/**.
> > > >Are «/*/**/*» «/**/*/*» «/*/*/**» really equivalent?  I would have
> > > >expected the first two to match any node except / and /'s immediate
> > > >children, but I wouldn't expect the third form to match /trunk/iota
> > > >where iota is a file, since the pattern has a trailing slash after the
> > > >non-optional second component.
> > > How do you know that /trunk/iota is a file?
> >
> > I was reviewing the API docs as a black box, i.e., from a user
> > (repository admin) perspective, not from an implementation perspective.
> >
> >  From that perspective, I would say that having a [/trunk/iota/**]
> > stanza to apply to a /trunk/iota file violates the principle of least
> > surprise.
>
>
> From a very critical point of view I agree.
>
> However, the point of wildcards is to easily reserve a complete namespace.

Sure, that's a valid use-case.

I was envisioning that, if a [/trunk/iota/**] stanza were present, then
an authz query for a /trunk/iota file would return either "No access" or
a parse error.  This would reserve the namespace, wouldn't it?  Referring
to your next paragraph, this logic would neither leak the contents of
the file nor require multiple stanzas.

For an AuthZ check the answer is either Yes or No, not "parser error", right?

And it really can't be a "parser error" (invalidating the AuthZ file entirely) since
in some other revision that "file" could be a "directory".  So either the stanza
gets skipped as "not applicable" (and therefore not reserving the namespace)
or it gets entered as if the file were a directory and we're back to the behavior
that I am expecting.
 
> If we do not apply that stanza apply to the file means requiring 2 stanzas
> to cover the space entirely. That's both expensive and brittle (2X stanzas
> and requires remembering to treat them in pairs - both when adding and when
> removing).
>
> And I think the "surprise" will be very short-lived if at all.
>
> From a cost/benefit standpoint I think it is extremely positive.

Well, if a common task requires two stanzas, then _of course_ we'll find
an easier way for users to spell it.  For example, we could invent some
new "reserve prefix" stanza syntax, or pass to svn_repos_authz_check_access()
the svn_node_kind_t of the path it checks access to, or any number of
other solutions.

In short: there might well be a design that meets both of our criteria:
principle of least surprise _and_ namespace reservation.

Not seeing it - at least not yet.  In Perl the RE needed to handle this would
be one of the duals, e.g. "/trunk/iota(|/.*)" - the either/or with nothing on the left
and "/.*" on the right.  It really is a dual case.  I know of no better syntax.  Since
we're working on this as a wildcard I don't see an alternative.

As I said, I think the surprise, if any (none if we document it well) will be
very short-lived.

Cheers.

Doug

--
DOUGLAS B ROBINSON SENIOR PRODUCT MANAGER

World Leader in Active Data Replication™
Find out more wandisco.com

THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY AND MAY BE PRIVILEGED

If this message was misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not waive any confidentiality or privilege. If you are not the intended recipient, please notify us immediately and destroy the message without disclosing its contents to anyone. Any distribution, use or copying of this email or the information it contains by other than an intended recipient is unauthorized. The views and opinions expressed in this email message are the author's own and may not reflect the views and opinions of WANdisco, unless the author is authorized by WANdisco to express such views or opinions on its behalf. All email sent to or from this address is subject to electronic storage and review by WANdisco. Although WANdisco operates anti-virus programs, it does not accept responsibility for any damage whatsoever caused by viruses being passed.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Daniel Shahaf-2
Doug Robinson wrote on Wed, May 03, 2017 at 15:54:50 -0400:

> Daniel:
>
> On Mon, May 1, 2017 at 2:05 PM, Daniel Shahaf <[hidden email]>
> wrote:
>
> > Doug Robinson wrote on Mon, May 01, 2017 at 14:20:16 +0000:
> > > On Mon, Apr 17, 2017 at 21:13 Daniel Shahaf <[hidden email]>
> > wrote:
> > > > Stefan Fuhrmann wrote on Mon, Apr 17, 2017 at 22:22:33 +0200:
> > > > > On 15.03.2017 10:55, Daniel Shahaf wrote:
> > > > > >>From the 1.10 draft release notes:
> > > > > >
> > > > > >>All wildcards apply to full path segments only, i.e. * never
> > matches
> > > > > >>/, except for the case where /**/ matches zero or more path
> > segments.
> > > > > >>For example, /*/**/* will match any path which contains at least
> > > > > >>2 segments and is equivalent to /**/*/* as well as /*/*/**.
> > > > > >Are «/*/**/*» «/**/*/*» «/*/*/**» really equivalent?  I would have
> > > > > >expected the first two to match any node except / and /'s immediate
> > > > > >children, but I wouldn't expect the third form to match /trunk/iota
> > > > > >where iota is a file, since the pattern has a trailing slash after
> > the
> > > > > >non-optional second component.
> > > > > How do you know that /trunk/iota is a file?
> > > >
> > > > I was reviewing the API docs as a black box, i.e., from a user
> > > > (repository admin) perspective, not from an implementation perspective.
> > > >
> > > >  From that perspective, I would say that having a [/trunk/iota/**]
> > > > stanza to apply to a /trunk/iota file violates the principle of least
> > > > surprise.
> > >
> > >
> > > From a very critical point of view I agree.
> > >
> > > However, the point of wildcards is to easily reserve a complete
> > namespace.
> >
> > Sure, that's a valid use-case.
> >
> > I was envisioning that, if a [/trunk/iota/**] stanza were present, then
> > an authz query for a /trunk/iota file would return either "No access" or
> > a parse error.  This would reserve the namespace, wouldn't it?  Referring
> > to your next paragraph, this logic would neither leak the contents of
> > the file nor require multiple stanzas.
> >
>
> For an AuthZ check the answer is either Yes or No, not "parser error",
> right?

Wrong.  An authz check can return an error.  For example, `svnauthz
accessof` has exit code 2 when the authz file fails to parse.

> And it really can't be a "parser error" (invalidating the AuthZ file
> entirely) since in some other revision that "file" could be
> a "directory".  So either the stanza gets skipped as "not applicable"
> (and therefore not reserving the namespace) or it gets entered as if
> the file were a directory and we're back to the behavior that I am
> expecting.

You are correct: it will not be a *parse* error since the grammar of
authz files does not depend on the contents of the repository.  That
just means it will be a different kind of error — a semantic error — and
will occur at authz query time, not at authz file load time.

That would still break checkouts of /trunk, though, so it might be
better to just default /trunk/iota to "No access" and log a warning to
the server log.  (Using, say, svn_repos_fs(repos)->warning().)

>
> > > If we do not apply that stanza apply to the file means requiring 2
> > stanzas
> > > to cover the space entirely. That's both expensive and brittle (2X
> > stanzas
> > > and requires remembering to treat them in pairs - both when adding
> > > and
> > when
> > > removing).
> > >
> > > And I think the "surprise" will be very short-lived if at all.
> > >
> > > From a cost/benefit standpoint I think it is extremely positive.
> >
> > Well, if a common task requires two stanzas, then _of course_ we'll
> > find an easier way for users to spell it.  For example, we could
> > invent some new "reserve prefix" stanza syntax, or pass to
> > svn_repos_authz_check_access() the svn_node_kind_t of the path it
> > checks access to, or any number of other solutions.
> >
> > In short: there might well be a design that meets both of our
> > criteria: principle of least surprise _and_ namespace reservation.
> >
>
> Not seeing it - at least not yet.  In Perl the RE needed to handle
> this would be one of the duals, e.g. "/trunk/iota(|/.*)" - the
> either/or with nothing on the left and "/.*" on the right.  It really
> is a dual case.  I know of no better syntax.  Since we're working on
> this as a wildcard I don't see an alternative.

Off the top of my head, we could have [/trunk/iota/***] and
[/trunk/iota/**] with different meanings (the former applies to
a /trunk/iota file, the latter doesn't).  Does anyone else (besides Doug
and I) have ideas here?

Cheers,

Daniel

> As I said, I think the surprise, if any (none if we document it well)
> will be very short-lived.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Johan Corveleyn-3
On Thu, May 4, 2017 at 10:16 AM, Daniel Shahaf <[hidden email]> wrote:
> Doug Robinson wrote on Wed, May 03, 2017 at 15:54:50 -0400:
...

>> Not seeing it - at least not yet.  In Perl the RE needed to handle
>> this would be one of the duals, e.g. "/trunk/iota(|/.*)" - the
>> either/or with nothing on the left and "/.*" on the right.  It really
>> is a dual case.  I know of no better syntax.  Since we're working on
>> this as a wildcard I don't see an alternative.
>
> Off the top of my head, we could have [/trunk/iota/***] and
> [/trunk/iota/**] with different meanings (the former applies to
> a /trunk/iota file, the latter doesn't).  Does anyone else (besides Doug
> and I) have ideas here?

Hmm, /*** doesn't look like something I'd remember easily, if I wanted
to use that feature as an svn admin.

I have only followed this discussion from a distance. If I understand
correctly the remaining point is whether or not /iota/** would match
with the file /iota or not. Speaking purely from my own intuition, I
would say "no". I feel this pattern is intended to apply to the
_subtree_ below iota, including iota itself (which is thus implied to
be a directory, because we're talking about subtrees). In practice I
think the admin configuring this rule will know whether iota is ever
intended to be a file or a directory. A rule like that to me always
implies that "the guy who configured it" expects iota to be a
directory (why else would he put a "subtree rule" for it).

TBH, I also don't really see the use case of "I want this rule to
apply to the _namespace_ iota, i.e. to the file iota (if it's a file)
and to directory iota and its subtree (if it's a directory)". In
context, you always know whether it's meant to be a file or a
directory.

Maybe we should just follow what most other implementations do?
I've done a quick check in Atlassian FishEye / Crucible (searching for
files). There /iota/** does not match file /iota (but it does match
directory /iota).

--
Johan
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Daniel Shahaf-2
Johan Corveleyn wrote on Thu, May 04, 2017 at 13:26:30 +0200:

> On Thu, May 4, 2017 at 10:16 AM, Daniel Shahaf <[hidden email]> wrote:
> > Doug Robinson wrote on Wed, May 03, 2017 at 15:54:50 -0400:
> ...
> >> Not seeing it - at least not yet.  In Perl the RE needed to handle
> >> this would be one of the duals, e.g. "/trunk/iota(|/.*)" - the
> >> either/or with nothing on the left and "/.*" on the right.  It really
> >> is a dual case.  I know of no better syntax.  Since we're working on
> >> this as a wildcard I don't see an alternative.
> >
> > Off the top of my head, we could have [/trunk/iota/***] and
> > [/trunk/iota/**] with different meanings (the former applies to
> > a /trunk/iota file, the latter doesn't).  Does anyone else (besides Doug
> > and I) have ideas here?
>
> Hmm, /*** doesn't look like something I'd remember easily, if I wanted
> to use that feature as an svn admin.

I cribbed the syntax from zsh and rsync, which both define a "***" token
in their glob expressions.

(In zsh, *** is like ** but recurses into symlinks-to-directories as
well.  In rsync, *** is similar to ** but can match zero path components
in the construct "foo/***)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Doug Robinson
In reply to this post by Johan Corveleyn-3
Johan:

On Thu, May 4, 2017 at 7:26 AM, Johan Corveleyn <[hidden email]> wrote:
On Thu, May 4, 2017 at 10:16 AM, Daniel Shahaf <[hidden email]> wrote:
> Doug Robinson wrote on Wed, May 03, 2017 at 15:54:50 -0400:
...
>> Not seeing it - at least not yet.  In Perl the RE needed to handle
>> this would be one of the duals, e.g. "/trunk/iota(|/.*)" - the
>> either/or with nothing on the left and "/.*" on the right.  It really
>> is a dual case.  I know of no better syntax.  Since we're working on
>> this as a wildcard I don't see an alternative.
>
> Off the top of my head, we could have [/trunk/iota/***] and
> [/trunk/iota/**] with different meanings (the former applies to
> a /trunk/iota file, the latter doesn't).  Does anyone else (besides Doug
> and I) have ideas here?

Hmm, /*** doesn't look like something I'd remember easily, if I wanted
to use that feature as an svn admin.

I have only followed this discussion from a distance. If I understand
correctly the remaining point is whether or not /iota/** would match
with the file /iota or not. Speaking purely from my own intuition, I
would say "no". I feel this pattern is intended to apply to the
_subtree_ below iota, including iota itself (which is thus implied to
be a directory, because we're talking about subtrees). In practice I
think the admin configuring this rule will know whether iota is ever
intended to be a file or a directory. A rule like that to me always
implies that "the guy who configured it" expects iota to be a
directory (why else would he put a "subtree rule" for it).

TBH, I also don't really see the use case of "I want this rule to
apply to the _namespace_ iota, i.e. to the file iota (if it's a file)
and to directory iota and its subtree (if it's a directory)". In
context, you always know whether it's meant to be a file or a
directory.
 

Maybe we should just follow what most other implementations do?
I've done a quick check in Atlassian FishEye / Crucible (searching for
files). There /iota/** does not match file /iota (but it does match
directory /iota).

--
Johan



--
DOUGLAS B ROBINSON SENIOR PRODUCT MANAGER

World Leader in Active Data Replication™
Find out more wandisco.com

THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY AND MAY BE PRIVILEGED

If this message was misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not waive any confidentiality or privilege. If you are not the intended recipient, please notify us immediately and destroy the message without disclosing its contents to anyone. Any distribution, use or copying of this email or the information it contains by other than an intended recipient is unauthorized. The views and opinions expressed in this email message are the author's own and may not reflect the views and opinions of WANdisco, unless the author is authorized by WANdisco to express such views or opinions on its behalf. All email sent to or from this address is subject to electronic storage and review by WANdisco. Although WANdisco operates anti-virus programs, it does not accept responsibility for any damage whatsoever caused by viruses being passed.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Doug Robinson
In reply to this post by Johan Corveleyn-3
Johan:

(sorry for the empty message - dwim failed)

On Thu, May 4, 2017 at 7:26 AM, Johan Corveleyn <[hidden email]> wrote:
On Thu, May 4, 2017 at 10:16 AM, Daniel Shahaf <[hidden email]> wrote:
> Doug Robinson wrote on Wed, May 03, 2017 at 15:54:50 -0400:
...
>> Not seeing it - at least not yet.  In Perl the RE needed to handle
>> this would be one of the duals, e.g. "/trunk/iota(|/.*)" - the
>> either/or with nothing on the left and "/.*" on the right.  It really
>> is a dual case.  I know of no better syntax.  Since we're working on
>> this as a wildcard I don't see an alternative.
>
> Off the top of my head, we could have [/trunk/iota/***] and
> [/trunk/iota/**] with different meanings (the former applies to
> a /trunk/iota file, the latter doesn't).  Does anyone else (besides Doug
> and I) have ideas here?

Hmm, /*** doesn't look like something I'd remember easily, if I wanted
to use that feature as an svn admin.

I have only followed this discussion from a distance. If I understand
correctly the remaining point is whether or not /iota/** would match
with the file /iota or not. Speaking purely from my own intuition, I
would say "no". I feel this pattern is intended to apply to the
_subtree_ below iota, including iota itself (which is thus implied to
be a directory, because we're talking about subtrees). In practice I
think the admin configuring this rule will know whether iota is ever
intended to be a file or a directory. A rule like that to me always
implies that "the guy who configured it" expects iota to be a
directory (why else would he put a "subtree rule" for it).

TBH, I also don't really see the use case of "I want this rule to
apply to the _namespace_ iota, i.e. to the file iota (if it's a file)
and to directory iota and its subtree (if it's a directory)". In
context, you always know whether it's meant to be a file or a
directory.

The use case is exactly that some administrator wants to reserve
the namespace.  They do not want some sly person to create a file
where they will, at some point in the future, create a directory.  It will
be sad that we can't have a simple way to make this reservation, but,
as I noted above, short of the current "[:glob:/iota/**]" doing the job it
will take 2 stanzas.
 
Maybe we should just follow what most other implementations do?
I've done a quick check in Atlassian FishEye / Crucible (searching for
files). There /iota/** does not match file /iota (but it does match
directory /iota).

The FishEye reference I found does not have a "**" operator - just a "*"

For all cases where a tool has a "*" operator this is semantically going
to "not match" this use case since the "*" operator that has been
implemented in SVN (at least so far) does not span past a single
directory entry. 

Doug
--
DOUGLAS B ROBINSON SENIOR PRODUCT MANAGER

World Leader in Active Data Replication™
Find out more wandisco.com

THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY AND MAY BE PRIVILEGED

If this message was misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not waive any confidentiality or privilege. If you are not the intended recipient, please notify us immediately and destroy the message without disclosing its contents to anyone. Any distribution, use or copying of this email or the information it contains by other than an intended recipient is unauthorized. The views and opinions expressed in this email message are the author's own and may not reflect the views and opinions of WANdisco, unless the author is authorized by WANdisco to express such views or opinions on its behalf. All email sent to or from this address is subject to electronic storage and review by WANdisco. Although WANdisco operates anti-virus programs, it does not accept responsibility for any damage whatsoever caused by viruses being passed.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Johan Corveleyn-3
On Fri, May 5, 2017 at 12:49 AM, Doug Robinson
<[hidden email]> wrote:

>
> Johan:
>
> (sorry for the empty message - dwim failed)
>
> On Thu, May 4, 2017 at 7:26 AM, Johan Corveleyn <[hidden email]> wrote:
>>
>> On Thu, May 4, 2017 at 10:16 AM, Daniel Shahaf <[hidden email]> wrote:
>> > Doug Robinson wrote on Wed, May 03, 2017 at 15:54:50 -0400:
>> ...
>> >> Not seeing it - at least not yet.  In Perl the RE needed to handle
>> >> this would be one of the duals, e.g. "/trunk/iota(|/.*)" - the
>> >> either/or with nothing on the left and "/.*" on the right.  It really
>> >> is a dual case.  I know of no better syntax.  Since we're working on
>> >> this as a wildcard I don't see an alternative.
>> >
>> > Off the top of my head, we could have [/trunk/iota/***] and
>> > [/trunk/iota/**] with different meanings (the former applies to
>> > a /trunk/iota file, the latter doesn't).  Does anyone else (besides Doug
>> > and I) have ideas here?
>>
>> Hmm, /*** doesn't look like something I'd remember easily, if I wanted
>> to use that feature as an svn admin.
>>
>> I have only followed this discussion from a distance. If I understand
>> correctly the remaining point is whether or not /iota/** would match
>> with the file /iota or not. Speaking purely from my own intuition, I
>> would say "no". I feel this pattern is intended to apply to the
>> _subtree_ below iota, including iota itself (which is thus implied to
>> be a directory, because we're talking about subtrees). In practice I
>> think the admin configuring this rule will know whether iota is ever
>> intended to be a file or a directory. A rule like that to me always
>> implies that "the guy who configured it" expects iota to be a
>> directory (why else would he put a "subtree rule" for it).
>>
>> TBH, I also don't really see the use case of "I want this rule to
>> apply to the _namespace_ iota, i.e. to the file iota (if it's a file)
>> and to directory iota and its subtree (if it's a directory)". In
>> context, you always know whether it's meant to be a file or a
>> directory.
>
>
> The use case is exactly that some administrator wants to reserve
> the namespace.  They do not want some sly person to create a file
> where they will, at some point in the future, create a directory.  It will
> be sad that we can't have a simple way to make this reservation, but,
> as I noted above, short of the current "[:glob:/iota/**]" doing the job it
> will take 2 stanzas.
>
>>
>> Maybe we should just follow what most other implementations do?
>> I've done a quick check in Atlassian FishEye / Crucible (searching for
>> files). There /iota/** does not match file /iota (but it does match
>> directory /iota).
>
>
> The FishEye reference I found does not have a "**" operator - just a "*"
> operator (https://confluence.atlassian.com/jiracoreserver073/search-syntax-for-text-fields-861257223.html).
>
> For all cases where a tool has a "*" operator this is semantically going
> to "not match" this use case since the "*" operator that has been
> implemented in SVN (at least so far) does not span past a single
> directory entry.

Ah. No, I'm referring to this syntax in FishEye:
https://confluence.atlassian.com/fisheye/pattern-matching-guide-298976797.html

Unfortunately the document does not specify the cases we're interested
in here. But I've tested them on our own FishEye instance :-). In this
case "/dir/**" does return /dir, but "/file/**" does not return /file.
But okay, it's just one example.

In the FishEye doc they say they're doing their pattern mathing "same
as the pattern matching in Apache Ant". So I've checked ant as well.
On this page:

    https://ant.apache.org/manual/dirtasks.html#patterns

at the bottom of the table they say:
    **/test/** - Matches all files that have a test element in their
path, including test as a filename.

So I've done a little test in ant (see [1]): apparently "**/test/**"
will match the file test, but "/test/**" doesn't! Weird. Apparently
the same goes for FishEye, if I put "**/" at the beginning of the
pattern, it does match the file.

Now, getting back to your use case: "reserving a namespace for future
use" (i.e. for now we don't know whether "iota" will ever be a file or
a directory, but in any case we don't want anyone to put anything
there). To me it sounds like a very special use case. It seems to be
something specific for authorization syntaxes, but much less
applicable to searching existing filesystems (like glob patterns for
shells or tools like FishEye). So maybe it's not such a good idea to
look at those tools for inspiration anyway :-).

Doug: do you think this is a common use case? Do other authorization
systems offer this functionality in an easily configurable manner? I
accept this is a valid use case, but it's not one that I would think
of using (wearing my hat of svn admin) -- I focus on authorization of
the existing files / directories.

Come to think of it: if reserving a namespace for future use, and
"/iota" doesn't exist yet, can't you just block the name "/iota"
without glob pattern? It doesn't exist anyway, so if you'd like to
create some subtree under it, you first have to create /iota, right?

Now, in the end, I don't want this issue to be blocked forever :-). I
think in practice the confusion will be minimal, because either the
administrator knows what kind of item "iota" is (a file or a
directory), or the item doesn't exist yet and he'll be doing the
"reserve namespace" use case. So for me it's fine if "/iota/**"
effectively matches both the "directory iota and its subtree" and "the
file iota". As long as it's documented that way then :-).

If Daniel insists, I'm fine with using "/***" as well, if we want to
have this special "reserve namespace" meaning.


[1] Steps to reproduce with ant (you'll need ant and java):
* Create a file build.xml with this content:
[[[
<project name="test" default="test" basedir=".">
    <target name="test">
        <echo>Concatenating:</echo>
        <concat>
            <fileset dir="." includes="test/**,dir/**"/>
        </concat>
    </target>
</project>
]]]

* Create a file "test" with some content, and a directory "dir" with
another file with other content below it.

* Run "ant". You'll see that the content of "test" is not catted.

* If you change the includes pattern to "**/test/**,dir/**", the file
is effectively catted

--
Johan
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Branko Čibej
On 05.05.2017 11:09, Johan Corveleyn wrote:

> On Fri, May 5, 2017 at 12:49 AM, Doug Robinson
> <[hidden email]> wrote:
>> Johan:
>>
>> (sorry for the empty message - dwim failed)
>>
>> On Thu, May 4, 2017 at 7:26 AM, Johan Corveleyn <[hidden email]> wrote:
>>> On Thu, May 4, 2017 at 10:16 AM, Daniel Shahaf <[hidden email]> wrote:
>>>> Doug Robinson wrote on Wed, May 03, 2017 at 15:54:50 -0400:
>>> ...
>>>>> Not seeing it - at least not yet.  In Perl the RE needed to handle
>>>>> this would be one of the duals, e.g. "/trunk/iota(|/.*)" - the
>>>>> either/or with nothing on the left and "/.*" on the right.  It really
>>>>> is a dual case.  I know of no better syntax.  Since we're working on
>>>>> this as a wildcard I don't see an alternative.
>>>> Off the top of my head, we could have [/trunk/iota/***] and
>>>> [/trunk/iota/**] with different meanings (the former applies to
>>>> a /trunk/iota file, the latter doesn't).  Does anyone else (besides Doug
>>>> and I) have ideas here?
>>> Hmm, /*** doesn't look like something I'd remember easily, if I wanted
>>> to use that feature as an svn admin.
>>>
>>> I have only followed this discussion from a distance. If I understand
>>> correctly the remaining point is whether or not /iota/** would match
>>> with the file /iota or not. Speaking purely from my own intuition, I
>>> would say "no". I feel this pattern is intended to apply to the
>>> _subtree_ below iota, including iota itself (which is thus implied to
>>> be a directory, because we're talking about subtrees). In practice I
>>> think the admin configuring this rule will know whether iota is ever
>>> intended to be a file or a directory. A rule like that to me always
>>> implies that "the guy who configured it" expects iota to be a
>>> directory (why else would he put a "subtree rule" for it).
>>>
>>> TBH, I also don't really see the use case of "I want this rule to
>>> apply to the _namespace_ iota, i.e. to the file iota (if it's a file)
>>> and to directory iota and its subtree (if it's a directory)". In
>>> context, you always know whether it's meant to be a file or a
>>> directory.
>>
>> The use case is exactly that some administrator wants to reserve
>> the namespace.  They do not want some sly person to create a file
>> where they will, at some point in the future, create a directory.  It will
>> be sad that we can't have a simple way to make this reservation, but,
>> as I noted above, short of the current "[:glob:/iota/**]" doing the job it
>> will take 2 stanzas.
>>
>>> Maybe we should just follow what most other implementations do?
>>> I've done a quick check in Atlassian FishEye / Crucible (searching for
>>> files). There /iota/** does not match file /iota (but it does match
>>> directory /iota).
>>
>> The FishEye reference I found does not have a "**" operator - just a "*"
>> operator (https://confluence.atlassian.com/jiracoreserver073/search-syntax-for-text-fields-861257223.html).
>>
>> For all cases where a tool has a "*" operator this is semantically going
>> to "not match" this use case since the "*" operator that has been
>> implemented in SVN (at least so far) does not span past a single
>> directory entry.
> Ah. No, I'm referring to this syntax in FishEye:
> https://confluence.atlassian.com/fisheye/pattern-matching-guide-298976797.html
>
> Unfortunately the document does not specify the cases we're interested
> in here. But I've tested them on our own FishEye instance :-). In this
> case "/dir/**" does return /dir, but "/file/**" does not return /file.
> But okay, it's just one example.
>
> In the FishEye doc they say they're doing their pattern mathing "same
> as the pattern matching in Apache Ant". So I've checked ant as well.
> On this page:
>
>     https://ant.apache.org/manual/dirtasks.html#patterns
>
> at the bottom of the table they say:
>     **/test/** - Matches all files that have a test element in their
> path, including test as a filename.
>
> So I've done a little test in ant (see [1]): apparently "**/test/**"
> will match the file test, but "/test/**" doesn't! Weird. Apparently
> the same goes for FishEye, if I put "**/" at the beginning of the
> pattern, it does match the file.


Before we go too far down this rabbit hole of possible semantics, I'd
like to remind you that in Subversion's authz implementation, the path
matching happens before we know anything about the structure of the
repository. Specifically, we don't know if a particular name in the
pattern even exists in the repository and if it does, whether it's a
file or a directory.

So, any discussion about whether something should match a file or not is
nice in theory, but a waste of time in practice. IMO there is absolutely
no chance of the authz check actually looking at the repository to
determine what kind of object we're looking at: not only would this
require a significant change in the authz architecture, it would also
slow authz checks down to probably an order of magnitude worse than they
were _before_ the current rewrite. A non-starter.

It would be a lot better to discuss if the current behaviour is sensible
given what we know when the authz check occurs.

-- Brane
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Daniel Shahaf-2
In reply to this post by Johan Corveleyn-3
Johan Corveleyn wrote on Fri, May 05, 2017 at 11:09:33 +0200:

> Come to think of it: if reserving a namespace for future use, and
> "/iota" doesn't exist yet, can't you just block the name "/iota"
> without glob pattern? It doesn't exist anyway, so if you'd like to
> create some subtree under it, you first have to create /iota, right?
>
> Now, in the end, I don't want this issue to be blocked forever :-). I
> think in practice the confusion will be minimal, because either the
> administrator knows what kind of item "iota" is (a file or a
> directory), or the item doesn't exist yet and he'll be doing the
> "reserve namespace" use case. So for me it's fine if "/iota/**"
> effectively matches both the "directory iota and its subtree" and "the
> file iota". As long as it's documented that way then :-).
>
> If Daniel insists, I'm fine with using "/***" as well, if we want to
> have this special "reserve namespace" meaning.

I hope I'm not coming across as "insisting", Johan.  I certainly didn't
intend to.

Regarding "reserve a namespace", I agree with you that "[/trunk/iota]
*=" appears to serve that purpose.  Let's wait to hear from Doug.

Cheers,

Daniel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wildcard authz docs question

Doug Robinson
In reply to this post by Johan Corveleyn-3
Johan:

Sorry for my sporadic replies... bin a bit hectic here.

Reply buried deep below.

On Fri, May 5, 2017 at 5:09 AM, Johan Corveleyn <[hidden email]> wrote:
On Fri, May 5, 2017 at 12:49 AM, Doug Robinson
<[hidden email]> wrote:
>
> Johan:
>
> (sorry for the empty message - dwim failed)
>
> On Thu, May 4, 2017 at 7:26 AM, Johan Corveleyn <[hidden email]> wrote:
>>
>> On Thu, May 4, 2017 at 10:16 AM, Daniel Shahaf <[hidden email]> wrote:
>> > Doug Robinson wrote on Wed, May 03, 2017 at 15:54:50 -0400:
>> ...
>> >> Not seeing it - at least not yet.  In Perl the RE needed to handle
>> >> this would be one of the duals, e.g. "/trunk/iota(|/.*)" - the
>> >> either/or with nothing on the left and "/.*" on the right.  It really
>> >> is a dual case.  I know of no better syntax.  Since we're working on
>> >> this as a wildcard I don't see an alternative.
>> >
>> > Off the top of my head, we could have [/trunk/iota/***] and
>> > [/trunk/iota/**] with different meanings (the former applies to
>> > a /trunk/iota file, the latter doesn't).  Does anyone else (besides Doug
>> > and I) have ideas here?
>>
>> Hmm, /*** doesn't look like something I'd remember easily, if I wanted
>> to use that feature as an svn admin.
>>
>> I have only followed this discussion from a distance. If I understand
>> correctly the remaining point is whether or not /iota/** would match
>> with the file /iota or not. Speaking purely from my own intuition, I
>> would say "no". I feel this pattern is intended to apply to the
>> _subtree_ below iota, including iota itself (which is thus implied to
>> be a directory, because we're talking about subtrees). In practice I
>> think the admin configuring this rule will know whether iota is ever
>> intended to be a file or a directory. A rule like that to me always
>> implies that "the guy who configured it" expects iota to be a
>> directory (why else would he put a "subtree rule" for it).
>>
>> TBH, I also don't really see the use case of "I want this rule to
>> apply to the _namespace_ iota, i.e. to the file iota (if it's a file)
>> and to directory iota and its subtree (if it's a directory)". In
>> context, you always know whether it's meant to be a file or a
>> directory.
>
>
> The use case is exactly that some administrator wants to reserve
> the namespace.  They do not want some sly person to create a file
> where they will, at some point in the future, create a directory.  It will
> be sad that we can't have a simple way to make this reservation, but,
> as I noted above, short of the current "[:glob:/iota/**]" doing the job it
> will take 2 stanzas.
>
>>
>> Maybe we should just follow what most other implementations do?
>> I've done a quick check in Atlassian FishEye / Crucible (searching for
>> files). There /iota/** does not match file /iota (but it does match
>> directory /iota).
>
>
> The FishEye reference I found does not have a "**" operator - just a "*"
> operator (https://confluence.atlassian.com/jiracoreserver073/search-syntax-for-text-fields-861257223.html).
>
> For all cases where a tool has a "*" operator this is semantically going
> to "not match" this use case since the "*" operator that has been
> implemented in SVN (at least so far) does not span past a single
> directory entry.

Ah. No, I'm referring to this syntax in FishEye:
https://confluence.atlassian.com/fisheye/pattern-matching-guide-298976797.html

Unfortunately the document does not specify the cases we're interested
in here. But I've tested them on our own FishEye instance :-). In this
case "/dir/**" does return /dir, but "/file/**" does not return /file.
But okay, it's just one example.

In the FishEye doc they say they're doing their pattern mathing "same
as the pattern matching in Apache Ant". So I've checked ant as well.
On this page:

    https://ant.apache.org/manual/dirtasks.html#patterns

at the bottom of the table they say:
    **/test/** - Matches all files that have a test element in their
path, including test as a filename.

So I've done a little test in ant (see [1]): apparently "**/test/**"
will match the file test, but "/test/**" doesn't! Weird. Apparently
the same goes for FishEye, if I put "**/" at the beginning of the
pattern, it does match the file.

Now, getting back to your use case: "reserving a namespace for future
use" (i.e. for now we don't know whether "iota" will ever be a file or
a directory, but in any case we don't want anyone to put anything
there). To me it sounds like a very special use case. It seems to be
something specific for authorization syntaxes, but much less
applicable to searching existing filesystems (like glob patterns for
shells or tools like FishEye). So maybe it's not such a good idea to
look at those tools for inspiration anyway :-).

Doug: do you think this is a common use case? Do other authorization
systems offer this functionality in an easily configurable manner? I
accept this is a valid use case, but it's not one that I would think
of using (wearing my hat of svn admin) -- I focus on authorization of
the existing files / directories.

It's a very common use case.  Think of it in terms of allocating all release
branches to the release team.  Or all Quality Assurance tags to the QA team.
 
Come to think of it: if reserving a namespace for future use, and
"/iota" doesn't exist yet, can't you just block the name "/iota"
without glob pattern? It doesn't exist anyway, so if you'd like to
create some subtree under it, you first have to create /iota, right?

There's 2 problems with this:

1. You're not trying to block the name "/iota", you're giving out privs to the
right team for creating (and nobody else).

2. The "**" operator is very special in that it does a "direct match" of all
at or below.  That "direct match", in terms of wildcards, means that there
is no "recursing upwards" to find a parent rule.  It's matched immediately.

Consider multiple repositories in an organization (perhaps they have code
that cannot go to vendors with which they share some repos so they cannot
keep all of their projects within a single repo - or similar use case).  A global
policy would have identical rules for all repositories.  They can't know when
or if some subset of the repositories have the specific artifact or not.
It would be nice/handy/convenient if a single rule could do the reservation
rather than a pair.

Now, in the end, I don't want this issue to be blocked forever :-). I
think in practice the confusion will be minimal, because either the
administrator knows what kind of item "iota" is (a file or a
directory), or the item doesn't exist yet and he'll be doing the
"reserve namespace" use case. So for me it's fine if "/iota/**"
effectively matches both the "directory iota and its subtree" and "the
file iota". As long as it's documented that way then :-).

My document does that since that is the way that the branched
implementation for SVN 1.8 and 1.9 works today.

If Daniel insists, I'm fine with using "/***" as well, if we want to
have this special "reserve namespace" meaning.

If so then we'll need to make sure to document the required changes to
our user's who are using the feature now.  It's not a big deal but will be
critical when our users upgrade to SVN 1.10.  So I'll continue to watch
this space carefully.

Thank you.

Doug

[1] Steps to reproduce with ant (you'll need ant and java):
* Create a file build.xml with this content:
[[[
<project name="test" default="test" basedir=".">
    <target name="test">
        <echo>Concatenating:</echo>
        <concat>
            <fileset dir="." includes="test/**,dir/**"/>
        </concat>
    </target>
</project>
]]]

* Create a file "test" with some content, and a directory "dir" with
another file with other content below it.

* Run "ant". You'll see that the content of "test" is not catted.

* If you change the includes pattern to "**/test/**,dir/**", the file
is effectively catted

--
Johan



--
DOUGLAS B ROBINSON SENIOR PRODUCT MANAGER

World Leader in Active Data Replication™
Find out more wandisco.com

THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY AND MAY BE PRIVILEGED

If this message was misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not waive any confidentiality or privilege. If you are not the intended recipient, please notify us immediately and destroy the message without disclosing its contents to anyone. Any distribution, use or copying of this email or the information it contains by other than an intended recipient is unauthorized. The views and opinions expressed in this email message are the author's own and may not reflect the views and opinions of WANdisco, unless the author is authorized by WANdisco to express such views or opinions on its behalf. All email sent to or from this address is subject to electronic storage and review by WANdisco. Although WANdisco operates anti-virus programs, it does not accept responsibility for any damage whatsoever caused by viruses being passed.

12
Loading...