[PATCH] svnadmin: Print LOCK_PATH's correctly.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[PATCH] svnadmin: Print LOCK_PATH's correctly.

Daniel Shahaf-2
Patch + log message attached.  It changes the expected output, and I'm not
sure how strictly compatible we want to be about that, hence posting here
first.

If I don't hear anything I'll go ahead and commit.

(It seems that the code being changed is passing an APR-encoded path to
a printing function that expects a UTF-8 argument?  This might even be
user-visible as mangled output in non-UTF-8 environments, I suppose,
but I don't have such an environment to test with.)

Cheers,

Daniel

svnadmin-lock-path-printing-v1.diff (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH] svnadmin: Print LOCK_PATH's correctly.

Julian Foad-5

On 27/05/17 17:23, Daniel Shahaf wrote:

> Patch + log message attached.  It changes the expected output, and I'm not
> sure how strictly compatible we want to be about that, hence posting here
> first.
>
> If I don't hear anything I'll go ahead and commit.
>
> (It seems that the code being changed is passing an APR-encoded path to
> a printing function that expects a UTF-8 argument?  This might even be
> user-visible as mangled output in non-UTF-8 environments, I suppose,
> but I don't have such an environment to test with.)

Hi Daniel.

Your patch looks correct, as far as it goes. The 'canonical path' change
(where the main effect is it adds a leading slash) is OK as it makes for
greater consistency. The conversion to UTF-8 is also correct, but this
makes me notice the following issue.

It looks like lots of the argument parsing in 'svnadmin' lacks UTF-8
conversion. I think this should be done in the function 'parse_args'
instead of everywhere else. There are also a few places where it is
missing in our other command-line programs.

To avoid more instances of the same error in the future, I suggest that
rather than continuing to use this pattern:

   const char *arg, *arg_utf8;
   arg = os->argv[os->ind++];
   svn_utf_cstring_to_utf8(&arg_utf8, arg);
   do_stuff(arg_utf8);
   do_more_stuff(arg);  /* BUG: should be arg_utf8 */

we could use this pattern which is more robust:

   const char *arg;
   svn_utf_cstring_to_utf8(&arg, os->argv[os->ind++]);
   do_stuff(arg);
   do_more_stuff(arg);

I see some other instances of this same problem elsewhere in the
codebase, following uses of svn_opt_parse_num_args() and
svn_opt_parse_all_args() which are two more functions that don't convert
the command-line arguments to UTF-8 before returning them, leaving their
callers to do it (wrongly).

I also noticed the 'unlock' regression tests use patterns ending with
"unlocked." which looks like the dot was intended to match literally but
actually matches a space, as it is interpreted as a regex and the output
looks like "unlocked by user 'xxx'."

You could commit your patch anyway, or switch it to the more robust
pattern first. I might follow up with these other issues if no-one else
volunteers.

- Julian
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH] svnadmin: Print LOCK_PATH's correctly.

Daniel Shahaf-2
Julian Foad wrote on Tue, 30 May 2017 15:19 +0100:
> It looks like lots of the argument parsing in 'svnadmin' lacks UTF-8
> conversion. I think this should be done in the function 'parse_args'
> instead of everywhere else.

+1: transcoding is an aspect of argv parsing, not of the business logic.
The subcommand_*() functions should be passed UTF-8 arguments.

> There are also a few places where it is
> missing in our other command-line programs.
>
> we could use this pattern which is more robust:
>
>    const char *arg;
>    svn_utf_cstring_to_utf8(&arg, os->argv[os->ind++]);
>    do_stuff(arg);
>    do_more_stuff(arg);

+1 to this idiom as well.  In general, the non-UTF-8 variables should
have as little visibility as possible.  (And if we ever have to store one
of them in a named variable, we ought to name that variable to reflect
its value's encoding --- i.e., Hungarian notation.)

> I also noticed the 'unlock' regression tests use patterns ending with
> "unlocked." which looks like the dot was intended to match literally but
> actually matches a space, as it is interpreted as a regex and the output
> looks like "unlocked by user 'xxx'."

I think the output didn't always include the "by user 'xxx'" portion, so the
regex used to match the whole line.  I'll extend the expectations to match
the username.

> You could commit your patch anyway, or switch it to the more robust
> pattern first. I might follow up with these other issues if no-one else
> volunteers.
>

Thanks for the review, Julian.  I'll commit the patch as-is when I have
a moment, and probably add something to the release notes about the
output change as well.  (In case people are regex-matching the output...)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH] svnadmin: Print LOCK_PATH's correctly.

Daniel Shahaf-2
Daniel Shahaf wrote on Tue, 30 May 2017 22:07 +0000:
> Thanks for the review, Julian.  I'll commit the patch as-is when I have
> a moment, and probably add something to the release notes about the
> output change as well.  (In case people are regex-matching the output...)

Done in r1797222, r1797223, r1797226.  I'd have nominated the former for
backport too (it depends on r1796426), but ran out of time sorting out the
test suite's expectation regarding leading slashes.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH] svnadmin: Print LOCK_PATH's correctly.

Julian Foad-5
On 2017-05-31, Daniel Shahaf wrote:
> Daniel Shahaf wrote on Tue, 30 May 2017 22:07 +0000:
>> Thanks for the review, Julian.  I'll commit the patch as-is when I have
>> a moment, and probably add something to the release notes about the
>> output change as well.  (In case people are regex-matching the output...)
>
> Done in r1797222, r1797223, r1797226.

Ah, you mean r1797122, r1797123, r1797126.

I made a few follow-up fixes to UTF-8 conversions in other places, in
subsequent revisions.

> I'd have nominated the former for
> backport too (it depends on r1796426), but ran out of time sorting out the
> test suite's expectation regarding leading slashes.

I don't see this as important to backport, just a nice-to-have.

- Julian
Loading...