[swig-py3][patch] interfacing bytes object instead of str

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[swig-py3][patch] interfacing bytes object instead of str

Yasuhito FUTATSUKI
Hi,

I've read BRANCH-README on swig-py3 branch and thead starting from
https://mail-archives.apache.org/mod_mbox/subversion-dev/201803.mbox/%3c20180323081026.cqgbhdzfxv2lwm24@....local2%3e

so I've tried to fix it.

Here is a patch against typemap file core.i, try to change
svn_stream_read*() and svn_stream_write() interface to use
bytes object.

As a result of this, it is also need to modify Stream.read(),
and perhaps test for file contents also need to modify.
(It is expected that the result object read from file is
a bytes object, isn't it?)

At least after patched version returns OK for "make check-swig-py"
on both of Python 2.7 and Python 3.7.

diff --git a/subversion/bindings/swig/core.i b/subversion/bindings/swig/core.i
--- a/subversion/bindings/swig/core.i
+++ b/subversion/bindings/swig/core.i
@@ -420,7 +422,7 @@
 
  #ifdef SWIGPYTHON
  %typemap(argout) (char *buffer, apr_size_t *len) {
-  %append_output(PyStr_FromStringAndSize($1, *$2));
+  %append_output(PyBytes_FromStringAndSize($1, *$2));
    free($1);
  }
  #endif
@@ -447,7 +449,9 @@
                          "expecting a string for the buffer");
          SWIG_fail;
      }
-    $1 = PyStr_AsUTF8AndSize($input, &temp);
+    if (PyBytes_AsStringAndSize($input, &$1, &temp) == -1) {
+        SWIG_fail;
+    }
      $2 = ($2_ltype)&temp;
  }
  #endif
diff --git a/subversion/bindings/swig/python/svn/core.py b/subversion/bindings/swig/python/svn/core.py
--- a/subversion/bindings/swig/python/svn/core.py
+++ b/subversion/bindings/swig/python/svn/core.py
@@ -185,7 +185,7 @@ class Stream:
          if not data:
            break
          chunks.append(data)
-      return ''.join(chunks)
+      return b''.join(chunks)
 
      # read the amount specified
      return svn_stream_read(self._stream, int(amt))
diff --git a/subversion/bindings/swig/python/tests/trac/versioncontrol/tests/svn_fs.py b/subversion/bindings/swig/python/tests/trac/versioncontrol/tests/svn_fs.py
--- a/subversion/bindings/swig/python/tests/trac/versioncontrol/tests/svn_fs.py
+++ b/subversion/bindings/swig/python/tests/trac/versioncontrol/tests/svn_fs.py
@@ -173,7 +173,7 @@ class SubversionRepositoryTestCase(unittest.TestCase):
          node = self.repos.get_node('/trunk/README.txt')
          self.assertEqual(8, node.content_length)
          self.assertEqual('text/plain', node.content_type)
-        self.assertEqual('A test.\n', node.get_content().read())
+        self.assertEqual(b'A test.\n', node.get_content().read())
 
      def test_get_dir_properties(self):
          f = self.repos.get_node('/trunk')

--
Yasuhito FUTATSUKI
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Yasuhito FUTATSUKI
On 11/22/18 3:22 AM, Yasuhito FUTATSUKI wrote:
> Here is a patch against typemap file core.i, try to change
> svn_stream_read*() and svn_stream_write() interface to use
> bytes object.

Even with this patch, svn_stream_readline() doesn't work well.
It claims 2nd argment is not bytes object, and returns str object.

I think it is better that svn_stream_readlin() returns bytes too.

I'll try to fix, though I'm not sure I can.
--
Yasuhito FUTATSUKI
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Daniel Shahaf-2
Yasuhito FUTATSUKI wrote on Thu, 22 Nov 2018 08:05 +0900:

> On 11/22/18 3:22 AM, Yasuhito FUTATSUKI wrote:
> > Here is a patch against typemap file core.i, try to change
> > svn_stream_read*() and svn_stream_write() interface to use
> > bytes object.
>
> Even with this patch, svn_stream_readline() doesn't work well.
> It claims 2nd argment is not bytes object, and returns str object.
>
> I think it is better that svn_stream_readlin() returns bytes too.
>

+1.  That function should return bytes, and as no swig-py consumers are
written in py3 there are no compatibility concerns preventing us from
making it so.

> I'll try to fix, though I'm not sure I can.

You're welcome to ask here or on IRC (#svn-dev on freenode) if you
need pointers.

Thanks for looking into it.

Cheers,

Daniel
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Yasuhito FUTATSUKI
Thank you to respond to me.

It seems svn_stream_readline() works as I expected, eol argment accept
bytes, and returns bytes. (the patch below contains fix for
svn_stream_write() which is incomplete in my previous patch)

On 11/22/18 9:57 AM, Daniel Shahaf wrote:
> Yasuhito FUTATSUKI wrote on Thu, 22 Nov 2018 08:05 +0900:

>> Even with this patch, svn_stream_readline() doesn't work well.
>> It claims 2nd argment is not bytes object, and returns str object.
>>
>> I think it is better that svn_stream_readlin() returns bytes too.
>>
>
> +1.  That function should return bytes, and as no swig-py consumers are
> written in py3 there are no compatibility concerns preventing us from
> making it so.

Now ViewVC (un-official branch) works with swig-py on py3 :-)
(https://github.com/futatuki/viewvc/tree/python3_support_with_altsvn)


By the way, while I've read headers, I found some other APIs handle
'eol' argment, in svn_subst.h and svn_diff.h. I think some of them
might need to handle some argment as bytes, because str on py3 normalize
eol style.


diff --git a/subversion/bindings/swig/core.i b/subversion/bindings/swig/core.i
index be5434dc20..0af5faa4ee 100644
--- a/subversion/bindings/swig/core.i
+++ b/subversion/bindings/swig/core.i
@@ -444,9 +444,9 @@
  */
  #ifdef SWIGPYTHON
  %typemap(in) (const char *data, apr_size_t *len) ($*2_type temp) {
-    if (!PyStr_Check($input)) {
+    if (!PyBytes_Check($input)) {
          PyErr_SetString(PyExc_TypeError,
-                        "expecting a string for the buffer");
+                        "expecting a bytes for the buffer");
          SWIG_fail;
      }
      if (PyBytes_AsStringAndSize($input, &$1, &temp) == -1) {
@@ -488,6 +488,39 @@
  }
  #endif
 
+/* -----------------------------------------------------------------------
+   fix up the svn_stream_readline() eol argument
+*/
+#ifdef SWIGPYTHON
+%typemap(in) (const char *eol) {
+    if (!PyBytes_Check($input)) {
+        PyErr_SetString(PyExc_TypeError,
+                        "expecting a bytes for the eol");
+        SWIG_fail;
+    }
+    $1 = PyBytes_AsString($input);
+}
+#endif
+
+/* -----------------------------------------------------------------------
+   fix up the svn_stream_readline() return value
+*/
+#ifdef SWIGPYTHON
+%typemap(argout) (svn_stringbuf_t **stringbuf, const char *eol) {
+    PyObject *s;
+    if (*$1 == NULL) {
+        Py_INCREF(Py_None);
+        s = Py_None;
+    }
+    else {
+        s = PyBytes_FromStringAndSize((*$1)->data, (*$1)->len);
+        if (s == NULL)
+            SWIG_fail;
+    }
+    %append_output(s);
+}
+#endif
+
  /* -----------------------------------------------------------------------
     auth parameter set/get
  */

--
Yasuhito FUTATSUKI
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Daniel Shahaf-2
Yasuhito FUTATSUKI wrote on Thu, 22 Nov 2018 20:16 +0900:
> Now ViewVC (un-official branch) works with swig-py on py3 :-)
> (https://github.com/futatuki/viewvc/tree/python3_support_with_altsvn)

Ah, so _that's_ why Mike asked about py3 support the other day :-)

> By the way, while I've read headers, I found some other APIs handle
> 'eol' argment, in svn_subst.h and svn_diff.h. I think some of them
> might need to handle some argment as bytes, because str on py3 normalize
> eol style.

Thanks for the patch.  I'm afraid I'm a bit disoriented, though; could
you kindly clarify a few things?

Is the patch destined for trunk, for branches/swig-py3@HEAD, or for
branches/swig-py3 after a sync-merge from trunk?  Does it replace your
previous patch, or are the two patches meant to be applied on top of
each other?  How do we test the patch to see the problem that it
corrects --- I assume that we should just run check-swig-py in a py3
build and see that it has fewer failures with the patch than without; is
that correct?

Thanks,

Daniel
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Yasuhito FUTATSUKI
On 11/23/18 11:21 AM, Daniel Shahaf wrote:
> Thanks for the patch.  I'm afraid I'm a bit disoriented, though; could
> you kindly clarify a few things?
>
> Is the patch destined for trunk, for branches/swig-py3@HEAD, or for
> branches/swig-py3 after a sync-merge from trunk?

It's destined for branches/swig-py3@HEAD, but actually I did sync-merge
from trunk@1846855 and tag 1.11.0, then I confirm via diff summary that
the change in my patches doesn't affect merge.
(I'm sorry, I did it on git, https://github.com/futatuki/subversion/tree/swig-py3)

>                                                   Does it replace your
> previous patch, or are the two patches meant to be applied on top of
> each other?

This is the latter.  My secondary patch doesn't replace previous patch,
but assuming first patch applied.


>              How do we test the patch to see the problem that it
> corrects --- I assume that we should just run check-swig-py in a py3
> build and see that it has fewer failures with the patch than without; is
> that correct?

It is partly correct, because my patch may affect some place where curent
test doesn't check.  Actually I run check-swig-py before apply secondary
patch, it returns ok without failure, so I think it doen't test
svn_stream_write() and svn_stream_readline(), though I'm not see whole
of the test.

So I did also bulding subversion/bindings/swig/python/
both of before apply patch and after apply, then diff them, to make sure
the patch doesn't affect other functions other than functions related
to svn_stream_read*() and svn_stream_write().

(I confess I made a diff only for subversion/bindings/swig/python/core.c
on my previous post, and now I found the patches affects
subversion/bindings/swig/python/core.py and
subversion/bindings/swig/python/libsvn/core.py, with incorrect function
annotation for svn_stream_readline() and svn_stream_invoke_readline_fn().)

Of course, I think it is not sufficient and I think it need tests for
svn_stream_readline() and svn_stream_write().

I ran ViewVC as a test for svn_stream_readline(), with branch merged
branch swig-py3 and 1.11.0 tree (for my own convenient),
but I know it is not appropriate here, and it isn't use svn_stream_write().

Thanks,

--
Yasuhito FUTATSUKI
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Daniel Shahaf-2
Yasuhito FUTATSUKI wrote on Fri, Nov 23, 2018 at 16:43:24 +0900:
> On 11/23/18 11:21 AM, Daniel Shahaf wrote:
> > Thanks for the patch.  I'm afraid I'm a bit disoriented, though; could
> > you kindly clarify a few things?
> >
> > Is the patch destined for trunk, for branches/swig-py3@HEAD, or for
> > branches/swig-py3 after a sync-merge from trunk?
>
> It's destined for branches/swig-py3@HEAD,

Okay.

> but actually I did sync-merge from trunk@1846855 and tag 1.11.0, then I
> confirm via diff summary that the change in my patches doesn't affect merge.

I see, but note that even if the changes don't conflict textually, they
might still interact in other ways.  In this example, one of the
revisions to be merged, r1824410, adds a call to svn_diff_file_output_unified4(),
which takes an «svn_stream_t *output_stream» parameter.

> (I'm sorry, I did it on git, https://github.com/futatuki/subversion/tree/swig-py3)
>

Well, until INFRA-7524 gets implemented, we'll just have to get you
commit access to ^/subversion/branches ☺

> >                                                   Does it replace your
> > previous patch, or are the two patches meant to be applied on top of
> > each other?
>
> This is the latter.  My secondary patch doesn't replace previous patch,
> but assuming first patch applied.
>

Okay.  Generally we prefer a second iteration of the patch to be sent as
a diff against HEAD, in order for the submissions to be self-contained.

> >              How do we test the patch to see the problem that it
> > corrects --- I assume that we should just run check-swig-py in a py3
> > build and see that it has fewer failures with the patch than without; is
> > that correct?
>
> It is partly correct, because my patch may affect some place where curent
> test doesn't check.  Actually I run check-swig-py before apply secondary
> patch, it returns ok without failure, so I think it doen't test
> svn_stream_write() and svn_stream_readline(), though I'm not see whole
> of the test.
>

'check-swig-py3' is the entire test suite, as far as I know.

> So I did also bulding subversion/bindings/swig/python/
> both of before apply patch and after apply, then diff them, to make sure
> the patch doesn't affect other functions other than functions related
> to svn_stream_read*() and svn_stream_write().
>

Good idea.

> (I confess I made a diff only for subversion/bindings/swig/python/core.c
> on my previous post, and now I found the patches affects
> subversion/bindings/swig/python/core.py and
> subversion/bindings/swig/python/libsvn/core.py, with incorrect function
> annotation for svn_stream_readline() and svn_stream_invoke_readline_fn().)

> Of course, I think it is not sufficient and I think it need tests for
> svn_stream_readline() and svn_stream_write().
>

Yes, please.  Tests would help us review the change.  You can use
svn_stream_empty() or svn_stream_from_stringbuf() for the tests.

> I ran ViewVC as a test for svn_stream_readline(), with branch merged
> branch swig-py3 and 1.11.0 tree (for my own convenient),
> but I know it is not appropriate here, and it isn't use svn_stream_write().

Yes, running ViewVC is a good smoke test, though one that doesn't
necessarily have good coverage of the relevant interfaces.  Then again,
I'm not sure that check-swig-py has good coverage either…

Cheers,

Daniel
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Yasuhito FUTATSUKI
I've revised typemaps and APIs using svn_stringbuf_t *, then I found
almost all those APIs include svn_stream_readline() use svn_stringbuf_t
for file contents.  So I've modified typemaps again so that those APIs
use bytes for svn_stringbuf_t interface.

The patch below destined for branches/swig-py affects those API wrappers.
-- affected wrapper functions --
_core(svn_utf.h):
svn_utf_stringbuf_to_utf8(src, pool)   bytes, apr_pool_t -> bytes
svn_utf_stringbuf_from_utf8(src, pool) bytes, apr_pool_t -> bytes
svn_utf_cstring_from_utf8_stringbuf(src, pool) bytes, apr_pool_t -> str

_core(svn_io.h):
svn_stringbuf_from_stream(stream, len, pool)
         svn_stream_t, int, apr_pool_t -> bytes
svn_stream_from_stringbuf(stringbuf, pool)
         bytes, apr_pool_t -> svn_stream_t
svn_stream_read_full(stream, len) svn_stream_t, int -> bytes
svn_stream_read2(stream, len) svn_stream_t, int -> bytes
svn_stream_read(stream, len) svn_stream_t, int -> bytes
svn_stream_write(stream, data) svn_stream_t, bytes -> int
svn_stream_readline(stream, eol) svn_stream_t, bytes -> [bytes, int(bool)]
svn_stringbuf_from_file2(filename, pool) str, apr_pool_t -> bytes
svn_stringbuf_from_file(filename, pool) str, apr_pool_t -> bytes
svn_stringbuf_from_aprfile(file, pool) apr_file_t, apr_pool_t -> bytes
svn_io_file_readline(file, max_len, result_pool, scratch_pool)
         apr_file_t, int, apr_pool_t, apr_pool_t -> bytes, bytes, int(bool)
svn_read_invoke_fn(_obj, baton, buffer) svn_read_fn_t, object, int -> bytes
svn_write_invoke_fn(_obj, baton, data) svn_write_fn_t, object, bytes -> int
svn_stream_invoke_readline_fn(_obj, baton, eol, pool)
         svn_stream_readline_fn_t, object, bytes, apr_pool_t
         -> [bytes, int(bool)]

_diff(svn_diff.h):
svn_diff_hunk_readline_diff_text(hunk, pool)
         svn_diff_hunk_t, apr_pool_t -> [bytes, bytes]
svn_diff_hunk_readline_original_text(hunk, pool)
         svn_diff_hunk_t, apr_pool_t -> [bytes, bytes]
svn_diff_hunk_readline_modified_text(hunk, pool)
         svn_diff_hunk_t, apr_pool_t -> [bytes, bytes]
-- end affected wrapper functions --

svn_fs_print_modules, svn_ra_print_modules, and svn_print_ra_libraries
are also uses svn_stringbuf_t, but as it seems to be preferred to stay
using str, they are excluded from changes.

For svn_stream_functions, there wasn't unit test in check-swig-py, so
I've added them into subversion/bindings/swig/python/tests/core.py.
(The patch also include it)

diff --git a/subversion/bindings/swig/core.i b/subversion/bindings/swig/core.i
index 7e250e5519..281ec14982 100644
--- a/subversion/bindings/swig/core.i
+++ b/subversion/bindings/swig/core.i
@@ -130,6 +130,8 @@
  /* svn_stream_checksummed would require special attention to wrap, because
   * of the read_digest and write_digest parameters. */
  %ignore svn_stream_checksummed;
+// svn_stream_read_full
+// svn_stream_read2
  // svn_stream_read
  // svn_stream_write
  // svn_stream_close
@@ -420,7 +422,7 @@
 
  #ifdef SWIGPYTHON
  %typemap(argout) (char *buffer, apr_size_t *len) {
-  %append_output(PyStr_FromStringAndSize($1, *$2));
+  %append_output(PyBytes_FromStringAndSize($1, *$2));
    free($1);
  }
  #endif
@@ -442,12 +444,14 @@
  */
  #ifdef SWIGPYTHON
  %typemap(in) (const char *data, apr_size_t *len) ($*2_type temp) {
-    if (!PyStr_Check($input)) {
+    if (!PyBytes_Check($input)) {
          PyErr_SetString(PyExc_TypeError,
-                        "expecting a string for the buffer");
+                        "expecting a bytes for the buffer");
+        SWIG_fail;
+    }
+    if (PyBytes_AsStringAndSize($input, &$1, &temp) == -1) {
          SWIG_fail;
      }
-    $1 = PyStr_AsUTF8AndSize($input, &temp);
      $2 = ($2_ltype)&temp;
  }
  #endif
@@ -484,6 +488,20 @@
  }
  #endif
 
+/* -----------------------------------------------------------------------
+   fix up the svn_stream_readline() eol argument
+*/
+#ifdef SWIGPYTHON
+%typemap(in) (const char *eol) {
+    if (!PyBytes_Check($input)) {
+        PyErr_SetString(PyExc_TypeError,
+                        "expecting a bytes for the eol");
+        SWIG_fail;
+    }
+    $1 = PyBytes_AsString($input);
+}
+#endif
+
  /* -----------------------------------------------------------------------
     auth parameter set/get
  */
diff --git a/subversion/bindings/swig/include/svn_string.swg b/subversion/bindings/swig/include/svn_string.swg
index 8cb9b0c544..5ac0eef50a 100644
--- a/subversion/bindings/swig/include/svn_string.swg
+++ b/subversion/bindings/swig/include/svn_string.swg
@@ -28,11 +28,13 @@ typedef struct svn_string_t svn_string_t;
 
  /* -----------------------------------------------------------------------
     generic OUT param typemap for svn_string(buf)_t. we can share these
-   because we only refer to the ->data and ->len values.
+   except in case of Python, because we only refer to the ->data
+   and ->len values. in case of Python, svn_stringbut_t represents
+   raw bytes and svn_string_t represents string in most case, so it is
+   convenient to distinguish them.
  */
  #ifdef SWIGPYTHON
-
-%typemap(argout) RET_STRING {
+%typemap(argout) svn_string_t ** {
      PyObject *s;
      if (*$1 == NULL) {
          Py_INCREF(Py_None);
@@ -45,6 +47,19 @@ typedef struct svn_string_t svn_string_t;
      }
      %append_output(s);
  }
+%typemap(argout) svn_stringbuf_t ** {
+    PyObject *s;
+    if (*$1 == NULL) {
+        Py_INCREF(Py_None);
+        s = Py_None;
+    }
+    else {
+        s = PyBytes_FromStringAndSize((*$1)->data, (*$1)->len);
+        if (s == NULL)
+            SWIG_fail;
+    }
+    %append_output(s);
+}
  #endif
  #ifdef SWIGPERL
  %typemap(argout) RET_STRING {
@@ -65,10 +80,12 @@ typedef struct svn_string_t svn_string_t;
  }
  #endif
 
+#ifndef SWIGPYTHON
  %apply RET_STRING {
    svn_string_t **,
    svn_stringbuf_t **
  };
+#endif
 
  /* -----------------------------------------------------------------------
     TYPE: svn_stringbuf_t
@@ -76,6 +93,22 @@ typedef struct svn_string_t svn_string_t;
 
  #ifdef SWIGPYTHON
  %typemap(in) svn_stringbuf_t * {
+    if (!PyBytes_Check($input)) {
+        PyErr_SetString(PyExc_TypeError, "not a bytes object");
+        SWIG_fail;
+    }
+    {
+      Py_ssize_t strBufLen;
+      const char *strBufChar;
+      if (-1 == PyBytes_AsStringAndSize($input, &strBufChar, &strBufLen)) {
+        SWIG_fail;
+      }
+      $1 = svn_stringbuf_ncreate(strBufChar, strBufLen,
+                                 /* ### gah... what pool to use? */
+                                 _global_pool);
+    }
+}
+%typemap(in) svn_stringbuf_t *output {
      if (!PyStr_Check($input)) {
          PyErr_SetString(PyExc_TypeError, "not a string");
          SWIG_fail;
@@ -264,6 +297,19 @@ typedef struct svn_string_t svn_string_t;
      }
      %append_output(s);
  }
+%typemap(argout) const char **eol {
+    PyObject *s;
+    if (*$1 == NULL) {
+        Py_INCREF(Py_None);
+        s = Py_None;
+    }
+    else {
+        s = PyBytes_FromString(*$1);
+        if (s == NULL)
+            SWIG_fail;
+    }
+    %append_output(s);
+}
  #endif
 
  #ifdef SWIGPERL
diff --git a/subversion/bindings/swig/python/svn/core.py b/subversion/bindings/swig/python/svn/core.py
index 11713c8def..7f346c1b6c 100644
--- a/subversion/bindings/swig/python/svn/core.py
+++ b/subversion/bindings/swig/python/svn/core.py
@@ -185,7 +185,7 @@ class Stream:
          if not data:
            break
          chunks.append(data)
-      return ''.join(chunks)
+      return b''.join(chunks)
 
      # read the amount specified
      return svn_stream_read(self._stream, int(amt))
diff --git a/subversion/bindings/swig/python/tests/core.py b/subversion/bindings/swig/python/tests/core.py
index b41654357c..87d7f37e9b 100644
--- a/subversion/bindings/swig/python/tests/core.py
+++ b/subversion/bindings/swig/python/tests/core.py
@@ -18,7 +18,7 @@
  # under the License.
  #
  #
-import unittest
+import unittest, os, tempfile
 
  import svn.core, svn.client
  import utils
@@ -171,6 +171,126 @@ class SubversionCoreTestCase(unittest.TestCase):
      self.assertEqual(
        svn.core.svn_config_enumerate_sections2(cfg, enumerator), 1)
 
+  def test_stream_from_bufstring(self):
+    stream = svn.core.svn_stream_from_stringbuf(b'')
+    svn.core.svn_stream_close(stream)
+    del stream
+    with self.assertRaises(TypeError):
+        stream = svn.core.svn_stream_from_stringbuf(b''.decode())
+        svn.core.svn_stream_close(stream)
+        del stream
+
+  def test_stream_read_full(self):
+    in_str = (b'Python\x00'
+              b'\xa4\xd1\xa4\xa4\xa4\xbd\xa4\xf3\r\n'
+              b'Subversion\x00'
+              b'\xa4\xb5\xa4\xd6\xa4\xd0\xa1\xbc\xa4\xb8\xa4\xe7\xa4\xf3\n'
+              b'swig\x00'
+              b'\xa4\xb9\xa4\xa6\xa4\xa3\xa4\xb0\r'
+              b'end')
+    stream = svn.core.svn_stream_from_stringbuf(in_str)
+    self.assertEqual(svn.core.svn_stream_read_full(stream, 4096), in_str)
+    svn.core.svn_stream_seek(stream, None)
+    self.assertEqual(svn.core.svn_stream_read_full(stream, 10), in_str[0:10])
+    svn.core.svn_stream_seek(stream, None)
+    svn.core.svn_stream_skip(stream, 20)
+    self.assertEqual(svn.core.svn_stream_read_full(stream, 4096), in_str[20:])
+    self.assertEqual(svn.core.svn_stream_read_full(stream, 4096), b'')
+    svn.core.svn_stream_close(stream)
+    del stream
+
+  def test_stream_read2(self):
+    # as we can't create non block stream by using swig-py API directly,
+    # we only test svn_stream_read2() behaves just same as
+    # svn_stream_read_full()
+    in_str = (b'Python\x00'
+              b'\xa4\xd1\xa4\xa4\xa4\xbd\xa4\xf3\r\n'
+              b'Subversion\x00'
+              b'\xa4\xb5\xa4\xd6\xa4\xd0\xa1\xbc\xa4\xb8\xa4\xe7\xa4\xf3\n'
+              b'swig\x00'
+              b'\xa4\xb9\xa4\xa6\xa4\xa3\xa4\xb0\r'
+              b'end')
+    stream = svn.core.svn_stream_from_stringbuf(in_str)
+    self.assertEqual(svn.core.svn_stream_read2(stream, 4096), in_str)
+    svn.core.svn_stream_seek(stream, None)
+    self.assertEqual(svn.core.svn_stream_read2(stream, 10), in_str[0:10])
+    svn.core.svn_stream_seek(stream, None)
+    svn.core.svn_stream_skip(stream, 20)
+    self.assertEqual(svn.core.svn_stream_read2(stream, 4096), in_str[20:])
+    self.assertEqual(svn.core.svn_stream_read2(stream, 4096), b'')
+    svn.core.svn_stream_close(stream)
+    del stream
+
+  def test_stream_write_exception(self):
+    ostr_unicode = b'Python'.decode()
+    stream = svn.core.svn_stream_empty()
+    with self.assertRaises(TypeError):
+        svn.core.svn_stream_write(stream, ostr_unicode)
+    svn.core.svn_stream_close(stream)
+    del stream
+
+  def test_stream_write(self):
+    o1_str = b'Python\x00\xa4\xd1\xa4\xa4\xa4\xbd\xa4\xf3\r\n'
+    o2_str = (b'subVersioN\x00'
+              b'\xa4\xb5\xa4\xd6\xa4\xd0\xa1\xbc\xa4\xb8\xa4\xe7\xa4\xf3\n')
+    o3_str =  b'swig\x00\xa4\xb9\xa4\xa6\xa4\xa3\xa4\xb0\rend'
+    out_str = o1_str + o2_str + o3_str
+    rewrite_str = b'Subversion'
+    fd, fname = tempfile.mkstemp()
+    os.close(fd)
+    try:
+      stream = svn.core.svn_stream_from_aprfile2(fname, False)
+      self.assertEqual(svn.core.svn_stream_write(stream, out_str),
+                       len(out_str))
+      svn.core.svn_stream_seek(stream, None)
+      self.assertEqual(svn.core.svn_stream_read_full(stream, 4096), out_str)
+      svn.core.svn_stream_seek(stream, None)
+      svn.core.svn_stream_skip(stream, len(o1_str))
+      self.assertEqual(svn.core.svn_stream_write(stream, rewrite_str),
+                       len(rewrite_str))
+      svn.core.svn_stream_seek(stream, None)
+      self.assertEqual(
+                svn.core.svn_stream_read_full(stream, 4096),
+                o1_str + rewrite_str + o2_str[len(rewrite_str):] + o3_str)
+      svn.core.svn_stream_close(stream)
+      del stream
+    finally:
+      try:
+        os.remove(fname)
+      except OSError:
+        pass
+
+  def test_stream_readline(self):
+    o1_str = b'Python\t\xa4\xd1\xa4\xa4\xa4\xbd\xa4\xf3\r\n'
+    o2_str = (b'Subversion\t'
+              b'\xa4\xb5\xa4\xd6\xa4\xd0\xa1\xbc\xa4\xb8\xa4\xe7\xa4\xf3\n')
+    o3_str =  b'swig\t\xa4\xb9\xa4\xa6\xa4\xa3\xa4\xb0\rend'
+    in_str = o1_str + o2_str + o3_str
+    stream = svn.core.svn_stream_from_stringbuf(in_str)
+    self.assertEqual(svn.core.svn_stream_readline(stream, b'\n'),
+                     [o1_str[:-1], 0])
+    self.assertEqual(svn.core.svn_stream_readline(stream, b'\n'),
+                     [o2_str[:-1], 0])
+    self.assertEqual(svn.core.svn_stream_readline(stream, b'\n'),
+                     [o3_str, 1])
+    self.assertEqual(svn.core.svn_stream_readline(stream, b'\n'),
+                     [b'', 1])
+    svn.core.svn_stream_seek(stream, None)
+    self.assertEqual(svn.core.svn_stream_readline(stream, b'\r\n'),
+                     [o1_str[:-2], 0])
+    self.assertEqual(svn.core.svn_stream_readline(stream, b'\r\n'),
+                     [o2_str + o3_str, 1])
+    svn.core.svn_stream_write(stream, b'\r\n')
+    svn.core.svn_stream_seek(stream, None)
+    self.assertEqual(svn.core.svn_stream_readline(stream, b'\r\n'),
+                     [o1_str[:-2], 0])
+    self.assertEqual(svn.core.svn_stream_readline(stream, b'\r\n'),
+                     [o2_str + o3_str, 0])
+    self.assertEqual(svn.core.svn_stream_readline(stream, b'\r\n'),
+                     [b'', 1])
+    svn.core.svn_stream_close(stream)
+    del stream
+
  def suite():
      return unittest.defaultTestLoader.loadTestsFromTestCase(
        SubversionCoreTestCase)
diff --git a/subversion/bindings/swig/python/tests/trac/versioncontrol/tests/svn_fs.py b/subversion/bindings/swig/python/tests/trac/versioncontrol/tests/svn_fs.py
index a4c39c2783..1cafccc4bf 100644
--- a/subversion/bindings/swig/python/tests/trac/versioncontrol/tests/svn_fs.py
+++ b/subversion/bindings/swig/python/tests/trac/versioncontrol/tests/svn_fs.py
@@ -173,7 +173,7 @@ class SubversionRepositoryTestCase(unittest.TestCase):
          node = self.repos.get_node('/trunk/README.txt')
          self.assertEqual(8, node.content_length)
          self.assertEqual('text/plain', node.content_type)
-        self.assertEqual('A test.\n', node.get_content().read())
+        self.assertEqual(b'A test.\n', node.get_content().read())
 
      def test_get_dir_properties(self):
          f = self.repos.get_node('/trunk')
diff --git a/subversion/bindings/swig/svn_ra.i b/subversion/bindings/swig/svn_ra.i
index e1925f8a47..595aa1cdb1 100644
--- a/subversion/bindings/swig/svn_ra.i
+++ b/subversion/bindings/swig/svn_ra.i
@@ -75,6 +75,25 @@
  /* FIXME: svn_ra_callbacks_t ? */
  #endif
 
+#ifdef SWIGPYTHON
+/* -----------------------------------------------------------------------
+   handle svn_ra_print_ra_libraries().
+*/
+%typemap(argout) svn_stringbuf_t **descriptions {
+    PyObject *s;
+    if (*$1 == NULL) {
+        Py_INCREF(Py_None);
+        s = Py_None;
+    }
+    else {
+        s = PyStr_FromStringAndSize((*$1)->data, (*$1)->len);
+        if (s == NULL)
+            SWIG_fail;
+    }
+    %append_output(s);
+}
+#endif
+
  #ifdef SWIGPYTHON
  %callback_typemap(const svn_ra_reporter2_t *reporter, void *report_baton,
                    svn_swig_py_get_ra_reporter2(),

Regards,
--
Yasuhito FUTATSUKI
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Daniel Shahaf-2
Yasuhito FUTATSUKI wrote on Tue, Nov 27, 2018 at 06:50:46 +0900:
> I've revised typemaps and APIs using svn_stringbuf_t *, then I found
> almost all those APIs include svn_stream_readline() use svn_stringbuf_t
> for file contents.  So I've modified typemaps again so that those APIs
> use bytes for svn_stringbuf_t interface.
>
> The patch below destined for branches/swig-py affects those API wrappers.

Thanks for the patch, Yasuhito.  It looks good on a quick skim.  Troy, you've
worked on this branch before; would you perchance be able to review this?
(If you can, great; but no worries if you can't)
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Troy Curtis Jr-2


On Wed, Nov 28, 2018, 10:40 AM Daniel Shahaf <[hidden email] wrote:
Yasuhito FUTATSUKI wrote on Tue, Nov 27, 2018 at 06:50:46 +0900:
> I've revised typemaps and APIs using svn_stringbuf_t *, then I found
> almost all those APIs include svn_stream_readline() use svn_stringbuf_t
> for file contents.  So I've modified typemaps again so that those APIs
> use bytes for svn_stringbuf_t interface.
>
> The patch below destined for branches/swig-py affects those API wrappers.

Thanks for the patch, Yasuhito.  It looks good on a quick skim.  Troy, you've
worked on this branch before; would you perchance be able to review this?
(If you can, great; but no worries if you can't)

I'll definitely try to do so. I had even starred this patch to review already, but obviously haven't actually looked over it yet. ;)

Troy
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Troy Curtis Jr-2
In reply to this post by Daniel Shahaf-2
On Wed, Nov 28, 2018 at 10:40 AM Daniel Shahaf <[hidden email]> wrote:
Yasuhito FUTATSUKI wrote on Tue, Nov 27, 2018 at 06:50:46 +0900:
> I've revised typemaps and APIs using svn_stringbuf_t *, then I found
> almost all those APIs include svn_stream_readline() use svn_stringbuf_t
> for file contents.  So I've modified typemaps again so that those APIs
> use bytes for svn_stringbuf_t interface.
>
> The patch below destined for branches/swig-py affects those API wrappers.

Thanks for the patch, Yasuhito.  It looks good on a quick skim.  Troy, you've
worked on this branch before; would you perchance be able to review this?
(If you can, great; but no worries if you can't)

Yes the patch is looking good, especially having the svn_stream Python APIs actually tested!  I updated the patch so that it works with the latest swig-py3 (@r1849027) and attached it.

But there was one item I wanted to talk about related to the patch.  I agree that "svn_stringbuf_t" => "bytes" and "svn_string_t" => "str" is the right general path, but after the patch there is now a "svn_stringbuf_t *output" typemap that is still using "PyStr" instead of "PyBytes".  Further, the only usage of "svn_stringbuf_t *output" is in "svn_fs_print_modules" and "svn_ra_print_modules", where the function parameter is actually an output parameter, so the appropriate typemap is probably "argout" instead of "in".

Otherwise, the patch LGTM.  Yasuhito, if you want to review my changes to your patch and perhaps address the issue in the last paragraph then I can commit your patch to swig-py3. I'll also start working more on the swig-py3 branch to get it to the point that we can actually get this branch merged into trunk finally.

Troy

yasuhito-swig-py3-revised.patch (20K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Daniel Shahaf-2
Troy Curtis Jr wrote on Sun, 16 Dec 2018 09:59 -0500:
> But there was one item I wanted to talk about related to the patch.  I
> agree that "svn_stringbuf_t" => "bytes" and "svn_string_t" => "str" is the
> right general path,

Are you sure about this?  Property values can be binary data (so not
representable as 'str') but are usually represented by «svn_string_t *»;
for example, svn_repos_parse_fns3_t::set_node_property().

> but after the patch there is now a "svn_stringbuf_t
> *output" typemap that is still using "PyStr" instead of "PyBytes".
> Further, the only usage of "svn_stringbuf_t *output" is in
> "svn_fs_print_modules" and "svn_ra_print_modules", where the function
> parameter is actually an output parameter, so the appropriate typemap is
> probably "argout" instead of "in".
>
> Otherwise, the patch LGTM.  Yasuhito, if you want to review my changes to
> your patch and perhaps address the issue in the last paragraph then I can
> commit your patch to swig-py3. I'll also start working more on the swig-py3
> branch to get it to the point that we can actually get this branch merged
> into trunk finally.

Great. :-)

Cheers,

Daniel
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Troy Curtis Jr-2

On Sun, Dec 16, 2018 at 11:28 AM Daniel Shahaf <[hidden email]> wrote:
Troy Curtis Jr wrote on Sun, 16 Dec 2018 09:59 -0500:
> But there was one item I wanted to talk about related to the patch.  I
> agree that "svn_stringbuf_t" => "bytes" and "svn_string_t" => "str" is the
> right general path,

Are you sure about this?  Property values can be binary data (so not
representable as 'str') but are usually represented by «svn_string_t *»;
for example, svn_repos_parse_fns3_t::set_node_property().

Ah yes, I suppose this also has bearing on the discussion in the other thread relating
to returning property values.  With that in mind, perhaps your suggestion of effectively
defaulting to always being Bytes for char * , svn_string_t, and svn_stringbuf_t unless there
is a specific circumstance not to makes the most sense as the general principle.

Yasuhito, I suppose that means we should probably tweak the typemaps to follow this
principle.  Have you already started down that path based on the properties discussion?
 
> but after the patch there is now a "svn_stringbuf_t
> *output" typemap that is still using "PyStr" instead of "PyBytes".
> Further, the only usage of "svn_stringbuf_t *output" is in
> "svn_fs_print_modules" and "svn_ra_print_modules", where the function
> parameter is actually an output parameter, so the appropriate typemap is
> probably "argout" instead of "in".
>
> Otherwise, the patch LGTM.  Yasuhito, if you want to review my changes to
> your patch and perhaps address the issue in the last paragraph then I can
> commit your patch to swig-py3. I'll also start working more on the swig-py3
> branch to get it to the point that we can actually get this branch merged
> into trunk finally.

Great. :-)

Cheers,

Daniel

Troy
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Yasuhito FUTATSUKI
On 12/17/18 2:08 AM, Troy Curtis Jr wrote:

> On Sun, Dec 16, 2018 at 11:28 AM Daniel Shahaf <[hidden email]>
> wrote:
>
>> Troy Curtis Jr wrote on Sun, 16 Dec 2018 09:59 -0500:
>>> But there was one item I wanted to talk about related to the patch.  I
>>> agree that "svn_stringbuf_t" => "bytes" and "svn_string_t" => "str" is
>> the
>>> right general path,
>>
>> Are you sure about this?  Property values can be binary data (so not
>> representable as 'str') but are usually represented by «svn_string_t *»;
>> for example, svn_repos_parse_fns3_t::set_node_property().
>>
>
> Ah yes, I suppose this also has bearing on the discussion in the other
> thread relating
> to returning property values.  With that in mind, perhaps your suggestion
> of effectively
> defaulting to always being Bytes for char * , svn_string_t, and
> svn_stringbuf_t unless there
> is a specific circumstance not to makes the most sense as the general
> principle.

To treat all those values as bytes in wrapper makes the wrapper is lower
level, keeping raw access like device drivers. On the other hand,
to make contexts, to make semantic differences in wrapper makes it more
higher level. So I think those are not the general principle but just a
policy of the wrapper design. I, as a consumer of the Python bindings,
prefer the latter if it is clear that what are bytes and what are str,
however I also agree with former if the distinction is too complex for
both of consumers and developers.

> Yasuhito, I suppose that means we should probably tweak the typemaps to
> follow this
> principle.  Have you already started down that path based on the properties
> discussion?

Yes, but there is almost nothing to output, so you don't need to care :)
I've also read swig document again, especially about default typemaps,
and also there is no result :)

>>> but after the patch there is now a "svn_stringbuf_t
>>> *output" typemap that is still using "PyStr" instead of "PyBytes".
>>> Further, the only usage of "svn_stringbuf_t *output" is in
>>> "svn_fs_print_modules" and "svn_ra_print_modules", where the function
>>> parameter is actually an output parameter, so the appropriate typemap is
>>> probably "argout" instead of "in".

I think you are right. And svn_stringbuf_t * type don't have proxy class,
it is no mean to pass string prepend to the result. The consumer of those
APIs can write a code like (using application_pool):

    modules += svn.fs.svn_fs_print_modules()

However, this seems to be not only the issue swig-py3 but also trunk.
Actually both of svn.fs.svn_fs_print_module('') and
svn.ra.svn_ra_print_module('') return None type, on FreeBSD package for
Subversion 1.8.8 and 1.11.0, without patch to tweak swig codes, so
it seems those API wrappers are broken.

>>> Otherwise, the patch LGTM.  Yasuhito, if you want to review my changes to
>>> your patch and perhaps address the issue in the last paragraph then I can
>>> commit your patch to swig-py3.

I found exception error messages isn't unified for
('a bytes', 'a bytes object', 'a Bytes object'),
and nothing but these.

>>> I'll also start working more on the swig-py3
>>> branch to get it to the point that we can actually get this branch merged
>>> into trunk finally.

Oh, thank you!

Cheers,
--
Yasuhito FUTATSUKI
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Daniel Shahaf-2
Yasuhito FUTATSUKI wrote on Mon, Dec 17, 2018 at 05:50:03 +0900:
> On 12/17/18 2:08 AM, Troy Curtis Jr wrote:
> > On Sun, Dec 16, 2018 at 11:28 AM Daniel Shahaf <[hidden email]>
> > wrote:
> >
> > > Troy Curtis Jr wrote on Sun, 16 Dec 2018 09:59 -0500:
> > > > But there was one item I wanted to talk about related to the patch.  I
> > > > agree that "svn_stringbuf_t" => "bytes" and "svn_string_t" => "str" is the
> > > > right general path,

Would svn_stringbuf_t map better to bytearray?  Sorry, that didn't occur
to me yesterday.

> > > Are you sure about this?  Property values can be binary data (so not
> > > representable as 'str') but are usually represented by «svn_string_t *»;
> > > for example, svn_repos_parse_fns3_t::set_node_property().
> > >
> >
> > Ah yes, I suppose this also has bearing on the discussion in the other
> > thread relating
> > to returning property values.  With that in mind, perhaps your suggestion
> > of effectively
> > defaulting to always being Bytes for char * , svn_string_t, and
> > svn_stringbuf_t unless there
> > is a specific circumstance not to makes the most sense as the general
> > principle.
>
> To treat all those values as bytes in wrapper makes the wrapper is lower
> level, keeping raw access like device drivers. On the other hand,
> to make contexts, to make semantic differences in wrapper makes it more
> higher level. So I think those are not the general principle but just a
> policy of the wrapper design.

In general, you're right.  The swig bindings are not Pythonic; to use
them requires using C idioms while writing Python.  There's certainly
room for higher-level abstractions — I think the javahl and
python-ctypes bindings are examples — but the trade-off is that they
can't be autogenerated like the swig bindings can.

However, all that aside, I think this general discussion doesn't apply
here.  The value of the svn_string_t* parameter of svn_repos_parse_fns3_t::set_node_property()
may be *unrepresentable* as a py3 str object.  It isn't a question of
C-like idiom versus a Pythonic idiom; the bytes class can can represent
all possible values, the str class cannot.

Consider «svn propset -F /bin/sh foo iota».  How would you represent
that value as str?

> I, as a consumer of the Python bindings, prefer the latter if it is
> clear that what are bytes and what are str, however I also agree with
> former if the distinction is too complex for both of consumers and
> developers.

Cheers,

Daniel
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Yasuhito FUTATSUKI
On 12/17/18 18:35, Daniel Shahaf wrote:

> Yasuhito FUTATSUKI wrote on Mon, Dec 17, 2018 at 05:50:03 +0900:
>> On 12/17/18 2:08 AM, Troy Curtis Jr wrote:
>>> On Sun, Dec 16, 2018 at 11:28 AM Daniel Shahaf <[hidden email]>
>>> wrote:
>>>
>>>> Troy Curtis Jr wrote on Sun, 16 Dec 2018 09:59 -0500:
>>>>> But there was one item I wanted to talk about related to the patch.  I
>>>>> agree that "svn_stringbuf_t" => "bytes" and "svn_string_t" => "str" is the
>>>>> right general path,
>
> Would svn_stringbuf_t map better to bytearray?  Sorry, that didn't occur
> to me yesterday.

I think if proxy object for svn_stringbuf_t can support buffer protocol,
no need to translate to bytes object for return values, however,
for input values, it is still need to translation to svn_stringbuf_t
with entity copy except proxy objects.

(I already agreed that propety values should be mapped to bytes objects,
so I ommit latter half.)

Cheers,
--
Yasuhito FUTATSUKI
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Yasuhito FUTATSUKI
In reply to this post by Yasuhito FUTATSUKI
On 12/17/18 05:50, Yasuhito FUTATSUKI wrote:
> I found exception error messages isn't unified for
> ('a bytes', 'a bytes object', 'a Bytes object'),
> and nothing but these.

I've updated the patch including fix above (unified to
'a bytes objects') and type casting between signed and
unsigned for len argment in svn_stream_write(),
for swig-py3@1849073

Cheers,
--
Yasuhito FUTATSUKI

yasuhito-swig-py3-revised-update.patch (14K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Yasuhito FUTATSUKI
In reply to this post by Yasuhito FUTATSUKI
On 12/17/18 5:50 AM, Yasuhito FUTATSUKI wrote:
> On 12/17/18 2:08 AM, Troy Curtis Jr wrote:

>> Ah yes, I suppose this also has bearing on the discussion in the other
>> thread relating
>> to returning property values.  With that in mind, perhaps your suggestion
>> of effectively
>> defaulting to always being Bytes for char * , svn_string_t, and
>> svn_stringbuf_t unless there
>> is a specific circumstance not to makes the most sense as the general
>> principle.
>>
>> Yasuhito, I suppose that means we should probably tweak the typemaps to
>> follow this
>> principle.  Have you already started down that path based on the properties
>> discussion?
>
> Yes, but there is almost nothing to output, so you don't need to care :)
> I've also read swig document again, especially about default typemaps,
> and also there is no result :)
Status update: I've done to make patch that default to map bytes
from char *, svn_string_t, and svn_stringbuf_t, for swig-py3@1849037.


buildi system change:

* build/ac-macros/swig.m4: add "-DPY3" option to SWIG_PY_OPTS for py3
  build to define symbol to switch typemap between py2 and py3

typemap changes:

* subversion/bindings/swig/core.i
  (%typemap(argout) (char *buffer, apr_size_t *len):
   map buffer arg to Bytes insted of Str
  (%typemap(in) (const char *data, apr_size_t *len) ($*2_type temp):
   + map data arg to Bytes instead of Str
   + passing Py_ssize_t variable as length argment of
   PyBytes_AsStringAndSize for for explicit type conversion.
  (%typemap(in) const void *value):
   map value arg to Bytes instead of Str (for svn_auth_set_parameter())

* subversion/bindings/swig/include/svn_global.swg:
  x define SWIG_PYTHON_STRICT_BYTE_CHAR symbol for swig context for
   Python3
  x switch "in" type typemap for char *, char const *, char * const,
   and char const * const "", to use parse parameter "y" for py3 and
   "s" for py2

* subversion/bindings/swig/include/svn_string.swg
  (%typemap(argout) RET_STRING, %typemap(in) svn_stringbuf_t *,
   %typemap(out) svn_stringbuf_t *, %typemap(in) const svn_string_t *,
   %typemap(out) svn_string_t *, %typemap(argout) const char **OUTPUT):
   map args to Bytes instead of Str

* subversion/bindings/swig/include/svn_swigcompat.swg
  (%set_constant(name, value): use PyDict_SetItem and PyBytes_FromString
  instead of PyDict_SetItemString, for SWIG <= 1.3.27 (not tested)

* subversion/bindings/swig/include/svn_types.swg
  (%typemap(in) const char *MAY_BE_NULL): don't use "z" conversion
   format and convert to Bytes if it is not NULL
  (%typemap(in) (const char *PTR, apr_size_t LEN): use PyBytes_Check and
   PyBytes_AsStringAndSize instead of PyStr_Check and PyStr_AsUTF8AndSize

* subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c
  (make_string_from_ob(), make_svn_string_from_ob(),
   svn_swig_py_make_file(), svn_swig_py_get_commit_log_func()):
   use PyBytes_Check and PyBytes_AsString instead of
   PyStr_Check and PyStr_AsString
  (convert_hash()): use Dict_SetItem and PyBytes_FromString instead of
   DictItemString
  (convert_svn_string_t(), svn_swig_py_proparray_to_dict(),
   ra_callbacks_get_wc_prop()):
   use PyBytes_AsStringAndSize instead of PyStr_AsUTF8AndSize
  (cstring_to_pystring(), convert_string(),
   svn_swig_py_propinheriteditemarray_to_dict(),
   svn_swig_py_proparray_to_dict(),
   svn_swig_py_locationhash_to_dict(),
   svn_swig_py_c_strings_to_list(),
   svn_swig_py_changed_path_hash_to_dict(),
   svn_swig_py_changed_path2_hash_to_dict(),
   svn_swig_py_unwrap_string(),
   svn_swig_py_array_to_list()): use PyBytes_FromString instead of
   PyStr_FromString
  (delete_entry(), add_directory(), open_directory(), change_dir_prop(),
   add_file(), open_file(), apply_textdelta(), change_file_prop(),
   close_file(), parse_fn3_uuid_record(), parse_fn3_set_revision_property(),
   parse_fn3_set_node_property(), parse_fn3_delete_node_property(),
   svn_swig_py_notify_func(), svn_swig_py_status_func(),
   svn_swig_py_delta_path_driver_cb_func(), svn_swig_py_status_func2(),
   svn_swig_py_fs_lock_callback(), svn_swig_py_repos_authz_func(),
   svn_swig_py_repos_history_func(), svn_swig_py_proplist_receiver2(),
   svn_swig_py_log_receiver(), svn_swig_py_info_receiver_func(),
   svn_swig_py_client_blame_receiver_func(),
   svn_swig_py_changelist_receiver_func(),
   svn_swig_py_auth_gnome_keyring_unlock_prompt_func(),
   svn_swig_py_auth_simple_prompt_func(),
   svn_swig_py_auth_username_prompt_func(),
   svn_swig_py_auth_ssl_server_trust_prompt_func(),
   svn_swig_py_auth_ssl_client_cert_prompt_func(),
   svn_swig_py_auth_ssl_client_cert_pw_prompt_func(),
   svn_swig_py_config_auth_walk_func(), ra_callbacks_get_wc_prop(),
   ra_callbacks_push_or_set_wc_prop(), ra_callbacks_invalidate_wc_props(),
   svn_swig_py_commit_callback(), svn_swig_py_ra_file_rev_handler_func(),
   svn_swig_py_ra_lock_callback(), reporter_set_path(),
   reporter_delete_path(), reporter_link_path(),
   wc_diff_callbacks2_file_changed_or_added(),
   wc_diff_callbacks2_file_deleted(), wc_diff_callbacks2_dir_added(),
   wc_diff_callbacks2_dir_deleted(),
   wc_diff_callbacks2_dir_props_changed(),
   svn_swig_py_config_enumerator2(),
   svn_swig_py_config_section_enumerator2()): use 's' for py2 and use 'y'
   for py3 in argment format string
  (ra_callbacks_push_or_set_wc_prop()): use PyBytes_FromStringAndSize
  instead of PyStr_FromStringAndSize
  (ra_callbacks_get_client_string()): use PyBytes_FromString instaed of
  PyStr_AsString

* subversion/bindings/swig/svn_client.i
  (%typemap(argout) apr_array_header_t **props):
   use PyBytes_FromStringAndSize() instead of PyStr_FromStringAndSize

helper python codes changes:

* subversion/bindings/swig/python/svn/core.py
  (svn_path_compare_paths): not to use cmp for py3
  (Stream::read()): chunks is now list of bytes, not list of str

* subversion/bindings/swig/python/svn/fs.py (FileDiff::get_pipe()):
  x make sure self.tempfile1 and self.self.tempfile2 are bytes
  x pass bytes to header_encoding arg in _svndiff.file_output_unified4()

* subversion/bindings/swig/python/svn/repos.py (Callbacks::__doc__):
  treat path as bytes in sample code

* subversion/bindings/swig/python/svn/repos.py
  (ChangeCollector::_make_base_path()): treat path as bytes
  (ChangeCollector::open_root(): path and basepath should be bytes in
   dir_baton
  (RevisionChangeCollector::_make_base_path()): treat path as bytes


test unit changes:

* subversion/bindings/swig/python/tests/auth.py: replace all str literals
  into bytes literals except docstring and module name

* subversion/bindings/swig/python/tests/checksum.py
  (ChecksumTestCases::test_checksum()): check if type of check_val is
   bytes instead of str

* subversion/bindings/swig/python/tests/client.py:
  x replace all str literals into bytes literals except mode arg of open(),
   args to utils.Temper methods, docstrings, and module name
  x (SubversionClientTestCase::test_uuid_from_url): check if return value
   type client.uuid_from_url() is bytes instead of str
  x (SubversionClientTestCase::test_uuid_from_path): check if return value
   type client.uuid_from_path() is bytes instead of str
  x (SubversionClientTestCase::test_merge_peg3()): open result file in
   raw mode
  x (SubversionClientTestCase::test_update4()): convert os.path.sep into
   bytes if it is str in py3

* subversion/bindings/swig/python/tests/core.py:
  x replace all str literals into bytes literals except exception messages,
   args to utils.Temper methods, docstring and module name
  x add new tests for svn_stream_*()
   (new)(SubversionCoreTestCase::test_stream_from_stringbuf()):
    test for svn_stream_stringbuf()
   (new)(SubversionCoreTestCase::test_stream_read_full()):
    test for svn_stream_read_full()
   (new)(SubversionCoreTestCase::test_stream_read2()):
    test for svn_stream_read2()
   (new)(SubversionCoreTestCase::test_stream_write_exception()):
    test for svn_stream_write() to raise exception if try to write unicode
   (new)(SubversionCoreTestCase::test_stream_write()):
    test for svn_stream_write()
   (new)(SubversionCoreTestCase::test_stream_readline()):
    test for svn_stream_readline()

* subversion/bindings/swig/python/tests/delta.py
  (DeltaTestCase::estTxWindowHandler_stream_IF(),
   DeltaTestCase::estTxWindowHandler_stream_IF()):
   x use bytes literals to make stream by using
    svn.core.svn_stream_from_stringbuf()
   x use bytes file name to make stream by using
    svn.core.svn_stream_from_aprfile2()

* subversion/bindings/swig/python/tests/fs.py
  (SubversionFSTestCase::log_message_func(): log_msg_func3 callback now
   expect bytes to return
  (SubversionFSTestCase::setUp()): pass bytes path and url to
   client.import2()
  (SubversionFSTestCase::test_diff_repos_paths_internal(),
   SubversionFSTestCase::test_diff_repos_paths_external()): pass bytes
   path1 to fs.FileDiff::__init__()

* subversion/bindings/swig/python/tests/mergeinfo.py
  (SubversionMergeinfoTestCase::TEXT_MERGEINFO1,
   SubversionMergeinfoTestCase::TEXT_MERGEINFO2,
   SubversionMergeinfoTestCase::MERGEINFO_SRC): use bytes instead of str
  (SubversionMergeinfoTestCase::test_mergeinfo_get()):
   x pass list of bytes for paths arg to repos.fs_get_mergeinfo()
   x replace str in expected_mergeinfo into bytes

* subversion/bindings/swig/python/tests/ra.py
  x replace all str literals into bytes literals except assersion message,
   args to utils.Temper methods, docstring and module name
  x (SubversionRepositoryAccessTestCase::test_get_file): compare file
   content as bytes

* subversion/bindings/swig/python/tests/repository.py
  x replace all str literals to pass into Subversion API or to compare
   with their return values into bytes literals except exception message.
  x (SubversionRepositoryTestCase:: test_dump_fs2()): treat dump and
   feedback as bytes as they are, and compose expected_feedback as bytes

* subversion/bindings/swig/python/tests/utils.py
  (Temper::alloc_empty_dir(),alloc_empty_repo) convert return value of
   tempfile.mkdtemp() into bytes
  (Temper::file_uri_for_path()): return uri as bytes

* subversion/bindings/swig/python/tests/wc.py
  x replace all str literals to pass into Subversion API or to compare
   with their return values into bytes literals except exception message.
  x (SubversionWorkingCopyTestCase::test_get_adm_dir()): check type as
   bytes
  x (SubversionWorkingCopyTestCase::test_get_pristine_copy_path(),
     test_diff_editor4()): open file as raw mode
  x (SubversionWorkingCopyTestCase::test_commit()): use bytes literal
    instead of encoded str

* subversion/bindings/swig/python/tests/trac/versioncontrol/main.py
  (Node::DIRECTORY, Node::FILE): replace to bytes
  (Node::__init__): not to convert path to str
  (Node::get_name): treat self.path as bytes
  (Changeset::ADD, Changeset::COPY, Changeset::DELETE, Changeset::EDIT,
   Changeset::MOVE): replace to bytes

* subversion/bindings/swig/python/tests/trac/versioncontrol/svn_fs.py
  (SubversionRepository::__init__(), get_oldest_rev(), youngest_rev(),
   next_rev()): treat self.path and self.scope as bytes
  (SubversionRepository::normalize_path(), get_node()): treat path as
   bytes
  (SubversionRepository::get_path_history()): replace 'unknown' Changeset
   value to bytes
  (SubversionRepository::get_deltas()): replace 'entry' arg as bytes for
   repos.svn_repos_dir_delta()
  (SubversionNode::__init__()):
   x treat self.path and self.scope as bytes
   x decode path into str for exception message on py3
  (SubversionNode::get_entries()): treat self.path as bytes
  (SubversionNode::get_properties()): property values are bytes, not str
  (DiffChangeEditor::open_root()): return bytes value

* subversion/bindings/swig/python/tests/trac/versioncontrol/tests/svn_fs.py
  x replace all str literals to pass into Subversion API or to compare
   with their return values into bytes literals except exception message.
  x (REPOS_URL) build REPOS_URL as bytes


Cheers,
--
Yasuhito FUTATSUKI

map-char-ptr-to-bytes-except-exception-patch.txt (201K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Troy Curtis Jr-2


On Mon, Dec 24, 2018 at 1:11 PM Yasuhito FUTATSUKI <[hidden email]> wrote:
On 12/17/18 5:50 AM, Yasuhito FUTATSUKI wrote:
> On 12/17/18 2:08 AM, Troy Curtis Jr wrote:

[snip]
 
Status update: I've done to make patch that default to map bytes
from char *, svn_string_t, and svn_stringbuf_t, for swig-py3@1849037.

Awesome! I'm really glad you are helping work on this, helps to keep me motivated as well ;)

So I wonder, as a guiding principle if we could support either Str or Bytes as an input? That would be akin to most
of the standard module interfaces.  I think that would be fairly easy to support, and would help ensure that the
behavior is much closer to the typical expectation. it would also reduce the number of b"" that are necessary :P Of
course, in many cases for the standard library interfaces, the input type is used as a hint to indicate what type to
return, which is much more difficult/impossible to fully support in these bindings.  That seems like it would provide
a way to make using the bindings more familiar, though we'd still have to have the policy of just always returning Bytes
objects as output. That input/output disconnect might undo any ease of use advantage than just always using Bytes
...I'm not sure exactly what my opinion on that is...

[ snip ] 

Cheers,
--
Yasuhito FUTATSUKI

The patch inline for comments:

diff --git a/build/ac-macros/swig.m4 b/build/ac-macros/swig.m4
index d865689de5..c797ae69d8 100644
--- a/build/ac-macros/swig.m4
+++ b/build/ac-macros/swig.m4
@@ -154,7 +154,7 @@ AC_DEFUN(SVN_FIND_SWIG,
           ])
 
           if test "$ac_cv_python_is_py3" = "yes"; then
-             SWIG_PY_OPTS="-python -py3"
+             SWIG_PY_OPTS="-python -py3 -DPY3"
           else
              SWIG_PY_OPTS="-python -classic"
           fi

We shouldn't define a new preprocessor value here, but rather make use of the existing IS_PY3 value defined by py3c.

diff --git a/subversion/bindings/swig/core.i b/subversion/bindings/swig/core.i
index e5a234c5e2..c309d48070 100644
--- a/subversion/bindings/swig/core.i
+++ b/subversion/bindings/swig/core.i

[snip]

@@ -442,16 +442,18 @@
 */
 #ifdef SWIGPYTHON
 %typemap(in) (const char *data, apr_size_t *len) ($*2_type temp) {
-    if (!PyStr_Check($input)) {
+    char *tmpdata;
+    Py_ssize_t length;
+    if (!PyBytes_Check($input)) {
         PyErr_SetString(PyExc_TypeError,
-                        "expecting a string for the buffer");
+                        "expecting a bytes object for the buffer");
         SWIG_fail;
     }
-    /* Explicitly cast away const from PyStr_AsUTF8AndSize (in Python 3). The
-       swig generated argument ($1) here will be "char *", but since its only
-       use is to pass directly into the "const char *" argument of the wrapped
-       function, this is safe to do. */
-    $1 = (char *)PyStr_AsUTF8AndSize($input, &temp);
 
+    if (PyBytes_AsStringAndSize($input, &tmpdata, &length) == -1) {

Why not just use "&$1" here instead of introducing a new temp variable?

+        SWIG_fail;
+    }
+    temp = ($*2_type)length;
+    $1 = tmpdata;
     $2 = ($2_ltype)&temp;
 }
 #endif
@@ -504,8 +506,8 @@
        SWIG_fail;
     }
 
-    if (PyStr_Check($input)) {
-        const char *value = PyStr_AsString($input);
+    if (PyBytes_Check($input)) {
+        const char *value = PyBytes_AsString($input);
         $1 = apr_pstrdup(_global_pool, value);
     }
     else if (PyLong_Check($input)) {

Since this function does a whole list of checks and conversions, perhaps it makes sense to support Str here in
addition to bytes? Of course in Py2 the "PyStr_Check()" would effectively be performed twice if we did that, but
I don't expect that to be a tremendous impact, and with the "cost" of an addition "IS_PY3" preprocessor check,
that could be eliminated from the Py2 build.
 
diff --git a/subversion/bindings/swig/include/svn_global.swg b/subversion/bindings/swig/include/svn_global.swg
index 191bbdd531..f7364be28f 100644
--- a/subversion/bindings/swig/include/svn_global.swg
+++ b/subversion/bindings/swig/include/svn_global.swg
@@ -31,6 +31,12 @@
 #define SVN_DEPRECATED
 #endif
 
+#ifdef SWIGPYTHON
+%begin %{
+#define SWIG_PYTHON_STRICT_BYTE_CHAR
+%}
+#endif
+
 %include typemaps.i
 %include constraints.i
 %include exception.i
@@ -136,9 +142,15 @@ static PyObject * _global_py_pool = NULL;
 /* Python format specifiers. Use Python instead of SWIG to parse
    these basic types, because Python reports better error messages
    (with correct argument numbers). */
+#if defined(PY3)

This is where we'd want to use py3c's "IS_PY3" value.
 
+%typemap (in, parse="y")
+  char *, char const *, char * const, char const * const "";
+#else
 %typemap (in, parse="s")
   char *, char const *, char * const, char const * const "";
+#endif
 %typemap (in, parse="c") char "";
+
 %typemap (in, fragment=SWIG_As_frag(long)) long
 {
   $1 = ($1_ltype)SWIG_As(long)($input);

[snip]


diff --git a/subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c b/subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c
index 2301d66770..44ec8662b7 100644
--- a/subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c
+++ b/subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c
@@ -547,23 +547,23 @@ static char *make_string_from_ob(PyObject *ob, apr_pool_t *pool)

Probably a reasonable place to support either Bytes or Str.

 {
   if (ob == Py_None)
     return NULL;
-  if (! PyStr_Check(ob))
+  if (! PyBytes_Check(ob))
     {
-      PyErr_SetString(PyExc_TypeError, "not a string");
+      PyErr_SetString(PyExc_TypeError, "not a bytes object");
       return NULL;
     }
-  return apr_pstrdup(pool, PyStr_AsString(ob));
+  return apr_pstrdup(pool, PyBytes_AsString(ob));
 }
 static svn_string_t *make_svn_string_from_ob(PyObject *ob, apr_pool_t *pool)
 {
   if (ob == Py_None)
     return NULL;
-  if (! PyStr_Check(ob))
+  if (! PyBytes_Check(ob))
     {
-      PyErr_SetString(PyExc_TypeError, "not a string");
+      PyErr_SetString(PyExc_TypeError, "not a bytes object");
       return NULL;
     }
-  return svn_string_create(PyStr_AsString(ob), pool);
+  return svn_string_create(PyBytes_AsString(ob), pool);
 }
 
 
@@ -598,7 +598,7 @@ static PyObject *convert_hash(apr_hash_t *hash,
             return NULL;
           }
         /* ### gotta cast this thing cuz Python doesn't use "const" */
-        if (PyDict_SetItemString(dict, (char *)key, value) == -1)
+        if (PyDict_SetItem(dict, PyBytes_FromString((char *)key), value) == -1)
           {
             Py_DECREF(value);
             Py_DECREF(dict);
@@ -623,10 +623,10 @@ static PyObject *convert_svn_string_t(void *value, void *ctx,
 
   const svn_string_t *s = value;
 
-  return PyStr_FromStringAndSize(s->data, s->len);
+  return PyBytes_FromStringAndSize(s->data, s->len);
 }
 
-/* Convert a C string into a Python String object (or a reference to
+/* Convert a C string into a Python Bytes object (or a reference to
    Py_None if CSTRING is NULL). */
 static PyObject *cstring_to_pystring(const char *cstring)

Doesn't returning PyBytes here make this function incorrectly named now? If this were renamed to
"cstring_to_pybytes" it would be better, since that also matches the Python2.7 idiom of bytes being alias of str.
And this is a static function, so we are safe to change it from an ABI standpoint.
 
 {
@@ -636,7 +636,7 @@ static PyObject *cstring_to_pystring(const char *cstring)
       Py_INCREF(Py_None);
       return retval;
     }
-  return PyStr_FromString(cstring);
+  return PyBytes_FromString(cstring);
 }
 
 static PyObject *convert_svn_client_commit_item3_t(void *value, void *ctx)

[snip]

@@ -1321,7 +1321,7 @@ svn_swig_py_unwrap_string(PyObject *source,
                           void *baton)

Requiring "source" to be a Bytes object, makes this function name be incorrect, since giving it a string will
cause an error.  Probably a reasonable place to support both Bytes/str. This is an exported function, so changing
the name isn't a good option.

 {
     const char **ptr_dest = destination;
-    *ptr_dest = PyStr_AsString(source);
+    *ptr_dest = PyBytes_AsString(source);
 
     if (*ptr_dest != NULL)
         return 0;
@@ -1440,7 +1440,7 @@ PyObject *svn_swig_py_array_to_list(const apr_array_header_t *array)
     for (i = 0; i < array->nelts; ++i)
       {
         PyObject *ob =
-          PyStr_FromString(APR_ARRAY_IDX(array, i, const char *));
+          PyBytes_FromString(APR_ARRAY_IDX(array, i, const char *));
         if (ob == NULL)
           goto error;
         PyList_SET_ITEM(list, i, ob);
@@ -1748,7 +1748,8 @@ static svn_error_t *delete_entry(const char *path,
 
   /* ### python doesn't have 'const' on the method name and format */
   if ((result = PyObject_CallMethod(ib->editor, (char *)"delete_entry",
-                                    (char *)"slOO&", path, revision, ib->baton,
+                                    (char *)SVN_SWIG_BYTES_FMT "lOO&",
+                                    path, revision, ib->baton,
                                     make_ob_pool, pool)) == NULL)
     {
       err = callback_exception_error();
@@ -1779,7 +1780,11 @@ static svn_error_t *add_directory(const char *path,
 
   /* ### python doesn't have 'const' on the method name and format */
   if ((result = PyObject_CallMethod(ib->editor, (char *)"add_directory",
+#if PY_VERSION_HEX >= 0x03000000
+                                    (char *)"yOylO&", path, ib->baton,
+#else
                                     (char *)"sOslO&", path, ib->baton,
+#endif

We definitly want to use the existing "IS_PY3" preprocesser value here. And yes I think the
definition SVN_SWIG_BYTES_FMT should be changed to use IS_PY3 instead of the version check as well
(which is the precedence I assume you are following).  The same applies to all the future usages
of PY_VERSION_HEX below as well.

I also think that using SVN_SWIG_BYTES_FMT here would better communicate purpose. However, I recognize
that that will really chop up the format string, so the "#if #else" seems reasonable enough. But perhaps we should *only*
include the format string the definition, just to make it more clear what piece we are doing the conditional compile on?

                                     copyfrom_path, copyfrom_revision,
                                     make_ob_pool, dir_pool)) == NULL)
     {
@@ -1810,8 +1815,8 @@ static svn_error_t *open_directory(const char *path,
 
   /* ### python doesn't have 'const' on the method name and format */
   if ((result = PyObject_CallMethod(ib->editor, (char *)"open_directory",
-                                    (char *)"sOlO&", path, ib->baton,
-                                    base_revision,
+                                    (char *)SVN_SWIG_BYTES_FMT "OlO&",
+                                    path, ib->baton, base_revision,
                                     make_ob_pool, dir_pool)) == NULL)
     {
       err = callback_exception_error();
 
[snip]


@@ -2508,10 +2542,10 @@ apr_file_t *svn_swig_py_make_file(PyObject *py_file,
   if (py_file == NULL || py_file == Py_None)
     return NULL;
 

Since this function is doing type checking between Bytes/File object anyway, perhaps it'd
be a good place to also support Str?

-  if (PyStr_Check(py_file))
+  if (PyBytes_Check(py_file))
     {
       /* input is a path -- just open an apr_file_t */
-      const char* fname = PyStr_AsString(py_file);
+      const char* fname = PyBytes_AsString(py_file);
       apr_err = apr_file_open(&apr_file, fname,
                               APR_CREATE | APR_READ | APR_WRITE,
                               APR_OS_DEFAULT, pool);

[snip]


@@ -3053,16 +3098,16 @@ svn_error_t *svn_swig_py_get_commit_log_func(const char **log_msg,
       *log_msg = NULL;
       err = SVN_NO_ERROR;
     }
-  else if (PyStr_Check(result))

If we were going to special-case places that are clearly expected to be string, this log_func place would
probably one of them.

+  else if (PyBytes_Check(result))
     {
-      *log_msg = apr_pstrdup(pool, PyStr_AsString(result));
+      *log_msg = apr_pstrdup(pool, PyBytes_AsString(result));
       Py_DECREF(result);
       err = SVN_NO_ERROR;
     }
   else
     {
       Py_DECREF(result);
-      err = callback_bad_return_error("Not a string");
+      err = callback_bad_return_error("Not a bytes object");
     }
 
  finished:
 
[snip]


diff --git a/subversion/bindings/swig/python/tests/trac/versioncontrol/main.py b/subversion/bindings/swig/python/tests/trac/versioncontrol/main.py
index 1b8a87c3f4..0d5d0618de 100644
--- a/subversion/bindings/swig/python/tests/trac/versioncontrol/main.py
+++ b/subversion/bindings/swig/python/tests/trac/versioncontrol/main.py
@@ -154,12 +154,12 @@ class Node(object):
     Represents a directory or file in the repository.
     """
 
-    DIRECTORY = "dir"
-    FILE = "file"
+    DIRECTORY = b"dir"
+    FILE = b"file"
 
     def __init__(self, path, rev, kind):
         assert kind in (Node.DIRECTORY, Node.FILE), "Unknown node kind %s" % kind
-        self.path = str(path)
+        self.path = path

Dropping use of "str()" changes the Py2 interface since in the current incarnation you could pass a number, or some object which
defines __str__ and it will get used correctly.So I think here we have to do an isinstanceof check so that we don't try to run "str"
on a bytes object, but otherwise continuing to use str()
 
         self.rev = rev
         self.kind = kind
 
[snip]

diff --git a/subversion/bindings/swig/python/tests/trac/versioncontrol/svn_fs.py b/subversion/bindings/swig/python/tests/trac/versioncontrol/svn_fs.py
index e0aef2df0d..fe92bb70b7 100644
--- a/subversion/bindings/swig/python/tests/trac/versioncontrol/svn_fs.py
+++ b/subversion/bindings/swig/python/tests/trac/versioncontrol/svn_fs.py
@@ -55,9 +55,12 @@ import os.path
 import time
 import weakref
 import posixpath
+import sys
 
 from svn import fs, repos, core, delta
 
+IS_PY3 = sys.version_info[0] >= 3
+
 _kindmap = {core.svn_node_dir: Node.DIRECTORY,
             core.svn_node_file: Node.FILE}

@@ -290,7 +293,7 @@ class SubversionNode(Node):
     def __init__(self, path, rev, authz, scope, fs_ptr):
         self.authz = authz
         self.scope = scope
-        if scope != '/':
+        if scope != b'/':
             self.scoped_path = scope + path
         else:
             self.scoped_path = path
@@ -300,7 +303,11 @@ class SubversionNode(Node):
         self.root = fs.revision_root(fs_ptr, rev)
         node_type = fs.check_path(self.root, self.scoped_path)
         if not node_type in _kindmap:
-            raise TracError("No node at %s in revision %s" % (path, rev))
+            if IS_PY3:
+                raise TracError("No node at %s in revision %s"
+                                % (path.decode('UTF-8'), rev))
+            else:
+                raise TracError("No node at %s in revision %s" % (path, rev))

Instead of having an explicit IS_PY3 check here, we should just store a unicode string in the exception object.
 
         self.created_rev = fs.node_created_rev(self.root, self.scoped_path)
         self.created_path = fs.node_created_path(self.root, self.scoped_path)
         # Note: 'created_path' differs from 'path' if the last change was a copy,
@@ -323,7 +330,7 @@ class SubversionNode(Node):
             return
         entries = fs.dir_entries(self.root, self.scoped_path)
         for item in entries:
-            path = '/'.join((self.path, item))
+            path = b'/'.join((self.path, item))
             if not self.authz.has_permission(path):
                 continue
             yield SubversionNode(path, self._requested_rev, self.authz,
@@ -350,7 +357,7 @@ class SubversionNode(Node):
     def get_properties(self):
         props = fs.node_proplist(self.root, self.scoped_path)
         for name,value in core._as_list(props.items()):
-            props[name] = str(value) # Make sure the value is a proper string
+            props[name] = value

Dropping str() here changes the API of the Py2 interface, just like the other str() drop mentioned above.
 
         return props
 
     def get_content_length(self):

 
[snip]
 
Reply | Threaded
Open this post in threaded view
|

Re: [swig-py3][patch] interfacing bytes object instead of str

Daniel Shahaf-2
Troy Curtis Jr wrote on Mon, Dec 24, 2018 at 22:20:40 -0500:

> On Mon, Dec 24, 2018 at 1:11 PM Yasuhito FUTATSUKI <[hidden email]>
> wrote:
> @@ -1321,7 +1321,7 @@ svn_swig_py_unwrap_string(PyObject *source,
> >                            void *baton)
> >
>
> Requiring "source" to be a Bytes object, makes this function name be
> incorrect, since giving it a string will cause an error.  Probably a
> reasonable place to support both Bytes/str.  This is an exported function, so
> changing the name isn't a good option.

We don't promise ABI compatibility for all exported symbols; only for
those that are part of the public API.  It's not clear to me that this
function qualifies: is it expected that anyone other than Subversion's
own code will call this function?  After all, the header file that
declares it (swigutil_py.h) isn't even installed by install-swig-py.

If it doesn't qualify as a public API function, then we're free to
change its semantics however we wish, with or without renaming it (there
are precedents for both options).

Cheers,

Daniel
12