Re: Error E140001: Sum of subblock sizes larger than total block content length

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Error E140001: Sum of subblock sizes larger than total block content length

Julian Foad-5
(dropping users@)

Julian Foad wrote:
> The attached patch should fix it; not yet tested.

I have opened https://issues.apache.org/jira/browse/SVN-4707
and attached the patch there. (Patch v2 is same as v1 but with tweaked
log message.)

We briefly discussed testing. It's a pretty obvious fix, but ideally we
would write a regression test. We don't want to store 2 GB* of temporary
data during a test run (that would make testing onerous) so we would
need to write a test that generates data, streams it to the rdump 'dump'
function, pipes that into the rdump 'load', and checks that it parses
without throwing errors, without storing the loaded data.

Anyone interested in writing such a test?

   * Actually we should test with over 4 GB because as well as the "%ld"
bug I found and fixed a "%lu" bug in nearby code at the same time.

- Julian
Reply | Threaded
Open this post in threaded view
|

Re: Error E140001: Sum of subblock sizes larger than total block content length

Julian Foad-5
Julian Foad wrote:
> (dropping users@)
>
> Julian Foad wrote:
>> The attached patch should fix it; not yet tested.

Proposed for backport to 1.8.x.

- Julian

> I have opened https://issues.apache.org/jira/browse/SVN-4707
> and attached the patch there. (Patch v2 is same as v1 but with tweaked
> log message.)
>
> We briefly discussed testing. It's a pretty obvious fix, but ideally we
> would write a regression test. We don't want to store 2 GB* of temporary
> data during a test run (that would make testing onerous) so we would
> need to write a test that generates data, streams it to the rdump 'dump'
> function, pipes that into the rdump 'load', and checks that it parses
> without throwing errors, without storing the loaded data.
>
> Anyone interested in writing such a test?
>
>    * Actually we should test with over 4 GB because as well as the "%ld"
> bug I found and fixed a "%lu" bug in nearby code at the same time.
>
> - Julian
Reply | Threaded
Open this post in threaded view
|

Re: Error E140001: Sum of subblock sizes larger than total block content length

Ronald Taneza
Hi Julian,

Thank you for your quick response and patch. I hope that this is fixed in the next 1.8.x release and that CollabNet will also release an update to Subversion Edge.

I'm still not so clear how this is exactly a problem with svnrdump 1.8.19 (from Subversion Edge). This is my first time browsing through the svn source code, and I hope you'll indulge me with my questions below.

We are using Subversion Edge 5.2.2 (Windows 64-bit) on a Windows Server 2012 (64-bit) OS.
I also verified that the httpd.exe, svn.exe, and svnrdump.exe binaries are 64-bit. (I ran the "file xxx" command in Cygwin and also checked their PE signatures as described here: https://superuser.com/questions/358434/how-to-check-if-a-binary-is-32-or-64-bit-on-windows).

You mentioned:

> SVN_ERR(svn_stream_printf(eb->stream, pool,
                               SVN_REPOS_DUMPFILE_CONTENT_LENGTH
                              ": %ld\n\n",
                              (unsigned long)info->size +
                                propstring->len));

> info->size is apr_off_t ... probably 64 bits on most systems.
> propstring->len is apr_size_t ... probably 64 bits on most systems.

> It uses "%lu" for the text content, which thus work OK up to 4 GB, and "%ld" for the overall content length which thus only works up to 2 GB.

On a 64-bit system where unsigned long is uint64:
If info->size is int64 and propstring->len is uint64, then the expression "(unsigned long)info->size + propstring->len" will produce uint64. This result should be printed with "%lu" (as in your patch). However, printing the result with "%ld" will still print a positive value (signed int64), I believe.

If for some reason, info->size is just int32, then it can only store max int32 (2 GB). So the earlier call to apr_file_info_get should already have failed if the file size is greater than 2 GB.

      err = apr_file_info_get(info, APR_FINFO_SIZE, eb->delta_file);
      if (err)
        SVN_ERR(svn_error_wrap_apr(err, NULL));

However, apr_file_info_get did not return an error; so it just somehow returned a negative value in info->size?

Regards,
Ronald


On Wed, Nov 22, 2017 at 6:05 PM, Julian Foad <[hidden email]> wrote:
Julian Foad wrote:
(dropping users@)

Julian Foad wrote:
The attached patch should fix it; not yet tested.

Proposed for backport to 1.8.x.

- Julian


I have opened https://issues.apache.org/jira/browse/SVN-4707
and attached the patch there. (Patch v2 is same as v1 but with tweaked log message.)

We briefly discussed testing. It's a pretty obvious fix, but ideally we would write a regression test. We don't want to store 2 GB* of temporary data during a test run (that would make testing onerous) so we would need to write a test that generates data, streams it to the rdump 'dump' function, pipes that into the rdump 'load', and checks that it parses without throwing errors, without storing the loaded data.

Anyone interested in writing such a test?

   * Actually we should test with over 4 GB because as well as the "%ld" bug I found and fixed a "%lu" bug in nearby code at the same time.

- Julian

Reply | Threaded
Open this post in threaded view
|

Re: Error E140001: Sum of subblock sizes larger than total block content length

Julian Foad-5
Ronald Taneza wrote:
> Hi Julian,
>
> Thank you for your quick response and patch. I hope that this is fixed
> in the next 1.8.x release and that CollabNet will also release an update
> to Subversion Edge.

It should be.

> I'm still not so clear how this is exactly a problem with svnrdump
> 1.8.19 (from Subversion Edge). This is my first time browsing through
> the svn source code, and I hope you'll indulge me with my questions below.
>
> We are using Subversion Edge 5.2.2 (Windows 64-bit) on a Windows Server
> 2012 (64-bit) OS.

This bug occurred because the "%ld" format and/or the 'long' data type
were 32-bit. This is indeed the case:

On 64-bit Windows, 'long' is a 32-bit type (and 'long long' is 64-bit;
on most other platforms both are 64-bit.)

(
https://stackoverflow.com/questions/384502/what-is-the-bit-size-of-long-on-64-bit-windows#384672 
)

> I also verified that the httpd.exe, svn.exe, and svnrdump.exe binaries
> are 64-bit. (I ran the "file xxx" command in Cygwin and also checked
> their PE signatures as described here:
> https://superuser.com/questions/358434/how-to-check-if-a-binary-is-32-or-64-bit-on-windows).
>
> You mentioned:
>
>  > SVN_ERR(svn_stream_printf(eb->stream, pool,
>                                 SVN_REPOS_DUMPFILE_CONTENT_LENGTH
>                                ": %ld\n\n",
>                                (unsigned long)info->size +
>                                  propstring->len));
>
>  > info->size is apr_off_t ... probably 64 bits on most systems.
>  > propstring->len is apr_size_t ... probably 64 bits on most systems.

I was wrong about that: apr_size_t would be 32-bit on 32-bit
architectures. File sizes (apr_off_t) are more likely to be 64-bit,
although there may be some platforms where they are 32-bit, maybe
depending on compile-time options or something that is configurable.

Therefore my initial patch (changing to use APR_SIZE_T_FMT) was wrong
(would not have worked on 32-bit architectures). I have now corrected
that to use APR_OFF_T_FMT.

>  > It uses "%lu" for the text content, which thus work OK up to 4 GB,
> and "%ld" for the overall content length which thus only works up to 2 GB.

When I wrote that, I was assuming a 32-bit architecture.

> On a 64-bit system where unsigned long is uint64:

On a standard Windows-64 system, it's not.

I hope that clears up the issue.

- Julian
Reply | Threaded
Open this post in threaded view
|

Re: Error E140001: Sum of subblock sizes larger than total block content length

Ronald Taneza
Hi Julian,

> This bug occurred because the "%ld" format and/or the 'long' data type were 32-bit. This is indeed the case:
> On 64-bit Windows, 'long' is a 32-bit type (and 'long long' is 64-bit; on most other platforms both are 64-bit.)

Ok, that explains it. I did not know that 'long' is 32 bits on 64-bit Windows. Most of my experience with C is on embedded systems where we use specific integer sizes (like C99 stdint.h types).

Thanks!

Regards,
Ronald


On Thu, Nov 23, 2017 at 12:17 PM, Julian Foad <[hidden email]> wrote:
Ronald Taneza wrote:
Hi Julian,

Thank you for your quick response and patch. I hope that this is fixed in the next 1.8.x release and that CollabNet will also release an update to Subversion Edge.

It should be.

I'm still not so clear how this is exactly a problem with svnrdump 1.8.19 (from Subversion Edge). This is my first time browsing through the svn source code, and I hope you'll indulge me with my questions below.

We are using Subversion Edge 5.2.2 (Windows 64-bit) on a Windows Server 2012 (64-bit) OS.

This bug occurred because the "%ld" format and/or the 'long' data type were 32-bit. This is indeed the case:

On 64-bit Windows, 'long' is a 32-bit type (and 'long long' is 64-bit; on most other platforms both are 64-bit.)

( https://stackoverflow.com/questions/384502/what-is-the-bit-size-of-long-on-64-bit-windows#384672 )

I also verified that the httpd.exe, svn.exe, and svnrdump.exe binaries are 64-bit. (I ran the "file xxx" command in Cygwin and also checked their PE signatures as described here: https://superuser.com/questions/358434/how-to-check-if-a-binary-is-32-or-64-bit-on-windows).

You mentioned:

 > SVN_ERR(svn_stream_printf(eb->stream, pool,
                                SVN_REPOS_DUMPFILE_CONTENT_LENGTH
                               ": %ld\n\n",
                               (unsigned long)info->size +
                                 propstring->len));

 > info->size is apr_off_t ... probably 64 bits on most systems.
 > propstring->len is apr_size_t ... probably 64 bits on most systems.

I was wrong about that: apr_size_t would be 32-bit on 32-bit architectures. File sizes (apr_off_t) are more likely to be 64-bit, although there may be some platforms where they are 32-bit, maybe depending on compile-time options or something that is configurable.

Therefore my initial patch (changing to use APR_SIZE_T_FMT) was wrong (would not have worked on 32-bit architectures). I have now corrected that to use APR_OFF_T_FMT.

 > It uses "%lu" for the text content, which thus work OK up to 4 GB, and "%ld" for the overall content length which thus only works up to 2 GB.

When I wrote that, I was assuming a 32-bit architecture.

On a 64-bit system where unsigned long is uint64:

On a standard Windows-64 system, it's not.

I hope that clears up the issue.

- Julian