Progress on SHA-1 fixes in patch releases?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Progress on SHA-1 fixes in patch releases?

Julian Foad-5
Hi, devs. Re. the SHA-1 fixes contemplated or in-the-works for SVN 1.8
and 1.9: Any idea on how those are coming along and when they might see
the light of day (in patch releases)?

- Julian

Reply | Threaded
Open this post in threaded view
|

Re: Progress on SHA-1 fixes in patch releases?

Stefan Sperling
On Thu, Mar 30, 2017 at 09:30:57AM +0100, Julian Foad wrote:
> Hi, devs. Re. the SHA-1 fixes contemplated or in-the-works for SVN 1.8 and
> 1.9: Any idea on how those are coming along and when they might see the
> light of day (in patch releases)?
>
> - Julian
>

As far as I can tell, these changes touch FSFS only. So while the server
can now store the offending PDF files, nothing has been done yet regarding
other areas which have problems with these files such as the working copy
(pristine store) and ra_serf (avoids fetching files already stored in
pristine store).

And perhaps there are more areas which have problems? I am not aware
of any effort to figure that out.
Reply | Threaded
Open this post in threaded view
|

Re: Progress on SHA-1 fixes in patch releases?

Paul Hammant-3
No patch releases though, right?

    http://svn.apache.org/repos/asf/subversion/trunk/CHANGES

Is that because, "the server can store the offending PDFs" isn't enough of a releasable in its own right?  Or because the "the server can store the offending PDFs" specifically is imperfect and unreleasable in some way?

- Paul

On Thu, Mar 30, 2017 at 4:42 AM, Stefan Sperling <[hidden email]> wrote:
On Thu, Mar 30, 2017 at 09:30:57AM +0100, Julian Foad wrote:
> Hi, devs. Re. the SHA-1 fixes contemplated or in-the-works for SVN 1.8 and
> 1.9: Any idea on how those are coming along and when they might see the
> light of day (in patch releases)?
>
> - Julian
>

As far as I can tell, these changes touch FSFS only. So while the server
can now store the offending PDF files, nothing has been done yet regarding
other areas which have problems with these files such as the working copy
(pristine store) and ra_serf (avoids fetching files already stored in
pristine store).

And perhaps there are more areas which have problems? I am not aware
of any effort to figure that out.

Reply | Threaded
Open this post in threaded view
|

Re: Progress on SHA-1 fixes in patch releases?

Branko Čibej
On 30.03.2017 12:01, Paul Hammant wrote:
> No patch releases though, right?
>
>     http://svn.apache.org/repos/asf/subversion/trunk/CHANGES

Eh? Patch release changelogs get written when the patch is released.



Reply | Threaded
Open this post in threaded view
|

Re: Progress on SHA-1 fixes in patch releases?

Paul Hammant-3
Eh? Patch release changelogs get written when the patch is released.

I'm not following what you're saying there.

Reply | Threaded
Open this post in threaded view
|

Re: Progress on SHA-1 fixes in patch releases?

Branko Čibej
On 30.03.2017 12:26, Paul Hammant wrote:
>> Eh? Patch release changelogs get written when the patch is released.
> I'm not following what you're saying there.


The fact that there's no entry for 1.9.6 in CHANGES does not mean that
1.9.6 will not be released, nor does it mean that the hash collision
related fixes won't be backported. The changelog is updated as part of
the patch release process, not before.

-- Brane
Reply | Threaded
Open this post in threaded view
|

Re: Progress on SHA-1 fixes in patch releases?

Paul Hammant-3
My apologies. When I wrote "No patch releases though, right?" my intention was to communicate "No patch releases YET though, right?

-ph

On Thu, Mar 30, 2017 at 6:28 AM, Branko Čibej <[hidden email]> wrote:
On 30.03.2017 12:26, Paul Hammant wrote:
>> Eh? Patch release changelogs get written when the patch is released.
> I'm not following what you're saying there.


The fact that there's no entry for 1.9.6 in CHANGES does not mean that
1.9.6 will not be released, nor does it mean that the hash collision
related fixes won't be backported. The changelog is updated as part of
the patch release process, not before.

-- Brane

Reply | Threaded
Open this post in threaded view
|

Re: Progress on SHA-1 fixes in patch releases?

Stefan Sperling
On Thu, Mar 30, 2017 at 06:30:55AM -0400, Paul Hammant wrote:
> My apologies. When I wrote "No patch releases though, right?" my intention
> was to communicate "No patch releases YET though, right?

There are no patch releases yet, no.
Reply | Threaded
Open this post in threaded view
|

RE: Progress on SHA-1 fixes in patch releases?

Bert Huijben-5

The server side fixes currently need a third vote for backporting. These fixes are nominated for 1.8.x and 1.9.x.

 

I don’t think we can really do much more on the released versions without a backwards incompatible change to the working copy… And I’m still not convinced that we should really change the WC code just to support using Subversion to explicitly store collisions.

 

 

 

On trunk there are even more problems, as Ivan committed a change that made us rely on sha-1 even more than before to check for local changes. I already pinged that thread a few times, but there is no progress there.

 

If no progress is reported on the discussed points I would like to see that change to notice differences (at least temporarily) reverted.

 

     Bert

 

Sent from Mail for Windows 10

 

From: [hidden email]
Sent: donderdag 30 maart 2017 13:23
To: [hidden email]
Cc: [hidden email]
Subject: Re: Progress on SHA-1 fixes in patch releases?

 

On Thu, Mar 30, 2017 at 06:30:55AM -0400, Paul Hammant wrote:

> My apologies. When I wrote "No patch releases though, right?" my intention

> was to communicate "No patch releases YET though, right?

 

There are no patch releases yet, no.

 

Reply | Threaded
Open this post in threaded view
|

Re: Progress on SHA-1 fixes in patch releases?

Stefan Sperling
On Thu, Mar 30, 2017 at 07:04:29PM +0200, Bert Huijben wrote:
> The server side fixes currently need a third vote for backporting. These
> fixes are nominated for 1.8.x and 1.9.x.
>
> I don’t think we can really do much more on the released versions without a
> backwards incompatible change to the working copy… And I’m still not
> convinced that we should really change the WC code just to support using
> Subversion to explicitly store collisions.

Storing the collisions in the WC without a format change is not my goal
either. I don't think anyone is trying to convince you of that.
But I think it would be good if the client did not throw errors we do
not anticipate. It could at least try to detect the situation and error
out in a reasonable way.

Has anyone tested what happens on the client side now with the FSFS
fixed applied? I did not yet find time to do so.  

And does 'svnadmin dump/load' work now or is it still broken if the
SHA1-colliding PDF files are in the repository?

Should we disable ra_serf's callback for fetching content from the
pristine store instead of from the repository when SHA1 matches?
This could be done without a format change.

> On trunk there are even more problems, as Ivan committed a change that made
> us rely on sha-1 even more than before to check for local changes. I already
> pinged that thread a few times, but there is no progress there.

Yes, I agree this should be looked at.
Reply | Threaded
Open this post in threaded view
|

Re: Progress on SHA-1 fixes in patch releases?

Daniel Shahaf-2
Let's use jira or moinmoin to track all the different issues that need
looking into?  I count at least fsfs, fsx, svnadmin load, libsvn_wc, and
zhakov's change that Bert mentioned.

(We can use SVN-4673 as a parent issue.)
Reply | Threaded
Open this post in threaded view
|

Re: Progress on SHA-1 fixes in patch releases?

Stefan Fuhrmann
On 30.03.2017 21:38, Daniel Shahaf wrote:
> Let's use jira or moinmoin to track all the different issues that need
> looking into?  I count at least fsfs, fsx, svnadmin load, libsvn_wc, and
> zhakov's change that Bert mentioned.

What is the problem with 'svnadmin load'?

It uses SHA1 and MD5 to verify that the data is valid
but will not detect intended changes that transformed
the dump stream from one valid / consistent state to
another. Apart from that, the FS backend should now
handle collisions just fine.

With FSFS and FSX fixed in /trunk (FSX not with the
backport IMO), there is only the wc issue left. Not sure
how much efficiency we want to sacrifice there. For
instance, switching between branches will become
expensive again.

-- Stefan^2.
Reply | Threaded
Open this post in threaded view
|

Re: Progress on SHA-1 fixes in patch releases?

Daniel Shahaf-2
Stefan Fuhrmann wrote on Mon, Apr 17, 2017 at 22:17:46 +0200:
> On 30.03.2017 21:38, Daniel Shahaf wrote:
> >Let's use jira or moinmoin to track all the different issues that need
> >looking into?  I count at least fsfs, fsx, svnadmin load, libsvn_wc, and
> >zhakov's change that Bert mentioned.
>
> What is the problem with 'svnadmin load'?

It's not possible to load a dumpfile that contains a sha1 collision:

[[[
% wget https://shattered.it/static/shattered-1.pdf https://shattered.it/static/shattered-2.pdf
% svnadmin create r
% svn co file://$PWD/r wc
% cd wc
wc% cp ../shattered-* .
wc% svn add *
wc% svn ci -madd
wc% cd ../
% svnadmin create r2
% svnadmin dump -r > dump
% svnadmin load r2 < dump
<<< Started new transaction, based on original revision 1
     * editing path : shattered-1.pdf ... done.
     * editing path : shattered-2.pdf ...svnadmin: E200014: Checksum mismatch for '/shattered-2.pdf':
   expected:  5bd9d8cabc46041579a311230539b8d1
     actual:  ee4aa52b139d925f8d8884402b0a750c

zsh: exit 1     svnadmin load r2 < dump
%
]]]

That's with 1.9.5.  Is it fixed on trunk now?  I'm not sure whether
r1785754 addresses that.

Incidentally: the error code added in r1785754 is misspelled:
SVN_ERR_FS_AMBIUGOUS_CHECKSUM_REP should be
SVN_ERR_FS_AMBIGUOUS_CHECKSUM_REP.
---------------^^

Not only in the error message, also in the code.

> It uses SHA1 and MD5 to verify that the data is valid
> but will not detect intended changes that transformed
> the dump stream from one valid / consistent state to
> another. Apart from that, the FS backend should now
> handle collisions just fine.
>
> With FSFS and FSX fixed in /trunk (FSX not with the
> backport IMO), there is only the wc issue left. Not sure
> how much efficiency we want to sacrifice there. For
> instance, switching between branches will become
> expensive again.
>
> -- Stefan^2.

Cheers,

Daniel
Reply | Threaded
Open this post in threaded view
|

Re: Progress on SHA-1 fixes in patch releases?

Daniel Shahaf-2
Daniel Shahaf wrote on Tue, Apr 18, 2017 at 00:54:20 +0000:
> Incidentally: the error code added in r1785754 is misspelled:
> SVN_ERR_FS_AMBIUGOUS_CHECKSUM_REP should be
> SVN_ERR_FS_AMBIGUOUS_CHECKSUM_REP.
> ---------------^^
>
> Not only in the error message, also in the code.

s/error message/log message/
Reply | Threaded
Open this post in threaded view
|

Re: Progress on SHA-1 fixes in patch releases?

Stefan Sperling
In reply to this post by Daniel Shahaf-2
On Tue, Apr 18, 2017 at 12:54:20AM +0000, Daniel Shahaf wrote:

> % svnadmin load r2 < dump
> <<< Started new transaction, based on original revision 1
>      * editing path : shattered-1.pdf ... done.
>      * editing path : shattered-2.pdf ...svnadmin: E200014: Checksum mismatch for '/shattered-2.pdf':
>    expected:  5bd9d8cabc46041579a311230539b8d1
>      actual:  ee4aa52b139d925f8d8884402b0a750c
>
> zsh: exit 1     svnadmin load r2 < dump
> %
> ]]]
>
> That's with 1.9.5.  Is it fixed on trunk now?  I'm not sure whether
> r1785754 addresses that.

Yes, this is fixed on trunk :-)

<<< Started new transaction, based on original revision 3
     * editing path : trunk/shattered-1.pdf ... done.
svnadmin: warning: apr_err=SVN_ERR_FS_AMBIUGOUS_CHECKSUM_REP
svnadmin: warning: W160067: SHA1 of reps '-1 3 381130 422435 5bd9d8cabc46041579a311230539b8d1 38762cf7f55934b34d179ae6a4c80cadccbb7f0a 2-2/_5' and '-1 0 381130
422435 5bd9d8cabc46041579a311230539b8d1 38762cf7f55934b34d179ae6a4c80cadccbb7f0a 2-2/_5' matches (38762cf7f55934b34d179ae6a4c80cadccbb7f0a) but contents differ
     * editing path : trunk/shattered-2.pdf ... done.

------- Committed revision 3 >>>

Of course, the working copy and RA protocols are still broken.
Editing both files and attempting a commit results in an error:

Sending        shattered-1.pdf
Sending        shattered-2.pdf
Transmitting file data ..subversion/svn/commit-cmd.c:185,
subversion/libsvn_client/commit.c:992,
subversion/libsvn_client/commit.c:156: (apr_err=SVN_ERR_CHECKSUM_MISMATCH)
svn: E200014: Commit failed (details follow):
subversion/libsvn_client/commit.c:904,
subversion/libsvn_client/commit_util.c:1933,
subversion/libsvn_wc/adm_crawler.c:1105,
subversion/libsvn_repos/commit.c:585,
subversion/libsvn_fs/fs-loader.c:1595,
subversion/libsvn_fs_fs/tree.c:3060,
subversion/libsvn_subr/checksum.c:658: (apr_err=SVN_ERR_CHECKSUM_MISMATCH)
svn: E200014: Base checksum mismatch on '/trunk/shattered-2.pdf':
   expected:  ee4aa52b139d925f8d8884402b0a750c
     actual:  5bd9d8cabc46041579a311230539b8d1

I suppose that is acceptable for now. The repository's health has priority.
Reply | Threaded
Open this post in threaded view
|

Re: Progress on SHA-1 fixes in patch releases?

Stefan Sperling
On Tue, May 09, 2017 at 11:38:51AM +0200, Stefan Sperling wrote:

> On Tue, Apr 18, 2017 at 12:54:20AM +0000, Daniel Shahaf wrote:
> > % svnadmin load r2 < dump
> > <<< Started new transaction, based on original revision 1
> >      * editing path : shattered-1.pdf ... done.
> >      * editing path : shattered-2.pdf ...svnadmin: E200014: Checksum mismatch for '/shattered-2.pdf':
> >    expected:  5bd9d8cabc46041579a311230539b8d1
> >      actual:  ee4aa52b139d925f8d8884402b0a750c
> >
> > zsh: exit 1     svnadmin load r2 < dump
> > %
> > ]]]
> >
> > That's with 1.9.5.  Is it fixed on trunk now?  I'm not sure whether
> > r1785754 addresses that.
>
> Yes, this is fixed on trunk :-)

I have now voted for and backported whatever I could to 1.9.x.

Please add your votes to the STATUS file as well so we can finally
release these fixes (or improve them, if there are still concerns).

Thanks!
Reply | Threaded
Open this post in threaded view
|

Re: Progress on SHA-1 fixes in patch releases?

Daniel Shahaf-2
In reply to this post by Stefan Sperling
Stefan Sperling wrote on Thu, Mar 30, 2017 at 19:40:10 +0200:
> Should we disable ra_serf's callback for fetching content from the
> pristine store instead of from the repository when SHA1 matches?
> This could be done without a format change.

On IRC today, Johan and I both think that that optimisation should be
disabled, since correctness has priority over performance.
Reply | Threaded
Open this post in threaded view
|

[PATCH] reject SHA1 collisions (was: Re: Progress on SHA-1 fixes in patch releases?)

Stefan Sperling
In reply to this post by Stefan Sperling
On Tue, May 09, 2017 at 01:39:49PM +0200, Stefan Sperling wrote:

> On Tue, May 09, 2017 at 11:38:51AM +0200, Stefan Sperling wrote:
> > On Tue, Apr 18, 2017 at 12:54:20AM +0000, Daniel Shahaf wrote:
> > > % svnadmin load r2 < dump
> > > <<< Started new transaction, based on original revision 1
> > >      * editing path : shattered-1.pdf ... done.
> > >      * editing path : shattered-2.pdf ...svnadmin: E200014: Checksum mismatch for '/shattered-2.pdf':
> > >    expected:  5bd9d8cabc46041579a311230539b8d1
> > >      actual:  ee4aa52b139d925f8d8884402b0a750c
> > >
> > > zsh: exit 1     svnadmin load r2 < dump
> > > %
> > > ]]]
> > >
> > > That's with 1.9.5.  Is it fixed on trunk now?  I'm not sure whether
> > > r1785754 addresses that.
> >
> > Yes, this is fixed on trunk :-)
>
> I have now voted for and backported whatever I could to 1.9.x.
>
> Please add your votes to the STATUS file as well so we can finally
> release these fixes (or improve them, if there are still concerns).

On IRC, Branko and Johan raised concerns about the proposed backport.

The proposed backport allows files with SHA1 collisions into the repository
and avoids de-duplication of such content by the rep-cache. It fixes the
integrity problem with the rep-cache but other problems remain.

Clients will still suffer from the consequences of having such files in
the repository. The RA layer cannot de-duplicate content, the working
copy's pristine store is broken by such content, etc.

So the question is whether we want to even store such content before all
parts of the system are ready to handle it. Some users will likely keep
asking for a way to reject this content because of the ripple effects
it can cause. Many users may perceive a risk even if "only" working copies
can be rendered unusable by committing such content, because dealing with
fallout can be highly disruptive to the actual work people have to get done.

We can block such content from entering the repository with a patch such
as the one below. This check will only be done if rep-sharing is active.
But since rep-sharing is active by default this patch will work for most
of our users.

This could be further extended by the config knob to give users a choice.
I don't see a good way of adding such a knob in a patch release, though.
Perhaps that could be done for 1.10.

Index: subversion/libsvn_fs_fs/transaction.c
===================================================================
--- subversion/libsvn_fs_fs/transaction.c (revision 1794541)
+++ subversion/libsvn_fs_fs/transaction.c (working copy)
@@ -2430,12 +2430,10 @@ get_shared_rep(representation_t **old_rep,
                                       scratch_pool);
 
       /* A mismatch should be extremely rare.
-       * If it does happen, log the event and don't share the rep. */
+       * If it does happen, reject the commit. */
       if (!same || err)
         {
-          /* SHA1 collision or worse.
-             We can continue without rep-sharing, but warn.
-            */
+          /* SHA1 collision or worse. */
           svn_stringbuf_t *old_rep_str
             = svn_fs_fs__unparse_representation(*old_rep,
                                                 ffd->format, FALSE,
@@ -2449,15 +2447,11 @@ get_shared_rep(representation_t **old_rep,
           const char *checksum__str
             = svn_checksum_to_cstring_display(&checksum, scratch_pool);
 
-          err = svn_error_createf(SVN_ERR_FS_AMBIUGOUS_CHECKSUM_REP,
-                                  err, "SHA1 of reps '%s' and '%s' "
-                                  "matches (%s) but contents differ",
-                                  old_rep_str->data, rep_str->data,
-                                  checksum__str);
-
-          (fs->warning)(fs->warning_baton, err);
-          svn_error_clear(err);
-          *old_rep = NULL;
+          return svn_error_createf(SVN_ERR_FS_AMBIUGOUS_CHECKSUM_REP,
+                                   err, "SHA1 of reps '%s' and '%s' "
+                                   "matches (%s) but contents differ",
+                                   old_rep_str->data, rep_str->data,
+                                   checksum__str);
         }
 
       /* Restore FILE's read / write position. */
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] reject SHA1 collisions (was: Re: Progress on SHA-1 fixes in patch releases?)

Mark Phippard-3
On Tue, May 9, 2017 at 9:25 AM, Stefan Sperling <[hidden email]> wrote:
On IRC, Branko and Johan raised concerns about the proposed backport.

The proposed backport allows files with SHA1 collisions into the repository
and avoids de-duplication of such content by the rep-cache. It fixes the
integrity problem with the rep-cache but other problems remain.

This approach makes a lot of sense to me.



--
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] reject SHA1 collisions (was: Re: Progress on SHA-1 fixes in patch releases?)

Jacek Materna
+1 on rejection.

On Tue, May 9, 2017 at 3:37 PM, Mark Phippard <[hidden email]> wrote:

> On Tue, May 9, 2017 at 9:25 AM, Stefan Sperling <[hidden email]> wrote:
>>
>> On IRC, Branko and Johan raised concerns about the proposed backport.
>>
>> The proposed backport allows files with SHA1 collisions into the
>> repository
>> and avoids de-duplication of such content by the rep-cache. It fixes the
>> integrity problem with the rep-cache but other problems remain.
>
>
> This approach makes a lot of sense to me.
>
>
>
> --
> Thanks
>
> Mark Phippard
> http://markphip.blogspot.com/



--

Jacek Materna
CTO

Assembla
210-410-7661
12