LZ4 compression

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

LZ4 compression

Julian Foad-5
Evgeny Kotkov wrote (in the 'Proposal: new fsfs.conf properties' thread):
> To improve the situation with slow commits of large binary and, possibly,
> incompressible files I committed a patch (http://svn.apache.org/r1801940)
> that adds initial support for LZ4 compression in the backend.

This sounds amazing!

and in the log message:
> [...] The interoperability is implemented by bumping the format of > svndiff to 2 and the repository file system format to 8. [...]

Can you state simply what facets of Subversion this benefits so far? It
looks like it is the compression/deltification internal to FSFS, stored
in rev files, and not exposed outside FSFS, so it affects server speed
(and thus commit speed) and server storage space. Any significant effect
on speed of reading the repo (decompression)?

What parts of Subversion could this usefully be extended to affect?
(Speed of compression/deltification performed on the client for commits?
Size reduction of data sent over the wire to/from new enough clients?)

> Currently, LZ4 compression is enabled if the fsfs.conf file specifies
> compression-level=1, and all other levels still use zlib for compression

Is this just a 'safe' starting point for testing and will we likely
change this to use LZ4 for all 'compression-level' settings?

- Julian

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: LZ4 compression

Evgeny Kotkov
LZ4 offers much faster decompression than zlib, and read operations should benefit from this change as well.  (Don't have the exact numbers available here on my phone, sorry for that.)

https://svn.apache.org/r1801974 adds negotiation and support for LZ4 in mod_dav_svn and in ra_serf.  Except for the PUT requests, which require a couple of tweaks to the negotiation scheme, with this change svndiff2 with LZ4 will be used in the repository on-disk data and as the wire format for http://, if the corresponding compression level is set to 1.

Speaking of only using it with compression level 1, that's not a starting point.  While LZ4 offers superior speeds, it is not a substitute for any zlib compression level > 1, including our current default of 5, as the latter gives better compression ratio.

I was thinking that it might make sense to make the compression level 1 our new default, so that new installations would benefit from the increased speed, while still keeping a decent compression ratio.  (If necessary, that could still be tweaked for better compression.)


Regards,
Evgeny Kotkov

[On vacation, from the mobile; please excuse top-posting and brevity.]

On Monday, July 17, 2017, Julian Foad <[hidden email]> wrote:
Evgeny Kotkov wrote (in the 'Proposal: new fsfs.conf properties' thread):
To improve the situation with slow commits of large binary and, possibly,
incompressible files I committed a patch (http://svn.apache.org/r1801940)
that adds initial support for LZ4 compression in the backend.

This sounds amazing!

and in the log message:
[...] The interoperability is implemented by bumping the format of > svndiff to 2 and the repository file system format to 8. [...]

Can you state simply what facets of Subversion this benefits so far? It looks like it is the compression/deltification internal to FSFS, stored in rev files, and not exposed outside FSFS, so it affects server speed (and thus commit speed) and server storage space. Any significant effect on speed of reading the repo (decompression)?

What parts of Subversion could this usefully be extended to affect?
(Speed of compression/deltification performed on the client for commits? Size reduction of data sent over the wire to/from new enough clients?)

Currently, LZ4 compression is enabled if the fsfs.conf file specifies
compression-level=1, and all other levels still use zlib for compression

Is this just a 'safe' starting point for testing and will we likely change this to use LZ4 for all 'compression-level' settings?

- Julian

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: LZ4 compression

Stefan Sperling
On Mon, Jul 17, 2017 at 02:54:59PM +0200, Evgeny Kotkov wrote:

> LZ4 offers much faster decompression than zlib, and read operations should
> benefit from this change as well.  (Don't have the exact numbers available
> here on my phone, sorry for that.)
>
> https://svn.apache.org/r1801974 adds negotiation and support for LZ4 in
> mod_dav_svn and in ra_serf.  Except for the PUT requests, which require a
> couple of tweaks to the negotiation scheme, with this change svndiff2 with
> LZ4 will be used in the repository on-disk data and as the wire format for
> http://, if the corresponding compression level is set to 1.
>
> Speaking of only using it with compression level 1, that's not a starting
> point.  While LZ4 offers superior speeds, it is not a substitute for any
> zlib compression level > 1, including our current default of 5, as the
> latter gives better compression ratio.
>
> I was thinking that it might make sense to make the compression level 1 our
> new default, so that new installations would benefit from the increased
> speed, while still keeping a decent compression ratio.  (If necessary, that
> could still be tweaked for better compression.)

This feature would be a nice addition to 1.10.

Is 1.10 your target release? Given that it's on trunk already,
it would seem so :)

Is there any reason to wait before rolling a 1.10 alpha release?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: LZ4 compression

Julian Foad-5
Stefan Sperling wrote:
> On Mon, Jul 17, 2017 at 02:54:59PM +0200, Evgeny Kotkov wrote:
[...]
>> LZ4 offers much faster decompression than zlib, and read operations should
>> benefit from this change as well. [...]

Thanks for the summary, Evgeny.

>> https://svn.apache.org/r1801974 adds negotiation and support for LZ4 in
>> mod_dav_svn and in ra_serf.  Except for the PUT requests, [...]

It would be nice to see the same negotiation added for svnserve
protocol, if anyone is willing to do that.

>> Speaking of only using it with compression level 1, that's not a starting
>> point.  While LZ4 offers superior speeds, it is not a substitute for any
>> zlib compression level > 1, including our current default of 5, as the
>> latter gives better compression ratio.

>> I was thinking that it might make sense to make the compression level 1 our
>> new default, so that new installations would benefit from the increased
>> speed, while still keeping a decent compression ratio.  (If necessary, that
>> could still be tweaked for better compression.)
>
> This feature would be a nice addition to 1.10.
>
> Is 1.10 your target release? Given that it's on trunk already,
> it would seem so :)
>
> Is there any reason to wait before rolling a 1.10 alpha release?

danielsh made suggestions about the config option, in reply to the
commit email, which want opinions/follow-up, but need not necessarily
delay an alpha.

- Julian

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: LZ4 compression

Daniel Shahaf-5
Julian Foad wrote on Mon, 17 Jul 2017 16:09 +0100:
> Stefan Sperling wrote:
> > On Mon, Jul 17, 2017 at 02:54:59PM +0200, Evgeny Kotkov wrote:
> > Is there any reason to wait before rolling a 1.10 alpha release?
>
> danielsh made suggestions about the config option, in reply to the
> commit email, which want opinions/follow-up, but need not necessarily
> delay an alpha.

I can roll an alpha tomorrow.  We'll want to write some release note text
about FSFS f8, reiterating the usual "No upgrade path is promised" warning.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: LZ4 compression

Paul Hammant-3
In reply to this post by Evgeny Kotkov

 Evgeny  wrote:

Speaking of only using it with compression level 1, that's not a starting point.  While LZ4 offers superior speeds, it is not a substitute for any zlib compression level > 1, including our current default of 5, as the latter gives better compression ratio.

I was thinking that it might make sense to make the compression level 1 our new default, so that new installations would benefit from the increased speed, while still keeping a decent compression ratio.  (If necessary, that could still be tweaked for better compression.)

What if the numerical series was stepped back from?  This new compression scheme goes in as 'simple-lz4' rather than a number and others in the future are named too?

-ph  
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: LZ4 compression

Stefan Sperling
In reply to this post by Evgeny Kotkov
On Mon, Jul 17, 2017 at 02:54:59PM +0200, Evgeny Kotkov wrote:

> https://svn.apache.org/r1801974 adds negotiation and support for LZ4 in
> mod_dav_svn and in ra_serf.  Except for the PUT requests, which require a
> couple of tweaks to the negotiation scheme, with this change svndiff2 with
> LZ4 will be used in the repository on-disk data and as the wire format for
> http://, if the corresponding compression level is set to 1.
>
> Speaking of only using it with compression level 1, that's not a starting
> point.  While LZ4 offers superior speeds, it is not a substitute for any
> zlib compression level > 1, including our current default of 5, as the
> latter gives better compression ratio.
>
> I was thinking that it might make sense to make the compression level 1 our
> new default, so that new installations would benefit from the increased
> speed, while still keeping a decent compression ratio.  (If necessary, that
> could still be tweaked for better compression.)

I am not sure where we are at in the discussion of the new option knobs,
so I'll add my opinion here. Bear in mind I have not looked at the
implementation so I might have missed something. I'll just express
what I would like to see as somebody who occasionally has to debug
servers with plain config files and already has to deal with mountains
of existing config settings in Subversion and HTTPD.

FSFS8: Default to lz4.
       Allow use of zlib only with an option (+ compression-levels)
       or disallow zlib entirely with Format 8 and deprecate/ignore
       compression options in fsfs.conf.
       
RA protocols: Use client<->server negotiation for lz4; fallback to zlib
              Do not require me to set an option for this.
Loading...