Checkout without storing two copies

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Checkout without storing two copies

Robert Hickman
Hello,

I tend to work on projects with a large amount of binary data along
with source code and need to track them together. To this date
Subversion is the only tool that I've used which handles this
dependably. That being said I have one major issue with it - the last
time I used SVN it stored two copies of every file in a checkout. For
what I am doing this additional data is useless. I frequently add new
binary files but rarely modify them in place. It would be extremely
useful for me if there was an option to store only one copy and loose
delta uploads. It's all new data so there is nothing to delta anyway.

As I have not used SVN for several years I realize that this feature
may have been added. If not has it been considered?

Robert.
Reply | Threaded
Open this post in threaded view
|

Re: Checkout without storing two copies

Ryan Schmidt-8

On Sep 26, 2017, at 13:13, Robert Hickman wrote:

> I tend to work on projects with a large amount of binary data along
> with source code and need to track them together. To this date
> Subversion is the only tool that I've used which handles this
> dependably. That being said I have one major issue with it - the last
> time I used SVN it stored two copies of every file in a checkout. For
> what I am doing this additional data is useless. I frequently add new
> binary files but rarely modify them in place. It would be extremely
> useful for me if there was an option to store only one copy and loose
> delta uploads. It's all new data so there is nothing to delta anyway.
>
> As I have not used SVN for several years I realize that this feature
> may have been added. If not has it been considered?

The feature hasn't been added yet.

https://issues.apache.org/jira/browse/SVN-525


Reply | Threaded
Open this post in threaded view
|

Re: Checkout without storing two copies

Stefan Sperling-9
On Wed, Sep 27, 2017 at 06:01:33AM -0500, Ryan Schmidt wrote:

>
> On Sep 26, 2017, at 13:13, Robert Hickman wrote:
>
> > I tend to work on projects with a large amount of binary data along
> > with source code and need to track them together. To this date
> > Subversion is the only tool that I've used which handles this
> > dependably. That being said I have one major issue with it - the last
> > time I used SVN it stored two copies of every file in a checkout. For
> > what I am doing this additional data is useless. I frequently add new
> > binary files but rarely modify them in place. It would be extremely
> > useful for me if there was an option to store only one copy and loose
> > delta uploads. It's all new data so there is nothing to delta anyway.
> >
> > As I have not used SVN for several years I realize that this feature
> > may have been added. If not has it been considered?
>
> The feature hasn't been added yet.
>
> https://issues.apache.org/jira/browse/SVN-525
>

I suspect the only problem with this feature request is that nobody
has time to work on it :-/
Reply | Threaded
Open this post in threaded view
|

Re: Checkout without storing two copies

Paul Hammant-3
In reply to this post by Robert Hickman


As I have not used SVN for several years I realize that this feature
may have been added. If not has it been considered?

I have a file-sync agent that uses a non-standard Subversion install as a backing-store over WebDAV. It only keeps one copy on the client side, and will shuttle all saves around the team that is subscribing to the same directory in the repo. It obeys permissions of course.  It happily moves 10GB files, and like Svn itself can go up to multiple TB in the backend (I've tested it to 3.4TB).  Tech is written in Python. It suits people that use MS-Office as tool, rather devs doing development.

Can you say more about your usage patterns, the numbers of people who'd use it, the frequency of change, and the where there users are on the source-control savvy spectrum?

Regards,

- Paul
 
Reply | Threaded
Open this post in threaded view
|

Re: Checkout without storing two copies

Robert Hickman
In reply to this post by Stefan Sperling-9
@Ryan Schmidt @Stefan Sperling. I guess that the difficulty of
implementing this depends on how much of the client code depends on
the existence of those files. From the linked bug tracker item, the
answer appears to be 'quite a lot', though I don't know anything about
this codebase.

@Paul Hammant This is used by myself only and I am familiar with
working with SVN and GIT on the command line, but in no way an
'expert'. I prefer tools which are simple and developer focused and
use Linux exclusively, prefer text file configuration.

I mainly work on a desktop but sometimes need to move part of the
file-system onto a laptop and then merge it back again. By file size
most of this data is DSLR raw files and source video, most of which is
related to a website with associated source code. Additionally I have
multiple unrelated personal projects from the past 7 years which need
to be in there own repositories. And miscellaneous 'stuff' which also
needs to stay separate. Some files are interdependent, some are not.

I too have developed a tool to fit my needs, having become
sufficiently frustrated with other tools. However I feel that I'm just
reimplementing part of Subversion, hence the question.

https://github.com/robehickman/simple-http-file-sync

The implementation of this system is very naive, for example storing
it's file manifest as JSON. It also has a number of problems that I
haven't fixed yet. However I've been quite surprised at how well it
works. It handles 16,000 individual files in one of my projects
without difficulty, the biggest bottleneck being the network.
Currently I'm using this to manage the binary stuff and git for code.

I was surprised how easy it was to implement this. The above system is
just over 1000 lines of python and a good chunk of that is a
journaling file-system interface.

On 27 September 2017 at 12:16, Stefan Sperling <[hidden email]> wrote:

> On Wed, Sep 27, 2017 at 06:01:33AM -0500, Ryan Schmidt wrote:
>>
>> On Sep 26, 2017, at 13:13, Robert Hickman wrote:
>>
>> > I tend to work on projects with a large amount of binary data along
>> > with source code and need to track them together. To this date
>> > Subversion is the only tool that I've used which handles this
>> > dependably. That being said I have one major issue with it - the last
>> > time I used SVN it stored two copies of every file in a checkout. For
>> > what I am doing this additional data is useless. I frequently add new
>> > binary files but rarely modify them in place. It would be extremely
>> > useful for me if there was an option to store only one copy and loose
>> > delta uploads. It's all new data so there is nothing to delta anyway.
>> >
>> > As I have not used SVN for several years I realize that this feature
>> > may have been added. If not has it been considered?
>>
>> The feature hasn't been added yet.
>>
>> https://issues.apache.org/jira/browse/SVN-525
>>
>
> I suspect the only problem with this feature request is that nobody
> has time to work on it :-/
Reply | Threaded
Open this post in threaded view
|

Re: Checkout without storing two copies

Paul Hammant-3
> * HTTP(S) based sync protocol.   

Mine uses Subversions WebDAV as is.

> * All files, both on the client and the server, are stored as plain files with there original names. 

Mine too, or plain binary 'as is'

> * Stores limited version history on the server only. Has limited support for file versioning. 

Mine: Only current version is stored on the client. Works out if 'clash' is about to happen.

> * No web/graphical UI Designed to perform a single function only, provides a minimal command line interface. 

Yup, though I'll have a tray piece in time like https://www.sparkleshare.org/ and DropBox.

> * No database dependency Stores file manifest information as regular JSON. 

I've metadata stored client side in JSON too.

> * Atomic file system operations through journaling. 

I've no server side beyond Subversion.

* Supports partial checkouts

Got that too :)

For your README, you'd be better to move the rationale to a separate page, and concentrate on hooking the potential user in.

- Paul
Reply | Threaded
Open this post in threaded view
|

Re: Checkout without storing two copies

Robert Hickman
> Mine uses Subversions WebDAV as is.

What is subversions WebDAV interface like to work with?

>> * Atomic file system operations through journaling.
> I've no server side beyond Subversion.

The journaling system was mostly needed on the client. During a
'checkout' any file being placed in the local file system also has to
be added to the manifest. If this is not atomic the FS could be left
in an inconsistent state. The journal goes some way towards solving
this as it can detect if something caused a failure between the two
operations and roll back.

This does nothing to help if another process is modifying the
file-system at the same time. The only way of addressing that would be
to lock the whole file-system during that change. I have not found
this to be an issue in practice.

> Yup, though I'll have a tray piece in time

Personally I'm happy with scrolling text in a terminal. I use Xmonad
almost stock, no system tray or any kind of status-bar.

On 27 September 2017 at 18:14, Paul Hammant <[hidden email]> wrote:

>> * HTTP(S) based sync protocol.
>
> Mine uses Subversions WebDAV as is.
>
>> * All files, both on the client and the server, are stored as plain files
>> with there original names.
>
> Mine too, or plain binary 'as is'
>
>> * Stores limited version history on the server only. Has limited support
>> for file versioning.
>
> Mine: Only current version is stored on the client. Works out if 'clash' is
> about to happen.
>
>> * No web/graphical UI Designed to perform a single function only, provides
>> a minimal command line interface.
>
> Yup, though I'll have a tray piece in time like
> https://www.sparkleshare.org/ and DropBox.
>
>> * No database dependency Stores file manifest information as regular JSON.
>
> I've metadata stored client side in JSON too.
>
>> * Atomic file system operations through journaling.
>
> I've no server side beyond Subversion.
>
> * Supports partial checkouts
>
> Got that too :)
>
> For your README, you'd be better to move the rationale to a separate page,
> and concentrate on hooking the potential user in.
>
> - Paul