Shelving / Checkpointing thoughts

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Shelving / Checkpointing thoughts

Johan Corveleyn-3
First of all: thanks for working on shelving and checkpointing,
Julian. These could become very important and big features. Daunting
to take them on, but it's good to see someone having a go at it.

I've tried to read through all the docs and recent mail threads. Below
are a couple of thoughts.

TL;DR: I'd suggest first going for a very good Shelving
implementation, and not rushing for Checkpointing. Shelving can
already solve some checkpointing use-cases (see below), and
Checkpointing as a full-blown feature needs very careful thought (even
if limited, it's risky to paint ourselves into a corner), and will
quickly raise expectations of "local commits" or "local branching"
(which still seem far away).


== Shelving ==

Looks great so far. Of course a lot of challenges remain for all the
cases which are not yet (correctly) covered by 'svn diff' and 'svn
patch' (property changes, tree operations, binary files, unresolved
conflicts, etc.). Attaching a log message to a shelf is key, and the
association with changelists looks like a good approach.

- Shelves should (eventually) support directories as "versioned
items". Changelists currently don't support directories.

- Suggestion: 'svn shelve --keep', to create a shelf (patch) in the
"shelf storage" but not revert it. That would enable some crude way of
checkpointing your work, through simple patches (which can be applied
later by fuzzy patching, or ...):

    work on feature A
    svn shelve --keep --name "feature A"
    continue work on feature A
    turns out badly, lets go back
    svn revert; svn unshelve "feature A"


== Checkpointing ==

I think only "option 3" looks viable / interesting in the long run
(option 1, storing patches, looks a lot like simple shelving, so not
much more value imho). Or even a completely different approach which
is implemented in working copy storage (option 4? I haven't thought
this through, but I'm afraid of the disk space requirements, and init
I/O cost, of option 3 for large multi-GB working copies).

It's very hard for me to not think of checkpoints as local branches of
some sort. And my users will immediately want to use them in that way.
In all honesty, I think we should aim for powerful local branches (and
think of an architecture / design / ui for that), and then think about
how we can perhaps start with something simpler and more limited as a
first step, but which goes in that direction. I.e. a more holistic
design around "local branches", "local commits", "checkpointing".
What's the big picture?

- After reading the "Terminology" section of
https://wiki.apache.org/subversion/SavePoints, I agree with that
document that "Savepoints" might be a better name. But don't want to
bikeshed over it ...

- In a prior mail-thread between you and Nathan Hartman the "rebasing"
problem was mentioned. In [1] you said:

> Performing an 'update' with a checkpoint series is a bigger ask than it
> might at first seem. In effect, it requires rebasing the series of
> checkpoints on the new base, which gets ugly because of the need to
> handle conflicts (which is ugly enough already in the existing
> single-depth WC).

However, that seems to only sane way to go for me. Rebasing the
checkpoints one by one, and resolving conflicts along the way. Don't
know how you'd do that though, if the checkpoints are revisions 1, 2,
3 in a local repository (with immutable history etc). This is really
something where the "local repository" technique breaks down IMHO. In
contract with DVCS's, in SVN history is immutable. But mutability is
quite important for local branches / commits.

In that sense, a series of patches is more flexible: you can still
apply them with fuzz even if applying them to a different BASE state,
and often that will "just work" (and conflicts "just" need to be
resolved).

In that light, Nathan's latest post in that thread ([2]) was also
interesting, where he suggested to store the BASE together with the
WORK for every checkpoint. I'm not sure if that's the way to go (maybe
you only need a "description of BASE", not the pristines etc), but it
made me realize: don't expect checkpoints to work for "undo 'svn
update'" if you can't restore the original BASE tree (down to the
exact mixed-revisionness).

[1] https://svn.haxx.se/dev/archive-2017-07/0302.shtml
[2] https://svn.haxx.se/dev/archive-2017-08/0064.shtml

--
Johan
Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Julian Foad-5
Johan, thank you very much for these considered thoughts!

Johan Corveleyn wrote:

> First of all: thanks for working on shelving and checkpointing,
> Julian. These could become very important and big features. Daunting
> to take them on, but it's good to see someone having a go at it.
>
> I've tried to read through all the docs and recent mail threads. Below
> are a couple of thoughts.
>
> TL;DR: I'd suggest first going for a very good Shelving
> implementation, and not rushing for Checkpointing. Shelving can
> already solve some checkpointing use-cases (see below), and
> Checkpointing as a full-blown feature needs very careful thought (even
> if limited, it's risky to paint ourselves into a corner), and will
> quickly raise expectations of "local commits" or "local branching"
> (which still seem far away).

Agreed.

> == Shelving ==
>
> Looks great so far. Of course a lot of challenges remain for all the
> cases which are not yet (correctly) covered by 'svn diff' and 'svn
> patch' (property changes, tree operations, binary files, unresolved
> conflicts, etc.). Attaching a log message to a shelf is key, and the
> association with changelists looks like a good approach.
>
> - Shelves should (eventually) support directories as "versioned
> items". Changelists currently don't support directories.

I agree with all this too. A log message is supported in the prototype
(but not yet integrated with changelists, as discussed in another thread).

> - Suggestion: 'svn shelve --keep', to create a shelf (patch) in the
> "shelf storage" but not revert it. That would enable some crude way of
> checkpointing your work, through simple patches (which can be applied
> later by fuzzy patching, or ...):
>
>     work on feature A
>     svn shelve --keep --name "feature A"
>     continue work on feature A
>     turns out badly, lets go back
>     svn revert; svn unshelve "feature A"

Yes; implemented now in r1806168.

In fact this usage aligns very well with how I use patch files myself. I
often save a series of patches to checkpoint my development of a
feature, with names like 'foo-1.patch' ... 'foo-N.patch'.

One thing I like to do in that case is to copy the log message from the
top of the previous patch into the top of the new patch and update it.
If Subversion could make that step easier, that would help.

At present the prototype 'svn shelve' accepts a log message with '-m'
or'-F' but doesn't provide an easy way to invoke an editor, assuming you
don't want a log message if you don't specify one. One way to solve that
would be by adopting the same convention as 'commit': pop up an editor
by default and allow quitting with an empty message if desired. Another
way could be a command-line option if we think no-log-message use cases
should be the default. But that's a UI detail that we can leave till later.

The Checkpoint feature could add the copy-and-modify facility for the
log message.


> == Checkpointing ==
>
> I think only "option 3" looks viable / interesting in the long run
> (option 1, storing patches, looks a lot like simple shelving, so not
> much more value imho). Or even a completely different approach which
> is implemented in working copy storage (option 4? I haven't thought
> this through, but I'm afraid of the disk space requirements, and init
> I/O cost, of option 3 for large multi-GB working copies).
>
> It's very hard for me to not think of checkpoints as local branches of
> some sort. And my users will immediately want to use them in that way.
> In all honesty, I think we should aim for powerful local branches (and
> think of an architecture / design / ui for that), and then think about
> how we can perhaps start with something simpler and more limited as a
> first step, but which goes in that direction. I.e. a more holistic
> design around "local branches", "local commits", "checkpointing".
> What's the big picture?

Good question. While I am continuing to think about ways in which some
larger scope towards "local branching" could play out, I don't currently
see that ever working well for Subversion, at least not the current
generation 1.x.

Given your comments about shelving being almost enough, I now wonder if
we should design simple checkpointing as a simple extension to shelving,
whereby we:

  * automatically increment the patch suffix number,
    if the latest shelved patch was made with '--keep-local';

  * also transfer the log message transfer from the previous numbered
    patch and offer it for editing;

  * offer a UI shortcut to revert to the previous version in the series;

and any other similar things.

WDYT?


> - After reading the "Terminology" section of
> https://wiki.apache.org/subversion/SavePoints, I agree with that
> document that "Savepoints" might be a better name. But don't want to
> bikeshed over it ...

Could be. (Also, 'checkpoint' is similar to 'checkout' which makes
command-line completion not work well, and the possible abbreviation
'cp' clashes with 'copy'.)

> - In a prior mail-thread between you and Nathan Hartman the "rebasing"
> problem was mentioned. In [1] you said:
>
>> Performing an 'update' with a checkpoint series is a bigger ask than it
>> might at first seem. In effect, it requires rebasing the series of
>> checkpoints on the new base, which gets ugly because of the need to
>> handle conflicts (which is ugly enough already in the existing
>> single-depth WC).
>
> However, that seems to only sane way to go for me. Rebasing the
> checkpoints one by one, and resolving conflicts along the way. Don't
> know how you'd do that though, if the checkpoints are revisions 1, 2,
> 3 in a local repository (with immutable history etc). This is really
> something where the "local repository" technique breaks down IMHO. In
> contract with DVCS's, in SVN history is immutable. But mutability is
> quite important for local branches / commits.

Yes, I have realized that it is going to be tricky to make storage in a
local repository effectively mutable. There are various options, the
most plausible being to leave old data in place and keep committing the
modified versions of changes, tracking them as needed; but it's not simple.

> In that sense, a series of patches is more flexible: you can still
> apply them with fuzz even if applying them to a different BASE state,
> and often that will "just work" (and conflicts "just" need to be
> resolved).

I don't buy that as an easier solution to updating a patch series. Any
conflicts would be handled worse this way.

> In that light, Nathan's latest post in that thread ([2]) was also
> interesting, where he suggested to store the BASE together with the
> WORK for every checkpoint. I'm not sure if that's the way to go (maybe
> you only need a "description of BASE", not the pristines etc), but it
> made me realize: don't expect checkpoints to work for "undo 'svn
> update'" if you can't restore the original BASE tree (down to the
> exact mixed-revisionness).

Yes... I haven't digested and replied to that yet but I intend to.

> [1] https://svn.haxx.se/dev/archive-2017-07/0302.shtml
> [2] https://svn.haxx.se/dev/archive-2017-08/0064.shtml

Very productive. Please don't hold back on further thoughts!

- Julian


Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Johan Corveleyn-3
On Fri, Aug 25, 2017 at 3:33 PM, Julian Foad <[hidden email]> wrote:
...

>> == Shelving ==
>>
>> Looks great so far. Of course a lot of challenges remain for all the
>> cases which are not yet (correctly) covered by 'svn diff' and 'svn
>> patch' (property changes, tree operations, binary files, unresolved
>> conflicts, etc.). Attaching a log message to a shelf is key, and the
>> association with changelists looks like a good approach.
>>
>> - Shelves should (eventually) support directories as "versioned
>> items". Changelists currently don't support directories.
>
> I agree with all this too. A log message is supported in the prototype
> (but not yet integrated with changelists, as discussed in another thread).

We'll need to have the discussion about what changelists should do
with directories though. Not sure what prior discussions were already
held about this, when the changelist feature was designed ...

>> - Suggestion: 'svn shelve --keep', to create a shelf (patch) in the
>> "shelf storage" but not revert it. That would enable some crude way of
>> checkpointing your work, through simple patches (which can be applied
>> later by fuzzy patching, or ...):
>>
>>     work on feature A
>>     svn shelve --keep --name "feature A"
>>     continue work on feature A
>>     turns out badly, lets go back
>>     svn revert; svn unshelve "feature A"
>
> Yes; implemented now in r1806168.

Great!

> In fact this usage aligns very well with how I use patch files myself. I
> often save a series of patches to checkpoint my development of a
> feature, with names like 'foo-1.patch' ... 'foo-N.patch'.
>
> One thing I like to do in that case is to copy the log message from the
> top of the previous patch into the top of the new patch and update it.
> If Subversion could make that step easier, that would help.
>
> At present the prototype 'svn shelve' accepts a log message with '-m'
> or'-F' but doesn't provide an easy way to invoke an editor, assuming you
> don't want a log message if you don't specify one. One way to solve that
> would be by adopting the same convention as 'commit': pop up an editor
> by default and allow quitting with an empty message if desired. Another
> way could be a command-line option if we think no-log-message use cases
> should be the default. But that's a UI detail that we can leave till later.
>
> The Checkpoint feature could add the copy-and-modify facility for the
> log message.

Yes, maybe we'll need to have some grouping structure / namespacing in
the shelves for this. A "rack" or something :-). The rack carries a
name ("savepoints", "feature A"); a single shelf in a rack is just
'svn shelve --rack "feature A"'; If I add more shelves to a rack, they
get numbered.

svn shelve --keep-local --rack savepoints -> shelf "savepoints-0" (or
just "savepoints"?)
svn shelve --keep-local --rack savepoints -> shelf "savepoints-1"
svn revert
svn unshelve savepoints-0

Just thinking out loud here ...

>
>> == Checkpointing ==
>>
>> I think only "option 3" looks viable / interesting in the long run
>> (option 1, storing patches, looks a lot like simple shelving, so not
>> much more value imho). Or even a completely different approach which
>> is implemented in working copy storage (option 4? I haven't thought
>> this through, but I'm afraid of the disk space requirements, and init
>> I/O cost, of option 3 for large multi-GB working copies).
>>
>> It's very hard for me to not think of checkpoints as local branches of
>> some sort. And my users will immediately want to use them in that way.
>> In all honesty, I think we should aim for powerful local branches (and
>> think of an architecture / design / ui for that), and then think about
>> how we can perhaps start with something simpler and more limited as a
>> first step, but which goes in that direction. I.e. a more holistic
>> design around "local branches", "local commits", "checkpointing".
>> What's the big picture?
>
> Good question. While I am continuing to think about ways in which some
> larger scope towards "local branching" could play out, I don't currently
> see that ever working well for Subversion, at least not the current
> generation 1.x.
>
> Given your comments about shelving being almost enough, I now wonder if
> we should design simple checkpointing as a simple extension to shelving,
> whereby we:
>
>   * automatically increment the patch suffix number,
>     if the latest shelved patch was made with '--keep-local';
>
>   * also transfer the log message transfer from the previous numbered
>     patch and offer it for editing;
>
>   * offer a UI shortcut to revert to the previous version in the series;
>
> and any other similar things.
>
> WDYT?

Yes! I think that would be enough for a good release 1.11, with nice
"shelving" and basic checkpointing (actually mainly UI lipstick with
options / conventions on the "shelve / unshelve" commands) on top of
it. Leaves the door open for more fundamental things in the future ...

>> - After reading the "Terminology" section of
>> https://wiki.apache.org/subversion/SavePoints, I agree with that
>> document that "Savepoints" might be a better name. But don't want to
>> bikeshed over it ...
>
> Could be. (Also, 'checkpoint' is similar to 'checkout' which makes
> command-line completion not work well, and the possible abbreviation
> 'cp' clashes with 'copy'.)
>
>> - In a prior mail-thread between you and Nathan Hartman the "rebasing"
>> problem was mentioned. In [1] you said:
>>
>>> Performing an 'update' with a checkpoint series is a bigger ask than it
>>> might at first seem. In effect, it requires rebasing the series of
>>> checkpoints on the new base, which gets ugly because of the need to
>>> handle conflicts (which is ugly enough already in the existing
>>> single-depth WC).
>>
>> However, that seems to only sane way to go for me. Rebasing the
>> checkpoints one by one, and resolving conflicts along the way. Don't
>> know how you'd do that though, if the checkpoints are revisions 1, 2,
>> 3 in a local repository (with immutable history etc). This is really
>> something where the "local repository" technique breaks down IMHO. In
>> contract with DVCS's, in SVN history is immutable. But mutability is
>> quite important for local branches / commits.
>
> Yes, I have realized that it is going to be tricky to make storage in a
> local repository effectively mutable. There are various options, the
> most plausible being to leave old data in place and keep committing the
> modified versions of changes, tracking them as needed; but it's not simple.
>
>> In that sense, a series of patches is more flexible: you can still
>> apply them with fuzz even if applying them to a different BASE state,
>> and often that will "just work" (and conflicts "just" need to be
>> resolved).
>
> I don't buy that as an easier solution to updating a patch series. Any
> conflicts would be handled worse this way.

Okay, updating the patches when the "new base" comes in might indeed
be better. OTOH: if the user has discarded / forgotten about his
checkpoints, any "early conflict resolution of checkpoints / patches"
might be a waste of time. It's a tradeoff I guess (perhaps updating
them should be done "during rebasing", but the user can postpone
conflicts ...). And what if rebasing checkpoint-2 has a conflict, but
that conflict would not occur when rebasing checkpoint-3 (i.e. a
change was made locally which eliminated the conflict)? Hmmm ...

--
Johan
Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Julian Foad-5
Johan Corveleyn wrote:
> On Fri, Aug 25, 2017 at 3:33 PM, Julian Foad <[hidden email]> wrote:
[...]>> The Checkpoint feature could add the copy-and-modify facility
for the
>> log message.
>
> Yes, maybe we'll need to have some grouping structure / namespacing in
> the shelves for this. A "rack" or something :-). The rack carries a
> name ("savepoints", "feature A"); a single shelf in a rack is just
> 'svn shelve --rack "feature A"'; If I add more shelves to a rack, they
> get numbered. [...]
I think the terminology works best, and most in line with other tools
(p4, hg, bzr) like this:

  * "Shelving" or "to shelve" means putting something on a shelf. There
is one "shelf" per WC.

  * The thing we put on the shelf is called a "patch" or a "shelved
change", and is analogous to a book or a paper placed on the shelf. A
numbered version of a patch can be called a "checkpoint".

  * A series of checkpoint patches is a series of "patch versions" or a
"checkpoint series". I think this is simpler than introducing a new term.

- Julian
Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Johan Corveleyn-3
On Fri, Aug 25, 2017 at 4:59 PM, Julian Foad <[hidden email]> wrote:

> Johan Corveleyn wrote:
>> On Fri, Aug 25, 2017 at 3:33 PM, Julian Foad <[hidden email]> wrote:
> [...]>> The Checkpoint feature could add the copy-and-modify facility
> for the
>>> log message.
>>
>> Yes, maybe we'll need to have some grouping structure / namespacing in
>> the shelves for this. A "rack" or something :-). The rack carries a
>> name ("savepoints", "feature A"); a single shelf in a rack is just
>> 'svn shelve --rack "feature A"'; If I add more shelves to a rack, they
>> get numbered. [...]
> I think the terminology works best, and most in line with other tools
> (p4, hg, bzr) like this:
>
>   * "Shelving" or "to shelve" means putting something on a shelf. There
> is one "shelf" per WC.
>
>   * The thing we put on the shelf is called a "patch" or a "shelved
> change", and is analogous to a book or a paper placed on the shelf. A
> numbered version of a patch can be called a "checkpoint".
>
>   * A series of checkpoint patches is a series of "patch versions" or a
> "checkpoint series". I think this is simpler than introducing a new term.

Ah yes, of course. Sorry, no need to invent a new term.

Just wondering then, when we create a "series of patches" that belong
together, that have some ordering, how do we organize that?

Still only one shelf per WC (*the* shelf)? Grouping them through
naming ("savepoint-1", "savepoint-2" are two shelved patches belonging
to the same series, but "featureA" (which was reverted) is separate
because it doesn't have the same prefix)? Or do we need multiple
shelves with some name too?

Just one more thought: in the namespace of shelved changes, we might
want to reserve "svn:" or some such prefix, for internal use, to give
us possibilities for features built upon the shelving infrastructure.

--
Johan
Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Julian Foad-5
Johan Corveleyn wrote:

>>   * "Shelving" or "to shelve" means putting something on a shelf. There
>> is one "shelf" per WC.
>>
>>   * The thing we put on the shelf is called a "patch" or a "shelved
>> change", and is analogous to a book or a paper placed on the shelf. A
>> numbered version of a patch can be called a "checkpoint".
>>
>>   * A series of checkpoint patches is a series of "patch versions" or a
>> "checkpoint series". I think this is simpler than introducing a new term.
>
> Ah yes, of course. Sorry, no need to invent a new term.
>
> Just wondering then, when we create a "series of patches" that belong
> together, that have some ordering, how do we organize that?

Let's clarify. We can mean two possible things when we say 'a series of
patches':

  1. "patch versions": a series of successively better patches, all
attempting the same logical thing, all from the same base, and only one
of which is applied at any time;

  2. a series of patches, each providing a different logical change,
where each patch is based on the result of applying the previous one.
("quilt" is a tool for managing path series of this kind. My 'option 3'
(local repository) design for checkpointing could also be used in this
way, in a primitive way, but would not support revising earlier patches
in the series which is a key strength of what "quilt" can do.)

I am talking about definition 1 ("patch versions").

I propose patches in a series of patch versions be named "featureA-1",
"featureA-2", ... (This is what I do already, manually, in my own work.)

I propose that we should not attempt to provide any special support for
definition 2 within this "shelving" feature; users can manage that
themselves by simply remembering which feature names depend on which
other ones, or by including some other numbering system within the names.

> Still only one shelf per WC (*the* shelf)? Grouping them through
> naming ("savepoint-1", "savepoint-2" are two shelved patches belonging
> to the same series, but "featureA" (which was reverted) is separate
> because it doesn't have the same prefix)? Or do we need multiple
> shelves with some name too?
>
> Just one more thought: in the namespace of shelved changes, we might
> want to reserve "svn:" or some such prefix, for internal use, to give
> us possibilities for features built upon the shelving infrastructure.

Good thought.

- Julian

Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Julian Foad-5
Forgot the conclusion...

Julian Foad wrote:
> Johan Corveleyn wrote:
>> Still only one shelf per WC (*the* shelf)? Grouping them through
>> naming ("savepoint-1", "savepoint-2" are two shelved patches belonging
>> to the same series, but "featureA" (which was reverted) is separate
>> because it doesn't have the same prefix)?

Absolutely yes! I thought I had previously indicated so.

>> Or do we need multiple shelves with some name too?

Absolutely not. Not for any reason that I can see.

- Julian
Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Daniel Shahaf-2
In reply to this post by Julian Foad-5
Julian Foad wrote on Fri, 25 Aug 2017 21:42 +0100:

> Let's clarify. We can mean two possible things when we say 'a series of
> patches':
>
>   1. "patch versions": a series of successively better patches, all
> attempting the same logical thing, all from the same base, and only one
> of which is applied at any time;
>
>   2. a series of patches, each providing a different logical change,
> where each patch is based on the result of applying the previous one.
> ("quilt" is a tool for managing path series of this kind. My 'option 3'
> (local repository) design for checkpointing could also be used in this
> way, in a primitive way, but would not support revising earlier patches
> in the series which is a key strength of what "quilt" can do.)
>
> I am talking about definition 1 ("patch versions").
>
> I propose patches in a series of patch versions be named "featureA-1",
> "featureA-2", ... (This is what I do already, manually, in my own work.)
>
> I propose that we should not attempt to provide any special support for
> definition 2 within this "shelving" feature; users can manage that
> themselves by simply remembering which feature names depend on which
> other ones, or by including some other numbering system within the names.

Could you explain how you see this working?  In use-case #2, later patches
in the series typically depend on earlier patches.  I don't see how that use-case
can be addressed if the patches are all implemented against deltas against
the same base (for example, because if patch #2 in a 5-patch series is edited,
all later patches would have to be regenerated).

If that's not clear enough I'd be happy to elaborate.

Cheers,

Daniel

P.S. I think brane's suggestion would make the "edit patch #2 in a 5-patch series"
use-case a lot easier: diff3 is exactly the right tool for rebasing patch #3 on top
of the edited #2.
Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Branko Čibej
On 27.08.2017 02:04, Daniel Shahaf wrote:

> Julian Foad wrote on Fri, 25 Aug 2017 21:42 +0100:
>> Let's clarify. We can mean two possible things when we say 'a series of
>> patches':
>>
>>   1. "patch versions": a series of successively better patches, all
>> attempting the same logical thing, all from the same base, and only one
>> of which is applied at any time;
>>
>>   2. a series of patches, each providing a different logical change,
>> where each patch is based on the result of applying the previous one.
>> ("quilt" is a tool for managing path series of this kind. My 'option 3'
>> (local repository) design for checkpointing could also be used in this
>> way, in a primitive way, but would not support revising earlier patches
>> in the series which is a key strength of what "quilt" can do.)
>>
>> I am talking about definition 1 ("patch versions").
>>
>> I propose patches in a series of patch versions be named "featureA-1",
>> "featureA-2", ... (This is what I do already, manually, in my own work.)
>>
>> I propose that we should not attempt to provide any special support for
>> definition 2 within this "shelving" feature; users can manage that
>> themselves by simply remembering which feature names depend on which
>> other ones, or by including some other numbering system within the names.
> Could you explain how you see this working?  In use-case #2, later patches
> in the series typically depend on earlier patches.  I don't see how that use-case
> can be addressed if the patches are all implemented against deltas against
> the same base (for example, because if patch #2 in a 5-patch series is edited,
> all later patches would have to be regenerated).
>
> If that's not clear enough I'd be happy to elaborate.
>
> Cheers,
>
> Daniel
>
> P.S. I think brane's suggestion would make the "edit patch #2 in a 5-patch series"
> use-case a lot easier: diff3 is exactly the right tool for rebasing patch #3 on top
> of the edited #2.

Yes, that's one of the reasons I proposed storing whole files instead of
patches. Rebasing and reordering would become much simpler (although not
exactly "simple" if we take tree restructuring into account).

-- Brane
Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Daniel Shahaf-2
Branko Čibej wrote on Sun, 27 Aug 2017 02:09 +0200:
> On 27.08.2017 02:04, Daniel Shahaf wrote:
> > P.S. I think brane's suggestion would make the "edit patch #2 in a 5-patch series"
> > use-case a lot easier: diff3 is exactly the right tool for rebasing patch #3 on top
> > of the edited #2.
>
> Yes, that's one of the reasons I proposed storing whole files instead of
> patches. Rebasing and reordering would become much simpler (although not
> exactly "simple" if we take tree restructuring into account).

The part with tree restructuring would "simply" require a working diff3
implementation for tree changes.  Isn't that exactly what the tree
conflicts work is supposed to produce?
Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Branko Čibej
On 27.08.2017 06:00, Daniel Shahaf wrote:

> Branko Čibej wrote on Sun, 27 Aug 2017 02:09 +0200:
>> On 27.08.2017 02:04, Daniel Shahaf wrote:
>>> P.S. I think brane's suggestion would make the "edit patch #2 in a 5-patch series"
>>> use-case a lot easier: diff3 is exactly the right tool for rebasing patch #3 on top
>>> of the edited #2.
>> Yes, that's one of the reasons I proposed storing whole files instead of
>> patches. Rebasing and reordering would become much simpler (although not
>> exactly "simple" if we take tree restructuring into account).
> The part with tree restructuring would "simply" require a working diff3
> implementation for tree changes.  Isn't that exactly what the tree
> conflicts work is supposed to produce?

I was wondering about that, yes. Not sure how one would combine
(un)shelving and conflict resoiltion, but clearly it must be possible
somehow.

-- Brane
Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Paul Hammant-3
Delegating to libgit2 to invisibly handle: shelves, local-branches and pull-requests could yield a workable and flexible implementation.

Pull-requests flow back to the central repo, of course) and there interactive and automatic workflows include: code review, classic CI, linting and other badge-style data-points.  Those workflows of course are the purview of other pieces of software that are triggered by Subversion.

- Paul
Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Julian Foad-5
In reply to this post by Daniel Shahaf-2
Daniel Shahaf wrote:

> Julian Foad wrote on Fri, 25 Aug 2017 21:42 +0100:
>> Let's clarify. We can mean two possible things when we say 'a series of
>> patches':
>>
>>   1. "patch versions": a series of successively better patches, all
>> attempting the same logical thing, all from the same base, and only one
>> of which is applied at any time;
>>
>>   2. a series of patches, each providing a different logical change,
>> where each patch is based on the result of applying the previous one.
>> ("quilt" is a tool for managing path series of this kind. My 'option 3'
>> (local repository) design for checkpointing could also be used in this
>> way, in a primitive way, but would not support revising earlier patches
>> in the series which is a key strength of what "quilt" can do.)
>>
>> I am talking about definition 1 ("patch versions").
>> [...]
>>
>> I propose that we should not attempt to provide any special support for
>> definition 2 within this "shelving" feature; users can manage that
>> themselves by simply remembering which feature names depend on which
>> other ones, or by including some other numbering system within the names.
>
> Could you explain how you see this working?  In use-case #2, later patches
> in the series typically depend on earlier patches.  I don't see how that use-case
> can be addressed if the patches are all implemented against deltas against
> the same base (for example, because if patch #2 in a 5-patch series is edited,
> all later patches would have to be regenerated).

This kind of shelving (with simple checkpointing add-ons) would *not*
help with use case #2: in particular it would not provide any support
for editing an earlier patch and rebasing the later patches. All I meant
is that you would still be free to use (manually) any naming/numbering
system you wish to help you remember how the patches that you shelve are
related.

- Julian

Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Julian Foad-5
In reply to this post by Paul Hammant-3
Paul Hammant wrote:
> Delegating to libgit2 to invisibly handle: shelves, local-branches and
> pull-requests could yield a workable and flexible implementation.
> [...]

It will definitely be worth me looking into the possibilities. Thanks
for the suggestion.

- Julian
Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Paul Hammant-3
Perhaps easy to prototype in Python, too.

Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Paul Hammant-3
Here's a prototype in Python2 that makes a git repo in a 'shelve' folder. Two commits - one with the starting position, and one with the ending position. The 'svn info' for the resource is copied in too (same file name with a '.info' suffix).
import sh
import os
from stat import S_IWUSR, S_IREAD


import fileinput

sh.rm("-rf", "maven-gpg-plugin")
sh.rm("-rf", "shelve")

sh.svn("co", "https://svn.apache.org/repos/asf/maven/plugins/trunk/maven-gpg-plugin/")

# Change two files.
for line in fileinput.input("maven-gpg-plugin/pom.xml", inplace=True):
print "%d: %s" % (fileinput.filelineno(), line),
for line in fileinput.input("maven-gpg-plugin/src/main/java/org/apache/maven/plugin/gpg/SigningBundle.java", inplace=True):
print "%d: %s" % (fileinput.filelineno(), line),

# While files changed?
files = sh.svn("st", "maven-gpg-plugin")

sh.mkdir("shelve")
sh.git("init", "shelve")

# Copy originals in.
for line in files.splitlines():
file_name = line[1:].strip()
info = sh.svn("info", file_name)
for iLine in info:
if iLine.startswith("Checksum:"):
checksum = iLine.split(" ")[1].strip()
dir = checksum[0:2]
sh.mkdir("-p", "shelve/" + "/".join(file_name.split('/')[:-1]))
sh.cp("maven-gpg-plugin/.svn/pristine/" + dir + "/" + checksum + ".svn-base", "shelve/" + file_name)
os.chmod("shelve/" + file_name, S_IWUSR | S_IREAD) # make writable
f = open("shelve/" + file_name + ".info", 'w')
f.writelines(info)
f.close()

# Do a commit
sh.cd("shelve")
sh.git("add", ".")
sh.git("commit", "-m", "start")
sh.cd("..")

# Copy changed versions in.
for line in files.splitlines():
file_name = line[1:].strip()
sh.cp(file_name, "shelve/" + file_name)

# Do a commit
sh.cd("shelve")
sh.git("add", ".")
sh.git("commit", "-m", "finish")
sh.cd("..")

Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Daniel Shahaf-2
In reply to this post by Paul Hammant-3
Paul Hammant wrote on Sun, Aug 27, 2017 at 07:04:03 -0400:
> Delegating to libgit2 to invisibly handle: shelves, local-branches and
> pull-requests could yield a workable and flexible implementation.

Apache products aren't allowed to depend on GPL'd code for mandatory /
core features.

What is the advantage of linking to libgit2 over simply telling users to
use git-svn with 'git stash' and local branches?
Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Stefan Kueng
In reply to this post by Branko Čibej


On 27.08.2017 02:09, Branko Čibej wrote:

> On 27.08.2017 02:04, Daniel Shahaf wrote:
>> Julian Foad wrote on Fri, 25 Aug 2017 21:42 +0100:
>>> Let's clarify. We can mean two possible things when we say 'a series of
>>> patches':
>>>
>>>    1. "patch versions": a series of successively better patches, all
>>> attempting the same logical thing, all from the same base, and only one
>>> of which is applied at any time;
>>>
>>>    2. a series of patches, each providing a different logical change,
>>> where each patch is based on the result of applying the previous one.
>>> ("quilt" is a tool for managing path series of this kind. My 'option 3'
>>> (local repository) design for checkpointing could also be used in this
>>> way, in a primitive way, but would not support revising earlier patches
>>> in the series which is a key strength of what "quilt" can do.)
>>>
>>> I am talking about definition 1 ("patch versions").
>>>
>>> I propose patches in a series of patch versions be named "featureA-1",
>>> "featureA-2", ... (This is what I do already, manually, in my own work.)
>>>
>>> I propose that we should not attempt to provide any special support for
>>> definition 2 within this "shelving" feature; users can manage that
>>> themselves by simply remembering which feature names depend on which
>>> other ones, or by including some other numbering system within the names.
>> Could you explain how you see this working?  In use-case #2, later patches
>> in the series typically depend on earlier patches.  I don't see how that use-case
>> can be addressed if the patches are all implemented against deltas against
>> the same base (for example, because if patch #2 in a 5-patch series is edited,
>> all later patches would have to be regenerated).
>>
>> If that's not clear enough I'd be happy to elaborate.
>>
>> Cheers,
>>
>> Daniel
>>
>> P.S. I think brane's suggestion would make the "edit patch #2 in a 5-patch series"
>> use-case a lot easier: diff3 is exactly the right tool for rebasing patch #3 on top
>> of the edited #2.
>
> Yes, that's one of the reasons I proposed storing whole files instead of
> patches. Rebasing and reordering would become much simpler (although not
> exactly "simple" if we take tree restructuring into account).

Another reason to store whole files: patches don't work with non-text
files. Which is a major problem especially on Windows where files often
are encoded in utf16.

Stefan

Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Julian Foad-5
Stefan Kueng wrote:
> Another reason to store whole files: patches don't work with non-text
> files. Which is a major problem especially on Windows where files often
> are encoded in utf16.

Making 'svn diff' and 'svn patch' support non-text files is already a
dependency of shelving-by-patch-files.

At present, 'svn diff --git' supports non-text files by writing into the
patch file (and encoded into ASCII), the whole of the old file and the
whole of the new file. So that already does what you are asking: it
'stores whole files'. And 'svn patch' handles this as input. Just like
'svn merge', it chooses either one side change or the other, and if both
sides changed that is a conflict. (I haven't tested it, I assume it
works like that.)

Storing whole files will not magically make merging UTF-16 text changes
work. For that, the merge will need to be taught how to process UTF-16
text. And if we do that, then 'svn diff' and 'svn patch' could process
UTF-16 as text too. That would be a great enhancement to Subversion, but
is an orthogonal issue.

- Julian
Reply | Threaded
Open this post in threaded view
|

Re: Shelving / Checkpointing thoughts

Julian Foad-5
Julian Foad wrote:
> Stefan Kueng wrote:
>> Another reason to store whole files: patches don't work with non-text
>> files. Which is a major problem especially on Windows where files often
>> are encoded in utf16.
>
> Making 'svn diff' and 'svn patch' support non-text files is already a
> dependency of shelving-by-patch-files.

See the "Existing Issues to Overcome" section,
https://docs.google.com/document/d/1PVgw0BdPF7v67oxIK7B_Yjmr3p28ojabP5N1PfZTsHk/edit#heading=h.4sj99cgksge9

> At present, 'svn diff --git' supports non-text files by writing into the
> patch file (and encoded into ASCII), the whole of the old file and the
> whole of the new file. So that already does what you are asking: it
> 'stores whole files'. And 'svn patch' handles this as input. Just like
> 'svn merge', it chooses either one side change or the other, and if both
> sides changed that is a conflict. (I haven't tested it, I assume it
> works like that.)

Which implies that binary file support might "just work" if I simply add
the "--git" option in to the "diff" call in the prototype. I plan to try
that.

And before anyone raises their hand to say this design will be too slow
for use cases involving very large binary files, yes, I know, it almost
certainly will be. This is called a prototype for good reason.

- Julian


> Storing whole files will not magically make merging UTF-16 text changes
> work. For that, the merge will need to be taught how to process UTF-16
> text. And if we do that, then 'svn diff' and 'svn patch' could process
> UTF-16 as text too. That would be a great enhancement to Subversion, but
> is an orthogonal issue.

12