[PATCH] release.py: write-changelog function (POC)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH] release.py: write-changelog function (POC)

Johan Corveleyn-3
As promised on IRC last Friday, here is a POC implementation of the
"generate changelog from commit messages" functionality, added to
release.py (I'm not very experienced in Python; I mainly depend on
google, SO, copy-paste, ... so don't focus on coding style etc).

This patch is intended to enable further discussion of this idea
(which was already discussed on this list 4 years ago [1]).

The idea is that we agree on some structured syntax for adding
(optionally) a changelog text to commit messages, to make it easier
for RMs to generate the text for CHANGES (and push the responsibility
for writing a good CHANGES entry first and foremost to the original
committer; and keep the relevant info coupled with the commit). RMs
can use the output of write-changelog as a draft, they can still edit
it, summarize items, shuffle things around, ... But it gives them a
rough version to start from.

Proposal for changelog syntax in commit messages (as implemented in this patch):
[[[
Changelog lines are lines with the following format:
  '['<visibility>:<section>']' <message>
where <visibility> = U (User-visible) or D (Developer-visible)
      <section> =
general|major|minor|client|server|client-server|other|api|bindings
                  (section is treated case-insensitively)
      <message> = the actual text for CHANGES

Examples:
  [U:major] Better interactive conflict resolution for tree conflicts
  [U:minor] ra_serf: Adjustments for serf versions with HTTP/2 support
  [U:client] Fix 'svn diff URL@REV WC' wrongly looks up URL@HEAD (issue #4597)
  [U:client-server] Fix bug with canonicalizing Window-specific
drive-relative URL
  [D:api] New svn_ra_list() API function
  [D:bindings] JavaHL: Allow access to constructors of a couple JavaHL classes

Q: Shorter prefix syntax ([U:C], [U:CS], [U:MJ], ...) to keep lines
short, or longer (and put message on next line) to make it more
human-readable? While making it easily rememberable for devs so they
don't have to look it up all the time when they just want to commit
...

Q: What to do with merged revisions? Use 'log -g', or make sure
relevant changelog entry is part of the commit message of the merge? I
vote for the latter, it's simpler and has less surprises. In case of
feature branches, a generic "Add feature foo" message on the
reintegrate commit usually suffices. In case of backports perhaps our
backport script can scrape the relevant changelog entries from the
revisions-to-be-merged and add them to the commit message of the
merge.
]]]


To get a rough idea: since we don't have any commit messages
containing such lines in our history, I've added a --pocfirstlines
option, which just takes the first line of every log message (ignoring
lines with 'STATUS', 'CHANGES', 'bump', or starting with '*'), putting
them in the "User -> General" section.

Here is the usage output:
[[[
$ ./release.py write-changelog -h
usage: release.py write-changelog [-h] [--pocfirstlines] branch previous

positional arguments:
  branch           The branch (or tag or trunk), relative to ^/subversion/, of
                   which to generate the changelog, when compared to
                   "previous".
  previous         The "previous" branch or tag, relative to ^/subversion/, to
                   compare "branch" against.

optional arguments:
  -h, --help       show this help message and exit
  --pocfirstlines  Proof of concept: just take the first line of each relevant
                   commit messages (except if it contains 'STATUS', 'CHANGES'
                   or 'bump' or starts with '*'), and put it in User:General.
]]]


Example output:
[[[
$ ./release.py write-changelog --pocfirstlines branches/1.9.x tags/1.9.7
 User-visible changes:
  - General:
    * Merge r1804691 and r1804692 from trunk: (r1804698)
  - Client-side bugfixes:
    (none)
  - Server-side bugfixes:
    (none)
  - Bindings bugfixes:
    (none)

 Developer-visible changes:
  - General:
    (none)
  - API changes:
    (none)

$ ./release.py write-changelog --pocfirstlines branches/1.8.x tags/1.8.19
 User-visible changes:
  - General:
    * Merge r1804691 from trunk: (r1804723)
    * On the 1.8.x branch: Merge r1804692 from trunk. (r1804737)
  - Client-side bugfixes:
    (none)
  - Server-side bugfixes:
    (none)
  - Bindings bugfixes:
    (none)

 Developer-visible changes:
  - General:
    (none)
  - API changes:
    (none)

$ ./release.py write-changelog --pocfirstlines trunk tags/1.9.7
 User-visible changes:
  - General:
    * A bug fix and minor tweaks in 'svnmover'. (r1715781)
    * A cosmetic tweak: add a final comma to lists of tests in a few
test files (r1706965)
    * A few FSFS-only tests apply to FSX just as well.  So, run them
for (r1667524)
    * A follow-up to r1715354. (r1715358)
    * A minor code cleanup in FSFS. (r1740722)
    * A minor tweak in 'svnmover'. (r1717793)
    * A small step towards making 'svnmover merge' operate into a new
temporary (r1717957)
    * Abbreviate the potentially rather long list of revisions shown
for tree (r1736063)
    * Actually use some helpful error messaging that we bother to
create in (r1683161)
    * Add "merge_" prefix to the names of conflict resolver merge test
sandboxes. (r1749457)
    * Add '--include' and '--exclude' options to 'svnadmin dump'. (r1811992)
    * Add '--search' option support to 'svnbench null-list'. (r1767202)
    * Add 'http-compression=auto' mode on the client, now used by
default. (r1803899)
...
]]]


Thoughts?

[1] https://lists.apache.org/thread.html/c80dd19a7bbafc4f535382b3f361f76ba6535ab3d447a8b988594bfc@1377814810@%3Cdev.subversion.apache.org%3E

--
Johan

release.py.patch-write-changelog.txt (9K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] release.py: write-changelog function (POC)

Stefan Sperling-9
On Mon, Dec 04, 2017 at 12:16:37PM +0100, Johan Corveleyn wrote:
> Examples:
>   [U:major] Better interactive conflict resolution for tree conflicts
>   [U:minor] ra_serf: Adjustments for serf versions with HTTP/2 support
>   [U:client] Fix 'svn diff URL@REV WC' wrongly looks up URL@HEAD (issue #4597)
>   [U:client-server] Fix bug with canonicalizing Window-specific
> drive-relative URL
>   [D:api] New svn_ra_list() API function
>   [D:bindings] JavaHL: Allow access to constructors of a couple JavaHL classes

> To get a rough idea: since we don't have any commit messages
> containing such lines in our history, I've added a --pocfirstlines
> option, which just takes the first line of every log message (ignoring
> lines with 'STATUS', 'CHANGES', 'bump', or starting with '*'), putting
> them in the "User -> General" section.

If we're gonna add some form of structured information to our log messages,
I'd prefer the git model with a simple heuristic which builds an uncategorized
list:

The first line of the message is what will end up in CHANGES. If the line
mentions an issue number, that will appear in CHANGES as (issue #N), else
the revision number will appear in changes as (rN). If the line matches
'[Rr]evert rN' then a prior entry for rN is removed again. If the log message
matches '[Ff]ollow(ing)-up to' the entry is dropped. If the log message
contains '[Nn]o functional change' anywhere, the entry is dropped.

That should suffice for the most tedious part of the task: Getting a list
of eligible entries. Categorizing the resulting list can be done by hand,
and remaining unimportant changes must be removed by hand.

I would refrain from trying to over-automate this task. It's not a lot of work
compared to the time we spend writing code. I think I have added most of the
1.10 entries in CHANGES myself just by reading through the log, and even that
wasn't too much effort. It took about a day and a half, and it also forced me
to read through the entries in detail which helped understand the impact of
each change. This was a much bigger problem from the project's origins through
to 1.8, when we had a lot of active developers. For 1.9/1.10 we had a lot less
activity overall and we can expect it to decrease further in the future.

Even if we tag log messages as you suggest, we'll need some mechanism to
deal with tags which are wrong or missing, and that's also tedious and
requires checking each entry anyway. Also, it is unclear how tagging as
suggested helps with features developed on trunk over long periods of time.
For example, the conflict resolver probably got a couple hundred commits
but it only requires one CHANGES entry. Should these all be tagged 'U:major'?
Or do we now enforce branch-based development so that only the final merge
commit is tagged, with the downsides that implies, such as no testing during
development by anyone except the original developer? Or do we add feature tags
which our script can recognize, such as [U:conflict-resolver]? Could we now
end up needing multiple tags for some messages? Will it become too complicated?

I believe that whatever we do, somebody will still have to read the full log
and check each entry in CHANGES to avoid listing a lot of trivial stuff,
and to make sure the most impactful changes appear at the top.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] release.py: write-changelog function (POC)

Johan Corveleyn-3
On Mon, Dec 4, 2017 at 1:45 PM, Stefan Sperling <[hidden email]> wrote:

> On Mon, Dec 04, 2017 at 12:16:37PM +0100, Johan Corveleyn wrote:
>> Examples:
>>   [U:major] Better interactive conflict resolution for tree conflicts
>>   [U:minor] ra_serf: Adjustments for serf versions with HTTP/2 support
>>   [U:client] Fix 'svn diff URL@REV WC' wrongly looks up URL@HEAD (issue #4597)
>>   [U:client-server] Fix bug with canonicalizing Window-specific
>> drive-relative URL
>>   [D:api] New svn_ra_list() API function
>>   [D:bindings] JavaHL: Allow access to constructors of a couple JavaHL classes
>
>> To get a rough idea: since we don't have any commit messages
>> containing such lines in our history, I've added a --pocfirstlines
>> option, which just takes the first line of every log message (ignoring
>> lines with 'STATUS', 'CHANGES', 'bump', or starting with '*'), putting
>> them in the "User -> General" section.
>
> If we're gonna add some form of structured information to our log messages,
> I'd prefer the git model with a simple heuristic which builds an uncategorized
> list:
>
> The first line of the message is what will end up in CHANGES. If the line
> mentions an issue number, that will appear in CHANGES as (issue #N), else
> the revision number will appear in changes as (rN). If the line matches
> '[Rr]evert rN' then a prior entry for rN is removed again. If the log message
> matches '[Ff]ollow(ing)-up to' the entry is dropped. If the log message
> contains '[Nn]o functional change' anywhere, the entry is dropped.

That implies that every commit has something useful to add to CHANGES
(except reverts, followups and non-functional changes). That's not the
case. In my proposal the changelog annotation is entirely optional.
Committers need to think whether their commit merits a changelog entry
(and the phrasing of the changelog entry can be different from the
first line of the commit message, which is more aimed at fellow
developers trying to understand a particular commit / how / why /
...).

> That should suffice for the most tedious part of the task: Getting a list
> of eligible entries. Categorizing the resulting list can be done by hand,
> and remaining unimportant changes must be removed by hand.
>
> I would refrain from trying to over-automate this task. It's not a lot of work
> compared to the time we spend writing code. I think I have added most of the
> 1.10 entries in CHANGES myself just by reading through the log, and even that
> wasn't too much effort. It took about a day and a half, and it also forced me
> to read through the entries in detail which helped understand the impact of
> each change. This was a much bigger problem from the project's origins through
> to 1.8, when we had a lot of active developers. For 1.9/1.10 we had a lot less
> activity overall and we can expect it to decrease further in the future.
>
> Even if we tag log messages as you suggest, we'll need some mechanism to
> deal with tags which are wrong or missing, and that's also tedious and
> requires checking each entry anyway. Also, it is unclear how tagging as
> suggested helps with features developed on trunk over long periods of time.
> For example, the conflict resolver probably got a couple hundred commits
> but it only requires one CHANGES entry. Should these all be tagged 'U:major'?

No, none of those individual commits requires a changelog entry. I'd
say for big new features it makes little sense to add tags for
incremental steps in functionality. Though entries such as "Add
interactive tree conflict resolver", "TC resolver: handle incoming
move vs. local edit", etc... (I'm just making these up) might be
useful.

> Or do we now enforce branch-based development so that only the final merge
> commit is tagged, with the downsides that implies, such as no testing during
> development by anyone except the original developer?

No, I'm not suggesting that.

> Or do we add feature tags
> which our script can recognize, such as [U:conflict-resolver]? Could we now
> end up needing multiple tags for some messages? Will it become too complicated?

No specific feature tags I'd say (but "sections", for the existing
sections in CHANGES). Multiple changelog entries in one commit message
would be allowed, but it'd be rather exceptional (for instance, if one
commit adds something User-visible and Developer-visible at the same
time, or a new feature and a corresponding "API change" -- i.e. you
want a changelog entry in both sections).

> I believe that whatever we do, somebody will still have to read the full log
> and check each entry in CHANGES to avoid listing a lot of trivial stuff,
> and to make sure the most impactful changes appear at the top.

I don't think so. Not if the changelog annotations are used well (as
with writing log message in general, this probably requires practice
to get right -- it'd be a part of reviewing each other's changes to
potentially give feedback on the (optional) changelog annotation).

The intention is that the RM doesn't need to go through all the full
log messages in detail, but that he can trust the output of
write-changelog, combined with trust in the quality of the log
messages involved. Basically CHANGES is built up incrementally as we
go in the commit messages themselves (and reviewed as part of our
normal workflow while the work is being done and everything is fresh
in everyone's memory).

Of course, the RM might see in the output of write-changelog that some
entries can be combined or summarized further, rephrased or reordered.
That's fine (in some cases he might want to correct the changelog
annotation in the underlying log message). But he shouldn't have to go
read all the details, unless something's wrong.

--
Johan
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] release.py: write-changelog function (POC)

Stefan Sperling-9
On Mon, Dec 04, 2017 at 02:17:59PM +0100, Johan Corveleyn wrote:

> > I believe that whatever we do, somebody will still have to read the full log
> > and check each entry in CHANGES to avoid listing a lot of trivial stuff,
> > and to make sure the most impactful changes appear at the top.
>
> I don't think so. Not if the changelog annotations are used well (as
> with writing log message in general, this probably requires practice
> to get right -- it'd be a part of reviewing each other's changes to
> potentially give feedback on the (optional) changelog annotation).
>
> The intention is that the RM doesn't need to go through all the full
> log messages in detail, but that he can trust the output of
> write-changelog, combined with trust in the quality of the log
> messages involved.

OK, in that light it all makes sense and I agree this is worth a try.
And if we don't manage to pull it off well enough, nothing is lost
compared to the status quo.

Thanks for driving this forward!
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] release.py: write-changelog function (POC)

Julian Foad-5
Stefan Sperling wrote:
> Johan Corveleyn wrote:
>> The intention is that the RM doesn't need to go through all the full
>> log messages in detail, but that he can trust the output of
>> write-changelog, combined with trust in the quality of the log
>> messages involved.
>
> OK, in that light it all makes sense and I agree this is worth a try.
> And if we don't manage to pull it off well enough, nothing is lost
> compared to the status quo.

Johan, you told us at the hackathon (and mentioned in the older linked
thread) that your team has been using a system like this for some time
at your workplace. For me, that gives the suggestion a lot more credibility.

The 'contribulyzer' syntax we use in log messages is the same sort of
scheme. I am slightly surprised it was as successful and long-lasting as
it is. When it was introduced I think a lot of gentle reminders were
issued until most people got accustomed to writing it.

I like the idea of something that encourages us to put more user-facing
descriptions in log messages, for changes that have a user-facing
impact. Sometimes I check the change messages for system security
updates I am installing, and I notice the huge difference between those
that say "change sys.foo.bar back to 18 to fix #4321" and those that say
"disable auto-switching the wifi channel because it wasn't reliable
(#4321)".

In the case where a user-facing but very simple change is made, I think
we would only need to write the tagged user-facing line in the log
message -- we should never need to write two lines that say the same
thing. Example log message:

[[[
* subversion/svn/svn.c:
   [U:client] Add a 'reintegrate merge' section to the 'svn help merge'.
]]]

I think one of the keys to making this successful is to keep the burden
low. For example, if "[U:section]" is the general tag for a user-visible
change then just "[U]" should be allowed so the dev can go ahead with
the commit when a suitable section is not defined or not obvious.

- Julian
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] release.py: write-changelog function (POC)

Johan Corveleyn-3
On Mon, Dec 4, 2017 at 3:15 PM, Julian Foad <[hidden email]> wrote:

> Stefan Sperling wrote:
>>
>> Johan Corveleyn wrote:
>>>
>>> The intention is that the RM doesn't need to go through all the full
>>> log messages in detail, but that he can trust the output of
>>> write-changelog, combined with trust in the quality of the log
>>> messages involved.
>>
>>
>> OK, in that light it all makes sense and I agree this is worth a try.
>> And if we don't manage to pull it off well enough, nothing is lost
>> compared to the status quo.
>
>
> Johan, you told us at the hackathon (and mentioned in the older linked
> thread) that your team has been using a system like this for some time at
> your workplace. For me, that gives the suggestion a lot more credibility.

Yes, absolutely. We're using a similar method for over 15 years now,
with weekly releases of a big software system (healthcare software,
being used in around 20 Belgian hospitals). It's more strict /
structured than what I've proposed here, but that's normal I guess ...
corporate world vs. open source. And it's not perfect of course, but
it has served us well and saves us a lot of time. I'm trying to find a
pragmatic way to apply the same core idea and get the benefits of "all
committers writing their own changelog entries at the time they're
writing the code".

...
> In the case where a user-facing but very simple change is made, I think we
> would only need to write the tagged user-facing line in the log message --
> we should never need to write two lines that say the same thing. Example log
> message:
>
> [[[
> * subversion/svn/svn.c:
>   [U:client] Add a 'reintegrate merge' section to the 'svn help merge'.
> ]]]

Hm, yes, I agree with the "don't write the same thing twice". But
perhaps the "general description" above the list of affected files
would be a better place:

[[[
[U:client] Add a 'reintegrate merge' section to the 'svn help merge'.

* subversion/svn/svn.c:
  (blah): Add help text.
]]]

Though, indeed, we're not required to always have a "general
description", and can just start with the affected files, if the
change is simple. So ... not sure.

That's the thing I'm most uncertain of at the moment: how to fit this
scheme precisely into our current log message style, without
interfering too much, keeping them as readable as possible for human
readers.

Maybe a syntax with '@' would be better, like annotations in Java or
doxygen. Like:
[[[
@U:client Add a 'reintegrate merge' section to the 'svn help merge'.

* subversion/svn/svn.c:
  (blah): Add help text.
]]]

or
[[[
* subversion/svn/svn.c:
  @U:client Add a 'reintegrate merge' section to the 'svn help merge'.
]]]

or as a suffix:
[[[
* subversion/svn/svn.c:
  Add a 'reintegrate merge' section to the 'svn help merge'. [U:client]
]]]

Just thinking out loud here ...
Or we put it at the bottom as an "extra", like the contribulyzer
lines, but then we have the problem of potentially having to write the
same line twice:
[[[
* subversion/svn/svn.c:
  (blah): Add a 'reintegrate merge' section to the 'svn help merge'.

[U:client] Add a 'reintegrate merge' section to the 'svn help merge'
]]]

Hmmmm

> I think one of the keys to making this successful is to keep the burden low.
> For example, if "[U:section]" is the general tag for a user-visible change
> then just "[U]" should be allowed so the dev can go ahead with the commit
> when a suitable section is not defined or not obvious.

Yes, good idea. Anything with [U] would become a simple '*' bullet
below "User-visible changes", and the RM can decide where it goes.

--
Johan
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] release.py: write-changelog function (POC)

Julian Foad-5
Johan Corveleyn wrote:
> [...]
> Hm, yes, I agree with the "don't write the same thing twice". But
> perhaps the "general description" above the list of affected files
> would be a better place:
[...]

> Though, indeed, we're not required to always have a "general
> description", and can just start with the affected files, if the
> change is simple. So ... not sure.
>
> That's the thing I'm most uncertain of at the moment: how to fit this
> scheme precisely into our current log message style, without
> interfering too much, keeping them as readable as possible for human
> readers.
>
> Maybe a syntax with '@' would be better, like annotations in Java or
> doxygen. Like:
[...]
> or as a suffix:
[...]
> Just thinking out loud here ...
[...]> Hmmmm

Now you're over-thinking it. What you said first, what you use at work,
is fine. Run with it!

Thanks,
- Julian

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] release.py: write-changelog function (POC)

Johan Corveleyn-3
On Mon, Dec 4, 2017 at 11:01 PM, Julian Foad <[hidden email]> wrote:

> Johan Corveleyn wrote:
>>
>> [...]
>> Hm, yes, I agree with the "don't write the same thing twice". But
>> perhaps the "general description" above the list of affected files
>> would be a better place:
>
> [...]
>>
>> Though, indeed, we're not required to always have a "general
>> description", and can just start with the affected files, if the
>> change is simple. So ... not sure.
>>
>> That's the thing I'm most uncertain of at the moment: how to fit this
>> scheme precisely into our current log message style, without
>> interfering too much, keeping them as readable as possible for human
>> readers.
>>
>> Maybe a syntax with '@' would be better, like annotations in Java or
>> doxygen. Like:
>
> [...]
>>
>> or as a suffix:
>
> [...]
>>
>> Just thinking out loud here ...
>
> [...]> Hmmmm
>
> Now you're over-thinking it. What you said first, what you use at work, is
> fine. Run with it!

Hehe, maybe :-).

OTOH: Subversion also has a 15+ year old log message style that has
served it well. Before giving this system a try (if we agree we
should), I think we should think carefully how to fit this into the
existing style, without breaking it. It's especially important to get
some buy-in from the people who create the most commits, and that's
certainly not me :-).

At work we have no such strong log message style as SVN. We limit
ourselves to a couple of lines, and every line is *required* to have
such a "tagged" prefix (which is enforced by a pre-commit hook, which
on error gives a reminder of the precise syntax). It also looks a
little different, with square brackets around the different parts:

   [U][General][SVN-1111] Fix crash in 'svn co'.

(the issue annotation is optional, the other two are mandatory).

    [D][API] Add new api svn__blah() as entry point to the blahing feature.

--
Johan
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] release.py: write-changelog function (POC)

Daniel Shahaf-2
Johan Corveleyn wrote on Tue, 05 Dec 2017 00:04 +0100:

> On Mon, Dec 4, 2017 at 11:01 PM, Julian Foad <[hidden email]> wrote:
> > Johan Corveleyn wrote:
> >> Just thinking out loud here ...
> >
> > [...]> Hmmmm
> >
> > Now you're over-thinking it. What you said first, what you use at work, is
> > fine. Run with it!
>
> Hehe, maybe :-).
>
> OTOH: Subversion also has a 15+ year old log message style that has
> served it well. Before giving this system a try (if we agree we
> should),

Let's go ahead and backfill log messages using the new format for trunk since
1.10.0-alpha3 (excepting changes already released in 1.9.x patch releases).
That would (a) get our feet wet with the new format, and make any shortcomings
or redundancies in it obvious (as well as whether their frequency warrants
discussing them and implementing special syntaxes for them), (b) be technical
credit¹ towards writing the 1.10.0-rc1 changelog.

So we're all on the same page, I propose we start by using the [U:foo]/[D:bar]
format from jcorvel's original message, plus the [U]/[D] fallback Julian
proposed.  We can add further extensions and solve edge cases (e.g., rfc822
line unfolding) as we run across them.

Personally, I wouldn't worry about duplication for now; we can always invent
additional syntaxes later that will reduce duplication in log messages.

Cheers,

Daniel

¹ "technical credit" is the opposite of technical debt.

> I think we should think carefully how to fit this into the
> existing style, without breaking it. It's especially important to get
> some buy-in from the people who create the most commits, and that's
> certainly not me :-).
>
> At work we have no such strong log message style as SVN. We limit
> ourselves to a couple of lines, and every line is *required* to have
> such a "tagged" prefix (which is enforced by a pre-commit hook, which
> on error gives a reminder of the precise syntax). It also looks a
> little different, with square brackets around the different parts:
>
>    [U][General][SVN-1111] Fix crash in 'svn co'.
>
> (the issue annotation is optional, the other two are mandatory).
>
>     [D][API] Add new api svn__blah() as entry point to the blahing feature.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] release.py: write-changelog function (POC)

Branko Čibej
In reply to this post by Julian Foad-5
On 04.12.2017 23:01, Julian Foad wrote:

> Johan Corveleyn wrote:
>> [...]
>> Hm, yes, I agree with the "don't write the same thing twice". But
>> perhaps the "general description" above the list of affected files
>> would be a better place:
> [...]
>> Though, indeed, we're not required to always have a "general
>> description", and can just start with the affected files, if the
>> change is simple. So ... not sure.
>>
>> That's the thing I'm most uncertain of at the moment: how to fit this
>> scheme precisely into our current log message style, without
>> interfering too much, keeping them as readable as possible for human
>> readers.
>>
>> Maybe a syntax with '@' would be better, like annotations in Java or
>> doxygen. Like:
> [...]
>> or as a suffix:
> [...]
>> Just thinking out loud here ...
> [...]> Hmmmm
>
> Now you're over-thinking it. What you said first, what you use at
> work, is fine. Run with it!


IMO the important thing is to make the tagging in the log messages as
human-friendly as possible. That's what made the contribulyzer work so
well: there were no $[@foo;\\} syntactical quirks to remember and the
syntax is permissive.

Therefore I think stsp's suggestion of just taking the summary line in
the log message as the CHANGES entry is more likely to produce useful
results than any magic tagging. I don't think it's all that important to
classify the change into user-facing, etc., at commit time. All we need,
IMO, is a way to signal whether the change is or is not important.

So I'd suggest we use stsp's proposal with a small difference: by
default, every log message that has a summary line is important. If it's
not, start the first line with a # to tell the script to ignore it. That
way, it takes extra thought and effort to exclude a commit from the
CHANGES summary, which is IMO preferable to having to put in extra
thought and effort to have it included. It's easier to prune unnecessary
entries from the summary than it is to dig into commit history to find
the missing ones.

-- Brane
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] release.py: write-changelog function (POC)

Johan Corveleyn-3
On Tue, Dec 5, 2017 at 1:00 PM, Branko Čibej <[hidden email]> wrote:

> On 04.12.2017 23:01, Julian Foad wrote:
>> Johan Corveleyn wrote:
>>> [...]
>>> Hm, yes, I agree with the "don't write the same thing twice". But
>>> perhaps the "general description" above the list of affected files
>>> would be a better place:
>> [...]
>>> Though, indeed, we're not required to always have a "general
>>> description", and can just start with the affected files, if the
>>> change is simple. So ... not sure.
>>>
>>> That's the thing I'm most uncertain of at the moment: how to fit this
>>> scheme precisely into our current log message style, without
>>> interfering too much, keeping them as readable as possible for human
>>> readers.
>>>
>>> Maybe a syntax with '@' would be better, like annotations in Java or
>>> doxygen. Like:
>> [...]
>>> or as a suffix:
>> [...]
>>> Just thinking out loud here ...
>> [...]> Hmmmm
>>
>> Now you're over-thinking it. What you said first, what you use at
>> work, is fine. Run with it!
>
>
> IMO the important thing is to make the tagging in the log messages as
> human-friendly as possible. That's what made the contribulyzer work so
> well: there were no $[@foo;\\} syntactical quirks to remember and the
> syntax is permissive.
>
> Therefore I think stsp's suggestion of just taking the summary line in
> the log message as the CHANGES entry is more likely to produce useful
> results than any magic tagging. I don't think it's all that important to
> classify the change into user-facing, etc., at commit time. All we need,
> IMO, is a way to signal whether the change is or is not important.
>
> So I'd suggest we use stsp's proposal with a small difference: by
> default, every log message that has a summary line is important. If it's
> not, start the first line with a # to tell the script to ignore it. That
> way, it takes extra thought and effort to exclude a commit from the
> CHANGES summary, which is IMO preferable to having to put in extra
> thought and effort to have it included. It's easier to prune unnecessary
> entries from the summary than it is to dig into commit history to find
> the missing ones.

I'm not sure. I think we're very much accustomed to a certain style
for CHANGES, and another style and information-density for the summary
of commit messages. If these could converge, then sure, this could be
a good idea. If not, I think we haven't gained much (the RM would have
to rephrase them all anyway).

For example, CHANGES for 1.10 contains (just taking a couple of
entries under User-visible "Minor new features and improvements"):

    * New '--file' option for several svnadmin subcommands (r1738021)
    * ra_serf: Adjustments for serf versions with HTTP/2 support (r1716400)
    * windows: Correctly check result from LoadLibrary() call (r1755983)
    * FSFS: Open transaction's proto revision in write-only mode (r1759135)

And the corresponding lines out of "./release.py write-changelog
--pocfirstlines trunk tags/1.9.7" are (just grepping on the revision
numbers):

    * Introduce `--file' option for svnadmin dump, dump-revprops, load
and (r1738021)
    * Apply some minor tweaks to libsvn_ra_serf to handle some http/2
cases. (r1716400)
    * Correctly check result from LoadLibrary() call: LoadLibrary()
returns NULL (r1755983)
    * FSFS: Open transaction's proto revision in write-only mode.
Read/write mode (r1759135)

That last one is actually the same, nice :-).

I think these could converge, i.e. we *can* get used to write our
summaries in the CHANGES style (for instance: put the relevant
"component" in front, like "ra_serf:", "windows:", ...; and make the
first line of the summary shorter). But it'll take getting used to
:-).

Another thing is that there are a lot more commits than entries in CHANGES.
* CHANGES for 1.10 is currently 244 lines.
* output of "./release.py write-changelog --pocfirstlines trunk
tags/1.9.7" is 1633 lines.

That means we'd have to use the "#" (or other symbol) escape a lot.
That, or a lot of those commits are reverts (we'd need to handle these
well in any case), or are summarized with "(... et al)".

The flip side would be, if we could get the hang of this, we'd need no
special syntax, no duplication of "almost the same text", and our
"commit message summaries" are automatically fit for CHANGES (and it's
safer to forget to exclude a commit). Which also means an option for
"svn log" showing only the first line would be interesting, and would
approximately yield CHANGES plus the "#-escaped" summaries.

Hmmm, red pill or blue pill.

--
Johan