Problem: Past communications are scattered across systems and storage locations, with no consistent archives or permalinks, so cross-referencing is difficult and non-permanent. Our issue tracker and wiki provide only links that are tied to their current provider technology.
The types of information include:
Short-form chat (IRC) archives: Matrix-Static, irclogger, ASF Wilderness IRC logs.
Email/forum archives: haxx.se, ASF Pony Mail, the old Tigris lists. (The old Tigris mailing-lists system used to provide a permalink to the archived version at the bottom of each mail delivered.)
Issue Tracker: the current Jira issues, the old issuezilla.
Wiki pages: the current Confluence, the old MoinMoin.
An important step is to develop a URL "permalink" scheme to refer to our various resources. These would be technology-ignorant URLs, all under subversion.apache.org, like "/issue/1234".
A baby step is the '.message-ids.tsv' file in our web site directory, holding a mapping from haxx archive URLs used in our web pages to email message ids, with (in the commit log message) a script to generate it. There is, as yet, no automation to use the mapping in any way.
start documenting a URL-space map for our resources
populate one entry, e.g. "/issue/<number> → issue <number>"
implement some simple automated handling (e.g. redirects) for that
well, well... we already have this in our .htaccess which covers that exact case along with some aliases:
"RedirectMatch ^/issue[^A-Za-z0-9]?(\d+)$ https://issues.apache.org/jira/browse/SVN-$1"
start using it: update existing direct links to point here instead; publicize it
Deeper integration: A permalink URL should not merely redirect the user to its technology-specific target URL, but present the target in such a way that other inbound and outbound URLs also use the permalink form. With a big third-party system like Jira or Confluence the feasibility of that is going to depend entirely on whether the system has built-in support for that usage.
> Julian Foad wrote on Fri, 07 Dec 2018 11:14 +0000:
> > https://cwiki.apache.org/confluence/x/U4rQBQ >
> Will this link be valid in a year or three when the cwiki installation
> has been migrated to another host? Or will it be invalidated upon the
> next server reboot, or in 90 days, or something like that?
> Is there a permalink format that includes the title of the linked page?
> A link that includes only a database identifier is poor from a usability
> point of view.
Just as a data point for the discussion: using links with page names,
where the page has been renamed, are handled quite gracefully by
Confluence. No automatic redirect, but a "Page not found" with "The
page you were looking for may have been renamed to the following:
<link to new page>".
> An important step is to develop a URL "permalink" scheme to refer to our
> various resources. These would be technology-ignorant URLs, all under
> subversion.apache.org, like "/issue/1234".
> A baby step is the '.message-ids.tsv' file in our web site directory,
> holding a mapping from haxx archive URLs used in our web pages to email
> message ids, with (in the commit log message) a script to generate it.
Possible next steps:
* run this mapping extractor not just on the web site but also on the
- svn source code
- mail archives
- IRC archives
- issue tracker
* request a copy of the archive so we can recover if it goes away
* implement some sort of interface to look up messages in the haxx archive via their message-id
* implement a URL scheme in our own URL space, map it to one archive (not necessarily this one), and then we're somewhat future-proofed against changing to different archives