Questions on migrating SVN (and Trac) to a Google Compute Engine instance

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Questions on migrating SVN (and Trac) to a Google Compute Engine instance

James H. H. Lampert
Greetings.

My employer has put me on a project of moving our SVN and Trac servers
from the old Windows Server 2003 box on which they're currently running
over to a Google Compute Engine instance.

To that end, I've set up the instance using Bitnami's canned Trac image,
which includes SVN 1.9.5 (r1770682) and Trac 1.0.15 (our old SVN server
is 1.5.0, r31699, and our old Trac server is 1.0).

I've got a test repository set up, and I've arranged access via both
https: and svn+ssh: protocols, which I then spent a few hours testing
from Eclipse.

But I'm not the one who set up the original SVN and Trac environments in
the first place, and so what little I know about administration on these
products is what I've picked up over the past few weeks.

Now, Trac's wiki page on the process of a dual migration,
    https://trac.edgewall.org/wiki/TracMigrate
seems to be pretty straightforward on the subject of migrating Trac, but
the section on migrating SVN is not so.

They recommend setting up a "pre-revprop-change" script with nothing in
it but the initial "shebang", for each target repository, and then using
"svnsync" to migrate the repositories. It also assumes the existence of
an "svnsync" user-ID on the target system, which (at least assuming it's
an operating system user-ID) we don't currently have.

Everything else I've read, especially The SVN Book, says to use
"svnsync" only for mirroring, and instead migrate using some combination
of "svnadmin dump," "svnadmin load," "svnrdump," and "svnrload."

I'm not seeing a lot about copying configuration files or hook scripts.
Is that just a matter of sending them over?

And I don't quite understand how this whole business impacts the authors
of commits. Does SVN care whether the author of a commit is a user known
to SVN or to the operating system? I've already copied an "authz" file
from one of the existing repositories into the test repository, and
given the current users Apache user-IDs and passwords, but that's all,
so far.

--
James H. H. Lampert
Touchtone Corporation
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Questions on migrating SVN (and Trac) to a Google Compute Engine instance

Ryan Schmidt-8

On Jul 27, 2017, at 14:23, James H. H. Lampert wrote:
>
> They recommend setting up a "pre-revprop-change" script with nothing in it but the initial "shebang", for each target repository, and then using "svnsync" to migrate the repositories.

Sounds plausible. An empty pre-revprop-change hook script would allow any revprop change, which you may not want. It's probably possible to write a more-specific script that would allow only the changes svnsync needs and disallow others.

> It also assumes the existence of an "svnsync" user-ID on the target system, which (at least assuming it's an operating system user-ID) we don't currently have.

svnsync doesn't care what system user account you use. You would probably want to pick the username that the server process will use. If you're serving with Apache, that'll be a username like nobody or httpd or apache.

> Everything else I've read, especially The SVN Book, says to use "svnsync" only for mirroring, and instead migrate using some combination of "svnadmin dump," "svnadmin load," "svnrdump," and "svnrload."

svnsync is probably best, since it allows you to easily incrementally mirror a live read/write repository to another server. It can be slow but once it's done it makes it very quick to switch from the old server to the new one with minimal downtime. Some of the other methods require you to make the source repository read-only or take it offline for the duration of the migration, which could take hours or days depending on how large your repository is.

> I'm not seeing a lot about copying configuration files or hook scripts. Is that just a matter of sending them over?

Pretty much. You may need to edit the files if the setup of the new server differs from the old one. New versions of Subversion may also offer more features than old versions, which may affect your scripts or configuration.

> And I don't quite understand how this whole business impacts the authors of commits. Does SVN care whether the author of a commit is a user known to SVN or to the operating system? I've already copied an "authz" file from one of the existing repositories into the test repository, and given the current users Apache user-IDs and passwords, but that's all, so far.

If you're using Apache to serve the Subversion repository, on both the old and new systems, then you're right, Subversion users don't care about server system user accounts; they only care about user accounts as defined in whatever authentication you've set up for the repository in Apache.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Questions on migrating SVN (and Trac) to a Google Compute Engine instance

Arwin Arni
In reply to this post by James H. H. Lampert



They recommend setting up a "pre-revprop-change" script with nothing in it but the initial "shebang", for each target repository, and then using "svnsync" to migrate the repositories. It also assumes the existence of an "svnsync" user-ID on the target system, which (at least assuming it's an operating system user-ID) we don't currently have.

This doesn't refer to a unix user id, just the username and password used to authenticate to the source repository. It can be any username (not necessarily 'svnsync'). Just make sure that it has read access to the entire repository that you are svnsync'ing
Everything else I've read, especially The SVN Book, says to use "svnsync" only for mirroring, and instead migrate using some combination of "svnadmin dump," "svnadmin load," "svnrdump," and "svnrload."

While svnadmin dump/load is the recommended path for migrating repos to a newer version, svnsync does pretty much the same thing. You just need to use the newer versions of the svnadmin and svnsync binaries (which are installed in your target system).
I'm not seeing a lot about copying configuration files or hook scripts. Is that just a matter of sending them over?

Neither svnsync nor svnadmin dump/load will take care of things like hook scripts/configuration files. You will have to copy these over manually and place them in their appropriate paths.
And I don't quite understand how this whole business impacts the authors of commits. Does SVN care whether the author of a commit is a user known to SVN or to the operating system?

While svnsync'ing you don't need to worry about authz at all, because I see that the document you posted suggests init over <a class="moz-txt-link-freetext" href="file://">file:// . This will not involve authz. You will however need to set up and configure authentication and authorization just like it is in the old system when you want to start using the new system.

Regards,
Arwin
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Questions on migrating SVN (and Trac) to a Google Compute Engine instance

Nico Kadel-Garcia-2
In reply to this post by James H. H. Lampert
On Thu, Jul 27, 2017 at 3:23 PM, James H. H. Lampert
<[hidden email]> wrote:

> Greetings.
>
> My employer has put me on a project of moving our SVN and Trac servers from
> the old Windows Server 2003 box on which they're currently running over to a
> Google Compute Engine instance.
>
> To that end, I've set up the instance using Bitnami's canned Trac image,
> which includes SVN 1.9.5 (r1770682) and Trac 1.0.15 (our old SVN server is
> 1.5.0, r31699, and our old Trac server is 1.0).
>
> I've got a test repository set up, and I've arranged access via both https:
> and svn+ssh: protocols, which I then spent a few hours testing from Eclipse.
>
> But I'm not the one who set up the original SVN and Trac environments in the
> first place, and so what little I know about administration on these
> products is what I've picked up over the past few weeks.
>
> Now, Trac's wiki page on the process of a dual migration,
>    https://trac.edgewall.org/wiki/TracMigrate
> seems to be pretty straightforward on the subject of migrating Trac, but the
> section on migrating SVN is not so.

That page is good stuff.

> They recommend setting up a "pre-revprop-change" script with nothing in it
> but the initial "shebang", for each target repository, and then using
> "svnsync" to migrate the repositories. It also assumes the existence of an
> "svnsync" user-ID on the target system, which (at least assuming it's an
> operating system user-ID) we don't currently have.

That is just the account name of the user who has access to the
upstream repository. If you don't have access to that upstream
repository via Subversion https://, or some CIFS mounted filesystem
access ot the filesystem, or a local filesystem copy or *something*
it's going to be very difficulty to copy the repository. And https://
access or svn+ssh:// or a CIFS mount gets you access to the live
upstream repository for updates.

> Everything else I've read, especially The SVN Book, says to use "svnsync"
> only for mirroring, and instead migrate using some combination of "svnadmin
> dump," "svnadmin load," "svnrdump," and "svnrload."

svnsync has gotten popular because it lets you keep the new repo
up-to-date until you're ready to switch. svnadmin dump, etc. are more
useful when you want to make an offline backup, or when you want to
filter out content. Note that this is about the *only* chance you're
going to get to clear out old content switching to a new repository.
If you have a cluttered "branch" layout, or bulky iso images someone
accidentally committed, or old passwords embedded in files you want to
clear, here is your chance with dump, filter, and load operations. .
I'm not sure how much that kind of filtering would do to Trac, just
saying.

> I'm not seeing a lot about copying configuration files or hook scripts. Is
> that just a matter of sending them over?

Going from Windows 2003 to a Google Compute Engine? You *wish*. In
theory, yes, but in practice, if they've been locally customized, they
may have hardcoded dependencies on particular scripting languages. One
step that may help is if you have access to the old box and can run
"svnadmin hotcopy", to get a copy to play with containing all the old
scripts so you can set it aside and play with it separately.

> And I don't quite understand how this whole business impacts the authors of
> commits. Does SVN care whether the author of a commit is a user known to SVN
> or to the operating system? I've already copied an "authz" file from one of
> the existing repositories into the test repository, and given the current
> users Apache user-IDs and passwords, but that's all, so far.

It Depends(tm). For HTTPS access, the author of a commit is known to
the httpd daemon as an authenticated user. The httpd daemon needs
write access to the file system of the server. For svn:// access,
ditto, the author is known to the svnserve protocol, not the local
filesystem, and the svnserve daemon user needs write access. For
svn+ssh://, the author is typically *set* in the configuration for the
SSH key, and the user designated for the SSH access or SSH key access
is local and needs write access. For file:/// access, the user would
need to exist in some way and have write access to the filesystem.

What you have seems quite correct. The httpd daemon needs write
access, and httpd cares about their credentials for https:// and for
Trac software. (I'm picky about Apache being apache-1.x, and release
2.x being renamed httpd, which is why I don't call it Apache.)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Questions on migrating SVN (and Trac) to a Google Compute Engine instance

James H. H. Lampert
In reply to this post by Ryan Schmidt-8
On 7/27/17, 11:15 PM, Ryan Schmidt wrote:
> Sounds plausible. An empty pre-revprop-change hook script would allow
> any revprop change, which you may not want. It's probably possible to
> write a more-specific script that would allow only the changes
> svnsync needs and disallow others.
. . .
> svnsync is probably best, since it allows you to easily incrementally
> mirror a live read/write repository to another server. It can be slow
> but once it's done it makes it very quick to switch from the old
> server to the new one with minimal downtime. Some of the other
> methods require you to make the source repository read-only or take
> it offline for the duration of the migration, which could take hours
> or days depending on how large your repository is.
. . .
and Arwin and Nico said similar things.

Thanks, Ryan, Arwin, and Nico.

It took a bit of futzing around, but as I type this, I'm replicating a
repository (the smallest and currently least-active one).

It took me a while to realize that the hooks with .tmpl extensions were
templates, not live hooks, and I was right on the verge of asking for
help when I looked at the instructions again, and realized that instead
of setting up the required null hook, I'd overwritten a template (DUH!)

--
JHHL
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Questions on migrating SVN (and Trac) to a Google Compute Engine instance

Johan Corveleyn-3
On Sat, Jul 29, 2017 at 1:16 AM, James H. H. Lampert
<[hidden email]> wrote:

> On 7/27/17, 11:15 PM, Ryan Schmidt wrote:
>>
>> Sounds plausible. An empty pre-revprop-change hook script would allow
>> any revprop change, which you may not want. It's probably possible to
>> write a more-specific script that would allow only the changes
>> svnsync needs and disallow others.
>
> . . .
>>
>> svnsync is probably best, since it allows you to easily incrementally
>> mirror a live read/write repository to another server. It can be slow
>> but once it's done it makes it very quick to switch from the old
>> server to the new one with minimal downtime. Some of the other
>> methods require you to make the source repository read-only or take
>> it offline for the duration of the migration, which could take hours
>> or days depending on how large your repository is.
>
> . . .
> and Arwin and Nico said similar things.
>
> Thanks, Ryan, Arwin, and Nico.
>
> It took a bit of futzing around, but as I type this, I'm replicating a
> repository (the smallest and currently least-active one).
>
> It took me a while to realize that the hooks with .tmpl extensions were
> templates, not live hooks, and I was right on the verge of asking for help
> when I looked at the instructions again, and realized that instead of
> setting up the required null hook, I'd overwritten a template (DUH!)

Indeed, you only need to copy the files under $REPOS/hooks which are
not ending in .tmpl :-). And for the custom hook scripts you have:
maybe review them and compare the comments section at the beginning
with the new corresonding template to see if there are new features
(perhaps copy over the new comment block to your custom hook script).

Two other things which are also not transferred by svnsync (nor by dump/load):
- config files (under $REPOS/conf), for instance authz
- locks (server-side locks, of the svn:needs-lock type): these should
be copied from $OLDREPOS/db/locks to $NEWREPOS/db/locks

FWIW, I've documented some of these things in an extended answer in
our FAQ, about dump/load (some of these things also apply to svnsync):
http://subversion.apache.org/faq.html#dumpload

For the record, you can also perform an incemental dump/load with
minimal downtime (I explained this procedure in the FAQ). But svnsync
is still a lot easier, because of the automatic normalization of
properties (which is not done automatically by 'svnadmin load' at the
moment, which will error out on non-LF line endings in svn:log
messages and things like that, possibly giving a lot of headaches ...
see FAQ).

--
Johan
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Questions on migrating SVN (and Trac) to a Google Compute Engine instance

Daniel Shahaf-2
Johan Corveleyn wrote on Sat, 29 Jul 2017 11:39 +0200:
> Two other things which are also not transferred by svnsync (nor by dump/load):
> - locks (server-side locks, of the svn:needs-lock type): these should
> be copied from $OLDREPOS/db/locks to $NEWREPOS/db/locks

It would be best to do this under 'svnadmin freeze'.  Currently, the
failure mode of a non-atomic copy is harmless, but we don't promise
it'll always be this way.
Loading...