Searching entire repository for a file (fastsvncrawler)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Searching entire repository for a file (fastsvncrawler)

Kenneth Porter
I need to locate a file in a client's large repository. I found
fastsvncrawler which uses svn_ra_do_status2 to rapidly dump the entire
repository as a directory listing. Has anyone built a Windows binary? Or
perhaps it's made it into the distribution? (I access the repo over a Cisco
VPN from Windows, or I'd just build it on Linux. I fear I'll have to learn
how to build Subversion on Windows, which looks daunting.)

<https://github.com/mithro/fastsvncrawler>

How it works:

<http://vcs.atspace.co.uk/2012/07/15/subversion-remote-api-listing-repository-with-status-request/>

Reply | Threaded
Open this post in threaded view
|

Re: Searching entire repository for a file (fastsvncrawler)

Stefan Sperling
On Wed, Jul 22, 2020 at 03:36:36PM -0700, Kenneth Porter wrote:

> I need to locate a file in a client's large repository. I found
> fastsvncrawler which uses svn_ra_do_status2 to rapidly dump the entire
> repository as a directory listing. Has anyone built a Windows binary? Or
> perhaps it's made it into the distribution? (I access the repo over a Cisco
> VPN from Windows, or I'd just build it on Linux. I fear I'll have to learn
> how to build Subversion on Windows, which looks daunting.)
>
> <https://github.com/mithro/fastsvncrawler>
>
> How it works:
>
> <http://vcs.atspace.co.uk/2012/07/15/subversion-remote-api-listing-repository-with-status-request/>
>

Are you aware of the built-in svn list --search feature, which has
been available since SVN 1.10.0?

For example:

$ svn list --depth=infinity --search svn.c ^/subversion/trunk
subversion/svn/svn.c
$

Also as of SVN 1.10 the server supports a special-purpose 'list' request
to speed this up.

In any case, the fastest way to search will likely be with a file:// URL,
assuming you can get direct access to the repository for this purpose.

Regards,
Stefan
Reply | Threaded
Open this post in threaded view
|

Re: Searching entire repository for a file (fastsvncrawler)

Stefan Sperling
On Thu, Jul 23, 2020 at 11:54:52AM +0200, Stefan Sperling wrote:

> On Wed, Jul 22, 2020 at 03:36:36PM -0700, Kenneth Porter wrote:
> > I need to locate a file in a client's large repository. I found
> > fastsvncrawler which uses svn_ra_do_status2 to rapidly dump the entire
> > repository as a directory listing. Has anyone built a Windows binary? Or
> > perhaps it's made it into the distribution? (I access the repo over a Cisco
> > VPN from Windows, or I'd just build it on Linux. I fear I'll have to learn
> > how to build Subversion on Windows, which looks daunting.)
> >
> > <https://github.com/mithro/fastsvncrawler>
> >
> > How it works:
> >
> > <http://vcs.atspace.co.uk/2012/07/15/subversion-remote-api-listing-repository-with-status-request/>
> >
>
> Are you aware of the built-in svn list --search feature, which has
> been available since SVN 1.10.0?
>
> For example:
>
> $ svn list --depth=infinity --search svn.c ^/subversion/trunk
> subversion/svn/svn.c
> $

I forgot to mention that that this feature supports pattern matching,
and that the pattern argument may need quoting. From 'svn help list':

  --search ARG             : use ARG as search pattern (glob syntax, case-
                             and accent-insensitive, may require quotation marks
                             to prevent shell expansion)

> Also as of SVN 1.10 the server supports a special-purpose 'list' request
> to speed this up.
>
> In any case, the fastest way to search will likely be with a file:// URL,
> assuming you can get direct access to the repository for this purpose.
>
> Regards,
> Stefan
Reply | Threaded
Open this post in threaded view
|

Re: Searching entire repository for a file (fastsvncrawler)

Kenneth Porter
In reply to this post by Stefan Sperling
--On Thursday, July 23, 2020 12:54 PM +0200 Stefan Sperling <[hidden email]>
wrote:

> Are you aware of the built-in svn list --search feature, which has
> been available since SVN 1.10.0?
>
> For example:
>
> $ svn list --depth=infinity --search svn.c ^/subversion/trunk
> subversion/svn/svn.c
> $
>
> Also as of SVN 1.10 the server supports a special-purpose 'list' request
> to speed this up.
>
> In any case, the fastest way to search will likely be with a file:// URL,
> assuming you can get direct access to the repository for this purpose.

I was not. Very nice, particularly the pattern matching. Can I determine
the server version remotely to see if this is supported? (Alas, I don't
have direct access.) Is the globbing done on the server with the 1.10
support? That would make it as fast as direct access.

Reply | Threaded
Open this post in threaded view
|

Re: Searching entire repository for a file (fastsvncrawler)

Stefan Sperling
On Thu, Jul 23, 2020 at 08:05:12AM -0700, Kenneth Porter wrote:

> --On Thursday, July 23, 2020 12:54 PM +0200 Stefan Sperling <[hidden email]>
> wrote:
>
> > Are you aware of the built-in svn list --search feature, which has
> > been available since SVN 1.10.0?
> >
> > For example:
> >
> > $ svn list --depth=infinity --search svn.c ^/subversion/trunk
> > subversion/svn/svn.c
> > $
> >
> > Also as of SVN 1.10 the server supports a special-purpose 'list' request
> > to speed this up.
> >
> > In any case, the fastest way to search will likely be with a file:// URL,
> > assuming you can get direct access to the repository for this purpose.
>
> I was not. Very nice, particularly the pattern matching. Can I determine the
> server version remotely to see if this is supported?

Some servers will advertise the SVN version on pages which can be visited
with a web browser. But this depends on the server's configuration. You may
have to ask the administrator to be sure about the server's exact version.

If the server supports the feature it will send an "svn/list" DAV header.
The full capability string to look for in a trace of an SVN HTTP session
would be: "http://subversion.tigris.org/xmlns/dav/svn/list"

I suppose the easiest way forward might be to simply try it out and see
how long it takes.

> (Alas, I don't have
> direct access.) Is the globbing done on the server with the 1.10 support?

Yes, globbing would be done on the server side.
Reply | Threaded
Open this post in threaded view
|

Re: Searching entire repository for a file (fastsvncrawler)

Kenneth Porter
--On Thursday, July 23, 2020 6:22 PM +0200 Stefan Sperling <[hidden email]>
wrote:

> Some servers will advertise the SVN version on pages which can be visited
> with a web browser. But this depends on the server's configuration. You
> may have to ask the administrator to be sure about the server's exact
> version.

Drat. My own server is on CentOS 7 and is still back at 1.7. (The "joy" of
using a conservative operating system.) So I need to look for someone
packaging the latest Subversion release for RHEL for when I start using
that with my own work. The server I need to search is a Windows server so
it may be quite a bit newer.


Reply | Threaded
Open this post in threaded view
|

Re: Searching entire repository for a file (fastsvncrawler)

Daniel Shahaf-2
In reply to this post by Stefan Sperling
Stefan Sperling wrote on Thu, 23 Jul 2020 17:22 +0200:
> Some servers will advertise the SVN version on pages which can be visited
> with a web browser. But this depends on the server's configuration. You may
> have to ask the administrator to be sure about the server's exact version.
>
> If the server supports the feature it will send an "svn/list" DAV header.
> The full capability string to look for in a trace of an SVN HTTP session
> would be: "http://subversion.tigris.org/xmlns/dav/svn/list"
>

I'm guessing the equivalent svnserve capability is called just "list".
svnserve capabilities are printed as soon as a TCP channel is
established, in the server's greeting.

> I suppose the easiest way forward might be to simply try it out and see
> how long it takes.
Reply | Threaded
Open this post in threaded view
|

Re: Searching entire repository for a file (fastsvncrawler)

Yasuhito FUTATSUKI
In reply to this post by Kenneth Porter
Hi,

On 2020/07/24 0:46, Kenneth Porter wrote:
> --On Thursday, July 23, 2020 6:22 PM +0200 Stefan Sperling <[hidden email]> wrote:
>
>> Some servers will advertise the SVN version on pages which can be visited
>> with a web browser. But this depends on the server's configuration. You
>> may have to ask the administrator to be sure about the server's exact
>> version.
>
> Drat. My own server is on CentOS 7 and is still back at 1.7. (The "joy" of using a conservative operating system.) So I need to look for someone packaging the latest Subversion release for RHEL for when I start using that with my own work. The server I need to search is a Windows server so it may be quite a bit newer.
>
 
I made (libserf 1.3.9 and) subversion 1.14.0 RPMs for CentOS 7
and put them on <https://mm.poem.co.jp/misc.html#subversion>.
Although I didn't built/test some feature, but source RPM
shows how I built with base/update and epel packages.

JFYI.

Cheers,
--
Yasuhito FUTATSUKI <[hidden email]>