Monday, May 29, 2017

Re: OpenBSD Package Manager

On 2017/05/29 14:16, H. Ishikawa wrote:
> Hello, I'd like to ask some specific areas of the pkg_add tools.
>
> 1. Why Perl instead of C?
> Perl is comparatively slow, and I think this limits who can contribute
> to the source code. How many developers in OpenBSD are actively doing
> any review of the pkg_add tools code? Would there be any interest in
> porting pkgsrc or pkgng from another BSD, or rewriting it in C?

No, not really.

> 2. Why no package database file?
> Other package managers like apt-get can fetch a single file that has
> all the package versions/info in it. When I update my packages on
> OpenBSD Current, it is a very slow process. Each package must be
> individually checked for updates, rather than comparing a list of
> what I have to a single list of the newest versions. This makes
> doing updates very painful and I avoid doing it sometimes.

This is intentional, to reduce the problems caused by fetching from
a mirror which is partway through an update. OpenBSD build procedure
is to do a full package build frequently (often 2-3x/week for i386
and amd64), producing 30-odd GB of files per build. Even with many
mirrors using --delay-updates --delete-delay we don't always
get the files updated atomically.

> 3. Why so many connections?
> When I tried to investigate why the update was so slow, I saw that
> pkg_add was making one HTTPS connection per package! Tools like
> wget from Linux can reuse a single connection for many downloads.
> Could this be added to pkg_add in OpenBSD?

There is a plan for some connection caching in the future, maybe
via an external process (though you can do this yourself today for
the connection going over the internet i.e. with a reverse http
proxy - I've used nginx for this before).

Past attempts at persistent connections ran into problems with stalling
and weren't overall an improvement. It would be desirable but hasn't
been too big of a problem in real-world use.

Personally I wouldn't really recommend HTTPS for pkg_add at this point.
Packages are signed anyway (and in recent versions the signature is
verified *before* passing to the decompressor) so I don't see much
real safety advantage. Multiple HTTP connections, while slower than
persistent connections, are considerably faster than doing kex for
HTTPS many times over.

No comments:

Post a Comment