A few months ago I criticised Qt team for not following QA indications regarding the installation of documentation. Now, I wish to apologize and thank them: they were tremendously useful in identifying one area of Gentoo that needs fixes in both the QA policy and the actual use of it.
The problem: the default QA rules, and the policies encoded in both the Ebuild HOWTO (that should be deprecated!) and Portage itself, is to install the documentation of packages into /usr/share/doc/${PF}
, a path that changes between different revisions of the same package. Some packages currently don’t respect that; some because it wasn’t thought about; some because they were mistakenly bound to ${P}
on a zero-revision ebuild; some because they need not respect that paths.
When I started reporting for wrongly-installed documentation, I wasn’t expecting any in the latter category, it turns out to find so many examples of the latter; and a further number of examples and use cases that would call to change that policy altogether:
- package foo requires to know where package bar installs its documentation, so that it can load it up, for whatever reason that is; this requires bar to either symlink its documentation somewhere or break the current policy, or otherwise you’d have to rebuild foo each time to find the correct path, which is not feasible;
- package baz requires to know where its own documentation is installed to be able to access it at runtime; this either requires it to be hardcoded in the sources or to write it in a configuration file requiring semi-manual merge through
etc-config
; this is the case of Postfix for instance; - probably most important for users, API documentation bookmarks, right now, cannot be made stale unless you use symlinks; this is very annoying for the people who use those packages to develop (and you might guess that my main target here would be Ruby gems).
The solution: not sure if I can say I have a solution, but Samuli and Ulrich proposed a number of possible alternatives to solve the problem; from their suggestions I’d say we have to encode exactly three informations. The category of the package, the package name, and the slot of the package itself — the category is needed because there are a number of packages with the same name and different categories… sometimes even with the same version (dev-php5
and the old dev-php4
categories are a good example of those, and they were systematically breaking the policy stated above).
One solution I was proposed was /usr/share/doc/${CATEGORY}_${PN}-${SLOT}
which wouldn’t be bad… but it would have a -0
appended to most of the directories; my preferred solution there would be to do something like omitting -${SLOT}
if it’s 0
. You’d have stable API documentation links, most of the intra-package and inter-package paths would be stable, and all in all you could drop the need for the document symlinking feature we currently have.
Unfortunately I’m expecting this to either require an EAPI bump or it’ll take a number of years before this can be properly implemented; I’ll probably have to either author myself – or find someone to author it for me – a GLEP to suggest changing dodoc. Contextually we should consider finding a better solution for compression, which is another problem we hit. Right now only the documentation installed with dodoc
is getting compressed with the chosen compression program, which might be gzip, bzip2 or lzma, same as the man pages. While man pages gets processed after install and before binpkg/livefs merge steps, documentation is not.
But not all documentation needs to be compressed in the first place: HTML files (API documentation first of all); PDF files; code examples need to be accessible without compression; while we have a dohtml
command to installing the web pages without compressing them, there is no equivalent for the other, and we have to rely on insinto
/doins
pairs. Further on, with more and more autotools-based packages moving to autoconf 2.6x and supporting the --docdir
option, we’re going to install more and more documentation directly into the directory, be it with the current ${PF}
or other form; these won’t be compresses as they are, right now.
So, again thanks to Ben for actually challenging the status-quo; his insights here were the spark that made me think about this for a long time.
According to app-doc/pms:3, the problem with compression will be solved in EAPI=4.For autotools-installed documentation, there is currently the hack by calling explicitly “prepalldocs” in src_install which excludes the subdirectory html, thus providing some minimal control over the compression. Hoping that EAPI=4 will be released soon, so that such hacks are not needed anymore.It would perhaps not be bad to provide for some EAPI the appropriate directory in some variable: It would be terrible to see in almost every ebuild code like
DOCDIR="${EPREFIX}/usr/share/doc/${CATEGORY}_${PN}";[ "${SLOT}" = "0" ] || DOCDIR="${DOCDIR}:${SLOT}"
Having this directory in a variable has the advantage that also--htmldir="${DOCDIR}/html"
can be used if appropriate. Perhaps it might even be reasonable to add--docdir="${DOCDIR}"
by default toeconf
in some future EAPI (probably together with some way to omit adding this parameter for the few packages not supporting it).There’s another concern to consider here — usability for the end user. When users are looking for documentation, they expect to find it in the standard location of /usr/share/doc/${PN}* rather than some other prefix.Beyond that, there is a real benefit to being consistent with how other distributions handle it, for purposes of portability and lowering the barrier to entry.Rather than this, I suggest that it could be better to avoid having package-name overlaps, as this would also enable a future flat package namespace.
There’s not going the space to have a “flat namespace” anytime soon, and if you can’t see that you definitely don’t see how upstream is naming things.Just take Ruby: we have a number of name collisions with non-Ruby packages; should we prefix all of them with @ruby-@? It’d be stupid to do so to drop the @dev-ruby@ category… especially because we can map the upstream name to a number of things both within the ebuilds/eclass and outside of it.End user usability come after having the system to work; and name collisions are unavoidable nowadays, with over twelve thousands packages; adding prefixes to work around those collisions is just idiotic.Fun fact: while we have a two-tiers category we still have a flat mirror system; on the other hand, FreeBSD ports and pkgsrc (both older than Portage, in one form or another, both having more packages than Portage, and both having a mix of one-tier categories and prefixes on package names) have a category-based mirroring system.Think about that, do you _really_ feel making a mess of a flat namespace is any good at all?
Yeah, categories are probably the main reason why it’s much more pleasant to search through packages in Gentoo than in Debian. Please never ever flatten that namespace.
I feel it would be nice to have ALL the documentation browsable html. With an index.html that is a bookmark on a default browser install. The index would just link to the docs for installed packagesUnfortunately browsers don’t seem to handle anything compressed very well. And as you mention here not everything adhere’s to one set of rules or structure when installing.
After yet-another-one round of “blast, dev/python/${PN} got updated and all the docs moved”, I’ve recalled this blog post.Diego, maybe it’s worth to open a bug on that or something?