DOC: extend the description of dependency pitfals and intricasies
Still not complete, but a complete outline now
This commit is contained in:
parent
b5de8523b1
commit
073efdf6a4
1 changed files with 97 additions and 22 deletions
|
|
@ -25,8 +25,8 @@ specified, coherent operational structure.
|
|||
Within this _definition of a coded structure_, there is an inherent tension
|
||||
between the _absoluteness_ of a definition (a definition in mathematical sense
|
||||
can not be changed, once given) and the _order of spelling out_ this definition.
|
||||
When put in such an abstract way, all of this might seem self evident and trivial,
|
||||
but let's just consider the following complications in practice...
|
||||
When described in such an abstract way, these observations might be deemed self evident
|
||||
and trivial, but let's just consider the following complications in practice...
|
||||
|
||||
- Headers are included into multiple translation units. Which means, they appear
|
||||
in several disjoint contexts, and must be written in a way independent of the
|
||||
|
|
@ -34,8 +34,8 @@ but let's just consider the following complications in practice...
|
|||
- Macros, from the point of their definition onwards, change the way the compiler
|
||||
``sees'' the actual code.
|
||||
- Namespaces are ``open'' -- meaning they can be re-opened several times and
|
||||
populated with further definitions. Generally speaking, the actual contents of
|
||||
any given namespace will be different in each and every translation unit.
|
||||
populated with further definitions. The actual contents of any given namespace
|
||||
will be slightly different in each and every translation unit.
|
||||
- a Template is not in itself code, but a constructor function for actual code.
|
||||
It needs to be instantiated with concrete type arguments to produce code.
|
||||
And when this happens, the template instantiation picks up definitions
|
||||
|
|
@ -70,10 +70,13 @@ Since it is really hard to reconcile all these conflicting goals, we are bound
|
|||
to rely on *patterns of construction*, which are known to work out well in
|
||||
this regard.
|
||||
|
||||
[yellow-background]#to be written#
|
||||
Import order, forward decls, placement of ctors, wrappers, PImpl
|
||||
|
||||
|
||||
Code size and Code Bloat
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Each piece of code incurs costs of various kinds
|
||||
Each piece of code incurs cost of various kinds
|
||||
|
||||
- it needs to be understood by the reader. Otherwise it will die
|
||||
sooner or later and from then on haunt the code base as a zombie.
|
||||
|
|
@ -86,28 +89,28 @@ Each piece of code incurs costs of various kinds
|
|||
- and since we're developing with debug builds, each and every definition
|
||||
produces debug information in each and every translation unit referring it.
|
||||
|
||||
Thus, for every piece of code we must ask ourselves how _visible_ this code
|
||||
is. And we must consider the dependencies the code incurs. It pays off to
|
||||
Thus, for every piece of code we must ask ourselves how much _visible_ this
|
||||
code is. And we must consider the dependencies the code incurs. It pays off to
|
||||
turn something into a detail and ``push it into the backyard''. This explains
|
||||
why we're using the frontend - backend split so frequently.
|
||||
|
||||
|
||||
Source dependencies vs binary dependencies
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Source and binary dependencies
|
||||
------------------------------
|
||||
To _use_ stuff while writing code, a definition or at least a declaration needs to
|
||||
be brought into scope. This is fine as long as definitions are rather cheap,
|
||||
omitting and hiding the details of implementation. The user does not need to understand
|
||||
these details, and the compiler does not need to parse them.
|
||||
|
||||
The situation is somewhat different when it comes to _binary dependencies_ though.
|
||||
At execution time, there are just pieces of data, and functions able to process this
|
||||
specific data. Thus, whenever a specific piece of data is to be used, the corresponding
|
||||
functions need to be loaded and made available. Most of the time we're linking dynamically,
|
||||
and thus the above means that a dynamic library providing those functions needs to be loaded.
|
||||
At execution time, all we get is pieces of data, and functions able to process specific
|
||||
data. Thus, whenever some piece of data is to be used, the corresponding functions need
|
||||
to be loaded and made available. Most of the time we're linking dynamically,
|
||||
and thus the above means that a _dynamic library_ providing those functions needs to be loaded.
|
||||
This other dynamic library becomes a dependency of our executable or library; it is recorded
|
||||
in the 'dynamic' section of the headers of our ELF binary (executable or library). Such a
|
||||
'needed' dependency is recorded there in the form of a ``SONAME'': this is an unique, symbolic
|
||||
ID denoting the library we're depending on. At runtime, its the responsibility of the system's
|
||||
ID denoting the library we're depending on. At runtime, it is the responsibility of the system's
|
||||
dynamic linker to translate these SONAMEs into actual libraries installed somewhere on the system,
|
||||
to load those libraries and to map the respective memory pages into our current process' address
|
||||
space, and finally to _relocate_ the references in our assembly code to point properly to the
|
||||
|
|
@ -121,8 +124,8 @@ As long as all we had to deal with was code in upper layers using and invoking s
|
|||
layers, there would not be much to worry. Yet to produce any tangible value, software has to
|
||||
collaborate on shared data. So the naive ``natural'' form of architecture would be to build
|
||||
everything around shared knowledge about the layout of this data. Unfortunately such an approach
|
||||
endangers the most central property of software, namely to be ``soft'', to be able to adapt to
|
||||
change. Inevitably, data centric architectures either grow into a rigid immobile structure,
|
||||
endangers the most central property of software, namely to be ``soft'', to adapt to change.
|
||||
Inevitably, data centric architectures either grow into a rigid immobile structure,
|
||||
or they breed an intangible insider culture with esoteric knowledge and obscure conventions
|
||||
and incantations. The only known solution to this problem (incidentally a solution known
|
||||
since millennia), is to rely on subsidiarity. ``Tell, don't ask''
|
||||
|
|
@ -130,13 +133,85 @@ since millennia), is to rely on subsidiarity. ``Tell, don't ask''
|
|||
This gets us into a tricky situation regarding binary dependencies. Subsidiarity leads to an
|
||||
interaction pattern based on handshakes and exchanges, which leads to mutual dependency. One
|
||||
side places a contract for offering some service, the other side reshapes its internal entities
|
||||
to comply to that contract superficially. Generally speaking, to handle the entities involved
|
||||
in each handshake, effectively we need the internal functions of both sides. Which is in
|
||||
contradiction to a ``clean'' layer hierarchy.
|
||||
to comply to that contract superficially. Dealing with the entities involved in such a handshake
|
||||
effectively involves the internal functions of both sides. Which is in contradiction to a
|
||||
``clean'' layer hierarchy.
|
||||
|
||||
For a tangible example, lets assume the our backend has to do some work on behalf of the GUI;
|
||||
For a more tangible example, lets assume our backend has to do some work on behalf of the GUI;
|
||||
so the backend offers a contract to outline the properties of stuff it can work on. In compliance
|
||||
with this contract, the GUI hands some data entities to the backend to work on -- but by their
|
||||
very nature, these data entities are and remain GUI entities. When the backend invokes compliant
|
||||
with this contract, the GUI hands over some data entities to the backend to work on -- but by their
|
||||
very nature, these data entities are and always remain GUI entities. When the backend invokes compliant
|
||||
operations on these entities, it effectively invokes functionality implemented in the GUI. Which
|
||||
makes the backend _binary dependent on the GUI_.
|
||||
|
||||
While this problem can not be resolved in principle, there are ways to work around it, to the degree
|
||||
necessary to get hierarchically ordered binary dependencies -- which is what we need to make a lower
|
||||
layer operative, standalone, without the upper layer(s). The key is to introduce an _abstraction_,
|
||||
and then to _segregate_ along the realm of this abstraction, which needs to be chosen large enough
|
||||
in scope to cast the service and its contract entirely in terms of this abstraction, but at the same
|
||||
time it needs to be kept tight enough to prevent details of the client to leak into the abstraction.
|
||||
When this is achieved (which is the hard part), then any operations dealing with the abstraction solely
|
||||
can be migrated into the entity offering the service, while the client hides the extended knowledge about
|
||||
the nature of the manipulated data behind a builder function footnote:[frequently this leads to the
|
||||
``type erasure'' pattern, where specific knowledge about the nature of the fabricated entities -- thus
|
||||
a specific type -- is relinquished and dropped once fabrication is complete], but retains ownership
|
||||
on these entities, passing just a reference to the service implementation. This move ties the binary
|
||||
dependency on the client implementation to this factory function -- as long as _this factory_ remains
|
||||
within the client, the decoupling works and eliminates binary cross dependencies.
|
||||
|
||||
This solution pattern can be found at various places within the code base; in support we link with
|
||||
strict dependency checking (Link flag `--no-undefined`), so every violation of the predefined
|
||||
hierarchical dependency order of our shared modules is spotted immediately during build.
|
||||
|
||||
|
||||
Finding dependencies at start-up
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
[yellow-background]#to be written#
|
||||
|
||||
|
||||
Transitive binary dependencies
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Binary dependencies can be recursive:
|
||||
When our code depends on some library, this library might in turn depend on other libraries.
|
||||
At runtime, the dynamic linker/loader will detect all these transitive dependencies and try to load
|
||||
all the required shared libraries; thus our binary is unable to start, unless all these dependencies
|
||||
are already present on the target system. It is the job of the packager to declare all necessary dependencies
|
||||
in the software package definition, so users can install them through the package manager of the distribution.
|
||||
|
||||
There is a natural tendency to define those installation requirements too wide. For one, it is better
|
||||
to be on the safe side, otherwise users won't be able to run the executable at all. And on top of that,
|
||||
there is the general tendency towards frameworks, toolkit sets and library collections -- basically
|
||||
a setup which is known to work under a wide range of conditions. Using any of these typically means
|
||||
to add a _standard set of dependencies_, which is often way more than actually required to load and
|
||||
execute our code. One way to fight this kind of ``distribution dependency bloat'' is to link `--as-needed`.
|
||||
In this mode, the linker silently drops any binary dependency not necessary for _this concrete piece
|
||||
of code_ to work. This is just awesome, and indeed we set this toggle by default in our build process.
|
||||
But there are some issues to be aware of.
|
||||
|
||||
Static registration magic
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
[yellow-background]#to be written#
|
||||
|
||||
Relative dependency location
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
Locating binary dependencies relative to the executable (as described above) is complicated when several
|
||||
of _our own dynamically linked modules_ depend on each other transitively. For example, a plug-in might
|
||||
depend on `liblumierabackend.so`, which in turn depends on `liblumierasupport.so`. Now, when we link
|
||||
`--as-needed`, the linker will add the direct dependency, but omit the transitive dependency on the
|
||||
support library. Which means, at runtime, that we'd need to find the support library _when we are
|
||||
about to load the backend library_. With the typical, external libraries already installed to the
|
||||
system this works, since the linker has built-in ``magic'' knowledge about the standard installation
|
||||
locations of system libraries. Not so for our own loadable components -- recall, the idea was to provide
|
||||
a self-contained directory tree, which can be relocated in the file system as appropriate, without the
|
||||
need to ``install'' the package officially. The GNU dynamic linker can handle this requirement, though,
|
||||
if we supply an additional, relative search information _with the library pulling in the transitive
|
||||
dependency_. In our example, `liblumierabackend.so` needs an additional search path to locate
|
||||
`liblumierasupport.so` _relative_ to the backend lib (and not relative to the executable). For this
|
||||
reason, our build system by default supplies such a search hint with every Lumiera lib or dynamic
|
||||
module -- assuming that our own shared libraries are installed into a subdirectory `modules` below
|
||||
the location of the executable; other dynamic modules (plug-ins) may be placed in sibling directories.
|
||||
So, to summarise, the build defines the following `RPATH` and `RUNPATH` specs:
|
||||
|
||||
for executables:: `$ORIGIN/modules`
|
||||
for libs and modules:: `$ORIGIN/../modules`
|
||||
|
||||
|
|
|
|||
Loading…
Reference in a new issue