A deeper investigation revealed that we can show the result of glitches
for each relevant situation, simply by scrutinising the produced distribution.
Even the 64-bit-Variant shows a skewed distribuion, in spite of all numbers
being within definition range.
So the conclusion is: we can expect tilted results, but in many cases
this might not be an issue, if the result range is properly wrapped / clipped.
Notably this is the case if we just want to inject a randomised sleep into a multithreaded test setup
Build a self-contained test case to document these findings.
Further investigation shows that the ''data type used for computation'' plays a crucial role.
The (recommended) 64bit mersenne twister uses the full value range of the working data type,
which on a typical 64bit system is also `uint64_t`. In this case, values corrupted by concurrency
go unnoticed. This can be **verified empirically** : the distribution
of shifts from the theoretical mean value is in the expected low range < 2‰
However, when using the 32bit mersenne engine, the working data type is still uint64_t.
In this case a **significant number of glitches** can be shown empricially.
When drawing 1 Million values, in 80% of all runs at least one glitch and up to 5 glitches
can happen, and the mean values are **significantly skewed**
''In theory,'' the random number generators are in no way threadsafe,
neither the old `rand()`, nor the mersenne twister of the C++ standard.
However, since all we want is some arbitrarily diffused numbers,
chances are that this issue can be safely ignored; because a random
number computation broken by concurrency will most likely generate --
well, a garbled number or "randomly" corrupted internal state.
Validating this reasoning by an empiric investigation seems advisable though.
- SchedulerStress_test simply takes too long to complete (~4 min)
and is thus aborted by the testrunner. Add a switch to allow for
a quick smoke test.
- SchedulerCommutator_test aborts due to an unresolved design problem,
which I marked as failure
- add some convenience methods for passing arguments to tests
...which turn out not to be due to the PRNG changes
* the SchedulerCommutator_test was inadvertently broken 2024-04-10
* SchedulerStress_test simply runs for 4min, which is not tolerated by our Testsuite setup
see also:
5b62438eb
We use the memory address to detect reference to ''the same language object.''
While primarily a testing tool, this predicate is also used in the
core application at places, especially to prevent self-assignment
and to handle custom allocations.
It turns out that actually we need two flavours for convenient usage
- `isSameObject` uses strict comparison of address and accepts only references
- `isSameAdr` can also accept pointers and even void*, but will dereference pointers
This leads to some further improvements of helper utilities related to memory addresses...
Problems in `Rational_test` were caused by `#include' reorderings regarding ''rational'' and ''intgral'' numbers.
The actual root cause is the fact that `FSecs` is only a typedef,
which prevents us from providing a string conversion for rational numbers without ambiguity
* most usages are drop-in replacements
* occasionally the other convenience functions can be used
* verify call-paths from core code to identify usages
* ensure reseeding for all tests involving some kind of randomness...
__Note__: some tests were not yet converted,
since their usage of randomness is actually not thread-safe.
This problem existed previously, since also `rand()` is not thread safe,
albeit in most cases it is possible to ignore this problem, as
''garbled internal state'' is also somehow „random“
As it turns out, by far margin we mostly use rand() to generate
test values within a limited interval, using the ''modulo trick''
and thus excluding the upper bound.
Looking into the implementation of the distributions in the
libStdC++ shows that ''constructing'' a distribution on-the-fly
is cheap and boils down to checking and then storing the bounds;
so basically there is no need to keep ''cached distribution objects''
around, because for all practical purposes these behave like free functions
What is required occasionally is a non-zero HashValue, and sometimes
an interval of floating-point number or a normal distribution seem useful.
Providing these as free-standing convenience functions,
implicitly accessing the default PRNG.
* add new option to the commandline option parser
* pass this as std::optional to the test-suite constructor
* use this value optionally to inject a fixed value on re-seeding
* provide diagnostic output to show the actual seed value used
...to the base-class of all tests
* `seedRand()` shall be invoked by every test using randomisation
* it will draw a new seed for the implicit default-PRNG
* it will document this seed value
* but when a seed was given via cmdline, it will inject that instead
* `makeRandGen()` will create a new dedicated generator instance,
attached (by seeding) to the current default-PRNG
It is not clear yet how to pass the actual `SeedNucleus`, which
for obvious reasons must be maintained by the `test::Suite`
Using random or pseudo-random numbers as input for tests
can be a very effective tool to spot unintended behaviour in
corner cases, and also helps writing more principled test verifications.
However, investigating failures in randomised tests can be challenging.
A well-proven solution is to exploit the **determinism** of pseudo-random-numbers
by documenting a randomly generated seed, that can be re-injected for investigation.
Up to now, most tests rely on the old library function `rand()`, while
at some places already the C++ standard framework for random number generation
is used, packaged into a custom wrapper. Adding adequate support for
documented seed values seems to be easy to achieve, after switching
existing usages of `rand()` to a suitable drop-in replacement.
After some consideration, I decided ''against'' wiring random generator instances
explicitly, while allowing to do so on occasion, when necessary. Thus
the planned seeding mechanism will rather re-seed a ''implicit default''
generator, which could then be used to construct explicit generator instances
when required (e.g. for multithreaded tests)
As a starting point, this changeset replaces the `randomise()` API call
by a direct access to the ''reseeding functionality'' exposed by the
C++ framework and all default generators. Since we already provide a
dedicated static instance of the plattform entropy source, re-randomisation
can be achieved by seeding from there.
NOTE: there was extended debate in the net, questioning the viability
of the `std::random_seq` -- these arguments, while valid from a theoretical
point of view, seem rather moot when placed into a practical context,
where even 2^32 different generation-paths(cycles) are more than enough
to provide sufficient diffusion of results (unless the goal is really to
engage into Monte-Carlo simulations for scientific research or large model
simulations).
Notable most of the more catchy reprovals raised by Melissa O'Neill
have been refuted by experts of the field, even while being still propagated
at various places in the net, often combined with promoting PCG-Random.
Originally, this helper was called `IterIndex`, thereby following a
common naming scheme of iteration-related facilities in Lumiera, e.g.
* `IterAdapter`
* `IterExplorer`
* `IterSource`
However, I myself was not able to recall this name, and found myself
now for the second time unable to find this piece of code, even while
still able to recall vaguely that I had written something of this kind.
(and unable to find it by a text search for "index", for obvious reasons)
So, on a second thought, the original name is confusing: we do not create
an index of / for iterators; rather we are iterating an index. So this
is what it should be called...
This is the first step towards a »Test Domain Ongology« #1372,
which is a systematic arrangement of test-dummy functionality assumed
to mirror the actual media processing functionality present in external libs.
Each media-processing library not only provides functions to crunch data,
but also establishes a framework of entities and classification to determine
what »media« is an how it is structured and can be generated, transformed
and qualified. Since a essential goal for Lumiera is to be **library agnostic,**
it is important to avoid naïvely to take some popular library's choices
as universal truth regarding structure and nature of »media« as such.
Rather, the architecture of the Lumiera Render Engine must be kept
sufficiently open to accommodate the working style of various libraries,
even ones not known today.
To validate this architectural openness, we use a set of test functions
unrelated to any existing library to validate access to and usage of
rendering functionality — followed by further steps to adopt existing
popular libraries like **FFmpeg** or **Gstreamer**, without tilting
the basic structure of the Render Engine one way or the other.
showing the Node-symbol and a reduced rendering of
either the predecessor or a collection of source nodes.
For this we need functionality to traverse the node graph depth-first
and collect all leaf nodes (which are the source nodes without predecessor);
such can be implemented with the help of the expandAll() functionality
of `lib::IterExplorer`. In addition we need to collect, sort and deduplicate
all the source-node specs; since this is a common requirement, a new
convenience builder was added to `lib::IterExplorer`
...taking into account the prospecive usage context
where the builder expressions will be invoked from within
a media-library plug-in, using std::string_view to pass
the symbolic information seems like a good fit, because
the given spec will typically be assembled from some
building blocks, and thus in itself not be literal data.
Building a precise Frame Cache is a tough job, and is doomed to fail
when attempting to tie cache invalidation to state changes. The only
viable path is to create a system of systematic tagging of processing
steps, and use this as foundation for chained hash values, linked
in accordance to the actual processing structure.
This is complicated by the secondary concern of maintaining memory efficacy
for the render node model, which can be expected to grow to massive scale.
And even while this invocation can not be fully devised right now,
an attempt can be made to build a foundation that is not outright
wasteful, by detaching the logical information from the specific
weaving pattern used for implementation, and by minimising the
representation in memory and computing the compound information
on-demand....
The immediate next goal is to verify properties of render nodes
generated by the builder framework; two kinds of validations
can be distinguished
* structural aspects of the wiring
* the fact that processing functionality is invoked in proper order
Looking into the structural aspects brings about the necessity
to identify the actual processing function bound into some functor.
Some recapitulation of goals and requirements revealed, that this
can not be a merely technical identity record — because the intention
is to base the ''cache key'' on chained processing node identities,
so that the key is stable as long as the user-visible results will be
equivalent. And while structural data can be aggregated, at the
core this information must be provided by the scheme embedded
into the domain ontology, which is tasked with invoking the
builder in order to implement a ''specific processing-asset''
Review the achievements from the last days and map out the further path
for test-driven build-up of a render-node network and invocation.
Notably ''several layers of prototyping'' are in the works now;
it is important to understand the purpose of each such round of
prototyping and to draw the necessary conclusions after closing out.
The next topic to investigate relates to the ''identity'' of nodes and
ports within nodes; this entails to generate a ''symbolic spec'' that
can be verified and used as base for a systematic hash-ID and cache-key...
Since it would in fact be possible to access and write beyond the configured storage,
simply by using the builder API without considering consistency,
it seems advisable to use explicit runtime checks here, instead of
only assertions, and to throw an exception when violating bounds.
Moreover, unsuccessfully attempted to better arrange the functionality
between PortBuilder and WeavingBuilder; seemingly we have an rather tight
coupling here, and also the expectations regarding the processing function
seem to be too tight (but that's the reason why it's an prototype...)
...which then also allow to fill in the missing parts for the
default 1:1 wiring scheme, which connects each »input slot«
of the processing function with the corresponding ''lead node''
- the chaining constructor is picked reliably when the
slicing is done by a direct static_cast
- the function definition can be passed reliably in all cases
after it has been ''decayed,'' which is done here simply by
taking it by-value. This is adequate, since the function
definition must be copied / inlined for each invocation.
With these fixes, the simplest test case now for the first time
**runs through without failure**
This change allows to disentangle the usages of `lib::SeveralBuilder`,
so that at any time during the build process only a single instance is
actively populated, all in one row — and thus the required storage can
either be pre-allocated, or dynamically extended and shrinked (when
filling elements into the last `SeveralBuilder` currently activated)
By packaging into a λ-closure, the building of the actual `Port`
implementation objects (≙ `Turnout` instances) is delayed until the
very end of the build process, and then unloaded into yet another
`lib::Several` in one strike. Temporarily, those building functor
objects are „hidden“ in the current stack frame, as a new `NodeBuilder`
instance is dropped off with an adapted type parameter (embedding the
λ-type produced by the last nested `PortBuilder` invocation, while
inheriting from previous ones.
However, defining a special constructor to cause this »chaining«
poses some challenge (regarding overload resolution). Moreover,
since the actual processing function shall be embedded directly
(as opposed to wrapping it into a `std::function`), further problems
can arise when this function is given as a ''function reference''
...and as expected, this turns up quite some inconsistencies,
especially regarding usage of the »buffer types«.
Basically, the `PortBuilder` is responsible for the high-level functionality
and thus must ensure the nested `WiringBuilder` is addressed and parameterised
properly to connect all »slots« of the processing function.
- can use a helper function in the WiringBuilder to fill in connections
- but the actual buffer types passed over these connectinos are totally
unchecked at that level, and can not see yet how this danger can be
mitigated one level above, where the PortBuilder is used.
- it is still unclear what a »buffer type« actually means; it could
be the pointer type, but it could also imply a class or struct type
to be emplaced into the buffer, which is a special extension to the
`BufferProvider` protocol, yet seems to be used here rather to transport
specific data types required by the actual media handling library (e.g. FFmpeg)
__Analysis__: what kind of verifications are sensible to employ
to cover building, wiring and invocation of render nodes?
Notably, a test should cover requirements and observable functionality,
while ''avoiding direct hard coupling to implementation internals...''
__Draft__: the most simple node builder invocation conceivable...
Code clean-up: mark all buffers with a dedicated tagging type
The point in question is: if we work the LocalTag into the type-hash,
could it be possible to miss an existing entry in the metadata registry?
This could cause two entries to be locked for a single buffer address,
leading to data corruption.
As far as I can see, in the current usage this would not happen,
but unfortunately this problem can not be ruled out, since the BufferProvider
API and protocol is designed to be open for various usage patterns.
However, the same potentially disastrous pattern could also materialise
when registering two different buffer types, and then locking each
for the same buffer location.
...this is a surprisingly tricky issue, since it undercuts the
generic and recursive implementation of buffer handling;
fortunately I've foreseen such demands may arise down the road
and I've reserved an »Local Key« (now renamed into `LocalTag`),
whose meaning is implementation defined and interpreted by
the specific `BufferProvider`
It became clear that a secondary system of connections must be added,
running top-down from a global model context, and thus contrary to the
regular orientation of the node network, which connects upwards from
predecessor to successor, in accordance with the pull principle.
If we accept this wiring as part of the primary structure, it can be
established immediately while building the nodes, thus adding a preconfigured
''pattern of Buffer Descriptors'' to each node, since there is no further
''moving part'' — beyond the wiring to the `BufferProvider`, which thus
becomes part of a global `ModelContext`
As an immediate consequence, the storage for this configuraion should
also be switched to `lib::Several` and handled similar to the primary
node wiring in the Builder...
* conduct analysis regarding allocator handling in the Builder
* turns out we'll have to keep around two different allocators while building
* ⟹ establish the goal to confine usage of the Node allocator to the lower Levels
* consequently must open up the `lib::SeveralBuilder` to be usable
as an intermediary data structure, while building up the target data
* in the initial design, the `SeveralBuilder` was kept opaque, since
contents can be expected to be re-located frequently and thus exposing
elements and taking references could be dangerous — yet this is also
true for `std::vector` however, so people are assumed to know
when they want to shoot themselves into their own foot
...especially what is necessary to represent at this level and what information
is implicit; notably there will be an implicit default wiring, but we allow
for case-by-case deviations
To escape a possible deadlock in analysis, I resort to developing
some kind of free-wheeling presupposition how the **Builder** could
be implemented — a centrepiece of the Lumiera architecture envisioned
thus far — which ''unfortunately'' can only be planned and developed
in a more solid way ''after'' the current »Vertical Slice« is completed.
Thus I find myself in the uncomfortable situation of having to work towards
a core piece, which can not yet be built, since it relies heavily on
the very structures to be built...
...and this line of analysis brings us deep into the ''Buffer Provider''
concept developed in 2012 — which appears to be very well to the point
and stands the test of time.
Adding some ''variadic arguments'' at the right place surprisingly leads
to an ''extension point'' — which in turn directly taps into the
still quite uncharted territory interfacing to a **Domain Ontology**;
the latter is assumed to define how to deal with entities and relationships
defined by some media handling library like e.g. FFmpeg.
So what we're set to do here is actually ''ontology mapping....''
The immediate next step is to build some render nodes directly
in a test setting, without using any kind of ''node factory.''
Getting ahead with this task requires to identify the constituents
to be represented on the first code layer for the reworked code
(here ''first layer'' means any part that are ''not'' supplied
by generic, templated building blocks).
Notably we need to build a descriptor for the `FeedManifold` —
which in turn implies we have to decide on some fundamental aspects
of handling buffers in the render process.
To allow rework of the `ProcNode` connectivity, a lot of presumably obsoleted
draft code from 2011 has to be detached, to be able to keep it in-tree
for further reference (until the rework and refactoring is settled).
As outlined in #1367, the integration effort requires some rework
of existing code, which will be driven ahead by the `NodeLinkage_test`
* redefine Node Connectivity
* build simple `ProcNode` directly in scope
* create an `TurnoutSystem` instance
* perform a ''dummy Node-Invocation''
As a replacement for the `RefArray` a new generic container
has been implemented and tested, in interplay with `AllocationCluster`
* the front-end container `lib::Several<I>` exposes only a reference
to the ''interface type'' `I`, while hiding any storage details
* data can only be populated through the `lib::SeveralBuilder`
* a lot of flexibility is allowed for the actual element data types
* element storage is maintained in a storage extent, managed through
a custom allocator (defaulting to `std::allocator` ⟹ heap storage)
The `SeveralBuilder` employs the same tactic as `std::vector`,
by over-allocating a reserve buffer, which grows in exponential
increments, to amortise better the costs of re-allocation.
This tactic does not play well with space limited allocators
like `AllocationCluster` however; it is thus necessary to provide
an extension point where the actuall allocator's limitation can be
queried, allowing to use what is available as reserve, but not more.
With these adaptations, a full usage cycle backed by `AllocationCluster`
can be demonstrated, including variations of dynamic allocation adjustment.
...identified as part of bug investigation
* make clear that reserve() prepares for an absolute capacity
* clarify that, to the contrary, ensureStorageCapaciy() means the delta
Moreover, it turns out that the assertion regarding storage limits
triggers frequently while writing the test code; so we can conclude
that the `AllocationCluster` interface lures into allocating without
previous check. Consequently, this check now throws a runtime exception.
As an aside, the size limitation should be accessible on the interface,
similar to `std::vector::max_size()`
- decided to allow creating empty lib::Several;
no need to be overly rigid in this point,
since it is move-assignable anyway...
- populate with enough elements to provoke several reallocations
with copying over the existing elements
- precisely calculate and verify the expected allocation size
- verify the use-count due to dedicated allocator instances
being embedded into both the builder and hidden in the deleter
- move-assign data
- all checksums go to zero at end
The setup for `ArrayBucket` is special, insofar it shell de-allocate itself,
which creates the danger of re-entrant calls, or to the contrary, the danger
to invoke this clean-up function without actually invoking the destructor.
These problems become relevant once the destructor function itself is statefull,
as is the case when embedding a non-trivial, instance bound allocator
to be used for the clean-up work. Using the new `lib::TrackingAllocator`
highlighted this potential problem, since the allocator maintains a use-count.
Thus I decided to move the »destruction mechanics« one level down into
a dedicated and well encapsulated base class; invoking ArrayBucket's destructor
thereby becomes the only way to trigger the clean-up, and even ElementFactory::destroy()
can now safely check if the destructor was already invoked, and otherwise
re-invoke itself through this embedded destructor function. Moreover,
as an additional safety measure, the actual destructor function is now
moved into the local stack frame of the object's destructor call, removing
any possibility for the de-allocation to interfere with the destructor
invokation itself
part of the observed deviation stems form bugs in logging and checksum calculation;
but there seems to be a real problem hidden in the allocator usage of the
new component, since the use-cnt of the handle does not drop to zero
While there might be the possibility to use the magic of the standard library,
it seems prudent rather to handle this insidious problem explicitly,
to make clear what is going on here.
To allow for such explicit alignment handling, I have now changed the
scheme of the storage definition; the actual buffer now starts ''behind''
the `ArrayBucket<I>` object, which thereby becomes a metadata managing header.
__To summarise the problem__: since we are maintaining a dynamically sized buffer,
and since we do not want to expose the actual element type through the
front-end object, we're necessarily bound to perform a raw-memory allocation.
This is denoted in bytes, and thus the allocator can no longer manage
the proper alignment automatically. Rather, we get a storage buffer with
just ''some accidental'' alignment, and we must care to request a sufficient
overhead to be able to shift the actual storage area forward to the next
proper alignment boundary. Obviously this also implies that we must
store this individual padding adjustment somewhere in the metadata,
in order to be able to report the correct size of the block later
on de-allocation.
The solution implemented thus far turns out to be not sufficient
for ''over-aligned-data'', as the raw-allocator can not perform the
''magic work'' because we're exposing only `std::byte` data.
This adaptor works in concert with the generic allocator
building blocks (prospective ''Concepts'') and automatically
registers a either static or dynamic back-link to the factory
for clean-up.
Use this wrapper fore more in-depth test of the new `TrackingAllocator`
and verify proper behaviour through the `EventLog`