This solution checks only the minimal precondition,
which is that a type supports `std::tuple_size<T>`.
A more complete implementation turns out to be surprisingly complex,
since a direct check likely requires compile-time reflection capabilities
at the level of at least C++23
- `std::tuple_element<i,T>` typically implements limits checks,
which interfere with the detection of empty structured types
- the situation regarding `std::get<i>()` is even more complicated,
since we might have to probe for ADL-based solutions, or member templates
The check for minimal necessary precondition however allows us to
single out std::tuple, std::array and our own structured types in
compilation branching, which suffices to fulfils actual needs.
It seemed like we're doomed...
Yet we barely escaped our horrid fate, because the C++ structured bindings happen to look also for get<i> member functions!
Any other solution involving a free function `get<i>(h)` would not work, since the `std::tuple` used as base class would inevitably drag in std::get via ADL
Obviously, the other remedy would be to turn the `StorageFrame` into a member; yet doing so is not desirable, as makes the actual storage layout more obscure (and also more brittle)
Actually it is the implementation of `std::get` from our STL implementation
which causes the problems; our new custom implementation works as intended an
would also be picked by the compiler's overload resolution. But unfortunately,
the bounds checking assertion built into std::tuple_element<I,T> triggers
immediately when instantiated with out-of-bounds argument, which happens
during the preparation of overload resolution, even while the compiler
would pick another implementation in the following routine.
So we're out of luck and need to find a workaround...
Why is our specialisation of `std::get` not picked up by the compiler?
* it must somehow be related to the fact that `std::tuple` itself is a base class of lib::HeteroData
* if we remove this inheritance relation, our specialisation is used by the compiler and works as intended
* however, this strange behaviour can not be reproduced in a simple synthetic setup
It must be some further subtlety which marks the tuple case as preferrable
Seems like low hanging fruit and would especially allow to use
those storage blocks with ''structural bindings''
Providing the necessary specialisations for `std::get` however turns out to be difficult;
the compiler insists on picking the direct tuple specialisation, since std::tuple is a
protected base class; yet still surprising -- I was under the impression
that the direct overload should be the closest match
This basically solves this implementation challenge:
It was possible to construct a ''compile-time type-safe'' overlay,
while using force-casts ''without any metadata'' at runtime.
Obviously this is a dangerous setup, but ''should be resonably safe'' when used within the defined scheme...
* now yields an instance of the full `HeteroData<X,X,Z>` template
* work around problems with std::tuple_element_t for derived classes
Can now default-create and direct-init a front-End data block,
access and modify its elements — and the API looks ok-ish for me
Decision to use the generic case as short-hand for the first data block,
and thus ''hide the more technical Loki-List specialisations''
With that, I'm finally able to write the first test case...
This was a tough nut to crack, but recalling the actual usage situation was helpful
* the ''constructor type'' must be created / picked-up beforehand
* we are about to build a ''parameter-computation node''
* so this constructor presumably is passed to a type parameter of a specific weaving pattern
* the constructor must be invoked directly to drop-off the new data frame into the local scope
* it is preferable to attach it only in a second step to the existing HeteroData-Chain (residing in `TurnoutSystem`)
What would be ''desirable'' though is to have some additional safeguard in the type system
to prevent attaching the newly constructed block to a chain with a non-fitting layout,
i.e. the case when several constructors or types get mixed up (because without any further
safe-guard this would lead to uncoordinated out-of-bounds memory access)
...as part of the rendering process, executed on top of the
low-level-model (Render Node network) as conceived thus far.
Parameter handling could be ''encoded'' into render nodes altogether,
or, at the other extreme, an explicit parameter handling could be specified
as part of the Render Node execution. As both extremes will lead to some
unfavourable consequences, I am aiming at a middle ground: largely, the
''automation computation'' will be encoded and hidden within the network,
implying that this topic remains to be addressed as part of defining
the Builder semantics and implementation. Yet in part the required
processing structure can be foreseen at an abstract level, and thus
the essential primitive operations are specified explicitly as part
of the Render Node definition. Notably the ''standard Weaving Pattern''
will include a ''parameter tuple'' into each `FeedManifold` and require
a binding function, which accepts this tuple as first argument.
Moreover — at implementation level, a library facility must be provided
to support handling of ''arbitrary heterogeneous data values'' embedded
directly into stack frame memory, together with a type-safe compile-time
overlay, which allows the builder to embed specific ''accessor handles''
into functor bindings, even while the actual storage location is not
yet known at that time (obviously, as being located on the stack).
__Note__: a recurring topic is how to return descriptor objects from builder functions; for this purpose, I am adjusting the semantics of `lib/nocopy.hpp` to be more specific...
As follow-up from the preceding refactorings,
it is now possible to drastically simplify several type signatures.
Generally speaking, iterator pipelines can now pass-through the result type,
and thus it is no longer necessary to handle this result type explicitly
In the case of `IterStateWrapper`, the result type parameter was retained,
but moved to the second position and defaulted; sometimes it can be relevant
to force a specific type; this is especially useful when defining an
`iterator` and a `const_iterator` based on the same »state-core«
For sake of completeness, since the `IterExplorer` supports building extended
search- and evaluation patterns, a tuple-zipping adapter can be expected
to handle these extended usages transparently.
While the idea is simple, making this actually happen had several ramifications
and required to introduce additional flexibility within the adaptor-framework
to cope better with those cases were some iterator must return a value, not a ref.
In the end, this could be solved with a bit of metaprogramming based on `std::common_type`
...and indeed, this is all quite nasty stuff — in hindsight, my initial intuition
to shy away from this topic was spot-on....
This involves some quite tricky changes in the way types are composed to form an iterator-pipeline.
Some wrappers are added as adaptors or for additional safety-checks, and to provide a builder-API.
Unfortunately, when building a new `IterExplorer` iterator pipeline from an existing pipeline naively,
composing all those types will add several unecessary intermediary wrapper-layers.
Worse even, the handling of `BaseAdapter` prevents the new tuple-zipping iterator
actually to pass-through any `expandChildren()` call.
These issues are a consequence of using templated types, instead of fixed types with an interface;
we can not just determine if some wrapper is present — unless the wrapper itself ''helps by exposing a tag.''
Even while I must admit that the whole packaging and adaptation machinery of `IterExplorer`
looks dangerously complex already, using dedicated type tags for this single purpose
seems like a tenable soulution.
and yes ... this revealed a **long standing bug**
The `Filter::pullFilter()` invocation in the ctor may produce dangling refs,
whenever an underlying source-iterator generates a reference that points
into the iterator itself.
The reason is: due to the »onion shell« design of the iterator pipeline,
we are bound to move a source iterator into the next layer constructor.
With this minor change, the internal result-tuple may now also hold references,
in case a source iterator exposes a reference (which is in fact the standard case).
Under the right circumstances, source-manipulation through the iterator becomes possible.
Moreover, the optimiser should now be able to elide the result-value tuple in many cases.
and access the iterator internals directly instead.
Obviously this is an advanced and possibly dangerous feature, and only possible
when no additional transformer functions are interspersed; moreover this prompted
a review of some long standing type definitions to more precisely reflect the intention.
Note: most deliberately, the Transformer element in IterExplorer must expose a reference type,
and capture the results into an internal ItemWrapper. This is the only way we can support arbitrary functions.
Indeed the solution worked out yesterday could be extracted and turned generic.
Some in-depth testing is necessary though, and possibly some qualifications to allow pass-through of references...
Moreover, last days I started collecting notes regarding problem solving patterns,
which I tend to use frequently, but which might not be obvious and thus can easily
be forgotten. In fact, I had encountered several cases, where I did invent some
roughly similar solution repeatedly, having forgotten about already settled matters.
Hopefully the habit of collecting notes and hints at a central location serves to remedy
Basically I am sick of writing for-loops in those cases
where the actual iteration is based on one or several data sources,
and I just need some damn index counter. Nothing against for-loops
in general — they have their valid uses — sometimes a for-loop is KISS
But in these typical cases, an iterator-based solution would be a
one-liner, when also exploiting the structured bindings of C++17
''I must admit that I want this for a loooooong time —''
...but always got intimidated again when thinking through the fine points.
Basically it „should be dead simple“ — as they say
Well — — it ''is'' simple, after getting the nasty aspects of tuple binding
and reference data types out of the way. Yesterday, while writing those
`TestFrame` test cases (which are again an example where you want to iterate
over two word sequences simultaneously and just compare them), I noticed that
last year I learned about the `std::apply`-to-fold-expression trick, and
that this solution pattern could be adapted to construct a tuple directly,
thereby circumventing most of the problems related to ''perfect forwarding''
So now we have a new util function `mapEach` (defined in `tuple-helper.hpp`)
and I have learned how to make this application completely generic.
As a second step, I implemented a proof-of-concept in `IterZip_test`,
which indeed was not really challenging, because the `IterExplorer`
is so very sophisticated by now and handles most cases with transparent
type-driven adaptors. A lot of work went into `IterExplorer` over the years,
and this pays off now.
The solution works as follows:
* apply the `lib::explore()` constructor function to the varargs
* package the resulting `IterExplorer` instantiations into a tuple
* build a »state core« implementation which just lifts out the three
iterator primitives onto this ''product type'' (i.e. the tuple)
* wrap it in yet another `IterExplorer`
* add a transformer function on top to extract a value-tuple for each ''yield'
As expected, works out-of-the-box, with all conceivable variants and wild
mixes of iterators, const, pointers, references, you name it....
PS: I changed the rendering of unsigned types in diagnostic output
to use the short notation, e.g. `uint` instead of `unsigned int`.
This dramatically improves the legibility of verification strings.
I changed the rendering of unsigned types in diagnostic output
to use the short notation, e.g. `uint` instead of `unsigned int`.
This dramatically improves the legibility of verification strings.
Moreover, I took the opportunigy to look through the existing page
with codeing style guides to explicitly write down some conventions
formed over years of usage.
I did not just »make up« those light heartedly, rather these conventions
are the result of a craftsman's ''attentive observation and self-reflection.''
* Lumiera source code always was copyrighted by individual contributors
* there is no entity "Lumiera.org" which holds any copyrights
* Lumiera source code is provided under the GPL Version 2+
== Explanations ==
Lumiera as a whole is distributed under Copyleft, GNU General Public License Version 2 or above.
For this to become legally effective, the ''File COPYING in the root directory is sufficient.''
The licensing header in each file is not strictly necessary, yet considered good practice;
attaching a licence notice increases the likeliness that this information is retained
in case someone extracts individual code files. However, it is not by the presence of some
text, that legally binding licensing terms become effective; rather the fact matters that a
given piece of code was provably copyrighted and published under a license. Even reformatting
the code, renaming some variables or deleting parts of the code will not alter this legal
situation, but rather creates a derivative work, which is likewise covered by the GPL!
The most relevant information in the file header is the notice regarding the
time of the first individual copyright claim. By virtue of this initial copyright,
the first author is entitled to choose the terms of licensing. All further
modifications are permitted and covered by the License. The specific wording
or format of the copyright header is not legally relevant, as long as the
intention to publish under the GPL remains clear. The extended wording was
based on a recommendation by the FSF. It can be shortened, because the full terms
of the license are provided alongside the distribution, in the file COPYING.
After augmenting our `lib/random.hpp` abstraction framework to add the necessary flexibility,
a common seeding scheme was ''built into the Test-Runner.''
* all tests relying on some kind of randomness should invoke `seedRand()`
* this draws a seed from the `entropyGen` — which is also documented in the log
* individual tests can now be launched with `--seed` to force a dedicated seed
* moreover, tests should build a coherent structure of linked generators,
especially when running concurrently. The existing tests were adapted accordingly
All usages of `rand()` in the code base were investigated and replaced
by suitable calls to our abstraction framework; the code base is thus
isolated from the actual implementation, simplifying further adaptation.
A deeper investigation revealed that we can show the result of glitches
for each relevant situation, simply by scrutinising the produced distribution.
Even the 64-bit-Variant shows a skewed distribuion, in spite of all numbers
being within definition range.
So the conclusion is: we can expect tilted results, but in many cases
this might not be an issue, if the result range is properly wrapped / clipped.
Notably this is the case if we just want to inject a randomised sleep into a multithreaded test setup
Build a self-contained test case to document these findings.
Further investigation shows that the ''data type used for computation'' plays a crucial role.
The (recommended) 64bit mersenne twister uses the full value range of the working data type,
which on a typical 64bit system is also `uint64_t`. In this case, values corrupted by concurrency
go unnoticed. This can be **verified empirically** : the distribution
of shifts from the theoretical mean value is in the expected low range < 2‰
However, when using the 32bit mersenne engine, the working data type is still uint64_t.
In this case a **significant number of glitches** can be shown empricially.
When drawing 1 Million values, in 80% of all runs at least one glitch and up to 5 glitches
can happen, and the mean values are **significantly skewed**
''In theory,'' the random number generators are in no way threadsafe,
neither the old `rand()`, nor the mersenne twister of the C++ standard.
However, since all we want is some arbitrarily diffused numbers,
chances are that this issue can be safely ignored; because a random
number computation broken by concurrency will most likely generate --
well, a garbled number or "randomly" corrupted internal state.
Validating this reasoning by an empiric investigation seems advisable though.
- SchedulerStress_test simply takes too long to complete (~4 min)
and is thus aborted by the testrunner. Add a switch to allow for
a quick smoke test.
- SchedulerCommutator_test aborts due to an unresolved design problem,
which I marked as failure
- add some convenience methods for passing arguments to tests
We use the memory address to detect reference to ''the same language object.''
While primarily a testing tool, this predicate is also used in the
core application at places, especially to prevent self-assignment
and to handle custom allocations.
It turns out that actually we need two flavours for convenient usage
- `isSameObject` uses strict comparison of address and accepts only references
- `isSameAdr` can also accept pointers and even void*, but will dereference pointers
This leads to some further improvements of helper utilities related to memory addresses...
Problems in `Rational_test` were caused by `#include' reorderings regarding ''rational'' and ''intgral'' numbers.
The actual root cause is the fact that `FSecs` is only a typedef,
which prevents us from providing a string conversion for rational numbers without ambiguity
* most usages are drop-in replacements
* occasionally the other convenience functions can be used
* verify call-paths from core code to identify usages
* ensure reseeding for all tests involving some kind of randomness...
__Note__: some tests were not yet converted,
since their usage of randomness is actually not thread-safe.
This problem existed previously, since also `rand()` is not thread safe,
albeit in most cases it is possible to ignore this problem, as
''garbled internal state'' is also somehow „random“
As it turns out, by far margin we mostly use rand() to generate
test values within a limited interval, using the ''modulo trick''
and thus excluding the upper bound.
Looking into the implementation of the distributions in the
libStdC++ shows that ''constructing'' a distribution on-the-fly
is cheap and boils down to checking and then storing the bounds;
so basically there is no need to keep ''cached distribution objects''
around, because for all practical purposes these behave like free functions
What is required occasionally is a non-zero HashValue, and sometimes
an interval of floating-point number or a normal distribution seem useful.
Providing these as free-standing convenience functions,
implicitly accessing the default PRNG.
* add new option to the commandline option parser
* pass this as std::optional to the test-suite constructor
* use this value optionally to inject a fixed value on re-seeding
* provide diagnostic output to show the actual seed value used
...to the base-class of all tests
* `seedRand()` shall be invoked by every test using randomisation
* it will draw a new seed for the implicit default-PRNG
* it will document this seed value
* but when a seed was given via cmdline, it will inject that instead
* `makeRandGen()` will create a new dedicated generator instance,
attached (by seeding) to the current default-PRNG
It is not clear yet how to pass the actual `SeedNucleus`, which
for obvious reasons must be maintained by the `test::Suite`
Using random or pseudo-random numbers as input for tests
can be a very effective tool to spot unintended behaviour in
corner cases, and also helps writing more principled test verifications.
However, investigating failures in randomised tests can be challenging.
A well-proven solution is to exploit the **determinism** of pseudo-random-numbers
by documenting a randomly generated seed, that can be re-injected for investigation.
Up to now, most tests rely on the old library function `rand()`, while
at some places already the C++ standard framework for random number generation
is used, packaged into a custom wrapper. Adding adequate support for
documented seed values seems to be easy to achieve, after switching
existing usages of `rand()` to a suitable drop-in replacement.
After some consideration, I decided ''against'' wiring random generator instances
explicitly, while allowing to do so on occasion, when necessary. Thus
the planned seeding mechanism will rather re-seed a ''implicit default''
generator, which could then be used to construct explicit generator instances
when required (e.g. for multithreaded tests)
As a starting point, this changeset replaces the `randomise()` API call
by a direct access to the ''reseeding functionality'' exposed by the
C++ framework and all default generators. Since we already provide a
dedicated static instance of the plattform entropy source, re-randomisation
can be achieved by seeding from there.
NOTE: there was extended debate in the net, questioning the viability
of the `std::random_seq` -- these arguments, while valid from a theoretical
point of view, seem rather moot when placed into a practical context,
where even 2^32 different generation-paths(cycles) are more than enough
to provide sufficient diffusion of results (unless the goal is really to
engage into Monte-Carlo simulations for scientific research or large model
simulations).
Notable most of the more catchy reprovals raised by Melissa O'Neill
have been refuted by experts of the field, even while being still propagated
at various places in the net, often combined with promoting PCG-Random.
Originally, this helper was called `IterIndex`, thereby following a
common naming scheme of iteration-related facilities in Lumiera, e.g.
* `IterAdapter`
* `IterExplorer`
* `IterSource`
However, I myself was not able to recall this name, and found myself
now for the second time unable to find this piece of code, even while
still able to recall vaguely that I had written something of this kind.
(and unable to find it by a text search for "index", for obvious reasons)
So, on a second thought, the original name is confusing: we do not create
an index of / for iterators; rather we are iterating an index. So this
is what it should be called...
showing the Node-symbol and a reduced rendering of
either the predecessor or a collection of source nodes.
For this we need functionality to traverse the node graph depth-first
and collect all leaf nodes (which are the source nodes without predecessor);
such can be implemented with the help of the expandAll() functionality
of `lib::IterExplorer`. In addition we need to collect, sort and deduplicate
all the source-node specs; since this is a common requirement, a new
convenience builder was added to `lib::IterExplorer`
...which then also allow to fill in the missing parts for the
default 1:1 wiring scheme, which connects each »input slot«
of the processing function with the corresponding ''lead node''
__Analysis__: what kind of verifications are sensible to employ
to cover building, wiring and invocation of render nodes?
Notably, a test should cover requirements and observable functionality,
while ''avoiding direct hard coupling to implementation internals...''
__Draft__: the most simple node builder invocation conceivable...
* conduct analysis regarding allocator handling in the Builder
* turns out we'll have to keep around two different allocators while building
* ⟹ establish the goal to confine usage of the Node allocator to the lower Levels
* consequently must open up the `lib::SeveralBuilder` to be usable
as an intermediary data structure, while building up the target data
* in the initial design, the `SeveralBuilder` was kept opaque, since
contents can be expected to be re-located frequently and thus exposing
elements and taking references could be dangerous — yet this is also
true for `std::vector` however, so people are assumed to know
when they want to shoot themselves into their own foot
As a replacement for the `RefArray` a new generic container
has been implemented and tested, in interplay with `AllocationCluster`
* the front-end container `lib::Several<I>` exposes only a reference
to the ''interface type'' `I`, while hiding any storage details
* data can only be populated through the `lib::SeveralBuilder`
* a lot of flexibility is allowed for the actual element data types
* element storage is maintained in a storage extent, managed through
a custom allocator (defaulting to `std::allocator` ⟹ heap storage)
The `SeveralBuilder` employs the same tactic as `std::vector`,
by over-allocating a reserve buffer, which grows in exponential
increments, to amortise better the costs of re-allocation.
This tactic does not play well with space limited allocators
like `AllocationCluster` however; it is thus necessary to provide
an extension point where the actuall allocator's limitation can be
queried, allowing to use what is available as reserve, but not more.
With these adaptations, a full usage cycle backed by `AllocationCluster`
can be demonstrated, including variations of dynamic allocation adjustment.
...identified as part of bug investigation
* make clear that reserve() prepares for an absolute capacity
* clarify that, to the contrary, ensureStorageCapaciy() means the delta
Moreover, it turns out that the assertion regarding storage limits
triggers frequently while writing the test code; so we can conclude
that the `AllocationCluster` interface lures into allocating without
previous check. Consequently, this check now throws a runtime exception.
As an aside, the size limitation should be accessible on the interface,
similar to `std::vector::max_size()`
- decided to allow creating empty lib::Several;
no need to be overly rigid in this point,
since it is move-assignable anyway...
- populate with enough elements to provoke several reallocations
with copying over the existing elements
- precisely calculate and verify the expected allocation size
- verify the use-count due to dedicated allocator instances
being embedded into both the builder and hidden in the deleter
- move-assign data
- all checksums go to zero at end
The setup for `ArrayBucket` is special, insofar it shell de-allocate itself,
which creates the danger of re-entrant calls, or to the contrary, the danger
to invoke this clean-up function without actually invoking the destructor.
These problems become relevant once the destructor function itself is statefull,
as is the case when embedding a non-trivial, instance bound allocator
to be used for the clean-up work. Using the new `lib::TrackingAllocator`
highlighted this potential problem, since the allocator maintains a use-count.
Thus I decided to move the »destruction mechanics« one level down into
a dedicated and well encapsulated base class; invoking ArrayBucket's destructor
thereby becomes the only way to trigger the clean-up, and even ElementFactory::destroy()
can now safely check if the destructor was already invoked, and otherwise
re-invoke itself through this embedded destructor function. Moreover,
as an additional safety measure, the actual destructor function is now
moved into the local stack frame of the object's destructor call, removing
any possibility for the de-allocation to interfere with the destructor
invokation itself
part of the observed deviation stems form bugs in logging and checksum calculation;
but there seems to be a real problem hidden in the allocator usage of the
new component, since the use-cnt of the handle does not drop to zero
While there might be the possibility to use the magic of the standard library,
it seems prudent rather to handle this insidious problem explicitly,
to make clear what is going on here.
To allow for such explicit alignment handling, I have now changed the
scheme of the storage definition; the actual buffer now starts ''behind''
the `ArrayBucket<I>` object, which thereby becomes a metadata managing header.
__To summarise the problem__: since we are maintaining a dynamically sized buffer,
and since we do not want to expose the actual element type through the
front-end object, we're necessarily bound to perform a raw-memory allocation.
This is denoted in bytes, and thus the allocator can no longer manage
the proper alignment automatically. Rather, we get a storage buffer with
just ''some accidental'' alignment, and we must care to request a sufficient
overhead to be able to shift the actual storage area forward to the next
proper alignment boundary. Obviously this also implies that we must
store this individual padding adjustment somewhere in the metadata,
in order to be able to report the correct size of the block later
on de-allocation.
The solution implemented thus far turns out to be not sufficient
for ''over-aligned-data'', as the raw-allocator can not perform the
''magic work'' because we're exposing only `std::byte` data.
This adaptor works in concert with the generic allocator
building blocks (prospective ''Concepts'') and automatically
registers a either static or dynamic back-link to the factory
for clean-up.
Use this wrapper fore more in-depth test of the new `TrackingAllocator`
and verify proper behaviour through the `EventLog`