After an extended break due to "real life issues"....
Pick up the investigation, with the goal to ascertain a valid definition
and understanding of all test parameters. A first step is to establish
a baseline ''without using a computational load''; this might be some kind
of base overhead of the scheduler.
However -- the way the test scaffolding was built, it is difficult to
create a feedback loop for the statistical test setup with binary search,
since it is not really clear how the single control parameter of the test algorithm,
the so called "stress factor", shall be interpreted and how it can be
combined with a base load.
An extended series of tests, while watching the observed value patterns qualitatively,
seems to corroborate the former results, indicating that the base expense
in my test setup (using a debug build) is at ~200µs / Node / core.
Yet the difficulty to interpret this result and arrive at a logical and generic model
prevents me from translating this into a measurement scheme, which can
be executed independently from a specific test setup and hardware
...so IterExplorer got yet another processing layer,
which uses the grouping mechanics developed yesterday,
but is freely configurable through λ-Functions.
At actual usage sit in TestChainLoad, now only the actual
aggregation computation must be supplied, and follow-up computations
can now be chained up easily as further transformation layers.
...during development of the Chain-Load, it became clear that we'll often
need a collection of small trees rather than one huge graph. Thus a rule
for pruning nodes and finishing graphs was added. This has the consequence
that there might now be several exit nodes scattered all over the graph;
we still want one single global hash value to verify computations,
thus those exit hashes must now be picked up from the nodes and
combined into a single value.
All existing hash values hard coded into tests must be updated
Introduced as remedy for a long standing sloppiness:
Using a `char[]` together with `reinterpret_cast` in storage management helpers
bears danger of placing objects with wrong alignment; moreover, there are increasing
risks that modern code optimisers miss the ''backdoor access'' and might apply too
aggressive rewritings.
With C++17, there is a standard conformant way to express such a usage scheme.
* `lib::UninitialisedStorage` can now be used in a situation (e.g. as in `ExtentFamily`)
where a complete block of storage is allocated once and then subsequently used
to plant objects one by one
* moreover, I went over the code base and adapted the most relevant usages of
''placement-new into buffer'' to also include the `std::launder()` marker
It seams indicated to verify the generated connectivity
and the hash calculation and recalculation explicitly
at least for one example topology; choosing a topology
comprised of several sub-graphs, to also verify the
propagation of seed values to further start-nodes.
In order to avoid addressing nodes directly by index number,
those sub-graphs can be processed by ''grouping of nodes'';
all parts are congruent because topology is determined by
the node hashes and thus a regular pattern can be exploited.
To allow for easy processing of groups, I have developed a
simplistic grouping device within the IterExplorer framework.
- with the new pruning option, start-Nodes can now be anywhere
- introduce predicates to detect start-Nodes and exit-Nodes
- ensure each new seed node gets the global seed on graph construction
- provide functionality to re-propagate a seed and clear hashes
- provide functionality to recalculate the hashes over the graph
Using the same building blocks, this operation can be generalised even more,
leading to a much cleaner implementation (also with better type deduction).
The feature actually used here, namely summing up all values,
can then be provided as a convenience shortcut, filling in std::plus
as a default reduction operator.
...first used as part of the test harness;
seemingly this is a generic and generally useful shortcut,
similar to algorithm::reduce (or some kind of fold-left operation)
Intended as replacement for the Mutex/ConditionVar based barrier
built into the exiting Lumiera thread handling framework and used
to ensure safe hand-over of a bound functor into the starting new
thread. The standard requires a comparable guarantee for the C++17
concurrency framework, expressed as a "synchronizes_with" assertion
along the lines of the Atomics framework.
While in most cases dedicated synchronisation is thus not required
anymore when swtiching to C++17, some special extended use cases
remain to be addressed, where the complete initialisation of
further support framework must be ensured.
With C++20 this would be easy to achieve with a std::latch, so we
need a simple workaround for the time being. After consideration of
the typical use case, I am aiming at a middle ground in terms of
performance, by using a yield-wait until satisfying the latch condition.
The EventLog seems to provide all the building blocks, but we need
some higher level special matchers (and maybe we also want to hide
some of the basic EventLog matchers). A soulution might be to wrap
the EventMatcher and delegate all follow-up builder calls.
This seems adequate, since the EventLog-Matcher is basically used as black box,
building up more elaborate matchers from the provided basic matchers...
Spent some time again to understand how EventLog matching works.
My feelings towards this piece of code are always the same: it is
somewhat too "tricky", but I am not aware of any other technique
to get this degree of elaborate chained matching on structured records,
short of building a dedicated matching engine from scratch.
The other alternative would be to use a flat textual log (instead of
the structured log records from EventLog), but then we'd have to
generate quite intricate regular expressions from the builder,
and I'm really doubtful it would be easier and clearer....
- fix a bug in IterExplorer: when iterating a »state core« directly,
the helper CoreYield passed the detected type through ValueTypeBindings.
This is logically wrong, because we never want to pick up some typedefs,
rather we always want to use the type directly returned from CORE::yield()
Here the iterator returns an Epoch&, which itself is again iterable
(it inherits from std::array<Activity, N>). However, it is clear
that we must not descent into such a "flatMap" style recursive expansion
- draft a simple scheme how to regulate Epoch lengths dynamically
- add diagnostics to pinpoint a given Activity and find out into which
Epoch it has been allocated; used to cover the allocator behaviour
Especially for the BlockFlow allocator, sanity checks are elided
for performance reasons; yet, generally speaking, it can be a very bad idea
to "optimise" away sanity checks. Thus an additional adaptor is provided
to layer such checks on top of an existing core; and IterEplorer now
always wires in this additional adaptor, and so the original behaviour
is now restored in this respect (and for the largest part of the code base)
While at first sight just a superficial variation of the existing IterStateWrapper,
it became clear with the evolution of the IterExplorer framework that
this setup represents a distinct concept, and especially lends itself
for complex and cohesive collaboration in a layered pipeline. Which
may, or may not be a good idea, depending on the circumstances.
Now, for the implementation of the scheduler memory allocation scheme,
another twist is added to the picture: we can not effort the sanity checks
on each access, even more so when layering / adapting iterators, where
it is essential that the optimiser can remove all unnecessary warts.
Library: add "obvious" utility to the IterExplorer, allowing to
materialise all contents of the Pipeline into a container
...use this to take a snapshot of all currently active Extent addresses
The second design from 2017, based on a pipeline builder,
is now renamed `TreeExplorer` ⟼ `IterExplorer` and uses
the memorable entrance point `lib::explore(<seq>)`
✔
after completing the recent clean-up and refactoring work,
the monad based framework for recursive tree expansion
can be abandoned and retracted.
This approach from functional programming leads to code,
which is ''cool to write'' yet ''hard to understand.''
A second design attempt was based on the pipeline and decorator pattern
and integrates the monadic expansion as a special case, used here to
discover the prerequisites for a render job. This turned out to be
more effective and prolific and became standard for several exploring
and backtracking algorithms in Lumiera.
- decision: the Monad-style iteration framework will be abandoned
- the job-planning will be recast in terms of the iter-tree-explorer
- job-planning and frame dispatch will be disentangled
- the Scheduler will deliberately offer a high-level interface
- on this high-level, Scheduler will support dependency management
- the low-level implementation of the Scheduler will be based on Activity verbs
Considering the fact that we are bound to introduce yet another iteration control function,
because there is literally no other way to cause a refresh within the IterTreeExplorer-Layers,
it is indicated to reconsider the way how IterStateWrapper attaches to the
iteration control API.
As it turns out, we'll never need an ADL-free function here;
and it seems fully adequate to require all "state core" objects to expose
the API as argument less member function. Because these reflect precisely
the contract of a "state core", so why not have them as member functions.
And as a nice extra, the implementation becomes way more concise in
all the cases refactored with this changeset!
Yet still, we stick to the basic design, *not* relying on virtual functions.
So this is a typical example of a Type Class (or "Concept" in C++ terminology)
This can be seen as a side track, but the hope is
by relying on some kind of monadic evaluation pattern, we'll be
able to to reconcile the IterExplorer draft from 2012 with the requirement
to keep the implementation of "tree position" entirely opaque.
The latter is mandatory in the use case here, since we must not intermingle
the algorithm to resolve UI-coordinates in any way with the code actually
navigating and accessing GTK widgets. Thus, we're forced to build some kind
of abstraction barrier, and this turns out to be surprisingly difficult.
after looking into our various iterator tools,
it seems obvious that our filtering iterator implementation
has almost all of the required behaviour; we only need to
add a hook to rewrite and extend the filtering functor,
which can now nicely done with a lambda closure.
This means all memory management, if necessary, is
pushed into std::function and the automated memory
management for closures provided by the runtime.
Since C++ is not a real functional programming language and
has unsafe unmanaged pointers, it is not difficult to produce
dangling references within an extended evaluation pipeline
involving transient objects and pass-by-reference.
In the initial implementation, I built in a safeguard copy
into the signature of the Explorer function, to make sure even
a transiently dressed-up input value gets materialised before
proceeding with the source sequence. Unfortunately this safeguard
turns out as a roadblock now; we might as well take the input
by reference and return an "expanded" state by value. We might
even want to do the full "expansion" on referred state, when
we're able to ensure the source values remain in memory
until consumption.
Thus now the full power of decision is placed on the signature
of the explorer function. The expansion strategies of IterExplorer
will no longer attempt to "sanitise" the signature of the passed-in
function to prevent desaster; I've added some warnings into the
documentation to highlight that danger. Basically, if you want
to be clever, then you're bound to read and understand inticacies
of the implementation.
If in doubt, use values and copying. C++ is optimised for that.
remembered that some years ago I had to deal with a very similar problem
for planning the frame rendering jobs. It turned out, that the
iterator monad developed for this looks promising for our task at hand
...for the very specific situation when we want
to explore an existing data structure, and the
exploration assumes value semantics.
The workaround then is to use pointers as values.
...attempt to build it based on the monadic iterator primitives.
Only problem is: need to find out relation between nodes
after the fact. In the real usage situation, this
is not a problem, since we have a state object
there, which can track the relation as it is established
especially the exploration stack is pushed down
first successful definition of all the JobPlanning classes
just the framework of classes necessary to pass the compiler;
all implementation is still stubbed
brainstorming how to implement the job planning stage
the idea is to built on top of the IterExplorer,
but have the "stack" of re-evaluation integrated
into a custom type, which exploits the static
node network structure to avoid heap allocations
solution idea: again use a builder function?
the template _Fun started as an internal helper
for function-closure, but seems to be of
general use. Thus move it into meta/function.hpp
(function-closure.hpp is heavyweight)
this enables expansion of a (functional) data structure
until exhaustion -- which is what we need to
build job functors by traversing and expanding
an arbitrarily nested job definition structure
the intention is to use this to simplify
generating render jobs based on the elaborated
dependency network of the render nodes. The key
challenge is to overcome the necessity to
store partially done evaluations as
continuation
the tricky part seems to be how to combine the
source iterators into a new monad instance, while
keeping this "Combinator" Strategy configurable
...just passes the compiler, while still lacking
even the generic implementation of joining
together the source iterators
The idea is to avoid building a data structure
for intermediary results, while still being able
to process a variably sized and arbitrary shaped
set of source data