- add preliminary deadline-check (directly instead of using the Activity)
- with this shortcut, now able to implement discarding obsoleted Epochs
- Iteration and use of the underlying `ExtentFamily` is also settled by now
💡 ''Implementation concept for the allocation scheme complete and validated''
...with the preceding IterableDecorator refactoring,
the navigation and access to the storage extents can now be
organised into a clear progression
Allocator::iterator -> EpochIter -> Epoch&
Convenience management and support functions can then be
pushed down into Epoch, while iteration control can be done
high-level in BlockFlow, based on the helpers in Epoch
Especially for the BlockFlow allocator, sanity checks are elided
for performance reasons; yet, generally speaking, it can be a very bad idea
to "optimise" away sanity checks. Thus an additional adaptor is provided
to layer such checks on top of an existing core; and IterEplorer now
always wires in this additional adaptor, and so the original behaviour
is now restored in this respect (and for the largest part of the code base)
While at first sight just a superficial variation of the existing IterStateWrapper,
it became clear with the evolution of the IterExplorer framework that
this setup represents a distinct concept, and especially lends itself
for complex and cohesive collaboration in a layered pipeline. Which
may, or may not be a good idea, depending on the circumstances.
Now, for the implementation of the scheduler memory allocation scheme,
another twist is added to the picture: we can not effort the sanity checks
on each access, even more so when layering / adapting iterators, where
it is essential that the optimiser can remove all unnecessary warts.
..this is the most simple case, where no Epochs are opened yet
..add diagnostics to inspect alloc count and deadlines
..add accessors for the first/last underlying Extent
...continue to proceed test-driven
...scheduler internals turn out to be intricate and cohesive,
and thus the only hope is to adhere to strict testing discipline
Library: add "obvious" utility to the IterExplorer, allowing to
materialise all contents of the Pipeline into a container
...use this to take a snapshot of all currently active Extent addresses
- use a checksum to prove that ctor / dtor of "content" is not invoked
- let the usage of active extents "wrap around" so that the mem block is re-used
- verify that the same data is still there
The low-level allocator is basically implemented now,
but we still need to check thoroughly that the tricky
wrap-around and expansion logic behaves sane...
(see #1311)
Using a Storage* within a wrapper as "pos" will work,
but is borderline trickery, since it amounts to subverting
the idea behind IterAdapter (which is to encapsulate a target
pointer with some control-logic in the managing container).
Using the same storage size and implementation overhead,
it is much more straight-forward to package the complete
iteration logic into a »State Core«, which in this case
however maintains a back-link to the ExtentFamily.
Iteration should just yield an Reference to an Extent,
thereby hiding all details of the actual raw storage (char[]).
This can be achieved by usind a wrapper type around a pointer
into the managing vector; from this pointer we may convert
into a vector::iterator with the trick described here
https://stackoverflow.com/a/37101607/444796
Furthermore, continued planning of the Activity-Language,
basically clarified the complete usage scenario for now;
seems all implementable right away without further difficulties
..with the ability to grow on demand..
..possibly add the new extents in the middle, by first allocating at the end
and then using the std::rotate() algo to bring them to the point
in the middle where new extents are required
- the idea is to use slot-0 in each extent for administrative metadata
- to that end, a specialised GATE-Activity is placed into slot-0
- decision to use the next-pointer for managing the next free slot
- thus we need the help of the underlying ExtentFamily for navigating Extents
Decision to refrain from any attempt to "fix" excessive memory usage,
caused by Epochs still blocked by pending IO operations. Rather, we
assume the engine uses sane parametrisation (possibly with dynamic adjustment)
Yet still there will be some safety limit, but when exceeding this limit,
the allocator will just throw, thereby killing the playback/render process
- decision to favour small memory footprint
- rather use several Activity records to express invocation
- design Activity record as »POD with constructor«
- conceptually, Activity is polymorphic, but on implementation
level, this is "folded down" into union-based data storage,
layering accessor functions on top
- decision how to handle the Extent storage (by forced-cast)
- decision to place the administrative record directly into the Extent
TODO not clear yet how to handle the implicit limitation for future deadlines
using a simple yet performant data structure.
Not clear yet if this approach is sustainable
- assuming that no value initialisation happens for POD payload
- performance trade-off growth when in wrapped-state vs using a list
....still about to find out what kinds of Activities there are,
and what reasonably to implement on layer-2 vs. layer-1
It is clear that the worker will typically invoke a doWork()
operation on layer-2, which in turn will iterate layer-1.
Each worker pulls and performs internal managmenet tasks exclusively
until encountering the next real render task, at which point it will
drop an exclusion flag and then engage into performing the actual
extended work for rendering...
- define a simple record to represent the Activity
- define a handle with an ordering function
- low-level functions to...
+ accept such a handle
+ pick it from the entrace queue
+ pass it for priorisation into the PriQueue
+ dequeue the top priority element
after completing the recent clean-up and refactoring work,
the monad based framework for recursive tree expansion
can be abandoned and retracted.
This approach from functional programming leads to code,
which is ''cool to write'' yet ''hard to understand.''
A second design attempt was based on the pipeline and decorator pattern
and integrates the monadic expansion as a special case, used here to
discover the prerequisites for a render job. This turned out to be
more effective and prolific and became standard for several exploring
and backtracking algorithms in Lumiera.
An extended series of refactoring and partial rewrites resulted
in a new definition of the `Dispatcher` interface and completes
the buildup of a Job-Planning pipeline, including the ability
to discover prerequisites and compute scheduling deadlines.
At this point, I am about to ''switch to the topic'' of the `Scheduler`,
''postponing'' the completion of the `RenderDrive` until the related
questions regarding memory management and Scheduler interface are settled.
- allow to configure the expected job runtime in the test spec
- remove link to EngineConfig and hard-wire the engine latency for now
... extended integration testing reveals two further bugs ;-)
... document deadline calculation
This finishes the last series of refactorings; the basic concept
remains the same, but in the initial version we arranged the expander
function in the pipeline to maintain a Tuple (parent, child) for the
JobTickets. Unfortunately this turned out to be insufficient, since
JobTicket is effectively const and responsible for a complete Sement,
so there is no room to memorise a Deadline for the parent dependency.
This leads to the better idea to link the JobPlanning aggregators
themselves by parent-child references, which is possible since the
whole dependency chain actually sits in the stack embedded into the
Expander (in the pipeline)
This very deep change (which requires almost complete rebuild)
was prompted by the need to process an object (JobPlanning),
which holds several references and is thus move-only, in the
middle of a complex processing pipeline with child expansion.
If this works out well, a long-standing and obnoxious problem
with transforming iterators would be solved, albeit by incurring
a (presumably small) performance overhead, since now the new
value is no longer *assigned*, but rather the existing payload
is destroyed and a new instance is copy/move constructed into
the inline buffer.
The primary purpose (and widely used in Lumieara) is to have a
Lambda create a new Object, which is then returned by value
and thus immediately moved into this inline buffer, where it
resides for further use (as long as the enclosing pipeline
stays alive). Unless such an object does very elaborate
allocations and registrations behind the scene, the
expense of assigning vs creating should be the same.
...in the hope that the Optimiser is able to elide those references entirely,
when (as is here the case) they point into another field of a larger object compound
...as a preparation for solving a logical problem with the Planning-Pipeline;
it can not quite work as intended just by passing down the pair of
current ticket and dependent ticket, since we have to calculate a chained
calculation of job deadlines, leading up to the root ticket for a frame.
My solution idea is to create the JobPlanning earlier in the pipeline,
already *before* the expansion of prerequisites, and rather to integrate
the representation of the dependency relation direcly into JobPlanning
...using hard coded values instead of observation of actual runtimes,
but at least the calculation scheme (now relocated from TimeAnchor to JobPlanning)
should be a reasonable starting point.
TODO: test fails...
- collect list of entities to be picked up from the dispatcher-pipeline
- as it turns out: there is no sensible use for the realTimeDeadline in
in the FrameCoord record ==> remove it
The initial implementation effort for Player and Job-Planning
has been reviewed and largely reworked, and some parts are now
obsoleted by the reworked alternative and can be disabled.
The basic idea will be retained though: JobPlanning is a
data aggregator and performs the final step of creating a Job
- had to fix a logical inconsistency in the underlying Expander implementation
in TreeExplorer: the source-pipeline was pulled in advance on expansion,
in order to "consume" the expanded element immediately; now we retain
this element (actually inaccessible) until all of the immediate
children are consumed; thus the (visible) state of the PipeFrameTick
stays at the frame number corresponding to the top-level frame Job,
while possibly expanding a complete tree of flexible prerequisites
This test now gives a nice visualisation of the interconnected states
in the Job-Planning pipeline. This can be quite complex, yet I still think
that this semi-functional approach with a stateful pipeline and expand functors
is the cleanest way to handle this while encapsulating all details
- fix a bug in the MockDispatcher, when duplicating the ExitNodes.
A vector-ctor with curly braces will be interpreted as std::initializer_list
- add visualisation of the contents appearing at the end of the pipeline
*** something still broken here, increments don't happen as expected
`steam/engine/mock-dispatcher.hpp |cpp` now integrates this
''complete mock setup for render jobs and frame dispatching.''
The exising `DummyJob` has been slightly adapted and renamed
to `MockJob` and is tightly integrated with the other mocks.
The implementation of a `MockDispatcher` necessitated to change
the use of `MockJobTicket`. The initial attempts used a complete
mock implementation, but this approach turned out not to be viable.
Instead — based on the ideas developed for the mock setup —
now the prospective real implementation of `JobTicket` is available
and will be used by the mock setup too. Instead of a synthetic spec,
now a setup of recursively connected `ExitNode`(s) is used; the latter
seems to develop into some kind of Facade for the render node network.
Based on this mock setup, we can now demonstrate the (mostly) complete
Job-Planning pipeline, starting from a segmentation up to render jobs,
and verify proper connectivity and job invocation.
✔
- has to be prepared / supported by the RenderEnvironmentClosure
- actual translation happens when building the Dispatcher-Pipeline
- implementation delegate through
virtual size_t Dispatcher::resolveModelPort (ModelPort)
...ouch this was insidious: the STL implementation for list does not
return a pointer to the element just allocated, but rather retrieves
and dereferences the back() / front() iterator after returning from emplace_back|front()
...which in case of re-entrant allocations is something wildly different
than the initial allocation. Thus a *cheap* and dirty placeholder implementation
just using a STL container is not possible, and we need at least
to code up likewise cheesy placeholder implementation by hand.
- separate allocation and ctor all
- use an inline buffer in the STL container
- explicitly handle ctor failures to discard allocation
- NOT THREADSAFE and likely WASTFUL in terms of performance
==> MockSupport_test now back to GREEN after complete refactoring
The existing implementation of the Player from 2012~2015 inclduded
an additional differentiation by media channel (for multichannel media)
and would build a separate CalcStream for each channel.
The in-depth analysis conducted for the ongoing »Vertical Slice« effort
revealed that this differentiation is besides the point and would never
be materialised: Since -- by definition -- all media processing has
to be done by the engine, also the generation of the final output format
including any channel multiplexing will happen in render nodes.
The only exception would be when only a single channel of multichannel
media is extracted -- yet this case would then translate into a
dedicated ModelPort.
Based on this reasoning, a lot of complexity (and some contradictions)
within the JobTicket implementation can be removed -- together with
some further leftovers of the fist attempt to build JobTickets always
from a Mock specification (we now use construction by the Segment,
based on an ExitNode, which is the expected actual implementation
for production setup)
...by defining a new scheme for access to custom allocators
...and then passing a reference to such an accessor into the
JobTicket ctor, thereby allowing the ticket istelf recursively
to place further JobTicket instances into the allocation space
--> success, test passes (finally)
Up to now, a draft/mock implementation was used, relying on a »spec tuple«,
which was fabricated by MockJobTicket. But with the introduction of
NodeGraphAttachment, the MockSequence now generates a nested ExitNode structure,
and thus the JobTicket will be created through the "real" ctor, and
no longer via MockJobTicket.
Thus it is possible to skip this whole interspersed »spec tuple«,
since ExitNode *is* already this aggregated / abstracted Spec
PROBLEM: can not implement Spec-generation, since
- we must use a λ for internal allocation of JobTickets
- but recursive type inference is not possible
Will thus need to abandon the Spec-Tuple and relocate this
traversal-and-generation code into JobTicket itself
Use another unit-test (FixtureSegment_test) to guide and cover
the transition from the existing fake-implementation to the
actual implementation, where the JobTicket will be generated
on-demand, from a NodeGraphAttachment
It turns out that the real (not mocked) implementation of JobTicket creation
is already required now for this planned (mock)Dispatcher setup;
moreover, this real implementation turns out to be almost identical
to the mock implementation written recently -- just nested structure
of prerequiste JobTickets need to be changed into a similar structur
of ExitNodes
-- as an aside: rearrange various tests to be more in-line
with the envisioned architecture of playback and engine
...this opens up yet another difficult question and a host of new problems
- how are prerequisites detected or arranged by the Builder
- how are prerequisites represented?
- what is an ExitNode in terms of implementation? A subclass of ProcNode?
- how will the actual implementation of JobTicket creation (on-demand) work?
- how to adapt the Mock implementation, while retaining the Specification
for Segments and prerequisites?
..this is now the third attempt, and it seems this one leads to a
clean solution for the type rebinding problem, while also allowing
to unit-test each step in isolation.
The idea is to layer a *templated* builder class on top,
but to slice it away in each step, re-assemble the pipeline
and decorate a new builder instance on top. The net result
is a tightly interconnected processing pipeline without
any spurious interspersed leftovers from the builder,
while all intermediate steps are fully operational
and can thus be unit-tested...
...still very rough edged...
...based on the idea to have a pair(Dependent,Dependency) and to shift these on each level of expansion
PROBLEMS:
- what to use as root level?
- can not handle JobTicket const& in transform-iterator (assignement operator)
...it turns out that we actually do not need to wrap TreeExplorer
on the builder types, because basically there is only a single active
builder type, and the complete processing pipeline can be assembled
in a single terminal function.
The type rebinding problem can thus be solved just by a simple
marker struct, which inherits from a template parameter
...hard to tackle...
The idea is to wrap the TreeExplorer builder, so that our specific
builder functions can delegated to the (inherited) generic builder functions
and would just need to supply some cleverly bound lambdas. However,
resulting types are recursive, which does not play nice with type inference,
and working around that problem leads to capturing a self reference,
which at time of invocation is already invalidated (due to moving the
whole pipeline into the final storage)
...which leads to the next daunting problems:
- we need some mocked ModelPort and DataSink placeholders
- we need a way how to inherit from a partial TreeExplorer pipeline
...introduced in preparation for building the Dispatcher pipeline,
which at its core means to iterate over a sequence of frame positions;
thus we need a way to stop rendering at a predetermined point...
several years ago, it seemed like a good idea to incorporate
the link between nominal time and wall-clock time into a dedicated
anchor point, which also regulates the continued frame planning.
But it turned out that such a design mixes up several concepts
and introduces confusion regarding the meaning of "real time"
- latency can not be reasonably defined for a whole planning chunk
- skipping or sliding due to missed deadlines can not reasonably handled
within such an abstract entity; it must be handled rather at the
level of a playback process
- linking the frame grid generation directly to a planning chunk
undercuts the possible abstraction of a planning pipeline
...which is build a »Job planning pipeline« step by step
in a test setup, and then factor that out as RenderDrive,
to supersede the existing CalcPlanContinuation and get
rid of the Monads this way...
Challenges
- there is a inconsistency with channel usage
- need to establish a way how to transport the output-Sink into the JobFunctor
- need a way to propagate the current frame number to the next planning chunk
The prototypical setup of data structures and test support components
is largely complete by now — with the exception of the `MockDispatcher`,
which will be completed while moving to the next steps pertaining the
setup of a frame dispatch pipeline.
* the existing `DummyJob` was augmented to allow verification of
association between Job and `JobTicket`
* the existing implementation of `JobTicket` was verified and augmented
to allow coverage of the whole usage cycle
* a `MockJobTicket` was implemented on top, which can be generated
from a symbolical test specification (rather than from the real
Fixture data structure)
* a complete `MockSegmentation` was developed, allowing to establish
all the aforementioned data structures without an actual backing
Render Engine. Moreover, `MockSegmentation` can be generated
from the aforementioned symbolic test specification.
* as part of this work, an algorithm to split an existing Segmentation
and to splice in new segments was developed and verified
Last testcase: add deeply nested Prerequisites.
Turns out that the allocator must be able to handle
re-entrant allocations, which std::deque can not fulfil.
Thus using std::list here for the Mock implementation.
In the end, the real allocations will be done by our custom
allocator (AllocationCluster), which can be arranged easily
to support re-entrant allocation calls (since the whole point
is to just place those objects into a pre-allocated large block
and only de-allocate them later in one sway. Thus the allocator
does not need to wait for the object constructor to finish, which
trivially allows for re-entrant calls)
...which uncovers further deeply nested problems,
especially when referring to non-copyable types.
Thus need to construct a common type that can be used
both to refer to the source elements and the expanded elements,
and use this common type as result type and also attempt to
produce better diagnostic messages on type mismatch....
...the improved const correctness on STL iterators uncovered another
latent problem with out diagnositc format helper, which provide
consistently rounded float and double output, but failed to take
CV-qualifiaction into account
This is a subtle and far reaching fix, which hopefully removes
a roadblock regarding a Dispatcher pipeline: Our type rebinding
template used to pick up nested type definitions, especially
'value_type' and 'reference' from iterators and containers,
took an overly simplistic approach, which was then fixed
at various places driven by individual problems.
Now:
- value_type is conceptually the "thing" exposed by the iterator
- and pointers are treated as simple values, and no longer linked
to their pointee type; rather we handle the twist regarding
STL const_iterator direcly (it defines a non const value_type,
which is sensible from the STL point of view, but breaks our
generic iterator wrapping mechanism)
...in an attempt to resolve the deeply nested problems encountered
while building an iterator pipeline for the Dispatcher. It seems
that I was sloppy some years ago and just "bashed them into submission",
thereby mixing up two different meanings of "value_type"
Moreover I seemingly implemented the same helper trait template twice,
so the first step is to switch all usages to meta::TypeBinding
To complete the mock setup, the next step would be to extend the GenNode-based spec langage
to allow defining prerequisite Mock-JobTickets. Setting this up seems rather straight forward --
however, defining a simple testcase to cover this extension runs into surprisingly tricky problems..
- for one, the singleValIterator from Itertools has serious difficulties handling references
- but even more surprising, it seems impossible to make the "prerequisites iterator"
fit into the Tree-Explorer framework (which I intend to use as replacement
for the monadic approach)
after some extended analysis of generic types and template instances,
it seems that not TreeExplorer as such is the primary problem, but rather
there is a conceptual mismatch somewhere deep down in Itertools or Iter-Adapter
By reasoning and analysis I conclude that the differentiation into
multiple channels is likely misplaced in JobTicket; it belongs ratther
into the Segment and should provide a suitable JobTicket for each ModelPort
Handling of prerequisites also needs to be reshaped entirely after
switching to a pipeline builder for the Job-planning pipeline; as
preliminary access point, just add an iterator over the immediate
prerequisites, thereby shifting the exploration mechanism entirely
out of the JobTicket implementation
Testcase: A simple Sementation with a single and bounded Segment
As aside, figured out how to unpack an iterator such as to
tie a fixed number of references through a structural binding:
auto const& [s1,s2,s3] = seqTuple<3> (mockSegs.eachSeg());
...now able to build a mock segmentation which issues dummy jobs,
and is wired such as to verify the right job is invoked for each segment.
And this allows to build and verify the Dispatcher,
without being able to invoke actual render jobs yet.
...this is something I should have done since YEARS, really...
Whenever working with symbolically represented data, tests
typically involve checking *hundreds* of expected results,
and thus it can be really hard to find out where the
failure actually happens; it is better for readability
to have the expected result string immediately in the
test code; now this expected result can be marked
with a user-defined literal, and then on mismatch
the expected and the real value will be printed.
The algorithm coded thus far turns out to be rather generic,
and thus it can be rewritten into a template, while all specific parts
are supplied as λ-Functors.
- instead of Time we use a generic ordering type
- the Iterator is likewise turned into a template parameter
- all the operations are directly supplied as functor types
- C++17 is able to pick up all those λ-Types from the ctor call
This change looks like "low hanging fruit"; the legibility of the code
is not seriously hampered, yet we get the benefit to test this rather
technical piece of logic by an isolated test (which for now is the
primary motivation), and we can hope to re-use it for similar tasks.
There are 12 distinct cases regarding the orientation of two intervals;
The Segmentation::splitSplice() operation shall insert a new Segment
and adjust / truncate / expand / split / delete existing segments
such as to retain the *Invariant* (seamless segmentation covering
the complete time axis)
- how to pass-in a specification given as GenNode
- now this might be translated into a MockJobTicket allocated in the MockSegmentation
Unimplemented: actually build the Segment with suitable start/end time
right now we're lacking a complete working implementation of render node invocation,
and thus the Dispatcher implementation can only be verified with the help
of mocked jobs. However, at least a preliminary implementation of tagging the
invocation instance is available, and thus we're able to verify that
a given job instance indeed belongs to and is "backed" by a specific JobTicket.
This is prerequisite for building up a (likewise mocked) Fixture datastructure,
and this in turn was meant to form the basis for attacking an actual Scheduler
implementation, followed by a real render node invocation.
- can now create a Job from JobTicket::NIL
- on invocation this Job will to nothing
Only when the first real output backend is implemented,
we can decide if this simplistic implementation is enough,
or if an empty output must be explicitly generated...
* using a simplified preliminary implementation of hash chaining (see #1293)
* simplistic implementation of hashing for time values (half-rotation)
* for now just hashing the time into the upper part of the LUID
Maybe we can even live with that implementation for some time,
depending on how important uniform distribution of hash values is
for proper usage of the frame cache.
Needless to say, various further fine points need more consideration,
especially questions of portability (32bit anyone?). Moreover, since
frame times are typically quantised, the search space for the hashed
time values is drastically reduced; conceivably we should rather
research and implement a good hash function for 128bit and then combine
all information into a single hash key....
...using the MockJobTicket setup as point of reference,
since the actual invocation of render nodes will only be drafted
later in this "Vertical Slice" integration effort...
- introduce a JobTicket::NOP (null-object pattern)
- assuming that the function splitSplice() will retain complete coverage allways
Remark:
`Fixture::getPlaylistForRender()` is a leftover from the very early implementation drafts.
This function was more or less based on the way Cinelerra works; it is clear by now
that Lumiera can not possibly work this way, given that we'll build a low-level model
and dispatch precompiled render jobs....
The idea is to escape a "design deadlock" by using a test-driven prototype
implementation of the data structure to back a further development
of the Dispatcher and Scheduler implementation, which then can be used
to gradually elaborate and switch over to an actual implementation
data structure
...requires a first attempt towards defining a `JobTiket`.
This turns out quite tricky, due to using those `LinkedElements`
(intrusive single linked list), which requires all added records
actually to live elsewhere. Since we want to use a custom allocator
later (the `AllocationCluster`), this boils down to allocating those
records only when about to construct the `JobTicket` itself.
What makes matters even worse: at the moment we use a separate spec
per Media channel (maybe these specs can be collapsed later non).
And thus we need to pass a collection -- or better an iterator
with raw specs, which in turn must reveal yet another nested
sequence for the prerequisite `JobTickets`.
Anyhow, now we're able at least to create an empty `JobTicket`,
backed by a dummy `JobFunctor`....