Library: add "obvious" utility to the IterExplorer, allowing to
materialise all contents of the Pipeline into a container
...use this to take a snapshot of all currently active Extent addresses
- use a checksum to prove that ctor / dtor of "content" is not invoked
- let the usage of active extents "wrap around" so that the mem block is re-used
- verify that the same data is still there
The low-level allocator is basically implemented now,
but we still need to check thoroughly that the tricky
wrap-around and expansion logic behaves sane...
(see #1311)
Iteration should just yield an Reference to an Extent,
thereby hiding all details of the actual raw storage (char[]).
This can be achieved by usind a wrapper type around a pointer
into the managing vector; from this pointer we may convert
into a vector::iterator with the trick described here
https://stackoverflow.com/a/37101607/444796
Furthermore, continued planning of the Activity-Language,
basically clarified the complete usage scenario for now;
seems all implementable right away without further difficulties
- the idea is to use slot-0 in each extent for administrative metadata
- to that end, a specialised GATE-Activity is placed into slot-0
- decision to use the next-pointer for managing the next free slot
- thus we need the help of the underlying ExtentFamily for navigating Extents
Decision to refrain from any attempt to "fix" excessive memory usage,
caused by Epochs still blocked by pending IO operations. Rather, we
assume the engine uses sane parametrisation (possibly with dynamic adjustment)
Yet still there will be some safety limit, but when exceeding this limit,
the allocator will just throw, thereby killing the playback/render process
- decision to favour small memory footprint
- rather use several Activity records to express invocation
- design Activity record as »POD with constructor«
- conceptually, Activity is polymorphic, but on implementation
level, this is "folded down" into union-based data storage,
layering accessor functions on top
- decision how to handle the Extent storage (by forced-cast)
- decision to place the administrative record directly into the Extent
TODO not clear yet how to handle the implicit limitation for future deadlines
using a simple yet performant data structure.
Not clear yet if this approach is sustainable
- assuming that no value initialisation happens for POD payload
- performance trade-off growth when in wrapped-state vs using a list
The second design from 2017, based on a pipeline builder,
is now renamed `TreeExplorer` ⟼ `IterExplorer` and uses
the memorable entrance point `lib::explore(<seq>)`
✔
after completing the recent clean-up and refactoring work,
the monad based framework for recursive tree expansion
can be abandoned and retracted.
This approach from functional programming leads to code,
which is ''cool to write'' yet ''hard to understand.''
A second design attempt was based on the pipeline and decorator pattern
and integrates the monadic expansion as a special case, used here to
discover the prerequisites for a render job. This turned out to be
more effective and prolific and became standard for several exploring
and backtracking algorithms in Lumiera.
An extended series of refactoring and partial rewrites resulted
in a new definition of the `Dispatcher` interface and completes
the buildup of a Job-Planning pipeline, including the ability
to discover prerequisites and compute scheduling deadlines.
At this point, I am about to ''switch to the topic'' of the `Scheduler`,
''postponing'' the completion of the `RenderDrive` until the related
questions regarding memory management and Scheduler interface are settled.
- allow to configure the expected job runtime in the test spec
- remove link to EngineConfig and hard-wire the engine latency for now
... extended integration testing reveals two further bugs ;-)
... document deadline calculation
This finishes the last series of refactorings; the basic concept
remains the same, but in the initial version we arranged the expander
function in the pipeline to maintain a Tuple (parent, child) for the
JobTickets. Unfortunately this turned out to be insufficient, since
JobTicket is effectively const and responsible for a complete Sement,
so there is no room to memorise a Deadline for the parent dependency.
This leads to the better idea to link the JobPlanning aggregators
themselves by parent-child references, which is possible since the
whole dependency chain actually sits in the stack embedded into the
Expander (in the pipeline)
...in the hope that the Optimiser is able to elide those references entirely,
when (as is here the case) they point into another field of a larger object compound
...as a preparation for solving a logical problem with the Planning-Pipeline;
it can not quite work as intended just by passing down the pair of
current ticket and dependent ticket, since we have to calculate a chained
calculation of job deadlines, leading up to the root ticket for a frame.
My solution idea is to create the JobPlanning earlier in the pipeline,
already *before* the expansion of prerequisites, and rather to integrate
the representation of the dependency relation direcly into JobPlanning
...using hard coded values instead of observation of actual runtimes,
but at least the calculation scheme (now relocated from TimeAnchor to JobPlanning)
should be a reasonable starting point.
TODO: test fails...
The initial implementation effort for Player and Job-Planning
has been reviewed and largely reworked, and some parts are now
obsoleted by the reworked alternative and can be disabled.
The basic idea will be retained though: JobPlanning is a
data aggregator and performs the final step of creating a Job
- had to fix a logical inconsistency in the underlying Expander implementation
in TreeExplorer: the source-pipeline was pulled in advance on expansion,
in order to "consume" the expanded element immediately; now we retain
this element (actually inaccessible) until all of the immediate
children are consumed; thus the (visible) state of the PipeFrameTick
stays at the frame number corresponding to the top-level frame Job,
while possibly expanding a complete tree of flexible prerequisites
This test now gives a nice visualisation of the interconnected states
in the Job-Planning pipeline. This can be quite complex, yet I still think
that this semi-functional approach with a stateful pipeline and expand functors
is the cleanest way to handle this while encapsulating all details
- fix a bug in the MockDispatcher, when duplicating the ExitNodes.
A vector-ctor with curly braces will be interpreted as std::initializer_list
- add visualisation of the contents appearing at the end of the pipeline
*** something still broken here, increments don't happen as expected
`steam/engine/mock-dispatcher.hpp |cpp` now integrates this
''complete mock setup for render jobs and frame dispatching.''
The exising `DummyJob` has been slightly adapted and renamed
to `MockJob` and is tightly integrated with the other mocks.
The implementation of a `MockDispatcher` necessitated to change
the use of `MockJobTicket`. The initial attempts used a complete
mock implementation, but this approach turned out not to be viable.
Instead — based on the ideas developed for the mock setup —
now the prospective real implementation of `JobTicket` is available
and will be used by the mock setup too. Instead of a synthetic spec,
now a setup of recursively connected `ExitNode`(s) is used; the latter
seems to develop into some kind of Facade for the render node network.
Based on this mock setup, we can now demonstrate the (mostly) complete
Job-Planning pipeline, starting from a segmentation up to render jobs,
and verify proper connectivity and job invocation.
✔
- has to be prepared / supported by the RenderEnvironmentClosure
- actual translation happens when building the Dispatcher-Pipeline
- implementation delegate through
virtual size_t Dispatcher::resolveModelPort (ModelPort)
...ouch this was insidious: the STL implementation for list does not
return a pointer to the element just allocated, but rather retrieves
and dereferences the back() / front() iterator after returning from emplace_back|front()
...which in case of re-entrant allocations is something wildly different
than the initial allocation. Thus a *cheap* and dirty placeholder implementation
just using a STL container is not possible, and we need at least
to code up likewise cheesy placeholder implementation by hand.
- separate allocation and ctor all
- use an inline buffer in the STL container
- explicitly handle ctor failures to discard allocation
- NOT THREADSAFE and likely WASTFUL in terms of performance
==> MockSupport_test now back to GREEN after complete refactoring
The existing implementation of the Player from 2012~2015 inclduded
an additional differentiation by media channel (for multichannel media)
and would build a separate CalcStream for each channel.
The in-depth analysis conducted for the ongoing »Vertical Slice« effort
revealed that this differentiation is besides the point and would never
be materialised: Since -- by definition -- all media processing has
to be done by the engine, also the generation of the final output format
including any channel multiplexing will happen in render nodes.
The only exception would be when only a single channel of multichannel
media is extracted -- yet this case would then translate into a
dedicated ModelPort.
Based on this reasoning, a lot of complexity (and some contradictions)
within the JobTicket implementation can be removed -- together with
some further leftovers of the fist attempt to build JobTickets always
from a Mock specification (we now use construction by the Segment,
based on an ExitNode, which is the expected actual implementation
for production setup)
...by defining a new scheme for access to custom allocators
...and then passing a reference to such an accessor into the
JobTicket ctor, thereby allowing the ticket istelf recursively
to place further JobTicket instances into the allocation space
--> success, test passes (finally)
Up to now, a draft/mock implementation was used, relying on a »spec tuple«,
which was fabricated by MockJobTicket. But with the introduction of
NodeGraphAttachment, the MockSequence now generates a nested ExitNode structure,
and thus the JobTicket will be created through the "real" ctor, and
no longer via MockJobTicket.
Thus it is possible to skip this whole interspersed »spec tuple«,
since ExitNode *is* already this aggregated / abstracted Spec
PROBLEM: can not implement Spec-generation, since
- we must use a λ for internal allocation of JobTickets
- but recursive type inference is not possible
Will thus need to abandon the Spec-Tuple and relocate this
traversal-and-generation code into JobTicket itself
Use another unit-test (FixtureSegment_test) to guide and cover
the transition from the existing fake-implementation to the
actual implementation, where the JobTicket will be generated
on-demand, from a NodeGraphAttachment
It turns out that the real (not mocked) implementation of JobTicket creation
is already required now for this planned (mock)Dispatcher setup;
moreover, this real implementation turns out to be almost identical
to the mock implementation written recently -- just nested structure
of prerequiste JobTickets need to be changed into a similar structur
of ExitNodes
-- as an aside: rearrange various tests to be more in-line
with the envisioned architecture of playback and engine
...this opens up yet another difficult question and a host of new problems
- how are prerequisites detected or arranged by the Builder
- how are prerequisites represented?
- what is an ExitNode in terms of implementation? A subclass of ProcNode?
- how will the actual implementation of JobTicket creation (on-demand) work?
- how to adapt the Mock implementation, while retaining the Specification
for Segments and prerequisites?
...it turns out that we actually do not need to wrap TreeExplorer
on the builder types, because basically there is only a single active
builder type, and the complete processing pipeline can be assembled
in a single terminal function.
The type rebinding problem can thus be solved just by a simple
marker struct, which inherits from a template parameter
...hard to tackle...
The idea is to wrap the TreeExplorer builder, so that our specific
builder functions can delegated to the (inherited) generic builder functions
and would just need to supply some cleverly bound lambdas. However,
resulting types are recursive, which does not play nice with type inference,
and working around that problem leads to capturing a self reference,
which at time of invocation is already invalidated (due to moving the
whole pipeline into the final storage)
...which leads to the next daunting problems:
- we need some mocked ModelPort and DataSink placeholders
- we need a way how to inherit from a partial TreeExplorer pipeline
...introduced in preparation for building the Dispatcher pipeline,
which at its core means to iterate over a sequence of frame positions;
thus we need a way to stop rendering at a predetermined point...
several years ago, it seemed like a good idea to incorporate
the link between nominal time and wall-clock time into a dedicated
anchor point, which also regulates the continued frame planning.
But it turned out that such a design mixes up several concepts
and introduces confusion regarding the meaning of "real time"
- latency can not be reasonably defined for a whole planning chunk
- skipping or sliding due to missed deadlines can not reasonably handled
within such an abstract entity; it must be handled rather at the
level of a playback process
- linking the frame grid generation directly to a planning chunk
undercuts the possible abstraction of a planning pipeline
...which is build a »Job planning pipeline« step by step
in a test setup, and then factor that out as RenderDrive,
to supersede the existing CalcPlanContinuation and get
rid of the Monads this way...
Challenges
- there is a inconsistency with channel usage
- need to establish a way how to transport the output-Sink into the JobFunctor
- need a way to propagate the current frame number to the next planning chunk
The prototypical setup of data structures and test support components
is largely complete by now — with the exception of the `MockDispatcher`,
which will be completed while moving to the next steps pertaining the
setup of a frame dispatch pipeline.
* the existing `DummyJob` was augmented to allow verification of
association between Job and `JobTicket`
* the existing implementation of `JobTicket` was verified and augmented
to allow coverage of the whole usage cycle
* a `MockJobTicket` was implemented on top, which can be generated
from a symbolical test specification (rather than from the real
Fixture data structure)
* a complete `MockSegmentation` was developed, allowing to establish
all the aforementioned data structures without an actual backing
Render Engine. Moreover, `MockSegmentation` can be generated
from the aforementioned symbolic test specification.
* as part of this work, an algorithm to split an existing Segmentation
and to splice in new segments was developed and verified
Last testcase: add deeply nested Prerequisites.
Turns out that the allocator must be able to handle
re-entrant allocations, which std::deque can not fulfil.
Thus using std::list here for the Mock implementation.
In the end, the real allocations will be done by our custom
allocator (AllocationCluster), which can be arranged easily
to support re-entrant allocation calls (since the whole point
is to just place those objects into a pre-allocated large block
and only de-allocate them later in one sway. Thus the allocator
does not need to wait for the object constructor to finish, which
trivially allows for re-entrant calls)
...which uncovers further deeply nested problems,
especially when referring to non-copyable types.
Thus need to construct a common type that can be used
both to refer to the source elements and the expanded elements,
and use this common type as result type and also attempt to
produce better diagnostic messages on type mismatch....
...the improved const correctness on STL iterators uncovered another
latent problem with out diagnositc format helper, which provide
consistently rounded float and double output, but failed to take
CV-qualifiaction into account
This is a subtle and far reaching fix, which hopefully removes
a roadblock regarding a Dispatcher pipeline: Our type rebinding
template used to pick up nested type definitions, especially
'value_type' and 'reference' from iterators and containers,
took an overly simplistic approach, which was then fixed
at various places driven by individual problems.
Now:
- value_type is conceptually the "thing" exposed by the iterator
- and pointers are treated as simple values, and no longer linked
to their pointee type; rather we handle the twist regarding
STL const_iterator direcly (it defines a non const value_type,
which is sensible from the STL point of view, but breaks our
generic iterator wrapping mechanism)
To complete the mock setup, the next step would be to extend the GenNode-based spec langage
to allow defining prerequisite Mock-JobTickets. Setting this up seems rather straight forward --
however, defining a simple testcase to cover this extension runs into surprisingly tricky problems..
- for one, the singleValIterator from Itertools has serious difficulties handling references
- but even more surprising, it seems impossible to make the "prerequisites iterator"
fit into the Tree-Explorer framework (which I intend to use as replacement
for the monadic approach)
after some extended analysis of generic types and template instances,
it seems that not TreeExplorer as such is the primary problem, but rather
there is a conceptual mismatch somewhere deep down in Itertools or Iter-Adapter
By reasoning and analysis I conclude that the differentiation into
multiple channels is likely misplaced in JobTicket; it belongs ratther
into the Segment and should provide a suitable JobTicket for each ModelPort
Handling of prerequisites also needs to be reshaped entirely after
switching to a pipeline builder for the Job-planning pipeline; as
preliminary access point, just add an iterator over the immediate
prerequisites, thereby shifting the exploration mechanism entirely
out of the JobTicket implementation
Testcase: A simple Sementation with a single and bounded Segment
As aside, figured out how to unpack an iterator such as to
tie a fixed number of references through a structural binding:
auto const& [s1,s2,s3] = seqTuple<3> (mockSegs.eachSeg());
...now able to build a mock segmentation which issues dummy jobs,
and is wired such as to verify the right job is invoked for each segment.
And this allows to build and verify the Dispatcher,
without being able to invoke actual render jobs yet.
- only the parts actually touched by the algo will be re-allocated
- when a segment is split, the clone copies carry on all data
Library: add function to check for a bare address (without type info)
...this is something I should have done since YEARS, really...
Whenever working with symbolically represented data, tests
typically involve checking *hundreds* of expected results,
and thus it can be really hard to find out where the
failure actually happens; it is better for readability
to have the expected result string immediately in the
test code; now this expected result can be marked
with a user-defined literal, and then on mismatch
the expected and the real value will be printed.
There are 12 distinct cases regarding the orientation of two intervals;
The Segmentation::splitSplice() operation shall insert a new Segment
and adjust / truncate / expand / split / delete existing segments
such as to retain the *Invariant* (seamless segmentation covering
the complete time axis)
- how to pass-in a specification given as GenNode
- now this might be translated into a MockJobTicket allocated in the MockSegmentation
Unimplemented: actually build the Segment with suitable start/end time
right now we're lacking a complete working implementation of render node invocation,
and thus the Dispatcher implementation can only be verified with the help
of mocked jobs. However, at least a preliminary implementation of tagging the
invocation instance is available, and thus we're able to verify that
a given job instance indeed belongs to and is "backed" by a specific JobTicket.
This is prerequisite for building up a (likewise mocked) Fixture datastructure,
and this in turn was meant to form the basis for attacking an actual Scheduler
implementation, followed by a real render node invocation.
- can now create a Job from JobTicket::NIL
- on invocation this Job will to nothing
Only when the first real output backend is implemented,
we can decide if this simplistic implementation is enough,
or if an empty output must be explicitly generated...
* using a simplified preliminary implementation of hash chaining (see #1293)
* simplistic implementation of hashing for time values (half-rotation)
* for now just hashing the time into the upper part of the LUID
Maybe we can even live with that implementation for some time,
depending on how important uniform distribution of hash values is
for proper usage of the frame cache.
Needless to say, various further fine points need more consideration,
especially questions of portability (32bit anyone?). Moreover, since
frame times are typically quantised, the search space for the hashed
time values is drastically reduced; conceivably we should rather
research and implement a good hash function for 128bit and then combine
all information into a single hash key....
...using the MockJobTicket setup as point of reference,
since the actual invocation of render nodes will only be drafted
later in this "Vertical Slice" integration effort...
- introduce a JobTicket::NOP (null-object pattern)
- assuming that the function splitSplice() will retain complete coverage allways
Remark:
`Fixture::getPlaylistForRender()` is a leftover from the very early implementation drafts.
This function was more or less based on the way Cinelerra works; it is clear by now
that Lumiera can not possibly work this way, given that we'll build a low-level model
and dispatch precompiled render jobs....
The Fixture and the low-level model backbone deserve a distinct namespace on their own.
Since it's built by the Builder from the Session contents, and also used by the frame dispatch,
we can expect dependence on some types from Steam-Layer, and thus this namespace
needs to reside in Steam-Layer rather, while the actual low-level Model
might become part of Vault-Layer, creating a hierarchy of data structures.
(Remark: likely also the session related namespaces will need a reorganisation)
The idea is to escape a "design deadlock" by using a test-driven prototype
implementation of the data structure to back a further development
of the Dispatcher and Scheduler implementation, which then can be used
to gradually elaborate and switch over to an actual implementation
data structure
...requires a first attempt towards defining a `JobTiket`.
This turns out quite tricky, due to using those `LinkedElements`
(intrusive single linked list), which requires all added records
actually to live elsewhere. Since we want to use a custom allocator
later (the `AllocationCluster`), this boils down to allocating those
records only when about to construct the `JobTicket` itself.
What makes matters even worse: at the moment we use a separate spec
per Media channel (maybe these specs can be collapsed later non).
And thus we need to pass a collection -- or better an iterator
with raw specs, which in turn must reveal yet another nested
sequence for the prerequisite `JobTickets`.
Anyhow, now we're able at least to create an empty `JobTicket`,
backed by a dummy `JobFunctor`....
Looks like we'll actually retain and use this low-level solution
in cases where we just can not afford heap allocations but need
to keep polymorphic objects close to one another in memory.
Since single linked lists are filled by prepending, it is rather
common to need the reversed order of elements for traversal,
which can be achieved in linear time.
And while we're here, we can modernise the templated emplacement functions
- build the reworked Job-planning pipeline more or less from scratch
- back that with mocked `Dispatcher` and `JobTicket`
- then transfer this into a `RenderDrive`, which can be tested as well
- could continue then to a `CalcStream` integration test....
- decision: the Monad-style iteration framework will be abandoned
- the job-planning will be recast in terms of the iter-tree-explorer
- job-planning and frame dispatch will be disentangled
- the Scheduler will deliberately offer a high-level interface
- on this high-level, Scheduler will support dependency management
- the low-level implementation of the Scheduler will be based on Activity verbs
This finishes a long lasting effort to rework the top-level of the Lumiera GTK UI,
to adapt to GTK-3 and the new asynchronous message based architecture.
Special credits and thanks to
* Joel Holdsworth
* Stefan Kangas
Without their relentless foundational work, the Lumiera UI could
never be where it is now. Even if some code was rewritten and several
parts of the old GTK-2 implementation are now obsolete, numerous ideas
solutions and inspirations were drawn from those early contributions
and live on as part of the reworked GUI.
Note: changing behaviour of TimeSpan to possibly flip start and end,
and also to use Offset as Offset and then re-orient,
since this seems the least surprising behaviour.
These changes carry over into changed default and limiting
on ZoomWindow constructor and various mutators, and most
notably shifting the time span always into allowed domain.
...the implementation was way too naive; in some cases we could go
into an infinite loop. In the end, using Newton approximation was not
necessary (and thus there is no loop anymore), but it helped me get
at a much better solution with very small error margin on average case.
All these corner cases are obviously "academic" to some degree,
but it turns out there is no clear-cut point where you'd be able
just so set a limit and be sure that fractional integer arithmetic
works flawless in all cases.
Thus the choice is
- give up (fractional) integers and work with floats and have to
deal with error accumulation
- or do something as chosen here, namely add a boundary zone, where
fractional integer arithmetic can be kept under control, while admitting
small errors, and in turn get the absolutely precise integers in all
everyday standard cases
The value used previously was too conservative, and prevented ZommWindow
from zooming out to the complete Time domain. This was due to missing the
Time::SCALE denominator, which increaded the limit by factor 1e6
In fact the code is able to handle even this extremely reduced limit,
but doing so seems over the top, since now detox() kicks in on several
calculations, leading to rather coarse grained errors.
Thus I decided to use a compromise: lower the limit only by factor 1000;
with typical screen pixel widths, we can reach the full time domain,
while most scaling and zoom calculations can be performed precisely,
without detox() kicking in. Obviously this change requires adjusting
a lot of the test case expectations, since we can now zoom out maximally.
As it turns out, the calculation path initially choosen for the mutateScale(Rat)
was needlessly indirect, and also duplicated several of the safeguards,
meanwhile implemented way better in conformWindowToMetric(Rat)
Thus, instead of relatively re-scaling the window, now we just
limit the given zoomFactor and pass it to conformWindowToMetric()
There is a built-in limitation, which now is even
lowered to 100000 pixels horizontally.
With the techniques introduced in this changeset, it seems possible
to support more -- yet this would be a case of unnecessary genricity;
handling such large numbers will drive more computations into the
danger zone, and doing so incurs cost in terms of testing and debugging.
Placing that into context, contemporary displays are not even 4K on
average, and it does not look likely even for cinema display to go
way beyond 8k -- so yes, I want display hardware with 100000 pixels!!
The key takeaway of this changeset:
- can calculate px = trunc(zoomFactor * duration) step wise,
even when the direct calculation would lead to wrap-around
- can safely adjust and fix the zoomFactor using Newton approximation
...even zooming out to span the complete time domain (~19000 years).
But only under the condition that the display window is sufficiently
large in terms of pixels, so we can handle the computation without
glitches.
This should not be a relevant limitation in practice, since a window
size of some 100 pixels is enough to handle Duration::MAX. Needless to add
that it's hard to imagine a media timeline of such tremendous size...
building on these Library changes, plus the safe-add function
developed some days ago, it is now possible to mark a large displacement
as `time::Offset`, and apply this to yield any valid time position,
even extreme negative values
...building on these Library changes, plus the safe-add function
developed some days ago, it is now possible to mark a large displacement
as `time::Offset`, and apply this to yield any valid time position,
even extreme negative values