VERIFY_ERROR allows to check that an expected except is actually thrown.
The implementation was lazy however;
it just investigated the C-style error flag instead of *really* verifying
that an *lumiera::Exception* with the expected flag was caught.
This discrepancy can be a problem when there is a stray error flag set,
or for some reason the error flag gets cleared before the exception
reaches the top-level catch-block in the test.
- relocate some code into a dedicated translation unit to reduce #includes
- actually set the thread-ID (the old implementation had only a TODO at that point)
While it would be straight forward from an implementation POV
to just expose both variants on the API (as the C++ standard does),
it seems prudent to enforce the distinction, and to highlight the
auto-detaching behaviour as the preferred standard case.
Creating worker threads just for one computation and joining the results
seemed like a good idea 30 years ago; today we prefer Futures or asynchronous
messaging to achieve similar results in a robust and performant way.
ThreadJoinable can come in handy however for writing unit tests, were
the controlling master thread has to wait prior to perform verification.
So the old design seems well advised in this respect and will be retained
- cut the ties to the old POSIX-based custom threadpool framework
- remove operations deemed no longer necessary
- sync() obsoleted by the new SyncBarrier
- support anything std::invoke supports
...which is the technique used in the existing Threadpool framwork.
As expected, such a solution is significantly slower than the new
atomics-based implementation. Yet how much slower is still striking.
Timing measurements in concurrent usage situation.
Observed delay is in the order of magnitude of known scheduling leeway;
assuming thus no relevant overhead related to implementation technique
Over time, a collection of microbenchmark helper functions was
extracted from occasional use -- including a variant to perform
parallelised microbenchmarks. While not used beyond sporadic experiments yet,
this framework seems a perfect fit for measuring the SyncBarrier performance.
There is only one catch:
- it uses the old Threadpool + POSIX thread support
- these require the Threadpool service to be started...
- which in turn prohibits using them for libary tests
And last but not least: this setup already requires a barrier.
==> switch the existing microbenchmark setup to c++17 threads preliminarily
(until the thread-wrapper has been reworked).
==> also introduce the new SyncBarrier here immediately
==> use this as a validation test of the setup + SyncBarrier
Using the same building blocks, this operation can be generalised even more,
leading to a much cleaner implementation (also with better type deduction).
The feature actually used here, namely summing up all values,
can then be provided as a convenience shortcut, filling in std::plus
as a default reduction operator.
...first used as part of the test harness;
seemingly this is a generic and generally useful shortcut,
similar to algorithm::reduce (or some kind of fold-left operation)
Intended as replacement for the Mutex/ConditionVar based barrier
built into the exiting Lumiera thread handling framework and used
to ensure safe hand-over of a bound functor into the starting new
thread. The standard requires a comparable guarantee for the C++17
concurrency framework, expressed as a "synchronizes_with" assertion
along the lines of the Atomics framework.
While in most cases dedicated synchronisation is thus not required
anymore when swtiching to C++17, some special extended use cases
remain to be addressed, where the complete initialisation of
further support framework must be ensured.
With C++20 this would be easy to achieve with a std::latch, so we
need a simple workaround for the time being. After consideration of
the typical use case, I am aiming at a middle ground in terms of
performance, by using a yield-wait until satisfying the latch condition.
The investigation for #1279 leads to the following conclusions
- the features and the design of our custom thread-wrapper
almost entirely matches the design chosen meanwhile by the C++ committee
- the implementation provided by the standard library however uses
modern techniques (especially Atomics) and is more precisely worked out
than our custom implementation was.
- we do not need an *active* threadpool with work-assignment,
rather we'll use *active* workers and a *passive* pool,
which was easy to implement based on C++17 features
==> decision to drop our POSIX based custom implementation
and to retrofit the Thread-wrapper as a drop-in replacement
+++ start this refactoring by moving code into the Library
+++ create a copy of the Threadwrapper-code to build and test
the refactorings while the application itself still uses
existing code, until the transition is complete
...which however brings the problem that we can no longer block the destructor
of WorkForce by simply joining on all joinable threads (there is a race
between testing joinable() and invoking join(), which does not tolerate
non-joinable state.
There is a second problem: we need to detect and clean-up terminated workers,
even for just finding out how many workers are still active. Fortunately
doing so also solves the waiting problem in the destructor
While in principle it would be possible (and desirable)
to control worker behaviour exclusively through the Work-Functor's return code,
in practice we must concede that Exceptions can always happen from situations
beyond our control. And while it is necessary for the WorkForce-dtor to
join and block (we can not just pull away the resources from running threads),
the same destructor (when called out of order) must somehow be able
at least to ask the running threads to terminate.
Especially for unit tests this becomes an obnoxious problem -- otherwise
each test failure would cause the test runner to hang.
Thus adding an emergency halt, and also improve setup for tests
with a convenience function to inject a work-function-λ
- investigate consistency guarantees through acquire-release
==> turns out we do not need a fence, but it is tantamount
to have a guard variable and actually load and check
the value to ensure we indeed get a happens-before
- elaborate design of the WorkForce
+ no shared control variables necessary
+ no ability to forcibly shut-down the WorkForce
+ rather, all control will be exerted through the return value
of the Work-Functor
Up to now, the DiagnosticFun mock in ActivityDetector only
created an EventLog entry on invocation and was able to retunr
a canned result value. Yet for the job invocation scenario test,
it would be desirable to hook-in a λ with a fake implementation
into the ExecutionContext. As a further convenience, the
return value is now default initialised, instead of being
marked as uninitialised until invocation of "returning(val)"
...regarding the kind of activity (the verb),
and also for some special case access of payload data;
deliberately asserting the correct verb, but no mandatory check,
since this whole Activity-Language is conceived as cohesive
and essentially sealed (not meant to be extended)
It is not sufficient just to pass this "current time" as parameter
into the ActivityLang::dispatchChain(), since some Activities within
this chain will essentially be long-running (think rendering); thus
we need a real callback from within the chain. The obvious solution
is to make this part of the Execution Context, which is an abstraction
of the scheduler environment anyway
...turns out there is still a lot of leeway in the possible implementation,
and seemingly it is too early to decide which case to consider the default.
Thus I'll proceed with the drafted preliminary solution...
- on primary-chain, an inhibited Gate dispatches itself into future for re-check
- on Notification, activation happens if and only if this very notification opens the Gate
- provide a specifically wired requireDirectActivation() to allow enforcing a minimal start time
While the ''general direction'' seems clear, some in-depth
analysis was required to find out what information can reasonably
be expected to be available at this point.
The decision was made to shift the actual deadline calculation
into the Job-Planning altogether, assuming that a preliminary solution
based on data implicitly available there will be enough to implement
simple linear playback, while precise management of job start times
can be added in later, when observation of actual timing behaviour
is available...
Solved by special treatment of a notification, which happens
to decrement the latch to zero: in this case, the chain is
dispatched, but also the Gate is locked permanently to block
any further activations scheduled or forwareded otherwise
TODO: while correct as implemented, the handling of the
notification seems questionable, since re-scheduling the chain immediately
may lead to multiple invocations of the chain, since it might have been "spinned"
and thus re-scheduled already, and we have no way to find out about that
...can not take a shortcut here, since the timing information
embedded into the POST-Activity must somehow be transported
to the Scheduler; key point to note is that the chain will
be performed in »management mode« (single threaded)
...attempt to get this intricate state machine sorted out
Notification turned out quite tricky, since it may emanate
from a concurrently executed phase and we try to avoid having
to protect the gate directly with a lock; rather we re-dispatch
the notification through the queue, which indirectly also ensures
that the worker de-queuing the NOTIFY-Activity operates in
management mode (single threaded, holding the GroomingToken)
Each Epoch in the memory manager holds a Gate in the first slot;
after the logic for Gate-activation is worked out now, we can switch
to using this actual logic to determine when an Epoch can be released
Decision how to handle a failed Gate-check
- spin forward (re-scheduler) by some time amount
- this spin-offset parameter is retrieved from the Execution Context
- thus it will be some kind of engine parameter
With these determinations and the framework for the Execution Context
it is now possible to code up the logic for Gate check, which in turn
can then be verified by the watchGate diagnostics
due to technical limitations this requires to wire the adaptor
as replacement for the subject Activity, so that it can capture
and log the activation, and then pass it on to its watched subject
...turns out that util::toString does not explicitly handle pointers differently,
for very good reasons; this function must always work, always produce a simple and
compact representation, and it must be possible to instantiate the template
and take a function reference (which precludes adding an overload for pointers)
requires to supplement EventLog matching primitives
to pick and verify a specific positional argument.
Moreover, it is more or less arbitrary which job invocation parameters
are unpacked and exposed for verification; we'll have to see what is
actually required for writing tests...
doing so would contradict the fundamental architecture,
all kinds of failures and timeouts need to be handled within
Scheduler-Layer-2 rather.
Jobs are never aborted, nor do they need to know if and when they are invoked
Testcase (detect function invocation) passes now as expected
Some Library / Framework changes
- rename event-log-test.cpp
- allow the ExpectString also to work with concatenated expectation strings
Remark: there was a warning in the comment in event-log.hpp,
pointing out that negative assertions are shallow.
However, after the rework in 9/2018 (commit: d923138d1)
...this should no longer be true, since we perform proper backtracking,
leading to an exhaustive search.
ActivityMatch inherits privately from the EventMatch object,
and is thus able to delegate relevant matching queries, but
also to provide high-level special matchers.
This new design resolves the ambiguity regarding function arguments.
Moreover, we can now record the current sequence-Number as *attribute*
in the respective log record (this is the benefit of using structured
log entries instead of just a textual log), thereby avoiding the various
pitfalls with explicit bracketing sequence-number log entries
bottom line: this reworked design seems to be a better fit,
even while technically the implementation with the wrapped matcher
is somewhat ugly...
The EventLog seems to provide all the building blocks, but we need
some higher level special matchers (and maybe we also want to hide
some of the basic EventLog matchers). A soulution might be to wrap
the EventMatcher and delegate all follow-up builder calls.
This seems adequate, since the EventLog-Matcher is basically used as black box,
building up more elaborate matchers from the provided basic matchers...
Spent some time again to understand how EventLog matching works.
My feelings towards this piece of code are always the same: it is
somewhat too "tricky", but I am not aware of any other technique
to get this degree of elaborate chained matching on structured records,
short of building a dedicated matching engine from scratch.
The other alternative would be to use a flat textual log (instead of
the structured log records from EventLog), but then we'd have to
generate quite intricate regular expressions from the builder,
and I'm really doubtful it would be easier and clearer....
...turns out this is entirely generic and not tied to the context
within ActivityDetector, where it was first introduced to build a
mock functor to log all invocations.
Basically this meta-function generates a new instantiation of the
template X, using the variadic argument pack from template U<ARGS...>
...for coverage of the Activity-Language,
various invocations of unspecific functions must be verified,
with the additional twist that the implementation avoids indirections
and is thus hard to rig for tests.
Solution-Idea: provide a λ-mock to log any invocation into the
Event-Log helper, which was created some years ago to trace GUI communication...
essentially define a concept how to ''perform'' render activities in the Scheduler.
This entails to specify the operation patterns for the four known base cases
and to establish a setup for the implementation.
Further extensive testing with parameter variations,
using the test setup in `BlockFlow_test::storageFlow()`
- Tweaks to improve convergence under extreme overload;
sudden load peaks are now accomodated typically < 5 sec
- Make the test definition parametric, to simplify variations
- Extract the generic microbenchmark helper function
- Documentation
There seems to be a ''sweet spot'' for somewhat larger Epoch sizes around 500 slots.
At least in the test setup used here, which works with a load of 200 Frames / sec,
which is significantly over the typical value of 50fps (video + audio) for simple playback.
The optimisation of averaged allocation times can not be much improved **below 30ns**.
Overall, this can be considered a good result,
since this allocation scheme does way more than just allocate memory,
it also provides a means to track dependencies and lifecycle.
__For context__:
- we should strive at processing one frame in ~ 10ms
- for 10 Activity records per Frame, we currently use < 0.5 µs for
memory and dependency management in the scheduler
- this leaves enough room for the further administrative efforts
(priority queue, job planning, buffer management)
... while this a comparison of apples and oranges, since the standard
heap allocator does not offer any dependency and lifecycle managmenet,
while the BlockFlow scheme developed here is much more complex and
offers a lifetime and dependency control specifically tailored to
the needs of the Scheduler.
Anyway, with the latest tweaks and refactorings, the test case
now shows averaged times per allocation on a comparable level
(both in the range of ~30ns)
BUT -> +50% runtime in -O3 (+20ns)
Investigation seems to indicate
- that the increased (+1 Epochs, 10 -> 11) moving average
caused the Algo to perform worse (strong effect)
- that the Optimiser has problems with boost::rational, which however
yields only a minute effect (+5ns), and only on the critical path
The access via Meyers Singleton has no adverse effect,
rather the new setup gives a tiny benefit (46ns -> 37ns).
Surprisingly, the increased pre-allocation has no observable effect.
On the long run, there will be a central Render Engine parametrisation;
some parameters can even be expected to be dynamic; thus prepare the
BlockFlow allocator to fit in with this expectation
For comparison: use individual managment by refcount.
This supports the conclusion that BlockFlow is more than just a
custom allocator; it also supports a non-trivial lifetime management,
and this comes at a cost.
Playing around with various load patterns uncovers further weak spots
in the regulation mechanism. As a remedy, introduce a stronger feed-back
and especially set the target load factor from 100% -> 90%
to add some headroom to absorb intermittent load peaks
Presumably ''much more observation and fine-tuning'' will be necessary
under real-world load conditions (⟹ Ticket #1318 for later)
- BUG: must prevent the Epoch size to become excessive low
- Problem: feedback signal should not be overly aggressive
Fine-Tuning:
- Dose for Overflow-compensation is delicate
- Moving average and Overflow should be balanced
- ideally the compensatory actions should be one order of magnitude
slower than the characteristic regulation time
Improvement: perform Moving-Average calculations in doubles
..as a heuristic to regulate optimal Epoch duration;
when Epochs are discarded, the effective fill factor can be used
to guess an Epoch duration time, which would (in hindsight)
lead to perfect usage of storage space
..using a simplistic implementation for now: scale down the
Epoch-stepping by 0.9 to increase capacity accordingly.
This is done on each separate overflow event, and will be
counterbalanced by the observation of Epoch fill ratio
performed later on clean-up of completed Epochs
further implementation makes clear that the AllocationHandle,
which is the primary usage front-end, has to rely both on
services of the underlying ExtentFamily allocator, as well
as on the BlockFlow itself for managing the Epoch spacing.
- fix a bug in IterExplorer: when iterating a »state core« directly,
the helper CoreYield passed the detected type through ValueTypeBindings.
This is logically wrong, because we never want to pick up some typedefs,
rather we always want to use the type directly returned from CORE::yield()
Here the iterator returns an Epoch&, which itself is again iterable
(it inherits from std::array<Activity, N>). However, it is clear
that we must not descent into such a "flatMap" style recursive expansion
- draft a simple scheme how to regulate Epoch lengths dynamically
- add diagnostics to pinpoint a given Activity and find out into which
Epoch it has been allocated; used to cover the allocator behaviour
- add preliminary deadline-check (directly instead of using the Activity)
- with this shortcut, now able to implement discarding obsoleted Epochs
- Iteration and use of the underlying `ExtentFamily` is also settled by now
💡 ''Implementation concept for the allocation scheme complete and validated''
...with the preceding IterableDecorator refactoring,
the navigation and access to the storage extents can now be
organised into a clear progression
Allocator::iterator -> EpochIter -> Epoch&
Convenience management and support functions can then be
pushed down into Epoch, while iteration control can be done
high-level in BlockFlow, based on the helpers in Epoch
Especially for the BlockFlow allocator, sanity checks are elided
for performance reasons; yet, generally speaking, it can be a very bad idea
to "optimise" away sanity checks. Thus an additional adaptor is provided
to layer such checks on top of an existing core; and IterEplorer now
always wires in this additional adaptor, and so the original behaviour
is now restored in this respect (and for the largest part of the code base)
While at first sight just a superficial variation of the existing IterStateWrapper,
it became clear with the evolution of the IterExplorer framework that
this setup represents a distinct concept, and especially lends itself
for complex and cohesive collaboration in a layered pipeline. Which
may, or may not be a good idea, depending on the circumstances.
Now, for the implementation of the scheduler memory allocation scheme,
another twist is added to the picture: we can not effort the sanity checks
on each access, even more so when layering / adapting iterators, where
it is essential that the optimiser can remove all unnecessary warts.
..this is the most simple case, where no Epochs are opened yet
..add diagnostics to inspect alloc count and deadlines
..add accessors for the first/last underlying Extent
...continue to proceed test-driven
...scheduler internals turn out to be intricate and cohesive,
and thus the only hope is to adhere to strict testing discipline
Library: add "obvious" utility to the IterExplorer, allowing to
materialise all contents of the Pipeline into a container
...use this to take a snapshot of all currently active Extent addresses
- use a checksum to prove that ctor / dtor of "content" is not invoked
- let the usage of active extents "wrap around" so that the mem block is re-used
- verify that the same data is still there
The low-level allocator is basically implemented now,
but we still need to check thoroughly that the tricky
wrap-around and expansion logic behaves sane...
(see #1311)
Using a Storage* within a wrapper as "pos" will work,
but is borderline trickery, since it amounts to subverting
the idea behind IterAdapter (which is to encapsulate a target
pointer with some control-logic in the managing container).
Using the same storage size and implementation overhead,
it is much more straight-forward to package the complete
iteration logic into a »State Core«, which in this case
however maintains a back-link to the ExtentFamily.
Iteration should just yield an Reference to an Extent,
thereby hiding all details of the actual raw storage (char[]).
This can be achieved by usind a wrapper type around a pointer
into the managing vector; from this pointer we may convert
into a vector::iterator with the trick described here
https://stackoverflow.com/a/37101607/444796
Furthermore, continued planning of the Activity-Language,
basically clarified the complete usage scenario for now;
seems all implementable right away without further difficulties
..with the ability to grow on demand..
..possibly add the new extents in the middle, by first allocating at the end
and then using the std::rotate() algo to bring them to the point
in the middle where new extents are required
- the idea is to use slot-0 in each extent for administrative metadata
- to that end, a specialised GATE-Activity is placed into slot-0
- decision to use the next-pointer for managing the next free slot
- thus we need the help of the underlying ExtentFamily for navigating Extents
Decision to refrain from any attempt to "fix" excessive memory usage,
caused by Epochs still blocked by pending IO operations. Rather, we
assume the engine uses sane parametrisation (possibly with dynamic adjustment)
Yet still there will be some safety limit, but when exceeding this limit,
the allocator will just throw, thereby killing the playback/render process
- decision to favour small memory footprint
- rather use several Activity records to express invocation
- design Activity record as »POD with constructor«
- conceptually, Activity is polymorphic, but on implementation
level, this is "folded down" into union-based data storage,
layering accessor functions on top
- decision how to handle the Extent storage (by forced-cast)
- decision to place the administrative record directly into the Extent
TODO not clear yet how to handle the implicit limitation for future deadlines
using a simple yet performant data structure.
Not clear yet if this approach is sustainable
- assuming that no value initialisation happens for POD payload
- performance trade-off growth when in wrapped-state vs using a list
....still about to find out what kinds of Activities there are,
and what reasonably to implement on layer-2 vs. layer-1
It is clear that the worker will typically invoke a doWork()
operation on layer-2, which in turn will iterate layer-1.
Each worker pulls and performs internal managmenet tasks exclusively
until encountering the next real render task, at which point it will
drop an exclusion flag and then engage into performing the actual
extended work for rendering...
- define a simple record to represent the Activity
- define a handle with an ordering function
- low-level functions to...
+ accept such a handle
+ pick it from the entrace queue
+ pass it for priorisation into the PriQueue
+ dequeue the top priority element
The second design from 2017, based on a pipeline builder,
is now renamed `TreeExplorer` ⟼ `IterExplorer` and uses
the memorable entrance point `lib::explore(<seq>)`
✔
after completing the recent clean-up and refactoring work,
the monad based framework for recursive tree expansion
can be abandoned and retracted.
This approach from functional programming leads to code,
which is ''cool to write'' yet ''hard to understand.''
A second design attempt was based on the pipeline and decorator pattern
and integrates the monadic expansion as a special case, used here to
discover the prerequisites for a render job. This turned out to be
more effective and prolific and became standard for several exploring
and backtracking algorithms in Lumiera.
- allow to configure the expected job runtime in the test spec
- remove link to EngineConfig and hard-wire the engine latency for now
... extended integration testing reveals two further bugs ;-)
... document deadline calculation
This finishes the last series of refactorings; the basic concept
remains the same, but in the initial version we arranged the expander
function in the pipeline to maintain a Tuple (parent, child) for the
JobTickets. Unfortunately this turned out to be insufficient, since
JobTicket is effectively const and responsible for a complete Sement,
so there is no room to memorise a Deadline for the parent dependency.
This leads to the better idea to link the JobPlanning aggregators
themselves by parent-child references, which is possible since the
whole dependency chain actually sits in the stack embedded into the
Expander (in the pipeline)
This very deep change (which requires almost complete rebuild)
was prompted by the need to process an object (JobPlanning),
which holds several references and is thus move-only, in the
middle of a complex processing pipeline with child expansion.
If this works out well, a long-standing and obnoxious problem
with transforming iterators would be solved, albeit by incurring
a (presumably small) performance overhead, since now the new
value is no longer *assigned*, but rather the existing payload
is destroyed and a new instance is copy/move constructed into
the inline buffer.
The primary purpose (and widely used in Lumieara) is to have a
Lambda create a new Object, which is then returned by value
and thus immediately moved into this inline buffer, where it
resides for further use (as long as the enclosing pipeline
stays alive). Unless such an object does very elaborate
allocations and registrations behind the scene, the
expense of assigning vs creating should be the same.
...in the hope that the Optimiser is able to elide those references entirely,
when (as is here the case) they point into another field of a larger object compound
...as a preparation for solving a logical problem with the Planning-Pipeline;
it can not quite work as intended just by passing down the pair of
current ticket and dependent ticket, since we have to calculate a chained
calculation of job deadlines, leading up to the root ticket for a frame.
My solution idea is to create the JobPlanning earlier in the pipeline,
already *before* the expansion of prerequisites, and rather to integrate
the representation of the dependency relation direcly into JobPlanning
...using hard coded values instead of observation of actual runtimes,
but at least the calculation scheme (now relocated from TimeAnchor to JobPlanning)
should be a reasonable starting point.
TODO: test fails...
- collect list of entities to be picked up from the dispatcher-pipeline
- as it turns out: there is no sensible use for the realTimeDeadline in
in the FrameCoord record ==> remove it
The initial implementation effort for Player and Job-Planning
has been reviewed and largely reworked, and some parts are now
obsoleted by the reworked alternative and can be disabled.
The basic idea will be retained though: JobPlanning is a
data aggregator and performs the final step of creating a Job
- had to fix a logical inconsistency in the underlying Expander implementation
in TreeExplorer: the source-pipeline was pulled in advance on expansion,
in order to "consume" the expanded element immediately; now we retain
this element (actually inaccessible) until all of the immediate
children are consumed; thus the (visible) state of the PipeFrameTick
stays at the frame number corresponding to the top-level frame Job,
while possibly expanding a complete tree of flexible prerequisites
This test now gives a nice visualisation of the interconnected states
in the Job-Planning pipeline. This can be quite complex, yet I still think
that this semi-functional approach with a stateful pipeline and expand functors
is the cleanest way to handle this while encapsulating all details
- fix a bug in the MockDispatcher, when duplicating the ExitNodes.
A vector-ctor with curly braces will be interpreted as std::initializer_list
- add visualisation of the contents appearing at the end of the pipeline
*** something still broken here, increments don't happen as expected
`steam/engine/mock-dispatcher.hpp |cpp` now integrates this
''complete mock setup for render jobs and frame dispatching.''
The exising `DummyJob` has been slightly adapted and renamed
to `MockJob` and is tightly integrated with the other mocks.
The implementation of a `MockDispatcher` necessitated to change
the use of `MockJobTicket`. The initial attempts used a complete
mock implementation, but this approach turned out not to be viable.
Instead — based on the ideas developed for the mock setup —
now the prospective real implementation of `JobTicket` is available
and will be used by the mock setup too. Instead of a synthetic spec,
now a setup of recursively connected `ExitNode`(s) is used; the latter
seems to develop into some kind of Facade for the render node network.
Based on this mock setup, we can now demonstrate the (mostly) complete
Job-Planning pipeline, starting from a segmentation up to render jobs,
and verify proper connectivity and job invocation.
✔
- has to be prepared / supported by the RenderEnvironmentClosure
- actual translation happens when building the Dispatcher-Pipeline
- implementation delegate through
virtual size_t Dispatcher::resolveModelPort (ModelPort)
...ouch this was insidious: the STL implementation for list does not
return a pointer to the element just allocated, but rather retrieves
and dereferences the back() / front() iterator after returning from emplace_back|front()
...which in case of re-entrant allocations is something wildly different
than the initial allocation. Thus a *cheap* and dirty placeholder implementation
just using a STL container is not possible, and we need at least
to code up likewise cheesy placeholder implementation by hand.
- separate allocation and ctor all
- use an inline buffer in the STL container
- explicitly handle ctor failures to discard allocation
- NOT THREADSAFE and likely WASTFUL in terms of performance
==> MockSupport_test now back to GREEN after complete refactoring
The existing implementation of the Player from 2012~2015 inclduded
an additional differentiation by media channel (for multichannel media)
and would build a separate CalcStream for each channel.
The in-depth analysis conducted for the ongoing »Vertical Slice« effort
revealed that this differentiation is besides the point and would never
be materialised: Since -- by definition -- all media processing has
to be done by the engine, also the generation of the final output format
including any channel multiplexing will happen in render nodes.
The only exception would be when only a single channel of multichannel
media is extracted -- yet this case would then translate into a
dedicated ModelPort.
Based on this reasoning, a lot of complexity (and some contradictions)
within the JobTicket implementation can be removed -- together with
some further leftovers of the fist attempt to build JobTickets always
from a Mock specification (we now use construction by the Segment,
based on an ExitNode, which is the expected actual implementation
for production setup)
...by defining a new scheme for access to custom allocators
...and then passing a reference to such an accessor into the
JobTicket ctor, thereby allowing the ticket istelf recursively
to place further JobTicket instances into the allocation space
--> success, test passes (finally)
Up to now, a draft/mock implementation was used, relying on a »spec tuple«,
which was fabricated by MockJobTicket. But with the introduction of
NodeGraphAttachment, the MockSequence now generates a nested ExitNode structure,
and thus the JobTicket will be created through the "real" ctor, and
no longer via MockJobTicket.
Thus it is possible to skip this whole interspersed »spec tuple«,
since ExitNode *is* already this aggregated / abstracted Spec
PROBLEM: can not implement Spec-generation, since
- we must use a λ for internal allocation of JobTickets
- but recursive type inference is not possible
Will thus need to abandon the Spec-Tuple and relocate this
traversal-and-generation code into JobTicket itself
Use another unit-test (FixtureSegment_test) to guide and cover
the transition from the existing fake-implementation to the
actual implementation, where the JobTicket will be generated
on-demand, from a NodeGraphAttachment
It turns out that the real (not mocked) implementation of JobTicket creation
is already required now for this planned (mock)Dispatcher setup;
moreover, this real implementation turns out to be almost identical
to the mock implementation written recently -- just nested structure
of prerequiste JobTickets need to be changed into a similar structur
of ExitNodes
-- as an aside: rearrange various tests to be more in-line
with the envisioned architecture of playback and engine
...this opens up yet another difficult question and a host of new problems
- how are prerequisites detected or arranged by the Builder
- how are prerequisites represented?
- what is an ExitNode in terms of implementation? A subclass of ProcNode?
- how will the actual implementation of JobTicket creation (on-demand) work?
- how to adapt the Mock implementation, while retaining the Specification
for Segments and prerequisites?
..this is now the third attempt, and it seems this one leads to a
clean solution for the type rebinding problem, while also allowing
to unit-test each step in isolation.
The idea is to layer a *templated* builder class on top,
but to slice it away in each step, re-assemble the pipeline
and decorate a new builder instance on top. The net result
is a tightly interconnected processing pipeline without
any spurious interspersed leftovers from the builder,
while all intermediate steps are fully operational
and can thus be unit-tested...
...still very rough edged...
...based on the idea to have a pair(Dependent,Dependency) and to shift these on each level of expansion
PROBLEMS:
- what to use as root level?
- can not handle JobTicket const& in transform-iterator (assignement operator)
...it turns out that we actually do not need to wrap TreeExplorer
on the builder types, because basically there is only a single active
builder type, and the complete processing pipeline can be assembled
in a single terminal function.
The type rebinding problem can thus be solved just by a simple
marker struct, which inherits from a template parameter
...hard to tackle...
The idea is to wrap the TreeExplorer builder, so that our specific
builder functions can delegated to the (inherited) generic builder functions
and would just need to supply some cleverly bound lambdas. However,
resulting types are recursive, which does not play nice with type inference,
and working around that problem leads to capturing a self reference,
which at time of invocation is already invalidated (due to moving the
whole pipeline into the final storage)
...which leads to the next daunting problems:
- we need some mocked ModelPort and DataSink placeholders
- we need a way how to inherit from a partial TreeExplorer pipeline
...introduced in preparation for building the Dispatcher pipeline,
which at its core means to iterate over a sequence of frame positions;
thus we need a way to stop rendering at a predetermined point...
several years ago, it seemed like a good idea to incorporate
the link between nominal time and wall-clock time into a dedicated
anchor point, which also regulates the continued frame planning.
But it turned out that such a design mixes up several concepts
and introduces confusion regarding the meaning of "real time"
- latency can not be reasonably defined for a whole planning chunk
- skipping or sliding due to missed deadlines can not reasonably handled
within such an abstract entity; it must be handled rather at the
level of a playback process
- linking the frame grid generation directly to a planning chunk
undercuts the possible abstraction of a planning pipeline
...which is build a »Job planning pipeline« step by step
in a test setup, and then factor that out as RenderDrive,
to supersede the existing CalcPlanContinuation and get
rid of the Monads this way...
Challenges
- there is a inconsistency with channel usage
- need to establish a way how to transport the output-Sink into the JobFunctor
- need a way to propagate the current frame number to the next planning chunk
The prototypical setup of data structures and test support components
is largely complete by now — with the exception of the `MockDispatcher`,
which will be completed while moving to the next steps pertaining the
setup of a frame dispatch pipeline.
* the existing `DummyJob` was augmented to allow verification of
association between Job and `JobTicket`
* the existing implementation of `JobTicket` was verified and augmented
to allow coverage of the whole usage cycle
* a `MockJobTicket` was implemented on top, which can be generated
from a symbolical test specification (rather than from the real
Fixture data structure)
* a complete `MockSegmentation` was developed, allowing to establish
all the aforementioned data structures without an actual backing
Render Engine. Moreover, `MockSegmentation` can be generated
from the aforementioned symbolic test specification.
* as part of this work, an algorithm to split an existing Segmentation
and to splice in new segments was developed and verified
Last testcase: add deeply nested Prerequisites.
Turns out that the allocator must be able to handle
re-entrant allocations, which std::deque can not fulfil.
Thus using std::list here for the Mock implementation.
In the end, the real allocations will be done by our custom
allocator (AllocationCluster), which can be arranged easily
to support re-entrant allocation calls (since the whole point
is to just place those objects into a pre-allocated large block
and only de-allocate them later in one sway. Thus the allocator
does not need to wait for the object constructor to finish, which
trivially allows for re-entrant calls)
...which uncovers further deeply nested problems,
especially when referring to non-copyable types.
Thus need to construct a common type that can be used
both to refer to the source elements and the expanded elements,
and use this common type as result type and also attempt to
produce better diagnostic messages on type mismatch....
...the improved const correctness on STL iterators uncovered another
latent problem with out diagnositc format helper, which provide
consistently rounded float and double output, but failed to take
CV-qualifiaction into account
This is a subtle and far reaching fix, which hopefully removes
a roadblock regarding a Dispatcher pipeline: Our type rebinding
template used to pick up nested type definitions, especially
'value_type' and 'reference' from iterators and containers,
took an overly simplistic approach, which was then fixed
at various places driven by individual problems.
Now:
- value_type is conceptually the "thing" exposed by the iterator
- and pointers are treated as simple values, and no longer linked
to their pointee type; rather we handle the twist regarding
STL const_iterator direcly (it defines a non const value_type,
which is sensible from the STL point of view, but breaks our
generic iterator wrapping mechanism)
...in an attempt to resolve the deeply nested problems encountered
while building an iterator pipeline for the Dispatcher. It seems
that I was sloppy some years ago and just "bashed them into submission",
thereby mixing up two different meanings of "value_type"
Moreover I seemingly implemented the same helper trait template twice,
so the first step is to switch all usages to meta::TypeBinding
To complete the mock setup, the next step would be to extend the GenNode-based spec langage
to allow defining prerequisite Mock-JobTickets. Setting this up seems rather straight forward --
however, defining a simple testcase to cover this extension runs into surprisingly tricky problems..
- for one, the singleValIterator from Itertools has serious difficulties handling references
- but even more surprising, it seems impossible to make the "prerequisites iterator"
fit into the Tree-Explorer framework (which I intend to use as replacement
for the monadic approach)
after some extended analysis of generic types and template instances,
it seems that not TreeExplorer as such is the primary problem, but rather
there is a conceptual mismatch somewhere deep down in Itertools or Iter-Adapter
By reasoning and analysis I conclude that the differentiation into
multiple channels is likely misplaced in JobTicket; it belongs ratther
into the Segment and should provide a suitable JobTicket for each ModelPort
Handling of prerequisites also needs to be reshaped entirely after
switching to a pipeline builder for the Job-planning pipeline; as
preliminary access point, just add an iterator over the immediate
prerequisites, thereby shifting the exploration mechanism entirely
out of the JobTicket implementation
Testcase: A simple Sementation with a single and bounded Segment
As aside, figured out how to unpack an iterator such as to
tie a fixed number of references through a structural binding:
auto const& [s1,s2,s3] = seqTuple<3> (mockSegs.eachSeg());
...now able to build a mock segmentation which issues dummy jobs,
and is wired such as to verify the right job is invoked for each segment.
And this allows to build and verify the Dispatcher,
without being able to invoke actual render jobs yet.
- only the parts actually touched by the algo will be re-allocated
- when a segment is split, the clone copies carry on all data
Library: add function to check for a bare address (without type info)
This macro has turned out to be quite useful in cases
where a generic setup / algorithm / builder need to be customised
with λ adaptors for binding to local or custom types. It relies
on the metafunctions defined in lib/meta/function.hpp to match
the signature of "anything function-like"; so this seems the
proper place to provide that macro alongside
...this is something I should have done since YEARS, really...
Whenever working with symbolically represented data, tests
typically involve checking *hundreds* of expected results,
and thus it can be really hard to find out where the
failure actually happens; it is better for readability
to have the expected result string immediately in the
test code; now this expected result can be marked
with a user-defined literal, and then on mismatch
the expected and the real value will be printed.
The algorithm coded thus far turns out to be rather generic,
and thus it can be rewritten into a template, while all specific parts
are supplied as λ-Functors.
- instead of Time we use a generic ordering type
- the Iterator is likewise turned into a template parameter
- all the operations are directly supplied as functor types
- C++17 is able to pick up all those λ-Types from the ctor call
This change looks like "low hanging fruit"; the legibility of the code
is not seriously hampered, yet we get the benefit to test this rather
technical piece of logic by an isolated test (which for now is the
primary motivation), and we can hope to re-use it for similar tasks.
There are 12 distinct cases regarding the orientation of two intervals;
The Segmentation::splitSplice() operation shall insert a new Segment
and adjust / truncate / expand / split / delete existing segments
such as to retain the *Invariant* (seamless segmentation covering
the complete time axis)
right now we're lacking a complete working implementation of render node invocation,
and thus the Dispatcher implementation can only be verified with the help
of mocked jobs. However, at least a preliminary implementation of tagging the
invocation instance is available, and thus we're able to verify that
a given job instance indeed belongs to and is "backed" by a specific JobTicket.
This is prerequisite for building up a (likewise mocked) Fixture datastructure,
and this in turn was meant to form the basis for attacking an actual Scheduler
implementation, followed by a real render node invocation.
- can now create a Job from JobTicket::NIL
- on invocation this Job will to nothing
Only when the first real output backend is implemented,
we can decide if this simplistic implementation is enough,
or if an empty output must be explicitly generated...
* using a simplified preliminary implementation of hash chaining (see #1293)
* simplistic implementation of hashing for time values (half-rotation)
* for now just hashing the time into the upper part of the LUID
Maybe we can even live with that implementation for some time,
depending on how important uniform distribution of hash values is
for proper usage of the frame cache.
Needless to say, various further fine points need more consideration,
especially questions of portability (32bit anyone?). Moreover, since
frame times are typically quantised, the search space for the hashed
time values is drastically reduced; conceivably we should rather
research and implement a good hash function for 128bit and then combine
all information into a single hash key....
...using the MockJobTicket setup as point of reference,
since the actual invocation of render nodes will only be drafted
later in this "Vertical Slice" integration effort...
- introduce a JobTicket::NOP (null-object pattern)
- assuming that the function splitSplice() will retain complete coverage allways
Remark:
`Fixture::getPlaylistForRender()` is a leftover from the very early implementation drafts.
This function was more or less based on the way Cinelerra works; it is clear by now
that Lumiera can not possibly work this way, given that we'll build a low-level model
and dispatch precompiled render jobs....
The Fixture and the low-level model backbone deserve a distinct namespace on their own.
Since it's built by the Builder from the Session contents, and also used by the frame dispatch,
we can expect dependence on some types from Steam-Layer, and thus this namespace
needs to reside in Steam-Layer rather, while the actual low-level Model
might become part of Vault-Layer, creating a hierarchy of data structures.
(Remark: likely also the session related namespaces will need a reorganisation)
The idea is to escape a "design deadlock" by using a test-driven prototype
implementation of the data structure to back a further development
of the Dispatcher and Scheduler implementation, which then can be used
to gradually elaborate and switch over to an actual implementation
data structure
...requires a first attempt towards defining a `JobTiket`.
This turns out quite tricky, due to using those `LinkedElements`
(intrusive single linked list), which requires all added records
actually to live elsewhere. Since we want to use a custom allocator
later (the `AllocationCluster`), this boils down to allocating those
records only when about to construct the `JobTicket` itself.
What makes matters even worse: at the moment we use a separate spec
per Media channel (maybe these specs can be collapsed later non).
And thus we need to pass a collection -- or better an iterator
with raw specs, which in turn must reveal yet another nested
sequence for the prerequisite `JobTickets`.
Anyhow, now we're able at least to create an empty `JobTicket`,
backed by a dummy `JobFunctor`....
Looks like we'll actually retain and use this low-level solution
in cases where we just can not afford heap allocations but need
to keep polymorphic objects close to one another in memory.
Since single linked lists are filled by prepending, it is rather
common to need the reversed order of elements for traversal,
which can be achieved in linear time.
And while we're here, we can modernise the templated emplacement functions
- build the reworked Job-planning pipeline more or less from scratch
- back that with mocked `Dispatcher` and `JobTicket`
- then transfer this into a `RenderDrive`, which can be tested as well
- could continue then to a `CalcStream` integration test....
- introduce a new entity: RenderDrive
- it supersedes the CalcPlanCalculation, but is managed by CalcStream
- moreover, the RenderDrive will house a IterTreeExplorer-Pipeline
- define the concerns and relationships more clearly (see Drawing)
- prerequisite to disentangle the Job-planning "mechanics"
- decision: the Monad-style iteration framework will be abandoned
- the job-planning will be recast in terms of the iter-tree-explorer
- job-planning and frame dispatch will be disentangled
- the Scheduler will deliberately offer a high-level interface
- on this high-level, Scheduler will support dependency management
- the low-level implementation of the Scheduler will be based on Activity verbs
This finishes a long lasting effort to rework the top-level of the Lumiera GTK UI,
to adapt to GTK-3 and the new asynchronous message based architecture.
Special credits and thanks to
* Joel Holdsworth
* Stefan Kangas
Without their relentless foundational work, the Lumiera UI could
never be where it is now. Even if some code was rewritten and several
parts of the old GTK-2 implementation are now obsolete, numerous ideas
solutions and inspirations were drawn from those early contributions
and live on as part of the reworked GUI.
for sake of developement of the timeline body drawing code,
several tweaks were added to make the impact of the styling stand
out clearly. This changeset removes all those tweaks and restors
the code to intended neutral behaviour
Moreover, the cursom drawing of the timeline now requires some
basic aids to be present in the stylesheet, otherwise the track structure
will not be visible. Thus add some minimalistic styling to the
"light-theme-complement"-stylesheet, mostly based on the usual
predefined theme colours and some box-shadow settings.
This is by no means an adequate graphical solution,
yet it should be enough to get on with coding....
check private notes and mindmap and fix some remaining minor inconsistencies,
notably the calculations in the overlay renderer in the `BodyCanvasWidget`.
Not sure if the initial window width is used properly for calibration
of ZoomWindow "pixel width" base setting.
Follow-up to the layout logic established with this commit:
09714cfe28
An extended round of rebuilding and reworking the global UI structures
can be concluded now. A flexible recursive structure of Tracks has
been implemented for the new Timeline-UI, allowing to contro all
relevant aspects of structure composition by **Diff Messages** sent
up from the Steam-Layer into the UI.
Moreover, the ability to control the custom drawing code through regular
**CSS style rules** has been demonstrated, allowing for seamless integration
of Lumiera UI elements with the existing desktop theme.
This completes the initial implementation round for the TrackHead.
- arrangement and layout for nested sub-Tracks is now settled
- a graphical representation of scope nesting was implemented
Postponed for later...
- still some minor discrepancies on synchronisation of vertical space
between TrackHead and custom drawing in the body (off-by-one?)
- Expanding / Collapsing of Tracks
- Implement actual Controls to influence the Scope, e.g. Volume, Mix-Mode
- Dynamically indicate selection and Muting on the structure display
While by default the Cairo Context is scaled to device units,
we must not assume that the given Context is unscaled; rather
it might have been deliberately scaled...
A notable example is the Gtk Inspector, which offers a "Magnifier" feature
- pick up all relevant values from CSS
- also control the width of the StaveBracket
- observe the given overall height
Moreover, complete documentation drawing in Inkscape
and add a page to the TiddlyWiki, describing the principles
underlying this design and construction.
.timeline__head : The complete header container on the left side
.timeline__navi : navigation control at top
.timeline__pbay : container holding the patchbay on the left side
.fork__head : each individual TrackHeadWidget (possibly nested)
.fork__control : container for the control components for each track scope
.fork__bracket : the StaveBracket drawing to indicate the nesting structure
The tricky part is to derive the anchor point for the upper
and the lower cap of the bracket, taking into account possible padding.
There seems to be a bug hidden somewhere in this logic, since the
line turns out too short at the lower end....?
According to the documentation, we should be able to get a Pango
font description from the CSS style context, and from there we should
be able to retrieve a font-size specification (and a DPI for the display)
Running this experimental code yields a font-size value of 9pt,
leading to a scale factor of about ~6, which seems plausible.
after weighting in the pro, and cons,
I decided to follow the standard path and pick up values
for each StaveBracket instance individually from Gtk::StyleContext,
assuming that the GTK framework will care for caching and performance
...as it turned out, it suffices just to state desired behaviour explicitly
- make the StaveBracket expand=false
- declare the HeadControl expand=true
The reason lies in the fact that on leaf-tracks there are just two
adjacent cells in a single row, lacking any exterior source of layout guidance.
...just drawing a marker cross for now to indicate allocated size;
speaking of size -- GTK sometimes expands allocation horizontally,
while we'd prefer an absolutely fixed size for the purpose at hand.
Intention is to shift focus of development work down towards Player and Scheduler soon.
However, since the timeline display saw substantial improvement it seems prudent
to finish work on some open ends, notably
- the track head structure
- the drawing and styling code
While content rendering for Clip widgets can safely be postponed, regarding the TrackHeadWidget,
where custom drawing is planned to display a structural outline of the nested scopes, some
ground work should be completed to make those plannings explicit.
This both demonstrates the layout of the `TrackHeadWidget`
and puts `ElementBoxWidget` into intended use to indicate
the scope of a track and to provide a placement icon and
an expander/collapser button.
see #1018
see #1219
Layout calculation and balancing between body and track header
now works reasonably well, labels are placed properly and the
calculated layout remains stable when changing window size,
connected panes and scrollbars behave as expected now.
Quite a feat!
...to properly account for
- the "prelude" padding above the root track
- overview rulers located into a fixed separate canvas at top
- slope-down and slope-up borders around nested tracks.
After testing, results seem now to be accurate up to ± 1px
Finally (sigh)
As it turns out, we need to take the actual allocation of the cells
within the grid explicitly into account: combined cells will report
their extension for each of the underlying cells, thus leading to
excessively overestimated measurements.
So we now calculate the overall height based on the actual structure
- first row holds the label
- left column below used as expander
- right column holds individual content
Remaining problems:
- height of ruler canvas at top not taken into account (11px in this example)
- start of sub-track headers not synchronised with start of sub-tracks
in the body area
...seemingly the allocation of grid cells in the `TrackHeadWidget`
is not quite correct yet: even when there are nested sub-tracks,
we always need another row to hold the controls corresponding to
the track itself and the whole scope. And this row is also what
should be adjusted to match the vertical extension of the content
area.
As it turns out, the whole topic how to handle collapsed tracks
was not even considered yet; the calculation of the "track profile"
would need to be reworked to accommodate collapsed tracks, see: #1265
Sometimes, fractional seconds in the ZoomWindow metric can build up
to numerator and denominator values in the 64-bit integer domain.
Thus the division and truncation of the Window pixel with value
must be done in int64_t, while the result value is then
guaranteed to be a small integer < 100000
This caused the canvas to flicker and jump in size and the
resulting scrollbar change caused various cascading effects
on the further layout calculations...
Note: the actual root cause, why this re-entrance happens,
is due to another obvious numerics bug not yet solved.
Here, the canvas width was suddenly set to zero, causing
the scrollbar position to change and thus the ZoomWindow
to re-fire the structure change signal.
However, such invalidation of previously established baseline
values can never be totally excluded in advanced layout calculations,
and thus the evaluation mechanism must be prepared and re-triggered
to start over, until a stable layout is achieved.
- rearrange cell content and disable auto-expand to prevent
the content area from becoming oversized initially
- fix autocompletion error in signal binding,
causing segfault when moving the scrollbar
attempting to get the vertical space allocation in header and content area
synchronised; previously we conflated the content size and additional
padding, but even after distinguishing both, we still get a cyclic
dependency, leading to progressive increasing of allocated size...
After quite some tinkering, instead of extending the DisplayManager interface,
I now prefer to treat this connection rather as an intricate implementation detail:
The TimelineLayout implementatino now provides two translation functions,
which are directly wired as slots from the Signals emitted by moving the
hand of the scrollbar; the idea is that these functions mutate the ZoomWindow,
which then triggers a DisplayEvaltuation, which in turn causes the
drawing code to pick up and translate back the new metric and position.
Results look promising, insofar the DisplayEvaluation is now triggered
repeatedly, and the actual window width in pixel is propagated;
however, the response of the layout code is seemingly random at times,
the allocated height grows monontonously and the code Segfaults when
moving the scrollbar...
this makes the arrangement more symmetric and natural
and also makes the overview ruler scroll alongside the content pane,
thereby creating the (intended) impression of one uniform layout space
As it turns out, this happens as side-effect from the workaround 2019-08-22
fc5eaf857c
Obviously, just set_size() on the canvas is not sufficient for
GTK actually to establish a size-request (seemingly the canvas counts
as /empty/ and only real widgets would make a difference).
However, since the ruler canvas is directly placed into the box,
and not adapted by a ScrolledWindow, explicitly set_size_request()
also causes the enclosing Box to "inherit" this minimal required size,
thereby also spreading out the BodyCanvasWidget beyond the size
actually available. GTK handles this situation by hard-clipping
on the right side, which causes the vertical scrollbar to disappear
and keeps the horizontal scrollbar disabled (since nominally it still
spans the whole size available, even while this size is then clipped
subsequently).
This changeset adds a lot of debug printing and demonstrates this
behaviour by setting only a minimal horizontal size_request, so that
the window is no longer expanded and clipped.
This does not really change the logic of the DisplayEvaluation mechanism,
but makes it much more flexible: instead of having two hard wired components,
the DisplayEvaluation visitor now holds a collection of LayoutElement pointers.
This way, the Layout manager itself (on behalf of the ZoomWindow) can
participate in the process, and on activation will now establish the
window width in pixel.
This works now insofar the drawing on the canvas is adapted coherently;
however something with the setup of the Scrollbar is still not quite right;
some time ago I recall the scrollbar worked, but now it is blocked
and the canvas just clips to the right side.
It is now tied to the start of ZoomWindow::overallSpan(),
thereby defining the (technical) pixel coordinates within the window
and for drawing on the canvas to be always positive. Whenever ZoomWindow
re-calibrates, it's change signal will trigger, causing the
TimelineLayout to perform a new DisplayEvaluationPass,
which in turn prompts all embedded widgets to readjust
their positions accordingly.
Note: changing behaviour of TimeSpan to possibly flip start and end,
and also to use Offset as Offset and then re-orient,
since this seems the least surprising behaviour.
These changes carry over into changed default and limiting
on ZoomWindow constructor and various mutators, and most
notably shifting the time span always into allowed domain.
...the implementation was way too naive; in some cases we could go
into an infinite loop. In the end, using Newton approximation was not
necessary (and thus there is no loop anymore), but it helped me get
at a much better solution with very small error margin on average case.
All these corner cases are obviously "academic" to some degree,
but it turns out there is no clear-cut point where you'd be able
just so set a limit and be sure that fractional integer arithmetic
works flawless in all cases.
Thus the choice is
- give up (fractional) integers and work with floats and have to
deal with error accumulation
- or do something as chosen here, namely add a boundary zone, where
fractional integer arithmetic can be kept under control, while admitting
small errors, and in turn get the absolutely precise integers in all
everyday standard cases
The value used previously was too conservative, and prevented ZommWindow
from zooming out to the complete Time domain. This was due to missing the
Time::SCALE denominator, which increaded the limit by factor 1e6
In fact the code is able to handle even this extremely reduced limit,
but doing so seems over the top, since now detox() kicks in on several
calculations, leading to rather coarse grained errors.
Thus I decided to use a compromise: lower the limit only by factor 1000;
with typical screen pixel widths, we can reach the full time domain,
while most scaling and zoom calculations can be performed precisely,
without detox() kicking in. Obviously this change requires adjusting
a lot of the test case expectations, since we can now zoom out maximally.
As it turns out, the calculation path initially choosen for the mutateScale(Rat)
was needlessly indirect, and also duplicated several of the safeguards,
meanwhile implemented way better in conformWindowToMetric(Rat)
Thus, instead of relatively re-scaling the window, now we just
limit the given zoomFactor and pass it to conformWindowToMetric()
There is a built-in limitation, which now is even
lowered to 100000 pixels horizontally.
With the techniques introduced in this changeset, it seems possible
to support more -- yet this would be a case of unnecessary genricity;
handling such large numbers will drive more computations into the
danger zone, and doing so incurs cost in terms of testing and debugging.
Placing that into context, contemporary displays are not even 4K on
average, and it does not look likely even for cinema display to go
way beyond 8k -- so yes, I want display hardware with 100000 pixels!!
The key takeaway of this changeset:
- can calculate px = trunc(zoomFactor * duration) step wise,
even when the direct calculation would lead to wrap-around
- can safely adjust and fix the zoomFactor using Newton approximation
...even zooming out to span the complete time domain (~19000 years).
But only under the condition that the display window is sufficiently
large in terms of pixels, so we can handle the computation without
glitches.
This should not be a relevant limitation in practice, since a window
size of some 100 pixels is enough to handle Duration::MAX. Needless to add
that it's hard to imagine a media timeline of such tremendous size...
building on these Library changes, plus the safe-add function
developed some days ago, it is now possible to mark a large displacement
as `time::Offset`, and apply this to yield any valid time position,
even extreme negative values
...building on these Library changes, plus the safe-add function
developed some days ago, it is now possible to mark a large displacement
as `time::Offset`, and apply this to yield any valid time position,
even extreme negative values
The APIs for time quantisation were drafted in an early stage of the project
and then never followed-up. Especially Grid::gridAlign has no
real-world usage yet, and is only massaged in some tests.
When looking at QuantiserBasics_test, I was puzzled and led astray,
since this function suggests to materialise a continuous time into
a quantised time -- which it doesn't (there is another dedicated
function Quantiser::materialise() to that end); so, without engaging
into the discussion if this function is of any use, I'll hereby
choose a name better reflecting what it does.
This is a deep refactoring to allow to represent the distance
between all valid time points as a time::Offset or time::Duration.
By design this is possible, since Time::MAX was defined as 1/30 of
the maximum value technically representable as int64_t. However,
introducing a different limiter for offsets and durations turns
out difficult, due to the inconsistencies in the exiting hierarchy
of temporal entities. Which in turn seems to stem from the unfortunate
decision to make time entities immutable, see #1261
Since the limiter is hard wired into the `time::TimeValue` constructor,
we are forced to create a "backdoor" of sorts, to pass up values
with different limiting from child classes. This would not be so
much of a problem if calculations weren't forced to go through `TimeVar`,
which does not distinguish between time points and time durations.
This solution rearranges all checks to be performed now by time::Offset,
while time::Duration will only take the absolute value at construction,
based on the fact that there is no valid construction path to yield
a duration which does not go through an offset first.
Later, when we're ready to sort out the implementation base of time values
(see #1258), this design issue should be revisited
- either we'll allow derived classes explicitly to invoke the limiter functions
- or we may be able to have an automatic conversion path from clearly
marked base implementation types, in which case we wouldn't use the
buildRaw_() and _raw() "backdoor" functions any more...
While the calculation was already basically under control, I just was not content
with the achieved numeric precision -- and the fact that the test case in fact
misses the bar, making it difficult do demonstrate that the calculation
is not derailed. I just had the gut feeling that it must be somehow possible
to achieve an absolute error level, not just a relative error level of 1e-6
Thus I reworked this part into a generic helper function, see #1262
The end result is:
* partial failure. we can only ''guarantee'' the relative error margin of 1e-6
* but in most cases not out to the most extreme numbers, the sophisticated
solution achieves much better results way below the absolute error level of 1µ-Tick
Thus with using rational numbers, we have now a solution that is absolutely precise
in the regular case, and gradually introduces errors at the domain bound
but with an guaranteed relative error margin of 1e-6 (== Time::SCALE)
...now able to achieve the expected error bound of 1e-6
...this seems to be the worst-case for ZoomWindow::setVisiblePos(factor)
for extremely large timeline; seemingly not possible to achieve the
goal set for this special test case
...in a similar vein as done for the product calculation.
In this case, we need to check the dimensions carefully and pick
the best calculation path, but as long as the overall result can
be represented, it should be possible to carry out the calculation
with fractional values, albeit introducing a small error.
As a follow-up, I have now also refactored the re-quantisation
functions, to be usable for general requantisation to another grid,
and I used these to replace the *naive* implementation of the
conversion FSecs -> µ-Grid, which caused a lot of integer-wrap-around
However, while the test now works basically without glitch or wrap,
the window position is still numerically of by 1e-6, which becomes
quite noticeably here due to the large overall span used for the test.
...using a requantisation trick to cancel out some factors in the
product of two rational numbers, allowing to calculate the product
without actual multiplication of (dangerously large) numbers.
with these additional safeguards, the anchorWindowAtPosition()
succeeds without Integer-wrap, but the result is not fully correct
(some further calculation error hidden somewhere??)
- detailed documentation of known problematic behaviour
when working with rational fractions
- demonstrate the heuristic predicate to detect dangerous numbers
- add extensive coverage and microbenchmarks for the integer-logarithm
implementation, based on an example on Stackoverflow. Surprising result:
The std::ilog(double) function is of comparable speed, at least for
GCC-8 on Debian-Buster.
Especially rational numbers with large denominator can be insidious,
since they might cause numeric overflow on seemingly harmless operations,
like adding a small number.
A solution might be to *requantise* the number into a different,
way smaller denominator. Obviously this is a lossy operation;
yet a small and controlled numeric error is always better than
an uncontrolled numeric wrap-around.
- protection against negative numbers seems adequate
- a possible concern are handling of very large time spans
- definitively have to guard against "poisonous" fractions
(e.g. n / INT_MAX)
- some test definitions were simply numerically wrong
- changed some aspects of the specified behaviour, to be more consistent
+ scrolling is more liberal and always allows to extend canvas
+ setting window to a given duration expands around anchor point
This function allows to move the visible range such that it contains
a given time position; the relative location of this point within
the visible range however is in turn determined by relating it to
the current overall canvas: if we are close to the beginning, the
position is also located rather to the left side, and if we're
approaching the canvas end, the position tends to the right side...
(and yes, I am aware that the variant taking a rational number
can be derailed by causing internal numeric overflow, when passing
a maliciously crafted rational number, like INT_MAX-1 / INT_MAX )
Rearrange the internal mutator functions to follow a common scheme,
so that most of the setters can be implemented by simple forwarding.
Move the change-listener triggering up into the actual setters.
This makes further test cases pass
- verify_setup
- verify_calibration
...implying that the pixel width is now retained
and basic behaviour matches expectations
Since conformWindowToMetric() is always called prior to performing
the complete invariant-reestablishment sequence, we can even integrate
the rule for relative scaling into this central function, which
simplifies the mutation implementation significantly. Should
relative positioning go south, the following sanity checks
will push the window back into bounds.
With these changes, the verify_simpleUsage() test passes!
Extensive tests with corner cases soon highlighted this problem
inherent to integer calculations with fractional numbers: it is
possible to derail the calculation by numeric overflow with values
not excessively large, but using large numbers as denominator.
This problem is typically triggered by addition and subtraction,
where you'd naively not expect any problems.
Thus changed the approach in the normalisation function, relying
on an explicitly coded test rather, and performing the adjustment
only after conversion back to simple integral micro-tick scale.
Getting all those requirements translated into code turns out to be a challenging task;
and the usual ascent to handle such a situation is to define **Invariants**
in conjunction with a normalisation scheme; each manipulation will then be
translated into invocation of one of the three fundamental mutators,
and these in turn always lead into the common normalisation sequence.
__Invariants__
- oriented and non-empty windows
- never alter given pxWidth
- zoom metric factor < max zoom
- visibleWindow ⊂ Canvas
Writing this specification unveiled a limitation of our internal
time base implementation, which is a 64bit microsecond grid.
As it turns out, any grid based time representation will always
be not precise enough to handle some relevant time specifications,
which are defined by a divisor. Most notably this affects the precise
display of frame duration in the GUI, and even more relevant,
the sample accurate editing of sound in the timeline.
Thus I decided to perform the internal computation in ZoomWindow
as rational numbers, based on boost::rational
Note: implementation stubbed only, test fails
This ZoomWindow_test highlights again the question about the intended usage
of the Lumiera time entities. In which way do we want to perform time calculations,
and under which circumstances is it adequate to perform arithmetic on
raw time values?
These questions made me think about rather far reaching concerns regarding
subsidiarity and implicit or explicit usage context. Basically I could
reconfirm the design choices taken some years ago -- while I must admit
that the project is headed towards a way larger scale and more loose
coupling of the parts, than I could imagine several years ago, at the
time when the design started...
As a side note: we can not avoid that some knowledge about the time implementation
leaks out from the support lib; time codes themselves are tightly coupled
to the usage scenario within the session and can not be used as means
for implementing UI concerns. And the more generic time frameworks,
like std::chrono (as much as it is desirable to have some integration here)
will not be of any help for most of our specific usage patterns.
The reason is, for film editing we do not have a global time scale,
rather the truth is when the film starts....
implement the first test case: nudge the zoom factor
⟹ scale factor doubled
⟹ visible window reduced to half size
⟹ visible window placed in the middle of the overall range
The solution is to provide a standard implementation in the form of a mix-in,
which directly houses a `ZoomWindow` instance. Moreover, the latter
is deemed a prominent use case for the time::Control, allowing other
components to attach and push changes of the zoom state or register
as listeners to react to state changes.
Actually, the `TimelineLayout`, which hosts all the actual visible
widgets forming the timeline-UI, now integrates this mix-in; and since
`TimelineLayout` is passed to `TimelineController` and used there as
reference-`CanvasHook` for the root track, this implementation of
the `DisplayMetric` interface will ''effectively be used by all
widgets'' attached to the timeline canvas.
Reading my work notes from two years ago, the concept can be validated.
Clarify the relation of the interfaces involved, and the role foreseen
for the upcoming `ZoomWindow` abstraction. This solution approach
will lead to multiple-stage indirect calls, which however are deemed
not to be overly concerning and will be investigated later, to avoid
premature optimisation (see #1254)
- `DisplayMetric` is a focused special purpose abstraction
- it belongs into the general abstraction of the `DisplayManager`
- it is rather linked by use to the other abstraction, the `CanvasHook`
- while the `RelativeCanvasHook` is not an interface, but an implementation mix-in
- and the actual `DisplayMetric` implementation can likewise be provided
as mix-in, since it will typically be implemented in terms of a `ZoomWindow`
Using this scheme, it will be possible to avoid some of the indirect cally
by making this mix-in visible higher up the call graph -- in case the
actual need for optimisation can be confirmed in practice.
* restructure the widgets used to implement ElementBox
* inject a Gtk::EventBox top-level base type to capture all Gdk-Events
* push the Gtk::Frame one level down (TODO: API for managing children)
With these changes
* dragging of Clips in the timeline works as expected
* size constraints are observed precisely
* all warnings and assertion failures from GTK disappeared
Thus we can conclude that the solution approach for size constrained widgets
was successful and this challenging problem is solved.
...as it turned out, the ClipWidget did still invoke the
Gtk::Frame::set_label() function, thereby deactivate our
elaborate custom IDLabel and showing the label text
unconditionally instead (also violating our size constraint)
...as it turns out, this code basically works already when the
widget is not(yet) realized:
- when a widget is hidden, it responds with size=0
- when a widget is shown, it reponds with proper or at least
preliminary size requirement, irrespective if already drawn
After injecting the diff, the widgets are created and then adjusted
in several steps; however, this code all executes from within a single
call to the UI-bus, and thus just piles up a sequence of realize()
and resize() messages, which are only executed later, in case the
Application-UI as a whole is visible on screen.
*Remaining Problems*:
- size-constraint code not working correct in all cases
- dragging works only on the buttons, not on the background
According to plan, this was more or less a drop-in replacement.
However, this first integration prototype highlights some design problems
* `ElementBoxWidget` is designed ''constructor-centric''
* but the population by diff messages will supply crucial information later
* and seemingly the size-constraint code is now invoked prior to widget realisation \\
⟹ Assertion Failure
- Test the new layout code with debugger + dump messages
- Experiment: live changes to the name-ID content
(send msg. "manip" -> Text changed and Layout properly revalidated)
devise a more fine grained algorithm for adapting the display of IDLabel
to a situation with size constrained layout, e.g. for a time calibrated canvas.
We still do not implement the shortening of ID labels (see #1242),
since doing so would be surprisingly expensive. But at least we
do proceed in several steps now
- first attempt to reduce the name-ID (for now: hide it)
- if this doesn't suffice, also hide the menu
- and as a last resort, hide the icon and thus make the IDLabel empty
We are using buttons now, but the standard theme introduces a lot of padding arount button's contents.
Thus we need to consider ways to address the compound of widgets forming an ElementBox; moreover,
this is the classical situation where the BEM notation helps to clarify the intention....
The problem leading to custom styling here is the padding within buttons;
the default stylesheet seemingly adds a min-width and min-height setting,
and some padding within the Button; based on systematic CSS class names,
it is possible to remove these settings specifically for buttons
within the IDLabel in general (no need to treat only the case of an EventBoxLabel
-- IDLabel could become a custom widget on its own
as it turns out, this is a self-contained separate concern,
and thus this arrangement of two icons plus a caption shall
now manage itself as a custom widget.
And while touching this subject, I have also reconsidered
the purpose and arrangement of those icons and completed
the specification with some decisions...
- context menus will be left-click, selection right-click (Blender!)
- we will always show those two icons, just allocate different graphics
- when there is no expander, the 2nd icon will just serve to open the menu
- so the button is almost redundant in that case (except when dragging)
Further extended GTK code survey to clarify the role of the minimum_size,
it is indeed ignored by most standard containers, but it is actually
used by Gtk::Layout as starting point for the query sequence. Thus
it does not make sense to treat minimum and natural size differently;
both queries should be responded by returning our size constraint.
Unless we define additional borders and margins in the CSS, we can be sure
that GTK will base the size allocation on the exact values returned
from the get_required_* functions.
These functions will be invoked only from within the Event Loop
and after the ctor is finished, but before the first "draw".
They will be re-invoked on each "size change" event and on each
focus change (since a focus change may change the style and thus
the actual extension).
...the key point is to ask the embedded box holding the label
about it's preferred_size() -- this info is updated immediately,
even at begin, when the nested child widgets did not yet receive
an allocation.
Even while the preferred-size is something different than the
actual allocation, it will always be smaller and is thus sufficient
to decide if the size constraint can be met
The header "format-cout.hpp" offers a convenience function
to print pretty much any object or data in human readable form.
However, the formatter for pointers used within this framework
switched std::cout into hexadecimal display of numbers and failed
to clean-up this state.
Since the "stickyness" of IOS stream manipulators is generally a problem,
we now provide a RAII helper to capture the previous stream state and
automatically restore it when leaving the scope.
...does not work reliably yet...
- on first invocation, the child allocation is still zero
- later on, there seem to be lots of further invocations,
always when the application window gains focus
- these further invocations somehow change the visible extension
of the widget's background
identify the various dimensions, which require flexibility
to support the intended use cases; try to come up with a
design draft, allowing to settle on a preliminary version
soon, while not hampering further development later on.
Obviously this is a very deep and challenging topic,
and we're far from even remotely addressing it adequately;
we just need to get to the point to use this drafted version
as building block, since these usages will then push us further
into the right direction...
The flexible custom styling yet needs to be definied.
Just adding a stock icon and a standard sized label field for now.
Widget can be constructed and successfully attached to a track.
Complete the investigation and turn the solution into a generic
mix-in-template, which can be used in flexible ways to support
this qualifier notation.
Moreover, recapitulate requirements for the ElementBoxWidget
Basically we want to create ElementBoxWidgets according to a
preconfigured layout scheme, yet we'll need to pass some additional
qualifiers and optional features, and these need to be checked
and used in accordance with the chosen flavour...
Investigating a possible solution based on additional ctor parameters,
which are given as "algebraic terms", and actually wrap a functor
to manipulate a builder configuration record
Seems to work solid now, after switching to the root coordinates provided by GDK.
With local relative coordinates, the subject fidgets while being dragged,
for obvious reasons, since we're shifting the relative point of reference.
Also clarified a strange behaviour of the test drawing code:
Cairo is "turtle graphics", so we need to set the starting point explicitly.
...well, the metric translation is not quite correct,
so it doesn't yet stick to the mouse. But all the challenging
problems within the framework for implementing such a generic
gesture seem to be solved now.
The ClipPresenter can access the CanvasHook wired into its actual ClipDelegate (widget).
And this in turn exposes the DisplayMetric, with the ability to transform
presentation coordinates (pixels) into a model representation (Time)
The actual translation is still hardwired placeholder code,
since it is planned to build an generic component "ZoomWindow"
to provide all the typical zomming and view window translations
found in every timeline editor
- move construct into the buffer
- directly invoke the payload constructor through PlantingHandle
- reconsider type signature and size constraint
- extend the unit test
- document a corner case of c++ "perfect forwarding",
which caused me some grief here
...this extension was spurred by the previeous refactoring.
Since 'emplace' now clearly denotes an operation to move-embed an existing object,
we could as well offer a separate 'create' API, which would take forwarding
arguments as usual and just delegates to the placement-new operation 'create'
already available in the InPlaceBuffer class.
Such would be a convenience shortcut and is not strictly necessary,
since move-construction is typically optimised away; yet it would also
allow to support strictly non-copyable payload types.
This refactoring also highlights a fuzziness in the existing design,
where we just passed the interface type, while being sloppy about the
DEFAULT type. In fact this *is* relevant, since any kind of construction
might fail, necessitating to default-construct a placeholder, since
InPlaceBuffer was intended for zero-overhead usage and thus has in itself
no means to know about the state of its buffer's contents. Thus the
only sane contract is that there is always a valid object emplaced
into the buffer, which in turn forces us to provide a loophole for
class hierarchies with an abstract base class -- in such a case the
user has to provide a fallback type explicitly.
...for the operation on a PlantingHandle, which allows
to implant a sub type instance into the opaque buffer.
* "create" should be used for a constructor invocation
* "emplace" takes an existing object and move-constructs
this allows to avoid multi-step indirection
when translating mouse dragging pixel coordinates
into a time offset for the dragged clip widget.
Moreover this also improves the design,
since the handling of canvas metric is pretty much
a self contained, separate concern
...previously this was modelled as part of the CanvasHook abstraction,
and in fact it will in any case be implemented by delegation to the
TimelineLayout or some kind of display manager.
We need this to tanslate mouse pixel movements into a time change
while dragging, effectively we have to translate a mouse position delta
into a TimeValue delta, and we want to avoid direct coupling to some
timeline display manager, to keep the gesture logic mostly generic.
some bugfixes,
but also a notable change: detect the completion of the gesture
directly when the button is released; this is necessary, because
seemingly we do not get motion_events when no button is pressed,
at least not in this test setup based on a Gtk::Button widget.
...because this is a prototype, but should fit in
with a future frameworks to handle complex interactions and gestures.
And no, we can not afford to rely on a UI toolkit for such a core concern
It is impossible that a framework like e.g. GTK will allow us to
support a custom made hardware controller and integrate it seamlessly
into getsture handling, thereby following a design philosophy that
is in accordance with our fundamental decisions.
...found out that GTK already implements an "implicit grab",
and thus the tricky situation that the mouse slides off the widget
can not happen at all; so in the end it's rather easy to build a trigger
for a dragging gesture.
The demo code is now activated only after the button is down
and just prints the position...
PS: did some research regarding the new Coroutines in C++
Setup the scaffolding necessary to get at the actual clip widget
and to establish a signal connection to the button_pressed signal.
The intention is to watch this in conjunction with mouse movements
for detection of the actual gesture.
At the moment, I am using button widgets as placeholder for the actual
clip widgets (not yet implemented...). And, as a tiny little success,
these buttons now invoke the gesture controller on right click
(left click is seemingly consumed by the button itself)
thus far my implementation concept seems to work as intended....
note: when populating the timeline with actual Clips,
the not-yet implemented linkSubject()-Function of the DragRelocateController
gets invoked (as it should), thereby killing Lumiera