The »Scheduler Service« will be assembled
from the components developed during the last months
- Layer-1
- Layer-2
- Activity-Language
- Block-Flow
- Work-Force
* the implementation logic of the Scheduler is essentially complete now
* all functionality necessary for the worker-function has been demonstrated
As next step, the »Scheduler Service« can be assembled from the two
Implementation Layers, the Activity-Language and the `BlockFlow` allocator
This should then be verified by a multi-threaded integration test...
This central operation sits at a crossroad and is used
- from external clients to fed new work to the Scheduler
- from Workers to engage into execution of the next Activity
- recursively from the execution of an Activity-chain
From these requirements the semantics of behaviour can be derived
regarding the GroomingToken and the result values, which indicate
when follow-up work should be processed
T thread holding the »Grooming Token" is permitted to
manipulate scheduler internals and thus also to define new
activities; this logic is implemented as an Atomic lock,
based on the current thread's ID.
Notably both Layers are conceived as functionality providers;
only at Scheduler top-Level will functionality be combined with
external dependencies to create the actual service.
At first sight, this seems confusing; there is a time window,
there is sometimes a `when` parameter, and mostly a `now` parameter
is passed through the activation chain.
However, taking the operational semantics into account, the existing
definitions seem to be (mostly) adequate already: The scheduler is
assumed to activate a chain only ''when'' the defined start time is reached.
As follow-up to the rework of thread-handling, likewise also
the implementation base for locking was switched over from direct
usage of POSIX primitives to the portable wrappers available in
the C++ standard library. All usages have been reviewed and
modernised to prefer λ-functions where possible.
With this series of changes, the old threadpool implementation
and a lot of further low-level support facilities are not used
any more and can be dismantled. Due to the integration efforts
spurred by the »Playback Vertical Slice«, several questions of
architecture could be decided over the last months. The design
of the Scheduler and Engine turned out different than previously
anticipated; notably the Scheduler now covers a wider array of
functionality, including some asynchronous messaging. This has
ramifications for the organisation of work tasks and threads,
and leads to a more deterministic memory management. Resource
management will be done on a higher level, partially superseding
some of the concepts from the early phase of the Lumiera project.
This is Step-2 : change the API towards application
Notably all invocation variants to support member functions
or a reference to bool flags are retracted, since today a
λ-binding directly at usage site tends to be more readable.
The function names are harmonised with the C++ standard and
emergency shutdown in the Subsystem-Runner is rationalised.
The old thread-wrapper test is repurposed to demonstrate
the effectiveness of monitor based locking.
The WorkForce (passive worker pool) has been coded just recently,
and -- in anticipation of this refactoring -- directly against std::thread
instead of using the old framework.
...the switch is straight-forward, using the default case
...add the ability to decorate the thread-IDs with a running counter
The investigation for #1279 leads to the following conclusions
- the features and the design of our custom thread-wrapper
almost entirely matches the design chosen meanwhile by the C++ committee
- the implementation provided by the standard library however uses
modern techniques (especially Atomics) and is more precisely worked out
than our custom implementation was.
- we do not need an *active* threadpool with work-assignment,
rather we'll use *active* workers and a *passive* pool,
which was easy to implement based on C++17 features
==> decision to drop our POSIX based custom implementation
and to retrofit the Thread-wrapper as a drop-in replacement
+++ start this refactoring by moving code into the Library
+++ create a copy of the Threadwrapper-code to build and test
the refactorings while the application itself still uses
existing code, until the transition is complete
...which however brings the problem that we can no longer block the destructor
of WorkForce by simply joining on all joinable threads (there is a race
between testing joinable() and invoking join(), which does not tolerate
non-joinable state.
There is a second problem: we need to detect and clean-up terminated workers,
even for just finding out how many workers are still active. Fortunately
doing so also solves the waiting problem in the destructor
While in principle it would be possible (and desirable)
to control worker behaviour exclusively through the Work-Functor's return code,
in practice we must concede that Exceptions can always happen from situations
beyond our control. And while it is necessary for the WorkForce-dtor to
join and block (we can not just pull away the resources from running threads),
the same destructor (when called out of order) must somehow be able
at least to ask the running threads to terminate.
Especially for unit tests this becomes an obnoxious problem -- otherwise
each test failure would cause the test runner to hang.
Thus adding an emergency halt, and also improve setup for tests
with a convenience function to inject a work-function-λ
- investigate consistency guarantees through acquire-release
==> turns out we do not need a fence, but it is tantamount
to have a guard variable and actually load and check
the value to ensure we indeed get a happens-before
- elaborate design of the WorkForce
+ no shared control variables necessary
+ no ability to forcibly shut-down the WorkForce
+ rather, all control will be exerted through the return value
of the Work-Functor
Up to now, the DiagnosticFun mock in ActivityDetector only
created an EventLog entry on invocation and was able to retunr
a canned result value. Yet for the job invocation scenario test,
it would be desirable to hook-in a λ with a fake implementation
into the ExecutionContext. As a further convenience, the
return value is now default initialised, instead of being
marked as uninitialised until invocation of "returning(val)"
...regarding the kind of activity (the verb),
and also for some special case access of payload data;
deliberately asserting the correct verb, but no mandatory check,
since this whole Activity-Language is conceived as cohesive
and essentially sealed (not meant to be extended)
It is not sufficient just to pass this "current time" as parameter
into the ActivityLang::dispatchChain(), since some Activities within
this chain will essentially be long-running (think rendering); thus
we need a real callback from within the chain. The obvious solution
is to make this part of the Execution Context, which is an abstraction
of the scheduler environment anyway
...turns out there is still a lot of leeway in the possible implementation,
and seemingly it is too early to decide which case to consider the default.
Thus I'll proceed with the drafted preliminary solution...
- on primary-chain, an inhibited Gate dispatches itself into future for re-check
- on Notification, activation happens if and only if this very notification opens the Gate
- provide a specifically wired requireDirectActivation() to allow enforcing a minimal start time
While the ''general direction'' seems clear, some in-depth
analysis was required to find out what information can reasonably
be expected to be available at this point.
The decision was made to shift the actual deadline calculation
into the Job-Planning altogether, assuming that a preliminary solution
based on data implicitly available there will be enough to implement
simple linear playback, while precise management of job start times
can be added in later, when observation of actual timing behaviour
is available...
Solved by special treatment of a notification, which happens
to decrement the latch to zero: in this case, the chain is
dispatched, but also the Gate is locked permanently to block
any further activations scheduled or forwareded otherwise
TODO: while correct as implemented, the handling of the
notification seems questionable, since re-scheduling the chain immediately
may lead to multiple invocations of the chain, since it might have been "spinned"
and thus re-scheduled already, and we have no way to find out about that