Commit graph

27 commits

Author SHA1 Message Date
0b9e184fa3 Library: replace usages of rand() in the whole code base
* most usages are drop-in replacements
 * occasionally the other convenience functions can be used
 * verify call-paths from core code to identify usages
 * ensure reseeding for all tests involving some kind of randomness...

__Note__: some tests were not yet converted,
since their usage of randomness is actually not thread-safe.
This problem existed previously, since also `rand()` is not thread safe,
albeit in most cases it is possible to ignore this problem, as
''garbled internal state'' is also somehow „random“
2024-11-13 04:23:46 +01:00
af680cdfd9 Scheduler-test: adapt tests to changed logic at entrance
- now there can not be any direct dispatch anymore when entering events
- thus there is no decision logic at entrance anymore
- rather the work-function implementation moved down into Layer-2
- so add a unit-test like coverage there (integration in SchedulerService_test)
2023-12-27 00:16:03 +01:00
09f0e92ea3 Scheduler-test: reorganise planning-job entrance and coordination
This amounts to a rather massive refactoring, prompted by the enduring problems
observed when pressing the scheduler. All the various glitches and (fixed) crashes
are related to the way how planning-jobs enter the schedule items,
which is also closely tied to the difficulties getting the locking
for planning-jobs correct.

The solution pursued hereby is to reorder the main avenues into the
scheduler implementation. There is now a streamlined main entrance,
which **always** enqueues only, allowing to omit most checks and
coordination. On the other hand, the complete coordination and dispatch
of the work capacity is now shifted down into the SchedulerCommutator,
thereby linking all coordination and access control close together
into a single implementation facility.

If this works out as intended
 - several repeated checks on the Grooming-Token could be omitted (performance)
 - the planning-job would no longer be able to loose / drop the Token,
   thereby running enforcedly single-threaded (as was the original intention)
 - since all planning effectively originates from planning-jobs, this
   would allow to omit many safety barriers and complexities at the
   scheduler entrance avenue, since now all entries just go into the queue.

WIP: tests pass compiler, but must be adapted / reworked
2023-12-26 03:06:30 +01:00
707fbc2933 Scheduler-test: implement contention mitigation scheme
while my basic assessment is still that contention will not play a significant
role given the expected real world usage scenario — when testing with
tighter schedule and rather short jobs (500µs), some phases of massive contention
can be observed, leading to significant slow-down of the test.

The major problem seems to be that extended phases of contention will
effectively cause several workers to remain in an active spinning-loop for
multiple microseconds, while also permanently reading the atomic lock.

Thus an adaptive scheme is introduced: after some repeated contention events,
workers now throttle down by themselves, with polling delays increased
with exponential stepping up to 2ms. This turns out to be surprisingly
effective and completely removes any observed delays in the test setup.
2023-12-20 20:25:17 +01:00
b497980522 Scheduler-test: guard memory allocations by grooming-token
Turns out that we need to implemented fine grained and explicit handling logic
to ensure that Activity planning only ever happens protected by the Grooming-Token.
This is in accordance to the original design, which dictates that all management tasks
must be done in »management mode«, which can only be entered by a single thread at a time.
The underlying assumption is that the effort for management work is dwarfed in comparison
to any media calculation work.

However, in
5c6354882d
...I discovered an insidious border condition, an in an attempt to fix it,
I broke that fundamental assumpton. The problem arises from the fact that we
do want to expose a *public API* of the Scheduler. Even while this is only used
to ''seed'' a calculation stream, because any further planning- and management work
will be performed by the workers themselves (this is a design decision, we do not
employ a "scheduler thread")
Anyway, since the Scheduler API ''is'' public, ''someone from the outside'' could
invoke those functions, and — unaware of any Scheduler internals — will
automatically acquire the Grooming-Token, yet never release it,
leading to deadlock.

So we need a dedicated solution, which is hereby implemented as a
scoped guard: in the standard case, the caller is a management-job and
thus already holds the token (and nothing must be done). But in the
rare case of an »outsider«, this guard now ''transparently'' acquires
the token (possibly with a blocking wait) and ''drops it when leaving scope''
2023-12-19 23:38:57 +01:00
892099412c Scheduler: integrate sanity check on timings
...especially to prevent a deadline way too far into the future,
since this would provoke the BlockFlow (epoch based) memory manager
to run out of space.

Just based on gut feeling, I am now imposing a limit of 20seconds,
which, given current parametrisation, with a minimum spacing of 6.6ms
and 500 Activities per Block would at maximum require 360 MiB for
the Activities, or 3000 Blocks. With *that much* blocks, the
linear search would degrade horribly anyway...
2023-11-07 18:37:20 +01:00
86b90fbf84 Scheduler: draft high-level API for building a Job schedule
The invocation structure is effectively determined by the
Activity-chain builder from the Activity-Language; but, taking
into account the complexity of the Scheduler code developed thus far,
it seems prudent to encapsulate the topic of "Activities" altogether
and expose only a convenience builder-API towards the Job-Planning
2023-11-06 06:00:00 +01:00
72258c06bd Scheduler: reconciled into clearer design
The problem with passing the deadline was just a blatant symptom
that something with the overall design was not quite right, leading
to mix-up of interfaces and implementation functions, and more and more
detail parameters spreading throughout the call chains.

The turning point was to realise the two conceptual levels
crossing and interconnected within the »Scheduler-Service«

- the Activity-Language describes the patterns of processing
- the Scheduler components handle time-bound events

So by turning the (previously private) queue entry into an
ActivationEvent, the design could be balanced.
This record becomes the common agens within the Scheduler,
and builds upon / layers on top of the common agens of the
Language, which is the Activity record.
2023-11-04 04:49:13 +01:00
b49de0738d Scheduler: implement automatic clean-up of outdated entries
Hooked into the existing processing logic at Layer-2,
and relying on the information functions of Layer-1
2023-11-03 01:17:10 +01:00
b1e0ce1a79 Scheduler: define expected filtering behaviour for significant tasks 2023-11-03 00:31:33 +01:00
6166ab63f2 Scheduler: complete handling of the grooming-token
- Ensure the grooming-token (lock) is reliably dropped
- also explicitly drop it prior to trageted sleeps
- properly signal when not able to acquire the token before dispatch

- amend tests broken by changes since yesterday
2023-10-28 05:35:35 +02:00
a21057bdf2 Scheduler: control structure for the worker-functor 2023-10-22 23:25:35 +02:00
9ce3ad3d72 Scheduler: Layer-2 complete and tested (see #1326)
* the implementation logic of the Scheduler is essentially complete now
 * all functionality necessary for the worker-function has been demonstrated

As next step, the »Scheduler Service« can be assembled from the two
Implementation Layers, the Activity-Language and the `BlockFlow` allocator
This should then be verified by a multi-threaded integration test...
2023-10-19 01:49:08 +02:00
10a2c6908c Scheduler: Layer-2 integration scenario complete
could even rig the diagnostic Execution-Ctx
to drop the GroomingToken at the point when switching to work-mode
2023-10-18 23:02:29 +02:00
c2ddaed28e Scheduler: draft scenario for Layer-2 integration test
Idea: re-use the scenario and instrumentation from
SchedulerActivity_test::scenario_RenderJob()
2023-10-18 18:10:10 +02:00
ee09a2eff2 Scheduler: completed implementation of Layer-2
...some further checks
...one integration test case needs to be written
2023-10-18 17:29:41 +02:00
93fcebb331 Scheduler: implement and verify postDispatch 2023-10-18 16:39:08 +02:00
666546856f Scheduler: design the core API operation - postDispatch
This central operation sits at a crossroad and is used
- from external clients to fed new work to the Scheduler
- from Workers to engage into execution of the next Activity
- recursively from the execution of an Activity-chain

From these requirements the semantics of behaviour can be derived
regarding the GroomingToken and the result values, which indicate
when follow-up work should be processed
2023-10-18 15:50:11 +02:00
55967cd649 Scheduler: work retrieval implementation
- simple approach, delegating to Layer-1
- deliberately no error handling
- GroomingToken not dropped
2023-10-18 04:18:01 +02:00
b57503fb97 Scheduler: define expected behaviour for work retrieval
still not quite sure how to implement it,
but working down from first principles to define test scenarios first...
2023-10-18 02:59:58 +02:00
aa60869082 Scheduler: decision logic for actual dispatch of activities 2023-10-18 01:38:58 +02:00
fa391d1267 Scheduler: torture test the thread access logic
Ensure the GroomingToken mechanism indeed creates an
exclusive section protected against concurrent corruption:

Use a without / with-protection test and verify
the results are exact vs. grossly broken
2023-10-17 21:35:37 +02:00
1223772f14 Scheduler: implement thread access logic
T thread holding the »Grooming Token" is permitted to
manipulate scheduler internals and thus also to define new
activities; this logic is implemented as an Atomic lock,
based on the current thread's ID.
2023-10-17 20:37:32 +02:00
862933e809 Scheduler: define API for Layer-2
Notably both Layers are conceived as functionality providers;
only at Scheduler top-Level will functionality be combined with
external dependencies to create the actual service.
2023-10-17 19:20:53 +02:00
997fc36c81 Workforce: implementation complete 2023-09-09 23:42:13 +02:00
70cd8af806 Workforce: requirement analysis 2023-09-05 00:22:17 +02:00
23a6fbdf4f Scheduler: investigate modes of operation
- analysis of Activity usage
- derive possible memory management schemes
- research regarding asynchronous IO
- decision regarding the memory management scheme
2023-07-03 18:40:37 +02:00