LUMIERA.clone

Author	SHA1	Message	Date
Ichthyostega	bf41474004	Library: investigate Scheduler test failures ...which turn out not to be due to the PRNG changes * the SchedulerCommutator_test was inadvertently broken 2024-04-10 * SchedulerStress_test simply runs for 4min, which is not tolerated by our Testsuite setup see also: `5b62438eb`	2024-11-15 02:20:36 +01:00
Ichthyostega	0b9e184fa3	Library: replace usages of `rand()` in the whole code base * most usages are drop-in replacements * occasionally the other convenience functions can be used * verify call-paths from core code to identify usages * ensure reseeding for all tests involving some kind of randomness... __Note__: some tests were not yet converted, since their usage of randomness is actually not thread-safe. This problem existed previously, since also `rand()` is not thread safe, albeit in most cases it is possible to ignore this problem, as ''garbled internal state'' is also somehow „random“	2024-11-13 04:23:46 +01:00
Ichthyostega	af680cdfd9	Scheduler-test: adapt tests to changed logic at entrance - now there can not be any direct dispatch anymore when entering events - thus there is no decision logic at entrance anymore - rather the work-function implementation moved down into Layer-2 - so add a unit-test like coverage there (integration in SchedulerService_test)	2023-12-27 00:16:03 +01:00
Ichthyostega	09f0e92ea3	Scheduler-test: reorganise planning-job entrance and coordination This amounts to a rather massive refactoring, prompted by the enduring problems observed when pressing the scheduler. All the various glitches and (fixed) crashes are related to the way how planning-jobs enter the schedule items, which is also closely tied to the difficulties getting the locking for planning-jobs correct. The solution pursued hereby is to reorder the main avenues into the scheduler implementation. There is now a streamlined main entrance, which always enqueues only, allowing to omit most checks and coordination. On the other hand, the complete coordination and dispatch of the work capacity is now shifted down into the SchedulerCommutator, thereby linking all coordination and access control close together into a single implementation facility. If this works out as intended - several repeated checks on the Grooming-Token could be omitted (performance) - the planning-job would no longer be able to loose / drop the Token, thereby running enforcedly single-threaded (as was the original intention) - since all planning effectively originates from planning-jobs, this would allow to omit many safety barriers and complexities at the scheduler entrance avenue, since now all entries just go into the queue. WIP: tests pass compiler, but must be adapted / reworked	2023-12-26 03:06:30 +01:00
Ichthyostega	707fbc2933	Scheduler-test: implement contention mitigation scheme while my basic assessment is still that contention will not play a significant role given the expected real world usage scenario — when testing with tighter schedule and rather short jobs (500µs), some phases of massive contention can be observed, leading to significant slow-down of the test. The major problem seems to be that extended phases of contention will effectively cause several workers to remain in an active spinning-loop for multiple microseconds, while also permanently reading the atomic lock. Thus an adaptive scheme is introduced: after some repeated contention events, workers now throttle down by themselves, with polling delays increased with exponential stepping up to 2ms. This turns out to be surprisingly effective and completely removes any observed delays in the test setup.	2023-12-20 20:25:17 +01:00
Ichthyostega	b497980522	Scheduler-test: guard memory allocations by grooming-token Turns out that we need to implemented fine grained and explicit handling logic to ensure that Activity planning only ever happens protected by the Grooming-Token. This is in accordance to the original design, which dictates that all management tasks must be done in »management mode«, which can only be entered by a single thread at a time. The underlying assumption is that the effort for management work is dwarfed in comparison to any media calculation work. However, in `5c6354882d` ...I discovered an insidious border condition, an in an attempt to fix it, I broke that fundamental assumpton. The problem arises from the fact that we do want to expose a public API of the Scheduler. Even while this is only used to ''seed'' a calculation stream, because any further planning- and management work will be performed by the workers themselves (this is a design decision, we do not employ a "scheduler thread") Anyway, since the Scheduler API ''is'' public, ''someone from the outside'' could invoke those functions, and — unaware of any Scheduler internals — will automatically acquire the Grooming-Token, yet never release it, leading to deadlock. So we need a dedicated solution, which is hereby implemented as a scoped guard: in the standard case, the caller is a management-job and thus already holds the token (and nothing must be done). But in the rare case of an »outsider«, this guard now ''transparently'' acquires the token (possibly with a blocking wait) and ''drops it when leaving scope''	2023-12-19 23:38:57 +01:00
Ichthyostega	892099412c	Scheduler: integrate sanity check on timings ...especially to prevent a deadline way too far into the future, since this would provoke the BlockFlow (epoch based) memory manager to run out of space. Just based on gut feeling, I am now imposing a limit of 20seconds, which, given current parametrisation, with a minimum spacing of 6.6ms and 500 Activities per Block would at maximum require 360 MiB for the Activities, or 3000 Blocks. With that much blocks, the linear search would degrade horribly anyway...	2023-11-07 18:37:20 +01:00
Ichthyostega	86b90fbf84	Scheduler: draft high-level API for building a Job schedule The invocation structure is effectively determined by the Activity-chain builder from the Activity-Language; but, taking into account the complexity of the Scheduler code developed thus far, it seems prudent to encapsulate the topic of "Activities" altogether and expose only a convenience builder-API towards the Job-Planning	2023-11-06 06:00:00 +01:00
Ichthyostega	72258c06bd	Scheduler: reconciled into clearer design The problem with passing the deadline was just a blatant symptom that something with the overall design was not quite right, leading to mix-up of interfaces and implementation functions, and more and more detail parameters spreading throughout the call chains. The turning point was to realise the two conceptual levels crossing and interconnected within the »Scheduler-Service« - the Activity-Language describes the patterns of processing - the Scheduler components handle time-bound events So by turning the (previously private) queue entry into an ActivationEvent, the design could be balanced. This record becomes the common agens within the Scheduler, and builds upon / layers on top of the common agens of the Language, which is the Activity record.	2023-11-04 04:49:13 +01:00
Ichthyostega	b49de0738d	Scheduler: implement automatic clean-up of outdated entries Hooked into the existing processing logic at Layer-2, and relying on the information functions of Layer-1	2023-11-03 01:17:10 +01:00
Ichthyostega	b1e0ce1a79	Scheduler: define expected filtering behaviour for significant tasks	2023-11-03 00:31:33 +01:00
Ichthyostega	6166ab63f2	Scheduler: complete handling of the grooming-token - Ensure the grooming-token (lock) is reliably dropped - also explicitly drop it prior to trageted sleeps - properly signal when not able to acquire the token before dispatch - amend tests broken by changes since yesterday	2023-10-28 05:35:35 +02:00
Ichthyostega	a21057bdf2	Scheduler: control structure for the worker-functor	2023-10-22 23:25:35 +02:00
Ichthyostega	9ce3ad3d72	Scheduler: Layer-2 complete and tested (see #1326 ) * the implementation logic of the Scheduler is essentially complete now * all functionality necessary for the worker-function has been demonstrated As next step, the »Scheduler Service« can be assembled from the two Implementation Layers, the Activity-Language and the `BlockFlow` allocator This should then be verified by a multi-threaded integration test...	2023-10-19 01:49:08 +02:00
Ichthyostega	10a2c6908c	Scheduler: Layer-2 integration scenario complete could even rig the diagnostic Execution-Ctx to drop the GroomingToken at the point when switching to work-mode	2023-10-18 23:02:29 +02:00
Ichthyostega	c2ddaed28e	Scheduler: draft scenario for Layer-2 integration test Idea: re-use the scenario and instrumentation from SchedulerActivity_test::scenario_RenderJob()	2023-10-18 18:10:10 +02:00
Ichthyostega	ee09a2eff2	Scheduler: completed implementation of Layer-2 ...some further checks ...one integration test case needs to be written	2023-10-18 17:29:41 +02:00
Ichthyostega	93fcebb331	Scheduler: implement and verify postDispatch	2023-10-18 16:39:08 +02:00
Ichthyostega	666546856f	Scheduler: design the core API operation - postDispatch This central operation sits at a crossroad and is used - from external clients to fed new work to the Scheduler - from Workers to engage into execution of the next Activity - recursively from the execution of an Activity-chain From these requirements the semantics of behaviour can be derived regarding the GroomingToken and the result values, which indicate when follow-up work should be processed	2023-10-18 15:50:11 +02:00
Ichthyostega	55967cd649	Scheduler: work retrieval implementation - simple approach, delegating to Layer-1 - deliberately no error handling - GroomingToken not dropped	2023-10-18 04:18:01 +02:00
Ichthyostega	b57503fb97	Scheduler: define expected behaviour for work retrieval still not quite sure how to implement it, but working down from first principles to define test scenarios first...	2023-10-18 02:59:58 +02:00
Ichthyostega	aa60869082	Scheduler: decision logic for actual dispatch of activities	2023-10-18 01:38:58 +02:00
Ichthyostega	fa391d1267	Scheduler: torture test the thread access logic Ensure the GroomingToken mechanism indeed creates an exclusive section protected against concurrent corruption: Use a without / with-protection test and verify the results are exact vs. grossly broken	2023-10-17 21:35:37 +02:00
Ichthyostega	1223772f14	Scheduler: implement thread access logic T thread holding the »Grooming Token" is permitted to manipulate scheduler internals and thus also to define new activities; this logic is implemented as an Atomic lock, based on the current thread's ID.	2023-10-17 20:37:32 +02:00
Ichthyostega	862933e809	Scheduler: define API for Layer-2 Notably both Layers are conceived as functionality providers; only at Scheduler top-Level will functionality be combined with external dependencies to create the actual service.	2023-10-17 19:20:53 +02:00
Ichthyostega	997fc36c81	Workforce: implementation complete	2023-09-09 23:42:13 +02:00
Ichthyostega	70cd8af806	Workforce: requirement analysis	2023-09-05 00:22:17 +02:00
Ichthyostega	23a6fbdf4f	Scheduler: investigate modes of operation - analysis of Activity usage - derive possible memory management schemes - research regarding asynchronous IO - decision regarding the memory management scheme	2023-07-03 18:40:37 +02:00

28 commits