LUMIERA.clone/tests/vault/gear/stress-test-rig.hpp

644 lines
26 KiB
C++
Raw Normal View History

/*
STRESS-TEST-RIG.hpp - setup for stress and performance investigation
Copyright: clarify and simplify the file headers * Lumiera source code always was copyrighted by individual contributors * there is no entity "Lumiera.org" which holds any copyrights * Lumiera source code is provided under the GPL Version 2+ == Explanations == Lumiera as a whole is distributed under Copyleft, GNU General Public License Version 2 or above. For this to become legally effective, the ''File COPYING in the root directory is sufficient.'' The licensing header in each file is not strictly necessary, yet considered good practice; attaching a licence notice increases the likeliness that this information is retained in case someone extracts individual code files. However, it is not by the presence of some text, that legally binding licensing terms become effective; rather the fact matters that a given piece of code was provably copyrighted and published under a license. Even reformatting the code, renaming some variables or deleting parts of the code will not alter this legal situation, but rather creates a derivative work, which is likewise covered by the GPL! The most relevant information in the file header is the notice regarding the time of the first individual copyright claim. By virtue of this initial copyright, the first author is entitled to choose the terms of licensing. All further modifications are permitted and covered by the License. The specific wording or format of the copyright header is not legally relevant, as long as the intention to publish under the GPL remains clear. The extended wording was based on a recommendation by the FSF. It can be shortened, because the full terms of the license are provided alongside the distribution, in the file COPYING.
2024-11-17 23:42:55 +01:00
Copyright (C)
2024, Hermann Vosseler <Ichthyostega@web.de>
Copyright: clarify and simplify the file headers * Lumiera source code always was copyrighted by individual contributors * there is no entity "Lumiera.org" which holds any copyrights * Lumiera source code is provided under the GPL Version 2+ == Explanations == Lumiera as a whole is distributed under Copyleft, GNU General Public License Version 2 or above. For this to become legally effective, the ''File COPYING in the root directory is sufficient.'' The licensing header in each file is not strictly necessary, yet considered good practice; attaching a licence notice increases the likeliness that this information is retained in case someone extracts individual code files. However, it is not by the presence of some text, that legally binding licensing terms become effective; rather the fact matters that a given piece of code was provably copyrighted and published under a license. Even reformatting the code, renaming some variables or deleting parts of the code will not alter this legal situation, but rather creates a derivative work, which is likewise covered by the GPL! The most relevant information in the file header is the notice regarding the time of the first individual copyright claim. By virtue of this initial copyright, the first author is entitled to choose the terms of licensing. All further modifications are permitted and covered by the License. The specific wording or format of the copyright header is not legally relevant, as long as the intention to publish under the GPL remains clear. The extended wording was based on a recommendation by the FSF. It can be shortened, because the full terms of the license are provided alongside the distribution, in the file COPYING.
2024-11-17 23:42:55 +01:00
  **Lumiera** is free software; you can redistribute it and/or modify it
  under the terms of the GNU General Public License as published by the
  Free Software Foundation; either version 2 of the License, or (at your
  option) any later version. See the file COPYING for further details.
*/
/** @file stress-test-rig.hpp
** A test bench to conduct performance measurement series. Outfitted especially
** to determine runtime behaviour of the Scheduler and associated parts of the
** Lumiera Engine through systematic execution of load scenarios.
**
** # Scheduler Stress Testing
**
** The point of departure for any stress testing is to show that the subject will
** break in controlled ways only. For the Scheduler this can easily be achieved by
** overloading until job deadlines are broken. Much more challenging however is the
** task to find out about the boundary of regular scheduler operation. This realm
** can be defined by the ability of the scheduler to follow and conform to the
** timings set out explicitly in the schedule. Obviously, short and localised
** load peaks can be accommodated, yet once a persistent backlog builds up,
** the schedule starts to slip and the calculation process will flounder.
**
** A method to determine such a _»breaking point«_ in a systematic way is based on
** building a [synthetic calculation load](\ref test-chain-load.hpp) and establish
** the timings of a test schedule based on a simplified model of expected computation
** expense. By scaling and condensing these schedule timings, a loss of control can
** be provoked, and determined by statistical observation: since the process of
** scheduling contains an essentially random component, persistent overload will be
** indicated by an increasing variance of the overall runtime, and a departure from
** the nominal runtime of the executed schedule.
**
** Another, complimentary observation method is to inject a defined and homogeneous
** load peak into the scheduler and then watch the time it takes to process, the
** processing overhead and achieved degree of concurrency. The actual observation
** using this measurement setup attempts to establish a single _control parameter_
** as free variable, allowing to look for correlations and to build a linear
** regression model to characterise a supposed functional dependency. Simply put,
** given a number of fixed sizes jobs (not further correlated) as input, this
** approach yields a »number of jobs per time unit« and »socked overhead«
** thereby distilling a _behaviour model_ to describe the actual stochastic data.
**
** ## Setup
** To perform this test scheme, an operational Scheduler is required, and an instance
** of the TestChainLoad must be provided, configured with desired load properties.
** Moreover, the actual measurement setup requires to perform several test executions,
** controlling some parameters in accordance to the observation scheme. The control
** parameters and the specifics of the actual setup should be clearly visible, while
** hiding the complexities of measurement execution.
**
** This can be achieved by a »Toolbench«, which is a framework with building blocks,
** providing a pre-arranged _measurement rig_ for the various kinds of measurement setup.
** The implementation code is arranged as a »sandwich« structure...
** - StressTestRig, which is also the framework class, acts as _bottom layer_ to
** provide an anchor point, some common definitions implying an invocation scheme
** + first a TestChainLoad topology is constructed, based on test parameters
** + this is used to create a TestChainLoad::SchedulerCtx, which is then
** outfitted specifically for each test run
** - the _middle layer_ is a custom `Setup` class, which inherits from the bottom
** layer and fills in the actual topology and configuration for the desired test
** - the test performance is then initiated by layering a specific _test tool_ on
** top of the compound, which in turn picks up the parametrisation from the Setup
** and base configuration, visible as base class (template param) \a CONF
** Together, this leads to the following code scheme, which aims to simplify experimentation:
** \code
** using StressRig = StressTestRig<16>;
**
** struct Setup : StressRig
** {
** uint CONCURRENCY = 4;
** //// more definitions
**
** auto testLoad()
** {....define a Test-Chain-Load topology....}
**
** auto testSetup (TestLoad& testLoad)
** { return StressRig::testSetup(testLoad)
** .withLoadTimeBase(500us)
** // ....more customisation here
** }
** };
**
** auto result = StressRig::with<Setup>()
** .perform<bench::SpecialToolClass>();
** \endcode
**
** ## Breaking Point search
** The bench::BreakingPoint tool typically uses a complex interwoven job plan, which is
** tightened until the timing breaks. The _stressFactor_ of the generated schedule will be
** the active parameter of this test, performing a _binary search_ for the _breaking point._
** The Measurement attempts to narrow down to the point of massive failure, when the ability
** to somehow cope with the schedule completely break down. Based on watching the Scheduler
** in operation, the detection was linked to three conditions, which typically will be
** triggered together, and within a narrow and reproducible parameter range:
** - an individual run counts as _accidentally failed_ when the execution slips
** away by more than 2ms with respect to the defined overall schedule. When more
** than 55% of all observed runs are considered as failed, the first condition is met
** - moreover, the observed ''standard derivation'' must also surpass the same limit
** of > 2ms, which indicates that the Scheduling mechanism is under substantial
** strain; in regular operation, the slip is rather ~ 200µs.
** - the third condition is that the ''averaged delta'' has surpassed 4ms,
** which is 2 times the basic failure indicator.
**
** ## Parameter Correlation
** As a complement, the bench::ParameterRange tool is provided to run a specific Scheduler setup
** while varying a single control parameter within defined limits. This produces a set of (x,y) data,
** which can be used to search for correlations or build a linear regression model to describe the
** Scheduler's behaviour as function of the control parameter. The typical use case would be to use
** the input length (number of Jobs) as control parameter, leading to a model for Scheduling expense.
**
** ## Observation tools
** The TestChainLoad, together with its helpers and framework, already offers some tools to visualise
** the generated topology and to calculate statistics, and to watch an performance with instrumentation.
** In addition, the individual tools provide some debugging output to watch the measurement scheme.
** Result data is either a tuple of values (in case of bench::BreakingPoint), or a table of result
** data as function of the control parameter (for bench::ParameterRange). Result data, when converted
** to CSV, can be visualised as Gnuplot diagram.
** @see TestChainLoad_test
** @see SchedulerStress_test
** @see binary-search.hpp
** @see gnuplot-gen.hpp
*/
#ifndef VAULT_GEAR_TEST_STRESS_TEST_RIG_H
#define VAULT_GEAR_TEST_STRESS_TEST_RIG_H
#include "test-chain-load.hpp"
#include "lib/binary-search.hpp"
#include "lib/test/transiently.hpp"
#include "vault/gear/scheduler.hpp"
#include "lib/time/timevalue.hpp"
#include "lib/meta/function.hpp"
#include "lib/format-string.hpp"
#include "lib/format-cout.hpp"
#include "lib/gnuplot-gen.hpp"
#include "lib/stat/statistic.hpp"
#include "lib/stat/data.hpp"
#include "lib/random.hpp"
#include "lib/util.hpp"
#include <algorithm>
#include <utility>
#include <vector>
#include <tuple>
#include <array>
namespace vault{
namespace gear {
namespace test {
using std::make_tuple;
using std::forward;
/**
* Configurable template framework for running Scheduler Stress tests
* Use to build a custom setup class, which is then [injected](\ref StressTestRig::with)
* to [perform](\ref StressTestRig::Launcher::perform) a _specific measurement tool._
* Several tools and detailed customisations are available in `namespace bench`
* - bench::BreakingPoint conducts a binary search to _break a schedule_
* - bench::ParameterRange performs a randomised series of parametrised test runs
*/
Scheduler-test: design problems impeding clean test-setup Encountering ''just some design problems related to the test setup,'' which however turn out hard to overcome. Seems that, in my eagerness to create a succinct and clear presentation of the test, I went into danger territory, overstretching the abilities of the C++ language. After working with a set of tools created step by step over an extended span of time, ''for me'' the machinations of this setup seem to be reduced to flipping a toggle here and there, and I want to focus these active parts while laying out this test. ''This would require'' to create a system of nested scopes, while getting more and more specific gradually, and moving to the individual case at question; notably any clarification and definition within those inner focused contexts would have to be picked up and linked in dynamically. Yet the C++ language only allows to be ''either'' open and flexible towards the actual types, or ''alternatively'' to select dynamically within a fixed set of (virtual) methods, which then must be determined from the beginning. It is not possible to tweak and adjust base definitions after the fact, and it is not possible to fill in constant definitions dynamically with late binding to some specific implementation type provided only at current scope. Seems that I am running against that brick wall over and over again, piling up complexities driven by an desire for succinctness and clarity. Now attempting to resolve this quite frustrating situation... - fix the actual type of the TestChainLoad by a typedef in test context - avoid the definitions (and thus the danger of shadowing) and use one `testSetup()` method to place all local adjustments.
2024-04-06 23:21:10 +02:00
template<size_t maxFan =DEFAULT_FAN>
class StressTestRig
: util::NonCopyable
{
public:
Scheduler-test: design problems impeding clean test-setup Encountering ''just some design problems related to the test setup,'' which however turn out hard to overcome. Seems that, in my eagerness to create a succinct and clear presentation of the test, I went into danger territory, overstretching the abilities of the C++ language. After working with a set of tools created step by step over an extended span of time, ''for me'' the machinations of this setup seem to be reduced to flipping a toggle here and there, and I want to focus these active parts while laying out this test. ''This would require'' to create a system of nested scopes, while getting more and more specific gradually, and moving to the individual case at question; notably any clarification and definition within those inner focused contexts would have to be picked up and linked in dynamically. Yet the C++ language only allows to be ''either'' open and flexible towards the actual types, or ''alternatively'' to select dynamically within a fixed set of (virtual) methods, which then must be determined from the beginning. It is not possible to tweak and adjust base definitions after the fact, and it is not possible to fill in constant definitions dynamically with late binding to some specific implementation type provided only at current scope. Seems that I am running against that brick wall over and over again, piling up complexities driven by an desire for succinctness and clarity. Now attempting to resolve this quite frustrating situation... - fix the actual type of the TestChainLoad by a typedef in test context - avoid the definitions (and thus the danger of shadowing) and use one `testSetup()` method to place all local adjustments.
2024-04-06 23:21:10 +02:00
using TestLoad = TestChainLoad<maxFan>;
using TestSetup = typename TestLoad::ScheduleCtx;
/***********************************************************************//**
* Entrance Point: build a stress test measurement setup using a dedicated
* \a TOOL class, takes the configuration \a CONF as template parameter
* and which is assumed to inherit (indirectly) from StressRig.
* @tparam CONF specialised subclass of StressRig with customisation
* @return a builder to configure and then launch the actual test
*/
template<class CONF>
static auto
with()
{
return Launcher<CONF>{};
}
/* ======= default configuration (inherited) ======= */
uint CONCURRENCY = work::Config::getDefaultComputationCapacity();
bool INSTRUMENTATION = true;
double EPSILON = 0.01; ///< error bound to abort binary search
double UPPER_STRESS = 1.7; ///< starting point for the upper limit, likely to fail
double FAIL_LIMIT = 2.0; ///< delta-limit when to count a run as failure
double TRIGGER_FAIL = 0.55; ///< %-fact: criterion-1 failures above this rate
double TRIGGER_SDEV = FAIL_LIMIT; ///< in ms : criterion-2 standard derivation
double TRIGGER_DELTA = 2*FAIL_LIMIT; ///< in ms : criterion-3 average delta above this limit
bool showRuns = false; ///< print a line for each individual run
bool showStep = true; ///< print a line for each binary search step
bool showRes = true; ///< print result data
bool showRef = true; ///< calculate single threaded reference time
static uint constexpr REPETITIONS{20};
BlockFlowAlloc bFlow{};
EngineObserver watch{};
Scheduler scheduler{bFlow, watch};
protected:
/** Extension point: build the computation topology for this test */
auto
testLoad(size_t nodes =64)
{
Scheduler-test: design problems impeding clean test-setup Encountering ''just some design problems related to the test setup,'' which however turn out hard to overcome. Seems that, in my eagerness to create a succinct and clear presentation of the test, I went into danger territory, overstretching the abilities of the C++ language. After working with a set of tools created step by step over an extended span of time, ''for me'' the machinations of this setup seem to be reduced to flipping a toggle here and there, and I want to focus these active parts while laying out this test. ''This would require'' to create a system of nested scopes, while getting more and more specific gradually, and moving to the individual case at question; notably any clarification and definition within those inner focused contexts would have to be picked up and linked in dynamically. Yet the C++ language only allows to be ''either'' open and flexible towards the actual types, or ''alternatively'' to select dynamically within a fixed set of (virtual) methods, which then must be determined from the beginning. It is not possible to tweak and adjust base definitions after the fact, and it is not possible to fill in constant definitions dynamically with late binding to some specific implementation type provided only at current scope. Seems that I am running against that brick wall over and over again, piling up complexities driven by an desire for succinctness and clarity. Now attempting to resolve this quite frustrating situation... - fix the actual type of the TestChainLoad by a typedef in test context - avoid the definitions (and thus the danger of shadowing) and use one `testSetup()` method to place all local adjustments.
2024-04-06 23:21:10 +02:00
return TestLoad{nodes};
}
Scheduler-test: design problems impeding clean test-setup Encountering ''just some design problems related to the test setup,'' which however turn out hard to overcome. Seems that, in my eagerness to create a succinct and clear presentation of the test, I went into danger territory, overstretching the abilities of the C++ language. After working with a set of tools created step by step over an extended span of time, ''for me'' the machinations of this setup seem to be reduced to flipping a toggle here and there, and I want to focus these active parts while laying out this test. ''This would require'' to create a system of nested scopes, while getting more and more specific gradually, and moving to the individual case at question; notably any clarification and definition within those inner focused contexts would have to be picked up and linked in dynamically. Yet the C++ language only allows to be ''either'' open and flexible towards the actual types, or ''alternatively'' to select dynamically within a fixed set of (virtual) methods, which then must be determined from the beginning. It is not possible to tweak and adjust base definitions after the fact, and it is not possible to fill in constant definitions dynamically with late binding to some specific implementation type provided only at current scope. Seems that I am running against that brick wall over and over again, piling up complexities driven by an desire for succinctness and clarity. Now attempting to resolve this quite frustrating situation... - fix the actual type of the TestChainLoad by a typedef in test context - avoid the definitions (and thus the danger of shadowing) and use one `testSetup()` method to place all local adjustments.
2024-04-06 23:21:10 +02:00
/** (optional) extension point: base configuration of the test ScheduleCtx
* @warning the actual setup \a CONF is layered, beware of shadowing. */
auto
Scheduler-test: design problems impeding clean test-setup Encountering ''just some design problems related to the test setup,'' which however turn out hard to overcome. Seems that, in my eagerness to create a succinct and clear presentation of the test, I went into danger territory, overstretching the abilities of the C++ language. After working with a set of tools created step by step over an extended span of time, ''for me'' the machinations of this setup seem to be reduced to flipping a toggle here and there, and I want to focus these active parts while laying out this test. ''This would require'' to create a system of nested scopes, while getting more and more specific gradually, and moving to the individual case at question; notably any clarification and definition within those inner focused contexts would have to be picked up and linked in dynamically. Yet the C++ language only allows to be ''either'' open and flexible towards the actual types, or ''alternatively'' to select dynamically within a fixed set of (virtual) methods, which then must be determined from the beginning. It is not possible to tweak and adjust base definitions after the fact, and it is not possible to fill in constant definitions dynamically with late binding to some specific implementation type provided only at current scope. Seems that I am running against that brick wall over and over again, piling up complexities driven by an desire for succinctness and clarity. Now attempting to resolve this quite frustrating situation... - fix the actual type of the TestChainLoad by a typedef in test context - avoid the definitions (and thus the danger of shadowing) and use one `testSetup()` method to place all local adjustments.
2024-04-06 23:21:10 +02:00
testSetup (TestLoad& testLoad)
{
return testLoad.setupSchedule(scheduler)
Scheduler-test: design problems impeding clean test-setup Encountering ''just some design problems related to the test setup,'' which however turn out hard to overcome. Seems that, in my eagerness to create a succinct and clear presentation of the test, I went into danger territory, overstretching the abilities of the C++ language. After working with a set of tools created step by step over an extended span of time, ''for me'' the machinations of this setup seem to be reduced to flipping a toggle here and there, and I want to focus these active parts while laying out this test. ''This would require'' to create a system of nested scopes, while getting more and more specific gradually, and moving to the individual case at question; notably any clarification and definition within those inner focused contexts would have to be picked up and linked in dynamically. Yet the C++ language only allows to be ''either'' open and flexible towards the actual types, or ''alternatively'' to select dynamically within a fixed set of (virtual) methods, which then must be determined from the beginning. It is not possible to tweak and adjust base definitions after the fact, and it is not possible to fill in constant definitions dynamically with late binding to some specific implementation type provided only at current scope. Seems that I am running against that brick wall over and over again, piling up complexities driven by an desire for succinctness and clarity. Now attempting to resolve this quite frustrating situation... - fix the actual type of the TestChainLoad by a typedef in test context - avoid the definitions (and thus the danger of shadowing) and use one `testSetup()` method to place all local adjustments.
2024-04-06 23:21:10 +02:00
.withLevelDuration(200us)
.withJobDeadline(500ms)
.withUpfrontPlanning();
}
template<class CONF>
struct Launcher : CONF
{
template<template<class> class TOOL, typename...ARGS>
auto
perform (ARGS&& ...args)
{
return TOOL<CONF>{}.perform (std::forward<ARGS> (args)...);
}
};
};
namespace bench { ///< Specialised tools to investigate scheduler performance
using util::_Fmt;
using util::min;
using util::max;
using std::vector;
using std::declval;
/**************************************************//**
* Specific stress test scheme to determine the
* »breaking point« where the Scheduler starts to slip
*/
template<class CONF>
class BreakingPoint
: public CONF
{
Scheduler-test: design problems impeding clean test-setup Encountering ''just some design problems related to the test setup,'' which however turn out hard to overcome. Seems that, in my eagerness to create a succinct and clear presentation of the test, I went into danger territory, overstretching the abilities of the C++ language. After working with a set of tools created step by step over an extended span of time, ''for me'' the machinations of this setup seem to be reduced to flipping a toggle here and there, and I want to focus these active parts while laying out this test. ''This would require'' to create a system of nested scopes, while getting more and more specific gradually, and moving to the individual case at question; notably any clarification and definition within those inner focused contexts would have to be picked up and linked in dynamically. Yet the C++ language only allows to be ''either'' open and flexible towards the actual types, or ''alternatively'' to select dynamically within a fixed set of (virtual) methods, which then must be determined from the beginning. It is not possible to tweak and adjust base definitions after the fact, and it is not possible to fill in constant definitions dynamically with late binding to some specific implementation type provided only at current scope. Seems that I am running against that brick wall over and over again, piling up complexities driven by an desire for succinctness and clarity. Now attempting to resolve this quite frustrating situation... - fix the actual type of the TestChainLoad by a typedef in test context - avoid the definitions (and thus the danger of shadowing) and use one `testSetup()` method to place all local adjustments.
2024-04-06 23:21:10 +02:00
using TestLoad = typename CONF::TestLoad;
using TestSetup = typename TestLoad::ScheduleCtx;
struct Res
{
double stressFac{0};
double percentOff{0};
double stdDev{0};
double avgDelta{0};
double avgTime{0};
double expTime{0};
};
/** prepare the ScheduleCtx for a specifically parametrised test series */
void
configureTest (TestSetup& testSetup, double stressFac)
{
testSetup.withInstrumentation(CONF::INSTRUMENTATION) // side-effect: clear existing statistics
.withAdaptedSchedule(stressFac, CONF::CONCURRENCY, adjustmentFac);
}
/** perform a repetition of test runs and compute statistics */
Res
runProbes (TestSetup& testSetup, double stressFac)
{
auto sqr = [](auto n){ return n*n; };
Res res;
auto& [sf,pf,sdev,avgD,avgT,expT] = res;
sf = stressFac;
std::array<double, CONF::REPETITIONS> runTime;
for (uint i=0; i<CONF::REPETITIONS; ++i)
{
runTime[i] = testSetup.launch_and_wait() / 1000;
avgT += runTime[i];
maybeAdaptScaleEmpirically (testSetup, stressFac);
}
expT = testSetup.getExpectedEndTime() / 1000;
avgT /= CONF::REPETITIONS;
avgD = (avgT-expT); // can be < 0
for (uint i=0; i<CONF::REPETITIONS; ++i)
{
sdev += sqr (runTime[i] - avgT);
double delta = (runTime[i] - expT);
bool fail = (delta > CONF::FAIL_LIMIT);
if (fail)
++ pf;
showRun(i, delta, runTime[i], runTime[i] > avgT, fail);
}
pf /= CONF::REPETITIONS;
sdev = sqrt (sdev/CONF::REPETITIONS);
showStep(res);
return res;
}
/** criterion to decide if this test series constitutes a slipped schedule */
bool
decideBreakPoint (Res& res)
{
return res.percentOff > 0.99
or( res.percentOff > CONF::TRIGGER_FAIL
and res.stdDev > CONF::TRIGGER_SDEV
and res.avgDelta > CONF::TRIGGER_DELTA);
}
/**
* invoke a binary search to produce a sequence of test series
* with the goal to narrow down the stressFact where the Schedule slips away.
*/
template<class FUN>
Res
conductBinarySearch (FUN&& runTestCase, vector<Res> const& results)
{
double breakPoint = lib::binarySearch_upper (forward<FUN> (runTestCase)
, 0.0, CONF::UPPER_STRESS
, CONF::EPSILON);
uint s = results.size();
ENSURE (s >= 2);
Res res;
auto& [sf,pf,sdev,avgD,avgT,expT] = res;
// average data over the last three steps investigated for smoothing
uint points = min (results.size(), 3u);
for (uint i=results.size()-points; i<results.size(); ++i)
{
Res const& resx = results[i];
pf += resx.percentOff;
sdev += resx.stdDev;
avgD += resx.avgDelta;
avgT += resx.avgTime;
expT += resx.expTime;
}
pf /= points;
sdev /= points;
avgD /= points;
avgT /= points;
expT /= points;
sf = breakPoint;
return res;
}
/** adaptive scale correction based on observed behaviour */
double adjustmentFac{1.0};
size_t gaugeProbes = 3 * CONF::REPETITIONS;
/**
* Attempt to factor out some observable properties, which are considered circumstantial
* and not a direct result of scheduling overheads. The artificial computational load is
* known to drift towards larger values than calibrated; moreover the actual concurrency
* achieved can deviate from the heuristic assumptions built into the testing schedule.
* The latter is problematic to some degree however, since the Scheduler is bound to
* scale down capacity when idle. To strike a reasonable balance, this adjustment of
* the measurement scale is done only initially, and when the stress factor is high
* and some degree of pressure on the scheduler can thus be assumed.
*/
void
maybeAdaptScaleEmpirically (TestSetup& testSetup, double stressFac)
{
if (not gaugeProbes) return;
double gain = util::limited (0, pow(stressFac, 9), 1);
if (gain < 0.2) return;
//
double formFac = testSetup.determineEmpiricFormFactor (CONF::CONCURRENCY);
adjustmentFac = gain*formFac + (1-gain)*adjustmentFac;
testSetup.withAdaptedSchedule(stressFac, CONF::CONCURRENCY, adjustmentFac);
--gaugeProbes;
}
_Fmt fmtRun_ {"....·%-2d: Δ=%4.1f t=%4.1f %s %s"}; // i % Δ % t % t>avg? % fail?
_Fmt fmtStep_{ "%4.2f| : ∅Δ=%4.1f±%-4.2f ∅t=%4.1f %s %%%-3.0f -- expect:%4.1fms"};// stress % ∅Δ % σ % ∅t % fail % pecentOff % t-expect
_Fmt fmtResSDv_{"%9s= %5.2f ±%4.2f%s"};
_Fmt fmtResVal_{"%9s: %5.2f%s"};
void
showRun(uint i, double delta, double t, bool over, bool fail)
{
if (CONF::showRuns)
cout << fmtRun_ % i % delta % t % (over? "+":"-") % (fail? "":"")
<< endl;
}
void
showStep(Res& res)
{
if (CONF::showStep)
cout << fmtStep_ % res.stressFac % res.avgDelta % res.stdDev % res.avgTime
% (decideBreakPoint(res)? "—◆—":"—◇—")
% (100*res.percentOff) % res.expTime
<< endl;
}
void
showRes(Res& res)
{
if (CONF::showRes)
{
cout << fmtResVal_ % "stresFac" % res.stressFac % "" <<endl;
cout << fmtResVal_ % "fail" %(res.percentOff * 100) % '%' <<endl;
cout << fmtResSDv_ % "delta" % res.avgDelta % res.stdDev % "ms"<<endl;
cout << fmtResVal_ % "runTime" % res.avgTime % "ms"<<endl;
cout << fmtResVal_ % "expected" % res.expTime % "ms"<<endl;
}
}
void
Scheduler-test: design problems impeding clean test-setup Encountering ''just some design problems related to the test setup,'' which however turn out hard to overcome. Seems that, in my eagerness to create a succinct and clear presentation of the test, I went into danger territory, overstretching the abilities of the C++ language. After working with a set of tools created step by step over an extended span of time, ''for me'' the machinations of this setup seem to be reduced to flipping a toggle here and there, and I want to focus these active parts while laying out this test. ''This would require'' to create a system of nested scopes, while getting more and more specific gradually, and moving to the individual case at question; notably any clarification and definition within those inner focused contexts would have to be picked up and linked in dynamically. Yet the C++ language only allows to be ''either'' open and flexible towards the actual types, or ''alternatively'' to select dynamically within a fixed set of (virtual) methods, which then must be determined from the beginning. It is not possible to tweak and adjust base definitions after the fact, and it is not possible to fill in constant definitions dynamically with late binding to some specific implementation type provided only at current scope. Seems that I am running against that brick wall over and over again, piling up complexities driven by an desire for succinctness and clarity. Now attempting to resolve this quite frustrating situation... - fix the actual type of the TestChainLoad by a typedef in test context - avoid the definitions (and thus the danger of shadowing) and use one `testSetup()` method to place all local adjustments.
2024-04-06 23:21:10 +02:00
showRef(TestSetup& testSetup)
{
if (CONF::showRef)
cout << fmtResVal_ % "refTime"
Scheduler-test: design problems impeding clean test-setup Encountering ''just some design problems related to the test setup,'' which however turn out hard to overcome. Seems that, in my eagerness to create a succinct and clear presentation of the test, I went into danger territory, overstretching the abilities of the C++ language. After working with a set of tools created step by step over an extended span of time, ''for me'' the machinations of this setup seem to be reduced to flipping a toggle here and there, and I want to focus these active parts while laying out this test. ''This would require'' to create a system of nested scopes, while getting more and more specific gradually, and moving to the individual case at question; notably any clarification and definition within those inner focused contexts would have to be picked up and linked in dynamically. Yet the C++ language only allows to be ''either'' open and flexible towards the actual types, or ''alternatively'' to select dynamically within a fixed set of (virtual) methods, which then must be determined from the beginning. It is not possible to tweak and adjust base definitions after the fact, and it is not possible to fill in constant definitions dynamically with late binding to some specific implementation type provided only at current scope. Seems that I am running against that brick wall over and over again, piling up complexities driven by an desire for succinctness and clarity. Now attempting to resolve this quite frustrating situation... - fix the actual type of the TestChainLoad by a typedef in test context - avoid the definitions (and thus the danger of shadowing) and use one `testSetup()` method to place all local adjustments.
2024-04-06 23:21:10 +02:00
% (testSetup.calcRuntimeReference() /1000)
% "ms" << endl;
}
public:
/**
* Launch a measurement sequence to determine the »breaking point«
* for the configured test load and parametrisation of the Scheduler.
* @return a tuple `[stress-factor, delta, run-time]`
*/
auto
perform()
{
TRANSIENTLY(work::Config::COMPUTATION_CAPACITY) = CONF::CONCURRENCY;
TestLoad testLoad = CONF::testLoad().buildTopology();
TestSetup testSetup = CONF::testSetup (testLoad);
vector<Res> observations;
auto performEvaluation = [&](double stressFac)
{
configureTest (testSetup, stressFac);
auto res = runProbes (testSetup, stressFac);
observations.push_back (res);
return decideBreakPoint(res);
};
Res res = conductBinarySearch (move(performEvaluation), observations);
showRes (res);
Scheduler-test: design problems impeding clean test-setup Encountering ''just some design problems related to the test setup,'' which however turn out hard to overcome. Seems that, in my eagerness to create a succinct and clear presentation of the test, I went into danger territory, overstretching the abilities of the C++ language. After working with a set of tools created step by step over an extended span of time, ''for me'' the machinations of this setup seem to be reduced to flipping a toggle here and there, and I want to focus these active parts while laying out this test. ''This would require'' to create a system of nested scopes, while getting more and more specific gradually, and moving to the individual case at question; notably any clarification and definition within those inner focused contexts would have to be picked up and linked in dynamically. Yet the C++ language only allows to be ''either'' open and flexible towards the actual types, or ''alternatively'' to select dynamically within a fixed set of (virtual) methods, which then must be determined from the beginning. It is not possible to tweak and adjust base definitions after the fact, and it is not possible to fill in constant definitions dynamically with late binding to some specific implementation type provided only at current scope. Seems that I am running against that brick wall over and over again, piling up complexities driven by an desire for succinctness and clarity. Now attempting to resolve this quite frustrating situation... - fix the actual type of the TestChainLoad by a typedef in test context - avoid the definitions (and thus the danger of shadowing) and use one `testSetup()` method to place all local adjustments.
2024-04-06 23:21:10 +02:00
showRef (testSetup);
return make_tuple (res.stressFac, res.avgDelta, res.avgTime);
}
};
/**************************************************//**
* Specific test scheme to perform a Scheduler setup
* over a given control parameter range to determine
* correlations
*/
template<class CONF>
class ParameterRange
: public CONF
{
Scheduler-test: design problems impeding clean test-setup Encountering ''just some design problems related to the test setup,'' which however turn out hard to overcome. Seems that, in my eagerness to create a succinct and clear presentation of the test, I went into danger territory, overstretching the abilities of the C++ language. After working with a set of tools created step by step over an extended span of time, ''for me'' the machinations of this setup seem to be reduced to flipping a toggle here and there, and I want to focus these active parts while laying out this test. ''This would require'' to create a system of nested scopes, while getting more and more specific gradually, and moving to the individual case at question; notably any clarification and definition within those inner focused contexts would have to be picked up and linked in dynamically. Yet the C++ language only allows to be ''either'' open and flexible towards the actual types, or ''alternatively'' to select dynamically within a fixed set of (virtual) methods, which then must be determined from the beginning. It is not possible to tweak and adjust base definitions after the fact, and it is not possible to fill in constant definitions dynamically with late binding to some specific implementation type provided only at current scope. Seems that I am running against that brick wall over and over again, piling up complexities driven by an desire for succinctness and clarity. Now attempting to resolve this quite frustrating situation... - fix the actual type of the TestChainLoad by a typedef in test context - avoid the definitions (and thus the danger of shadowing) and use one `testSetup()` method to place all local adjustments.
2024-04-06 23:21:10 +02:00
using TestLoad = typename CONF::TestLoad;
using TestSetup = typename TestLoad::ScheduleCtx;
// Type binding for data evaluation
using Param = typename CONF::Param;
using Table = typename CONF::Table;
void
runTest (Param param, Table& data)
{
TestLoad testLoad = CONF::testLoad(param).buildTopology();
TestSetup testSetup = CONF::testSetup (testLoad)
.withInstrumentation(); // Note: by default Schedule with CONF::LEVEL_STEP
double millis = testSetup.launch_and_wait() / 1000;
auto stat = testSetup.getInvocationStatistic();
CONF::collectResult (data, param, millis, stat);
}
public:
/**
* Launch a measurement sequence running the Scheduler with a
* varying parameter value to investigate (x,y) correlations.
* @return ////TODO a tuple `[stress-factor, ∅delta, ∅run-time]`
*/
Table
perform (Param lower, Param upper)
{
TRANSIENTLY(work::Config::COMPUTATION_CAPACITY) = CONF::CONCURRENCY;
Param dist = upper - lower;
uint cnt = CONF::REPETITIONS;
vector<Param> points;
points.reserve (cnt);
Param minP{upper}, maxP{lower};
for (uint i=0; i<cnt; ++i)
{
auto random = lib::defaultGen.uni(); // [0 .. 1.0[
Param pos = lower + Param(floor (random*dist + 0.5));
points.push_back(pos);
minP = min (pos, minP);
maxP = max (pos, maxP);
}
// ensure the bounds participate in test
if (maxP < upper) points[cnt-2] = upper;
if (minP > lower) points[cnt-1] = lower;
Table results;
for (Param point : points)
runTest (point, results);
return results;
}
};
/* ====== Preconfigured ParamRange-Evaluations ====== */
using lib::stat::Column;
using lib::stat::DataTable;
using lib::stat::DataSpan;
using lib::stat::CSVData;
using IncidenceStat = lib::IncidenceCount::Statistic;
/**
* Calculate a linear regression model for two table columns
* @return a tuple `(socket,gradient,Vector(predicted),Vector(deltas),correlation,maxDelta,stdev)`
*/
template<typename F, typename G>
inline auto
linearRegression (Column<F> const& x, Column<G> const& y)
{
lib::stat::RegressionData points;
size_t cnt = min (x.data.size(), y.data.size());
points.reserve (cnt);
for (size_t i=0; i < cnt; ++i)
points.emplace_back (x.data[i], y.data[i]);
return lib::stat::computeLinearRegression (points);
}
/**
* Mix-in for setup of a #ParameterRange evaluation to watch
* the processing of a single load peak, using the number of
* added job as independent parameter.
* @remark inject this definition (by inheritance) into the
* Setup, which should then also define a TestChainLoad
* graph with an overall size controlled by the #Param
* @see SchedulerStress_test#watch_expenseFunction()
*/
struct LoadPeak_ParamRange_Evaluation
{
using Param = size_t;
struct DataRow
{
Column<Param> param {"load size"}; // independent variable / control parameter
Column<double> time {"result time"};
Column<double> conc {"concurrency"};
Column<double> jobtime {"avg jobtime"};
Column<double> impeded {"avg impeded"};
auto allColumns()
{ return std::tie(param
,time
,conc
,jobtime
,impeded
);
}
};
using Table = DataTable<DataRow>;
void
collectResult(Table& data, Param param, double millis, bench::IncidenceStat const& stat)
{
(void)millis;
data.newRow();
data.param = param;
data.time = stat.coveredTime / 1000;
data.conc = stat.avgConcurrency;
data.jobtime = stat.activeTime / stat.activationCnt;
data.impeded = (stat.timeAtConc(1) + stat.timeAtConc(0))/stat.activationCnt;
}
static double
avgConcurrency (Table const& results)
{
return lib::stat::average (DataSpan<double> (results.conc.data));
}
static string
renderGnuplot (Table const& results)
{
using namespace lib::gnuplot_gen;
string csv = results.renderCSV();
Param maxParam = * std::max_element (results.param.data.begin(), results.param.data.end());
Param xtics = maxParam > 500? 50
: maxParam > 200? 20
: maxParam > 100? 10
: 5;
return scatterRegression(
ParamRecord().set (KEY_CSVData, csv)
.set (KEY_TermSize, "600,600")
.set (KEY_Xtics, int64_t(xtics))
.set (KEY_Xlabel, "load size ⟶ number of jobs")
.set (KEY_Ylabel, "active time ⟶ ms")
.set (KEY_Y2label, "concurrent threads ⟶")
.set (KEY_Y3label, "avg job time ⟶ µs")
);
}
};
//
}// namespace bench
}}}// namespace vault::gear::test
#endif /*VAULT_GEAR_TEST_STRESS_TEST_RIG_H*/