Scheduler-test: reduce impact of scale adjustments on breakpoint-search

the `BreakingPoint` tool conducts a binary search to find the ''stress factor''
where a given schedule breaks. There are some known deviations related to the
measurement setup, which unfortunately impact the interpretation of the
''stress factor'' scale. Earlier, an attempt was made, to watch those factors
empirically and work a ''form factor'' into the ''effective stress factor''
used to guide this measurement method.

Closer investigation with extended and elastic load patters now revealed
a strong tendency of the Scheduler to scale down the work resources when not
fully loaded. This may be mistaken by the above mentioned adjustments as a sign
of a structural limiation of the possible concurrency.

Thus, as a mitigation, those adjustments are now only performed at the
beginning of the measurement series, and also only when the stress factor
is high (implying that the scheduler is actually overloaded and thus has
no incentive for scaling down).

These observations indicate that the »Breaking Point« search must be taken
with a grain of salt: Especially when the test load does ''not'' contain
a high degree of inter dependencies, it will be ''stretched elastically''
rather than outright broken. And under such circumstances, this measurement
actually gauges the Scheduler's ability to comply to an established
load and computation goal.
This commit is contained in:
Fischlurch 2024-04-17 21:04:03 +02:00
parent 55842a8d4f
commit c934e7f079
10 changed files with 1731 additions and 119 deletions

View file

@ -0,0 +1,475 @@
digraph {
// Nodes
N0[label="0: 37", shape=doublecircle ]
N1[label="1: 37", shape=circle ]
N2[label="2: 37", shape=circle ]
N3[label="3: 4F", shape=box, style=rounded ]
N4[label="4: 37", shape=circle ]
N5[label="5: 37", shape=circle ]
N6[label="6: 4F", shape=box, style=rounded ]
N7[label="7: 37", shape=circle ]
N8[label="8: 37", shape=circle ]
N9[label="9: 4F", shape=box, style=rounded ]
N10[label="10: 37", shape=circle ]
N11[label="11: 37", shape=circle ]
N12[label="12: 4F", shape=box, style=rounded ]
N13[label="13: 37", shape=circle ]
N14[label="14: 37", shape=circle ]
N15[label="15: 4F", shape=box, style=rounded ]
N16[label="16: 37", shape=circle ]
N17[label="17: 13" ]
N18[label="18: 37", shape=circle ]
N19[label="19: 37", shape=circle ]
N20[label="20: 4F", shape=box, style=rounded ]
N21[label="21: 37", shape=circle ]
N22[label="22: 37", shape=circle ]
N23[label="23: 4F", shape=box, style=rounded ]
N24[label="24: 37", shape=circle ]
N25[label="25: 61", shape=box, style=rounded ]
N26[label="26: 37", shape=circle ]
N27[label="27: 37", shape=circle ]
N28[label="28: 4F", shape=box, style=rounded ]
N29[label="29: 37", shape=circle ]
N30[label="30: 37", shape=circle ]
N31[label="31: 4F", shape=box, style=rounded ]
N32[label="32: 37", shape=circle ]
N33[label="33: 40" ]
N34[label="34: 37", shape=circle ]
N35[label="35: 37", shape=circle ]
N36[label="36: 4F", shape=box, style=rounded ]
N37[label="37: 37", shape=circle ]
N38[label="38: 37", shape=circle ]
N39[label="39: 4F", shape=box, style=rounded ]
N40[label="40: 37", shape=circle ]
N41[label="41: 3A" ]
N42[label="42: 37", shape=circle ]
N43[label="43: 37", shape=circle ]
N44[label="44: 4F", shape=box, style=rounded ]
N45[label="45: 37", shape=circle ]
N46[label="46: 37", shape=circle ]
N47[label="47: 4F", shape=box, style=rounded ]
N48[label="48: 37", shape=circle ]
N49[label="49: F9", shape=box, style=rounded ]
N50[label="50: 37", shape=circle ]
N51[label="51: 37", shape=circle ]
N52[label="52: 4F", shape=box, style=rounded ]
N53[label="53: 37", shape=circle ]
N54[label="54: 37", shape=circle ]
N55[label="55: 4F", shape=box, style=rounded ]
N56[label="56: 37", shape=circle ]
N57[label="57: 40" ]
N58[label="58: 37", shape=circle ]
N59[label="59: 37", shape=circle ]
N60[label="60: 4F", shape=box, style=rounded ]
N61[label="61: 37", shape=circle ]
N62[label="62: 37", shape=circle ]
N63[label="63: 4F", shape=box, style=rounded ]
N64[label="64: 37", shape=circle ]
N65[label="65: 3A" ]
N66[label="66: 37", shape=circle ]
N67[label="67: 37", shape=circle ]
N68[label="68: 4F", shape=box, style=rounded ]
N69[label="69: 37", shape=circle ]
N70[label="70: 37", shape=circle ]
N71[label="71: 4F", shape=box, style=rounded ]
N72[label="72: 37", shape=circle ]
N73[label="73: F9", shape=box, style=rounded ]
N74[label="74: 37", shape=circle ]
N75[label="75: 37", shape=circle ]
N76[label="76: 4F", shape=box, style=rounded ]
N77[label="77: 37", shape=circle ]
N78[label="78: 37", shape=circle ]
N79[label="79: 4F", shape=box, style=rounded ]
N80[label="80: 37", shape=circle ]
N81[label="81: 40" ]
N82[label="82: 37", shape=circle ]
N83[label="83: 37", shape=circle ]
N84[label="84: 4F", shape=box, style=rounded ]
N85[label="85: 37", shape=circle ]
N86[label="86: 37", shape=circle ]
N87[label="87: 4F", shape=box, style=rounded ]
N88[label="88: 37", shape=circle ]
N89[label="89: 3A" ]
N90[label="90: 37", shape=circle ]
N91[label="91: 37", shape=circle ]
N92[label="92: 4F", shape=box, style=rounded ]
N93[label="93: 37", shape=circle ]
N94[label="94: 37", shape=circle ]
N95[label="95: 4F", shape=box, style=rounded ]
N96[label="96: 37", shape=circle ]
N97[label="97: F9", shape=box, style=rounded ]
N98[label="98: 37", shape=circle ]
N99[label="99: 37", shape=circle ]
N100[label="100: 4F", shape=box, style=rounded ]
N101[label="101: 37", shape=circle ]
N102[label="102: 37", shape=circle ]
N103[label="103: 4F", shape=box, style=rounded ]
N104[label="104: 37", shape=circle ]
N105[label="105: 40" ]
N106[label="106: 37", shape=circle ]
N107[label="107: 37", shape=circle ]
N108[label="108: 4F", shape=box, style=rounded ]
N109[label="109: 37", shape=circle ]
N110[label="110: 37", shape=circle ]
N111[label="111: 4F", shape=box, style=rounded ]
N112[label="112: 37", shape=circle ]
N113[label="113: 3A" ]
N114[label="114: 37", shape=circle ]
N115[label="115: 37", shape=circle ]
N116[label="116: 4F", shape=box, style=rounded ]
N117[label="117: 37", shape=circle ]
N118[label="118: 37", shape=circle ]
N119[label="119: 4F", shape=box, style=rounded ]
N120[label="120: 37", shape=circle ]
N121[label="121: F9", shape=box, style=rounded ]
N122[label="122: 37", shape=circle ]
N123[label="123: 37", shape=circle ]
N124[label="124: 4F", shape=box, style=rounded ]
N125[label="125: 37", shape=circle ]
N126[label="126: 37", shape=circle ]
N127[label="127: 4F", shape=box, style=rounded ]
N128[label="128: 37", shape=circle ]
N129[label="129: 40" ]
N130[label="130: 37", shape=circle ]
N131[label="131: 37", shape=circle ]
N132[label="132: 4F", shape=box, style=rounded ]
N133[label="133: 37", shape=circle ]
N134[label="134: 37", shape=circle ]
N135[label="135: 4F", shape=box, style=rounded ]
N136[label="136: 37", shape=circle ]
N137[label="137: 3A" ]
N138[label="138: 37", shape=circle ]
N139[label="139: 37", shape=circle ]
N140[label="140: 4F", shape=box, style=rounded ]
N141[label="141: 37", shape=circle ]
N142[label="142: 37", shape=circle ]
N143[label="143: 4F", shape=box, style=rounded ]
N144[label="144: 37", shape=circle ]
N145[label="145: F9", shape=box, style=rounded ]
N146[label="146: 37", shape=circle ]
N147[label="147: 37", shape=circle ]
N148[label="148: 4F", shape=box, style=rounded ]
N149[label="149: 37", shape=circle ]
N150[label="150: 37", shape=circle ]
N151[label="151: 4F", shape=box, style=rounded ]
N152[label="152: 37", shape=circle ]
N153[label="153: 40" ]
N154[label="154: 37", shape=circle ]
N155[label="155: 37", shape=circle ]
N156[label="156: 4F", shape=box, style=rounded ]
N157[label="157: 37", shape=circle ]
N158[label="158: 37", shape=circle ]
N159[label="159: 4F", shape=box, style=rounded ]
N160[label="160: 37", shape=circle ]
N161[label="161: 3A" ]
N162[label="162: 37", shape=circle ]
N163[label="163: 37", shape=circle ]
N164[label="164: 4F", shape=box, style=rounded ]
N165[label="165: 37", shape=circle ]
N166[label="166: 37", shape=circle ]
N167[label="167: 4F", shape=box, style=rounded ]
N168[label="168: 37", shape=circle ]
N169[label="169: F9", shape=box, style=rounded ]
N170[label="170: 37", shape=circle ]
N171[label="171: 37", shape=circle ]
N172[label="172: 4F", shape=box, style=rounded ]
N173[label="173: 37", shape=circle ]
N174[label="174: 37", shape=circle ]
N175[label="175: 4F", shape=box, style=rounded ]
N176[label="176: 37", shape=circle ]
N177[label="177: 40" ]
N178[label="178: 37", shape=circle ]
N179[label="179: 37", shape=circle ]
N180[label="180: 4F", shape=box, style=rounded ]
N181[label="181: 37", shape=circle ]
N182[label="182: 37", shape=circle ]
N183[label="183: 4F", shape=box, style=rounded ]
N184[label="184: 37", shape=circle ]
N185[label="185: 3A" ]
N186[label="186: 37", shape=circle ]
N187[label="187: 37", shape=circle ]
N188[label="188: 4F", shape=box, style=rounded ]
N189[label="189: 37", shape=circle ]
N190[label="190: 37", shape=circle ]
N191[label="191: 4F", shape=box, style=rounded ]
N192[label="192: 37", shape=circle ]
N193[label="193: F9", shape=box, style=rounded ]
N194[label="194: 37", shape=circle ]
N195[label="195: 37", shape=circle ]
N196[label="196: 4F", shape=box, style=rounded ]
N197[label="197: 37", shape=circle ]
N198[label="198: 37", shape=circle ]
N199[label="199: 4F", shape=box, style=rounded ]
N200[label="200: 37", shape=circle ]
N201[label="201: 40" ]
N202[label="202: 37", shape=circle ]
N203[label="203: 37", shape=circle ]
N204[label="204: 4F", shape=box, style=rounded ]
N205[label="205: 37", shape=circle ]
N206[label="206: 37", shape=circle ]
N207[label="207: 4F", shape=box, style=rounded ]
N208[label="208: 37", shape=circle ]
N209[label="209: 3A" ]
N210[label="210: 37", shape=circle ]
N211[label="211: 37", shape=circle ]
N212[label="212: 4F", shape=box, style=rounded ]
N213[label="213: 37", shape=circle ]
N214[label="214: 37", shape=circle ]
N215[label="215: 4F", shape=box, style=rounded ]
N216[label="216: 37", shape=circle ]
N217[label="217: F9", shape=box, style=rounded ]
N218[label="218: 37", shape=circle ]
N219[label="219: 37", shape=circle ]
N220[label="220: 4F", shape=box, style=rounded ]
N221[label="221: 37", shape=circle ]
N222[label="222: 37", shape=circle ]
N223[label="223: 4F", shape=box, style=rounded ]
N224[label="224: 37", shape=circle ]
N225[label="225: 40" ]
N226[label="226: 37", shape=circle ]
N227[label="227: 37", shape=circle ]
N228[label="228: 4F", shape=box, style=rounded ]
N229[label="229: 37", shape=circle ]
N230[label="230: 37", shape=circle ]
N231[label="231: 4F", shape=box, style=rounded ]
N232[label="232: 37", shape=circle ]
N233[label="233: 3A" ]
N234[label="234: 37", shape=circle ]
N235[label="235: 37", shape=circle ]
N236[label="236: 4F", shape=box, style=rounded ]
N237[label="237: 37", shape=circle ]
N238[label="238: 37", shape=circle ]
N239[label="239: 4F", shape=box, style=rounded ]
N240[label="240: 37", shape=circle ]
N241[label="241: F9", shape=box, style=rounded ]
N242[label="242: 37", shape=circle ]
N243[label="243: 37", shape=circle ]
N244[label="244: 4F", shape=box, style=rounded ]
N245[label="245: 37", shape=circle ]
N246[label="246: 37", shape=circle ]
N247[label="247: 4F", shape=box, style=rounded ]
N248[label="248: 37", shape=circle ]
N249[label="249: 40" ]
N250[label="250: 37", shape=circle ]
N251[label="251: 37", shape=circle ]
N252[label="252: 4F", shape=box, style=rounded ]
N253[label="253: 37", shape=circle ]
N254[label="254: 37", shape=circle ]
N255[label="255: 52", shape=box, style=rounded ]
// Layers
{ /*0*/ rank=min N0 }
{ /*1*/ rank=same N1 N2 N3 }
{ /*2*/ rank=same N4 N5 N6 N7 N8 N9 }
{ /*3*/ rank=same N10 N11 N12 N13 N14 N15 N16 N17 }
{ /*4*/ rank=same N18 N19 N20 N21 N22 N23 N24 N25 }
{ /*5*/ rank=same N26 N27 N28 N29 N30 N31 N32 N33 }
{ /*6*/ rank=same N34 N35 N36 N37 N38 N39 N40 N41 }
{ /*7*/ rank=same N42 N43 N44 N45 N46 N47 N48 N49 }
{ /*8*/ rank=same N50 N51 N52 N53 N54 N55 N56 N57 }
{ /*9*/ rank=same N58 N59 N60 N61 N62 N63 N64 N65 }
{ /*10*/ rank=same N66 N67 N68 N69 N70 N71 N72 N73 }
{ /*11*/ rank=same N74 N75 N76 N77 N78 N79 N80 N81 }
{ /*12*/ rank=same N82 N83 N84 N85 N86 N87 N88 N89 }
{ /*13*/ rank=same N90 N91 N92 N93 N94 N95 N96 N97 }
{ /*14*/ rank=same N98 N99 N100 N101 N102 N103 N104 N105 }
{ /*15*/ rank=same N106 N107 N108 N109 N110 N111 N112 N113 }
{ /*16*/ rank=same N114 N115 N116 N117 N118 N119 N120 N121 }
{ /*17*/ rank=same N122 N123 N124 N125 N126 N127 N128 N129 }
{ /*18*/ rank=same N130 N131 N132 N133 N134 N135 N136 N137 }
{ /*19*/ rank=same N138 N139 N140 N141 N142 N143 N144 N145 }
{ /*20*/ rank=same N146 N147 N148 N149 N150 N151 N152 N153 }
{ /*21*/ rank=same N154 N155 N156 N157 N158 N159 N160 N161 }
{ /*22*/ rank=same N162 N163 N164 N165 N166 N167 N168 N169 }
{ /*23*/ rank=same N170 N171 N172 N173 N174 N175 N176 N177 }
{ /*24*/ rank=same N178 N179 N180 N181 N182 N183 N184 N185 }
{ /*25*/ rank=same N186 N187 N188 N189 N190 N191 N192 N193 }
{ /*26*/ rank=same N194 N195 N196 N197 N198 N199 N200 N201 }
{ /*27*/ rank=same N202 N203 N204 N205 N206 N207 N208 N209 }
{ /*28*/ rank=same N210 N211 N212 N213 N214 N215 N216 N217 }
{ /*29*/ rank=same N218 N219 N220 N221 N222 N223 N224 N225 }
{ /*30*/ rank=same N226 N227 N228 N229 N230 N231 N232 N233 }
{ /*31*/ rank=same N234 N235 N236 N237 N238 N239 N240 N241 }
{ /*32*/ rank=same N242 N243 N244 N245 N246 N247 N248 N249 }
{ /*33*/ rank=same N250 N251 N252 N253 N254 N255 }
// Topology
N0 -> N3
N1 -> N6
N2 -> N9
N4 -> N12
N5 -> N15
N7 -> N17
N8 -> N17
N10 -> N20
N11 -> N23
N13 -> N25
N14 -> N25
N16 -> N25
N17 -> N25
N18 -> N28
N19 -> N31
N21 -> N33
N22 -> N33
N24 -> N33
N26 -> N36
N27 -> N39
N29 -> N41
N30 -> N41
N32 -> N41
N33 -> N41
N34 -> N44
N35 -> N47
N37 -> N49
N38 -> N49
N40 -> N49
N41 -> N49
N42 -> N52
N43 -> N55
N45 -> N57
N46 -> N57
N48 -> N57
N50 -> N60
N51 -> N63
N53 -> N65
N54 -> N65
N56 -> N65
N57 -> N65
N58 -> N68
N59 -> N71
N61 -> N73
N62 -> N73
N64 -> N73
N65 -> N73
N66 -> N76
N67 -> N79
N69 -> N81
N70 -> N81
N72 -> N81
N74 -> N84
N75 -> N87
N77 -> N89
N78 -> N89
N80 -> N89
N81 -> N89
N82 -> N92
N83 -> N95
N85 -> N97
N86 -> N97
N88 -> N97
N89 -> N97
N90 -> N100
N91 -> N103
N93 -> N105
N94 -> N105
N96 -> N105
N98 -> N108
N99 -> N111
N101 -> N113
N102 -> N113
N104 -> N113
N105 -> N113
N106 -> N116
N107 -> N119
N109 -> N121
N110 -> N121
N112 -> N121
N113 -> N121
N114 -> N124
N115 -> N127
N117 -> N129
N118 -> N129
N120 -> N129
N122 -> N132
N123 -> N135
N125 -> N137
N126 -> N137
N128 -> N137
N129 -> N137
N130 -> N140
N131 -> N143
N133 -> N145
N134 -> N145
N136 -> N145
N137 -> N145
N138 -> N148
N139 -> N151
N141 -> N153
N142 -> N153
N144 -> N153
N146 -> N156
N147 -> N159
N149 -> N161
N150 -> N161
N152 -> N161
N153 -> N161
N154 -> N164
N155 -> N167
N157 -> N169
N158 -> N169
N160 -> N169
N161 -> N169
N162 -> N172
N163 -> N175
N165 -> N177
N166 -> N177
N168 -> N177
N170 -> N180
N171 -> N183
N173 -> N185
N174 -> N185
N176 -> N185
N177 -> N185
N178 -> N188
N179 -> N191
N181 -> N193
N182 -> N193
N184 -> N193
N185 -> N193
N186 -> N196
N187 -> N199
N189 -> N201
N190 -> N201
N192 -> N201
N194 -> N204
N195 -> N207
N197 -> N209
N198 -> N209
N200 -> N209
N201 -> N209
N202 -> N212
N203 -> N215
N205 -> N217
N206 -> N217
N208 -> N217
N209 -> N217
N210 -> N220
N211 -> N223
N213 -> N225
N214 -> N225
N216 -> N225
N218 -> N228
N219 -> N231
N221 -> N233
N222 -> N233
N224 -> N233
N225 -> N233
N226 -> N236
N227 -> N239
N229 -> N241
N230 -> N241
N232 -> N241
N233 -> N241
N234 -> N244
N235 -> N247
N237 -> N249
N238 -> N249
N240 -> N249
N242 -> N252
N243 -> N255
N245 -> N255
N246 -> N255
N248 -> N255
N249 -> N255
}

File diff suppressed because it is too large Load diff

After

Width:  |  Height:  |  Size: 66 KiB

View file

@ -21,7 +21,7 @@ _Gnuplot_ script. Raw measurement data is stored as CSV (see 'csv.hpp').
Breaking Point Testing
----------------------
Topo-10::
Topo-10.dot::
Topology of the processing load used as typical example for _breaking a schedule._
This Graph with 64 nodes is generated by the pre-configured rules
`configureShape_chain_loadBursts()`; it starts with a single linear, yet »bursts«
@ -134,6 +134,13 @@ reproducibly slower (at least on my machine). Below 90 jobs, also the spread of
value is larger, as is the spread of time in _impeded state,_ which is defined as less than two
workers processing active job content at a given time.
Stationary Load
---------------
The goal for this setup is to demonstrate stable processing over an extended period of time.
Topo-20.dot::
Topology used to emulate a realistic load.
It is comprised of small yet interleaved dependency tries,
filling each level up to the maximum capacity (limited here to 8 workers).

View file

@ -140,7 +140,7 @@ namespace gear {
const auto IDLE_WAIT = 20ms; ///< sleep-recheck cycle for workers deemed _idle_
const size_t DISMISS_CYCLES = 100; ///< number of wait cycles before an idle worker terminates completely
Offset DUTY_CYCLE_PERIOD{FSecs(1,20)}; ///< period of the regular scheduler »tick« for state maintenance.
Offset DUTY_CYCLE_TOLERANCE{FSecs(1,10)}; ///< maximum slip tolerated on duty-cycle start before triggering Scheduler-emergency
Offset DUTY_CYCLE_TOLERANCE{FSecs(2,10)}; ///< maximum slip tolerated on duty-cycle start before triggering Scheduler-emergency
Offset FUTURE_PLANNING_LIMIT{FSecs{20}}; ///< limit timespan of deadline into the future (~360 MiB max)
}

View file

@ -484,8 +484,8 @@ cout << "time="<<runTime/1000
// double UPPER_STRESS = 12;
//
// double FAIL_LIMIT = 1.0; //0.7;
double TRIGGER_SDEV = 1.0; //0.25;
double TRIGGER_DELTA = 2.0; //0.5;
// double TRIGGER_SDEV = 1.0; //0.25;
// double TRIGGER_DELTA = 2.0; //0.5;
// uint CONCURRENCY = 4;
uint CONCURRENCY = 8;
// bool SCHED_DEPENDS = true;
@ -506,7 +506,7 @@ cout << "time="<<runTime/1000
{
return StressRig::testSetup(testLoad)
// .withBaseExpense(200us)
.withLoadTimeBase(4ms);
.withLoadTimeBase(5ms);
}
};
auto [stress,delta,time] = StressRig::with<Setup>()

View file

@ -207,7 +207,7 @@ namespace test {
uint CONCURRENCY = work::Config::getDefaultComputationCapacity();
bool INSTRUMENTATION = true;
double EPSILON = 0.01; ///< error bound to abort binary search
double UPPER_STRESS = 1.2; ///< starting point for the upper limit, likely to fail
double UPPER_STRESS = 1.7; ///< starting point for the upper limit, likely to fail
double FAIL_LIMIT = 2.0; ///< delta-limit when to count a run as failure
double TRIGGER_FAIL = 0.55; ///< %-fact: criterion-1 failures above this rate
double TRIGGER_SDEV = FAIL_LIMIT; ///< in ms : criterion-2 standard derivation
@ -289,8 +289,6 @@ namespace test {
double expTime{0};
};
double adjustmentFac{1.0};
/** prepare the ScheduleCtx for a specifically parametrised test series */
void
configureTest (TestSetup& testSetup, double stressFac)
@ -312,8 +310,7 @@ namespace test {
{
runTime[i] = testSetup.launch_and_wait() / 1000;
avgT += runTime[i];
testSetup.adaptEmpirically (stressFac, CONF::CONCURRENCY);
this->adjustmentFac = 1 / (testSetup.getStressFac() / stressFac);
maybeAdaptScaleEmpirically (testSetup, stressFac);
}
expT = testSetup.getExpectedEndTime() / 1000;
avgT /= CONF::REPETITIONS;
@ -337,9 +334,10 @@ namespace test {
bool
decideBreakPoint (Res& res)
{
return res.percentOff > CONF::TRIGGER_FAIL
return res.percentOff > 0.99
or( res.percentOff > CONF::TRIGGER_FAIL
and res.stdDev > CONF::TRIGGER_SDEV
and res.avgDelta > CONF::TRIGGER_DELTA;
and res.avgDelta > CONF::TRIGGER_DELTA);
}
/**
@ -377,6 +375,37 @@ namespace test {
return res;
}
/** adaptive scale correction based on observed behaviour */
double adjustmentFac{1.0};
size_t gaugeProbes = 3 * CONF::REPETITIONS;
/**
* Attempt to factor out some observable properties, which are considered circumstantial
* and not a direct result of scheduling overheads. The artificial computational load is
* known to drift towards larger values than calibrated; moreover the actual concurrency
* achieved can deviate from the heuristic assumptions built into the testing schedule.
* The latter is problematic to some degree however, since the Scheduler is bound to
* scale down capacity when idle. To strike a reasonable balance, this adjustment of
* the measurement scale is done only initially, and when the stress factor is high
* and some degree of pressure on the scheduler can thus be assumed.
*/
void
maybeAdaptScaleEmpirically (TestSetup& testSetup, double stressFac)
{
double formFac = testSetup.determineEmpiricFormFactor (CONF::CONCURRENCY);
if (not gaugeProbes) return;
double gain = util::limited (0, pow(stressFac, 9), 1);
if (gain < 0.2) return;
//
// double formFac = testSetup.determineEmpiricFormFactor (CONF::CONCURRENCY);
double afak = adjustmentFac;
adjustmentFac = gain*formFac + (1-gain)*adjustmentFac;
cout << _Fmt{"g:%-2d|%3.1f stress:%4.2f formFac=%5.3f ▶ %5.3f -> %5.3f => %5.3f"}
% gaugeProbes % gain % stressFac% formFac % afak%adjustmentFac % (stressFac/adjustmentFac) <<endl;
testSetup.withAdaptedSchedule(stressFac, CONF::CONCURRENCY, adjustmentFac);
--gaugeProbes;
}
_Fmt fmtRun_ {"....·%-2d: Δ=%4.1f t=%4.1f %s %s"}; // i % Δ % t % t>avg? % fail?
_Fmt fmtStep_{ "%4.2f| : ∅Δ=%4.1f±%-4.2f ∅t=%4.1f %s %%%-3.0f -- expect:%4.1fms"};// stress % ∅Δ % σ % ∅t % fail % pecentOff % t-expect

View file

@ -1971,30 +1971,27 @@ namespace test {
return move(*this);
}
ScheduleCtx&&
adaptEmpirically (double stressFac =1.0, uint concurrency=0)
double
determineEmpiricFormFactor (uint concurrency=0)
{
if (watchInvocations_)
{
auto stat = watchInvocations_->evaluate();
if (0 < stat.activationCnt)
{// looks like we have actual measurement data
ENSURE (0.0 < stat.avgConcurrency);
if (not concurrency)
concurrency = defaultConcurrency();
double worktimeRatio = 1 - stat.timeAtConc(0) / stat.coveredTime;
double workConcurrency = stat.avgConcurrency / worktimeRatio;
double weightSum = chainLoad_.calcWeightSum();
double expectedCompoundedWeight = chainLoad_.calcExpectedCompoundedWeight(concurrency);
double expectedConcurrency = weightSum / expectedCompoundedWeight;
double formFac = 1 / (workConcurrency / expectedConcurrency);
double expectedNodeTime = _uSec(compuLoad_->timeBase) * weightSum / chainLoad_.size();
double realAvgNodeTime = stat.activeTime / stat.activationCnt;
formFac *= realAvgNodeTime / expectedNodeTime;
return withAdaptedSchedule (stressFac, concurrency, formFac);
}
}
return move(*this);
if (not watchInvocations_) return 1.0;
auto stat = watchInvocations_->evaluate();
if (0 == stat.activationCnt) return 1.0;
// looks like we have actual measurement data...
ENSURE (0.0 < stat.avgConcurrency);
if (not concurrency)
concurrency = defaultConcurrency();
double worktimeRatio = 1 - stat.timeAtConc(0) / stat.coveredTime;
double workConcurrency = stat.avgConcurrency / worktimeRatio;
double weightSum = chainLoad_.calcWeightSum();
double expectedCompoundedWeight = chainLoad_.calcExpectedCompoundedWeight(concurrency);
double expectedConcurrency = weightSum / expectedCompoundedWeight;
double formFac = 1 / (workConcurrency / expectedConcurrency);
double expectedNodeTime = _uSec(compuLoad_->timeBase) * weightSum / chainLoad_.size();
double realAvgNodeTime = stat.activeTime / stat.activationCnt;
formFac *= realAvgNodeTime / expectedNodeTime;
cout<<"∅conc:"<<stat.avgConcurrency<<" ....f◇f="<<formFac<<endl;
return formFac;
}
ScheduleCtx&&

View file

@ -7352,7 +7352,7 @@ While the ability to reason about activities and verify their behaviour in isola
The way other parts of the system are built, requires us to obtain a guaranteed knowledge of some job's termination. It is possible to obtain that knowledge with some limited delay, but it nees to be absoultely reliable (violations leading to segfault). The requirements stated above assume this can be achieved through //jobs with guaranteed execution.// Alternatively we could consider installing specific callbacks -- in this case the scheduler itself has to guarantee the invocation of these callbacks, even if the corresponding job fails or is never invoked. It doesn't seem there is any other option.
</pre>
</div>
<div title="SchedulerTest" creator="Ichthyostega" modifier="Ichthyostega" created="202312281814" modified="202404112327" tags="Rendering operational draft img" changecount="99">
<div title="SchedulerTest" creator="Ichthyostega" modifier="Ichthyostega" created="202312281814" modified="202404172013" tags="Rendering operational draft img" changecount="106">
<pre>With the Scheduler testing effort [[#1344|https://issues.lumiera.org/ticket/1344]], several goals are pursued
* by exposing the new scheduler implementation to excessive overload, its robustness can be assessed and defects can be spotted
* with the help of a systematic, calibrated load, characteristic performance limits and breaking points can be established
@ -7452,17 +7452,15 @@ The example presented to the right uses a similar setup (''8 workers''), but red
As net effect, most of the load peaks are just handled by two workers, especially for larger load sizes; most of the available processing capacity remains unused for such short running payloads. Moreover, on average a significant amount of time is spent with partially blocked or impeded operation (&amp;rarr; light green circles), since administrative work must be done non-concurrently. Depending on the perspective, this can be seen as a weakness -- or as the result of a deliberate trade-off made by the choice of active work-pulling and a passive Scheduler.
The actual average in-job time (&amp;rarr; dark green dots) is offset significantly here, and closer to 400µs -- which is also confirmed by the gradient of the linear model (0.4ms / 2 Threads ≙ 0.2ms/job). With shorter load sizes below 90 jobs, increased variance can be observerd, and measurements can no longer be subsumed under a single linear relation -- in fact, data points seem to be arranged into several groups with differing, yet mostly linear correlation, which also explains the negative socket value of the overall computed model; using only the data points with &gt; 90 jobs would yield a model with slightly lower gradient but a positive offset of ~2ms.
The actual average in-job time (&amp;rarr; dark green dots) is offset significantly here, and closer to 400µs -- which is also confirmed by the gradient of the linear model (0.4ms / 2 Threads ≙ 0.2ms/job). With shorter load sizes below 90 jobs, increased variance can be observed, and measurements can no longer be subsumed under a single linear relation -- in fact, data points seem to be arranged into several groups with differing, yet mostly linear correlation, which also explains the negative socket value of the overall computed model; using only the data points with &gt; 90 jobs would yield a model with slightly lower gradient but a positive offset of ~2ms.
&lt;html&gt;&lt;div style=&quot;clear: both&quot;/&gt;&lt;/html&gt;
Further measurement runs with other parameter values fit well in between the two extremes presented above. It can be concluded that this Scheduler implementation strongly favours larger job sizes starting with several milliseconds, when it comes to processing through a extended homogenous work load without much job interdependencies. Such larger lot sizes can be handled efficiently and close to expected limits, while very small jobs massively degrade the available performance. This can be attributed both to the choice of a randomised capacity distribution, and of pull processing without a central manager.
!!!Stationary Processing
The ultimate goal of //load- and stress testing// is to establish a notion of //full load// and to demonstrate adequate performance under //nominal load conditions.// Thus, after investigating overheads and the breaking point of a complex schedule, a measurement setup was established with load patterns deemed „realistic“ -- based on knowledge regarding some typical media processing demands encountered for video editing. Such a setup entails small dependency trees of jobs loaded with computation times around 5ms, interleaving several challenges up to the available level of concurrency. To determine viable parameter bounds, the //breaking-point// measurement method can be applied to an extended graph with this structure, to find out at which level the computations will use the system's abilities to such a degree that it is not able to move along faster any more.
&lt;html&gt;&lt;img title=&quot;Load topology for stationary processing with 8 cores&quot; src=&quot;dump/2024-04-08.Scheduler-LoadTest/Topo-20.svg&quot; style=&quot;width:100%&quot;/&gt;&lt;/html&gt;
lorem ipsum
lorem ipsum nebbich
ja luia sog I
This research revealed again the tendency of the given Scheduler implementation to ''scale-down capacity unless overloaded''. Using the breaking-point method with such a fine grained and rather homogenous schedule can be problematic, since a search for the limit will inevitably involve running several probes //below the limit// -- which can cause the scheduler to reduce the number of workers used to a level that fills the available time. Depending on the path taken, the search can thus find a breaking point corresponding to a throttled capacity, while taking a search path through parameter ranges of overload will reveal the ability to follow a much tighter schedule. While this is an inherent problem of this measurement approach, it can be mitigated to some degree by limiting the empiric adaption of the parameter scale to the initial phase of the measurement, while ensuring this initial phase is started from overload territory.
</pre>
</div>
<div title="SchedulerWorker" creator="Ichthyostega" modifier="Ichthyostega" created="202309041605" modified="202312281745" tags="Rendering operational spec draft" changecount="21">

View file

@ -87696,23 +87696,18 @@ Date:&#160;&#160;&#160;Thu Apr 20 18:53:17 2023 +0200<br/>
</node>
<node COLOR="#338800" CREATED="1713279448677" ID="ID_1699330405" LINK="#ID_1493818454" MODIFIED="1713279611750" TEXT="sp&#xe4;ter: noch ein Folge-Fehler &#x2014; nach Fix korrektes Verhalten verifiziert">
<richcontent TYPE="NOTE"><html>
<head>
</head>
<head/>
<body>
<p>
...ich hatte diesen Fix nur oberfl&#228;chlich getestet, und dabei &#252;bersehen, da&#223; eine Assertion ansprechen kann (sogar sehr wahrscheinlich einmal ansprechen wird, sobald der Reparatur-Mechanismus eine gr&#246;&#223;tere Strecke zur&#252;cklegt). Das ist aber kein Bug im eigentlichen Reparatur-/reLink-Mechanismus; dieser funktioniert pr&#228;zise, wie ich nochmals im einzelnen mit dem Debugger nachvollziehen konnte.
</p>
</body>
</html>
</richcontent>
</html></richcontent>
<icon BUILTIN="button_ok"/>
</node>
<node COLOR="#435e98" CREATED="1713279615787" ID="ID_307598630" MODIFIED="1713279730132" TEXT="Deadlines sind ein separates Thema">
<richcontent TYPE="NOTE"><html>
<head>
</head>
<head/>
<body>
<p>
wenn eine Deadline &#252;berfahren wurde, ist ein weiterer Zugriff auf den Extend als _undefined behaviour_ zu betrachten. Das gilt auch f&#252;r das AllocatorHandle, das man fr&#252;her mal f&#252;r eine bestimmte Deadline bekommen hat; dieses kann man sehr wohl weiterhin verwenden (solange die Deadline noch in der Zukunft liegt). Konkreter Fall: sp&#228;ter noch eine Dependency anh&#228;ngen. Wenn der Anker dieser Dependecy zu dem Zeitpunkt bereits ausgef&#252;hrt oder invalidiert wurde, ist man selber schuld!
@ -116289,9 +116284,7 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
<icon BUILTIN="help"/>
<node CREATED="1713113053955" ID="ID_1665688338" MODIFIED="1713113079288" TEXT="sollte ja au&#xdf;erhalb der gemessenen Job-Zeiten liegen">
<richcontent TYPE="NOTE"><html>
<head>
</head>
<head/>
<body>
<p>
und damit nicht &#252;ber eine Abweichung der Job-Zeiten in den Formfaktor eingehen
@ -116302,9 +116295,7 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
<node COLOR="#435e98" CREATED="1713113092429" ID="ID_674076723" MODIFIED="1713131827112" TEXT="&#x27f9; die tats&#xe4;chliche Concurrency ausgeben">
<node CREATED="1713131159383" ID="ID_1723960817" MODIFIED="1713131190568">
<richcontent TYPE="NODE"><html>
<head>
</head>
<head/>
<body>
<p>
naja... die ist <i>unterirdisch</i>
@ -116314,9 +116305,7 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
</node>
<node CREATED="1713131550668" ID="ID_1108435309" MODIFIED="1713138652526" TEXT="bewegt sich um 2 &#x2014; Tendenz fallend">
<richcontent TYPE="NOTE"><html>
<head>
</head>
<head/>
<body>
<p>
zur Erinnerung: in einer Serie machen wir ja eine Art Konvergenz auf einen effektiven Form-Faktor hin. Mit den Ergebnissen eines Laufes wird f&#252;r den n&#228;chsten Lauf nachjustiert; der von au&#223;en vorgegebene (nominelle) Stre&#223;-Faktor bleibt, aber die tats&#228;chliche Dichte wird so optimiert, da&#223; die dann effektiv diesem Faktor entspricht. Im Zuge dieser Anpassung wird anscheinend das Schedule jeweils etwas verdichtet, und die erreichte Concurency f&#228;llt (von etwas &#252;ber 2 auf 1.6 zuletzt)
@ -116327,9 +116316,7 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
</node>
<node CREATED="1713131782234" ID="ID_68565346" MODIFIED="1713131814392" TEXT="das ist aber bei einer Last = 500&#xb5;s auch nicht &#xfc;berraschend">
<richcontent TYPE="NOTE"><html>
<head>
</head>
<head/>
<body>
<p>
zumindest nach den inzwischen vorliegenden Beobachtungen aus dem Param-Range-Setup
@ -116347,10 +116334,138 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
<arrowlink COLOR="#a32248" DESTINATION="ID_1108435309" ENDARROW="Default" ENDINCLINATION="-451;33;" ID="Arrow_ID_1455964137" STARTARROW="None" STARTINCLINATION="748;-44;"/>
<icon BUILTIN="messagebox_warning"/>
</node>
<node BACKGROUND_COLOR="#eef0c5" COLOR="#990000" CREATED="1713132789379" ID="ID_340707211" MODIFIED="1713279753842" TEXT="mal mit l&#xe4;ngerer Job-Load (5ms) versuchen">
<icon BUILTIN="pencil"/>
<node COLOR="#338800" CREATED="1713132789379" ID="ID_340707211" MODIFIED="1713381423897" TEXT="mal mit l&#xe4;ngerer Job-Load (5ms) versuchen">
<icon BUILTIN="button_ok"/>
<node CREATED="1713297705300" ID="ID_966787442" MODIFIED="1713381635480" TEXT="nach Beheben einer Inkonsistenz....">
<arrowlink COLOR="#648bb9" DESTINATION="ID_651763651" ENDARROW="Default" ENDINCLINATION="23;-13;" ID="Arrow_ID_1152882180" STARTARROW="None" STARTINCLINATION="-98;8;"/>
</node>
<node COLOR="#435e98" CREATED="1713132381066" FOLDED="true" ID="ID_651763651" MODIFIED="1713279739989" TEXT="Fehlzugriff aus dem Allocator (isValidPos (idx))">
<node CREATED="1713297781395" ID="ID_1732931301" MODIFIED="1713297807516" TEXT="l&#xe4;uft die Me&#xdf;methode auch mit gr&#xf6;&#xdf;erer Last ganz &#xe4;hnlich"/>
<node COLOR="#338800" CREATED="1713378539378" ID="ID_783154296" MODIFIED="1713381614244" TEXT="Anpassungen um das Abregeln der Concurrency zu verhindern">
<arrowlink COLOR="#6792a2" DESTINATION="ID_543037683" ENDARROW="Default" ENDINCLINATION="-148;-48;" ID="Arrow_ID_367320078" STARTARROW="None" STARTINCLINATION="-395;18;"/>
<icon BUILTIN="yes"/>
<icon BUILTIN="button_ok"/>
</node>
<node BACKGROUND_COLOR="#c8c0b6" CREATED="1713378767420" ID="ID_1986630212" MODIFIED="1713381454582">
<richcontent TYPE="NODE"><html>
<head/>
<body>
<p>
damit geht jetzt die Auslastung<i>&#160;einigerma&#223;en hoch</i>
</p>
</body>
</html></richcontent>
<richcontent TYPE="NOTE"><html>
<head>
</head>
<body>
<p>
es ist und bleibt ein Kompromi&#223;....
</p>
<p>
Ich versuche hier, eine sehr spezifische Me&#223;methode halbwegs generisch nutzbar zu machen, stecke dabei aber bereits gef&#228;hrlich viel Vorannahmen &#252;ber den Scheduler in den Me&#223;proze&#223;
</p>
</body>
</html>
</richcontent>
<icon BUILTIN="forward"/>
</node>
</node>
<node BACKGROUND_COLOR="#e0ceaa" COLOR="#690f14" CREATED="1713297808257" FOLDED="true" ID="ID_661966582" MODIFIED="1713381624280" TEXT="Concurrency bleibt &lt; 3">
<icon BUILTIN="messagebox_warning"/>
<node CREATED="1713298065585" ID="ID_503149788" MODIFIED="1713298079920" TEXT="anfangs starten wir mit concurrency &gt; 7"/>
<node CREATED="1713298409701" ID="ID_711601314" MODIFIED="1713298565853" TEXT="in den ersten L&#xe4;ufen nimmt die Concurrency anfangs ab"/>
<node CREATED="1713298566299" ID="ID_604402875" MODIFIED="1713298828656" TEXT="ab Lauf 3~4 ist die Situation stabil"/>
<node CREATED="1713298635266" ID="ID_1669728736" MODIFIED="1713298694718" TEXT="die erwarteten Zeiten sind verh&#xe4;ltnism&#xe4;&#xdf;ig l&#xe4;nger"/>
<node BACKGROUND_COLOR="#dac790" COLOR="#8c3a2c" CREATED="1713298882342" ID="ID_929524356" MODIFIED="1713381398183" TEXT="das Nachregulieren verhindert das Aussch&#xf6;pfen der Kapazit&#xe4;t">
<icon BUILTIN="broken-line"/>
<node CREATED="1713299259060" ID="ID_1959279893" MODIFIED="1713299291015" TEXT="erscheint logisch: es ist &#x201e;Luft&#x201c; in jedem Layer"/>
<node CREATED="1713299335656" ID="ID_1671680079" MODIFIED="1713299361793" TEXT="die Concurrency bleibt nur bei (&#xdc;ber)Druck hoch">
<icon BUILTIN="messagebox_warning"/>
</node>
<node CREATED="1713299482330" ID="ID_1397469244" MODIFIED="1713299718759" TEXT="die Kapazit&#xe4;t steht erst mit Verz&#xf6;gerung bereit">
<richcontent TYPE="NOTE"><html>
<head/>
<body>
<p>
Kapazit&#228;t wird normalerweise zuf&#228;llig in eine aktive Zone verteilt; nur wenn wir hinter das Schedule fallen, werden alle Worker eingesetzt
</p>
</body>
</html></richcontent>
</node>
<node CREATED="1713299610829" ID="ID_1483803959" MODIFIED="1713299877054" TEXT="Schedule locker &#x27f9; die meisten Cores legen sich schlafen">
<richcontent TYPE="NOTE"><html>
<head/>
<body>
<p>
...wenn aufgrund einer vorhergehend beobachteten, geringen Parallelit&#228;t das Schedule gespreizt ist, dann ist die Abarbeitung eines Layers vorzeitig fertig, und die Worker werden hinter den Startpunkt des n&#228;chsten Levels verteilt. Damit geht auch dort wieder die Kapazit&#228;t nur langsam hoch, und nach wenigen Runden hat sich eine kleine Zahl an aktiven Workern herauskristalisiert. Die weitere Nachregulierung sorgt dann genau daf&#252;r, da&#223; das Schedule so gro&#223;z&#252;gig ist, da&#223; diese wenigen Worker es schaffen.
</p>
</body>
</html></richcontent>
</node>
<node CREATED="1713300172455" ID="ID_1122795031" MODIFIED="1713300182139" TEXT="zur Best&#xe4;tigung: mal mit 10ms">
<node CREATED="1713300183270" ID="ID_304503677" MODIFIED="1713300202495" TEXT="&#x27f9; die erste Runde l&#xe4;uft &gt; 6-conc"/>
<node CREATED="1713300203656" ID="ID_1857649484" MODIFIED="1713300215597" TEXT="danach wird der Stress gelockert"/>
<node CREATED="1713300216369" ID="ID_139611638" MODIFIED="1713300237818" TEXT="und die Auslastung konvergiert &#x27f6; 3.2"/>
</node>
</node>
<node BACKGROUND_COLOR="#ebd5bd" COLOR="#fa002a" CREATED="1713300492012" ID="ID_97024035" MODIFIED="1713300612987">
<richcontent TYPE="NODE"><html>
<head/>
<body>
<p>
dieses Setup beobachtet <b>nicht</b>&#160;den &#187;breaking Point&#171;
</p>
<p>
&#8212; sondern das Erreichen eines Lastziels
</p>
</body>
</html></richcontent>
<icon BUILTIN="messagebox_warning"/>
<node CREATED="1713300616842" ID="ID_1218029367" MODIFIED="1713301638566" TEXT="das liegt aber auch an der elastischen Last">
<richcontent TYPE="NOTE"><html>
<head/>
<body>
<p>
Urspr&#252;nglich wurde die &#187;breaking Point&#171;-Methode an einem komplexen Lastmuster entwickelt, welches bei &#220;berlastung sehr deutlich degeneriert, insofern dann zentrale Vorraussetzungs-Knoten erst sp&#228;t erreicht werden, und damit das gesamte Schedule sich drastisch versp&#228;tet. Sowas ist hier nicht gegeben. Vielmehr verl&#228;ngert sich die Laufzeit einfach elastisch und proportional, wenn ein einmal vorgegebenes Schedule ohne Puffer genau erf&#252;llt wird. Da sich die Suche von der Seite geringer Last n&#228;hert, wird dabei nie gen&#252;gend Druck aufgebaut, um die Concurrency hochzutreiben.
</p>
</body>
</html></richcontent>
</node>
<node CREATED="1713301234231" ID="ID_47056091" MODIFIED="1713301635592" TEXT="gef&#xe4;hrlich ist, da&#xdf; das Nachregeln den Ma&#xdf;stab im Suchraum verschiebt">
<richcontent TYPE="NOTE"><html>
<head/>
<body>
<p>
locker &#10233; geringe Concurrency &#10233; dieser Punkt wird als strenger klassifiziert und das Schedule wird noch lockerer &#10233; wenn nun wir an den vorherigen Testpunkt zur&#252;ckkehren w&#252;rden, dann w&#228;re dort m&#246;glicherweise das Testziel (= Schedule gebrochen) gar nicht mehr erf&#252;llt
</p>
</body>
</html></richcontent>
</node>
<node BACKGROUND_COLOR="#e0ceaa" COLOR="#520f69" CREATED="1713303673229" ID="ID_1191581589" MODIFIED="1713381357181" TEXT="man k&#xf6;nnte das Nachregeln einschr&#xe4;nken">
<icon BUILTIN="idea"/>
<node CREATED="1713310003377" ID="ID_297923973" MODIFIED="1713310017089" TEXT="und zwar betr. der Concurrency"/>
<node CREATED="1713310017927" ID="ID_1386885337" MODIFIED="1713310026554" TEXT="man k&#xf6;nnte es nur initial machen"/>
<node CREATED="1713310032685" ID="ID_512748510" MODIFIED="1713310055438" TEXT="man k&#xf6;nnte es nur partiell erlauben"/>
<node CREATED="1713310144799" ID="ID_1050697363" MODIFIED="1713310168964" TEXT="man k&#xf6;nnte Anpassungen lokal begrenzen"/>
<node CREATED="1713310056306" ID="ID_173741470" MODIFIED="1713310188349" TEXT="man k&#xf6;nnte nur bei hoher Last nachregeln"/>
</node>
</node>
<node COLOR="#435e98" CREATED="1713378539378" ID="ID_543037683" MODIFIED="1713381607236" TEXT="Anpassungen">
<linktarget COLOR="#6792a2" DESTINATION="ID_543037683" ENDARROW="Default" ENDINCLINATION="-148;-48;" ID="Arrow_ID_367320078" SOURCE="ID_783154296" STARTARROW="None" STARTINCLINATION="-395;18;"/>
<icon BUILTIN="yes"/>
<node CREATED="1713378549353" ID="ID_1117044672" MODIFIED="1713378578881" TEXT="graduelle Anpassung des gespeicherten Korrektur-Faktors"/>
<node CREATED="1713378579642" ID="ID_1127117354" MODIFIED="1713378611245" TEXT="nur korrigieren, wenn Stre&#xdf; &gt; 0.9">
<node CREATED="1713378613776" ID="ID_1661254915" MODIFIED="1713378628514" TEXT="es zeigt sich: der Scheduler regelt sehr effektiv die Kapazit&#xe4;t ab"/>
<node CREATED="1713378629243" ID="ID_1009761299" MODIFIED="1713378637473" TEXT="man mu&#xdf; ihn wirklich unter Druck setzen"/>
</node>
<node CREATED="1713378543482" ID="ID_575096101" MODIFIED="1713378734842" TEXT="nur anfangs nachregeln">
<node CREATED="1713378639765" ID="ID_651104418" MODIFIED="1713378713263" TEXT="...und Startpunkt aggressiver setzen"/>
<node CREATED="1713378713903" ID="ID_1710484349" MODIFIED="1713378727005" TEXT=".....so da&#xdf; die ersten L&#xe4;ufe unter Druck erfolgen"/>
</node>
</node>
</node>
<node COLOR="#435e98" CREATED="1713132381066" FOLDED="true" ID="ID_651763651" MODIFIED="1713381635480" TEXT="Fehlzugriff aus dem Allocator (isValidPos (idx))">
<richcontent TYPE="NOTE"><html>
<head>
@ -116360,13 +116475,13 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
extent-family.hpp:356
</p>
</body>
</html></richcontent>
</html>
</richcontent>
<linktarget COLOR="#648bb9" DESTINATION="ID_651763651" ENDARROW="Default" ENDINCLINATION="23;-13;" ID="Arrow_ID_1152882180" SOURCE="ID_966787442" STARTARROW="None" STARTINCLINATION="-98;8;"/>
<icon BUILTIN="broken-line"/>
<node CREATED="1713132457217" ID="ID_1257716735" LINK="#ID_917615827" MODIFIED="1713132548204" TEXT="die gleiche Assertion hat schon fr&#xfc;her angesprochen">
<richcontent TYPE="NOTE"><html>
<head>
</head>
<head/>
<body>
<p>
...damals allerdings aus einem ganz anderen Kontext, der inzwischen durch einen Umbau im Scheduler behoben ist/sein sollte.
@ -116377,9 +116492,7 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
<node CREATED="1713132773293" ID="ID_818906770" MODIFIED="1713132783368" TEXT="aufgetreten nach wenigen L&#xe4;ufen der 1.Serie">
<node CREATED="1713132811175" ID="ID_1384338907" MODIFIED="1713132859169">
<richcontent TYPE="NODE"><html>
<head>
</head>
<head/>
<body>
<p>
<font color="#114cbf">&#187;investigateWorkProcessing&#171;</font>
@ -116405,9 +116518,7 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
</node>
<node CREATED="1713132874202" ID="ID_222679060" MODIFIED="1713190676385" TEXT="Stacktrace">
<richcontent TYPE="NOTE"><html>
<head>
</head>
<head/>
<body>
<p>
<font face="Monospaced" size="2">0000000609: PRECONDITION: extent-family.hpp:356: thread_9: access: (isValidPos (idx)) </font>
@ -116497,9 +116608,7 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
</node>
<node CREATED="1713133619529" ID="ID_82008393" MODIFIED="1713133626170">
<richcontent TYPE="NODE"><html>
<head>
</head>
<head/>
<body>
<p>
<font color="#c50955" face="Monospaced" size="2">linkToPredecessor()</font>
@ -116520,9 +116629,7 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
</node>
<node BACKGROUND_COLOR="#e0ceaa" COLOR="#690f14" CREATED="1713134829893" ID="ID_169525701" MODIFIED="1713134851513">
<richcontent TYPE="NODE"><html>
<head>
</head>
<head/>
<body>
<p>
tritt <b>reproduzierbar</b>&#160;auf ab Load=4ms
@ -116554,9 +116661,7 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
</node>
<node CREATED="1713190363888" ID="ID_1032034712" MODIFIED="1713190386538">
<richcontent TYPE="NODE"><html>
<head>
</head>
<head/>
<body>
<p>
alloziert wurde auf dem <i>predecessor-Term</i>
@ -116599,16 +116704,13 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
<icon BUILTIN="button_ok"/>
<node CREATED="1713275014742" ID="ID_1261730499" MODIFIED="1713278884631" TEXT="nicht yield() f&#xfc;r den Check verwenden">
<richcontent TYPE="NOTE"><html>
<head>
</head>
<head/>
<body>
<p>
...das ist ohnehin <i>etwas kreativ</i>
</p>
</body>
</html>
</richcontent>
</html></richcontent>
<icon BUILTIN="yes"/>
</node>
<node COLOR="#338800" CREATED="1713275064894" ID="ID_18075325" MODIFIED="1713278877792" TEXT="stattdessen eine (dedizierte) sichere Pr&#xfc;f-Funktion schaffen">
@ -116620,16 +116722,13 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
<node COLOR="#435e98" CREATED="1713278922710" ID="ID_1345315613" MODIFIED="1713279328774" TEXT="sicherer Gebrauch?">
<node CREATED="1713278934306" ID="ID_1396168333" MODIFIED="1713278957520">
<richcontent TYPE="NODE"><html>
<head>
</head>
<head/>
<body>
<p>
dieser h&#228;ngt an der <i>Deadline</i>
</p>
</body>
</html>
</richcontent>
</html></richcontent>
<font NAME="SansSerif" SIZE="12"/>
<icon BUILTIN="idea"/>
</node>
@ -116643,16 +116742,13 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
</node>
<node CREATED="1713279085734" ID="ID_701792778" MODIFIED="1713279202169" TEXT="Deadline wird im Allocator nicht gepr&#xfc;ft &#x2014; absichtlich">
<richcontent TYPE="NOTE"><html>
<head>
</head>
<head/>
<body>
<p>
Der Allokator in sich ist robust; die Deadline beschreibt nur einen Nutzungs-Kontrakt; sie ist zwar im Gate des Blocks gespeichert, aber f&#252;r den Allokator nur ma&#223;geblich zur Suche des passenden Blocks. Das weitere Nutzungs-Muster mu&#223; auf einer h&#246;heren Ebene gew&#228;hrleistet sein.
</p>
</body>
</html>
</richcontent>
</html></richcontent>
</node>
<node CREATED="1713279203766" ID="ID_1759355521" MODIFIED="1713279234766" TEXT="Das Belegungs/Nutzungsmuster ist an den Nutzungs-Kontext gebunden"/>
<node CREATED="1713279235549" ID="ID_380603990" MODIFIED="1713279288738" TEXT="konkret: selber schuld, wenn man eine Dependency auf eine abgelaufene Activity setzt">
@ -116675,9 +116771,7 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
<node CREATED="1712878724339" ID="ID_355423762" MODIFIED="1712878743580" TEXT="allerdings erst nach Korrektur systematischer Einfl&#xfc;sse">
<node CREATED="1712878745848" ID="ID_1272865652" MODIFIED="1712878932404" TEXT="die Test-Last l&#xe4;uft langsamer als kalibriert">
<richcontent TYPE="NOTE"><html>
<head>
</head>
<head/>
<body>
<p>
Und zwar unabh&#228;ngig davon, ob die Kalibrierung mit kurzen oder langen Zeiten und single- oder multithreaded erfolgte. Die Abweichtung tritt nur im realen Last-Kontext auf, und ist (visuell. den Diagrammen nach zu urteilen) korreliert mit dem Grad an contention und irregularit&#228;t im Ablauf. Tendentiell nimmt sie f&#252;r l&#228;ngere Testl&#228;ufe ab, konvergiert aber &#8212; auch f&#252;r ganz gro&#223;e Lasten und sehr lange L&#228;ufe &#8212; typischerweise zu einem Offset von ~ +1ms
@ -116687,9 +116781,7 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
</node>
<node CREATED="1712878761558" ID="ID_398679192" MODIFIED="1712879104846" TEXT="es gibt eine relativ stabile Abweichung der effektiven Concurrency gg&#xfc; der Heuristik">
<richcontent TYPE="NOTE"><html>
<head>
</head>
<head/>
<body>
<p>
Und das ist schon aus rein-logischen Gr&#252;nden so zu erwarten. Bewu&#223;t habe ich beim Aufstellen der Heuristig f&#252;r das Test-Schedule auf jedwede <i>optimale Anordnung der Rechenwege</i>&#160;verzichtet (kein Box-stacking problem l&#246;sen!). Hinzu kommen die tats&#228;chlichen Beschr&#228;nkungen des Worker-Pools. Daraus ergibt sich eine charakteristische Abweichung zwischen einem theoretisch berechneten concurrency-speed-up (wie er in's Schedule eingerechnet ist) und der <b>empirisch</b>&#160;beobachteten durchschnittlichen concurrency. Das wird als <b>Form-Faktor</b>&#160;gedeutet.
@ -116703,9 +116795,7 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
<node CREATED="1712879138799" ID="ID_718120993" MODIFIED="1712879147226" TEXT="Load Peak / Param Range">
<node CREATED="1712879148222" ID="ID_461458634" MODIFIED="1712879195546" TEXT="eindeutig linearer Zusammenhang">
<richcontent TYPE="NOTE"><html>
<head>
</head>
<head/>
<body>
<p>
zwischen Load-Size und Laufzeit zum kompletten Abarbeiten der erzeugten Lastspitze
@ -116717,9 +116807,7 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
<node CREATED="1712879211454" ID="ID_686196534" MODIFIED="1712879234055" TEXT="f&#xfc;r gr&#xf6;&#xdf;ere Test-L&#xe4;ngen: sehr hohe Korrelationen"/>
<node CREATED="1712879236775" ID="ID_787579644" MODIFIED="1712879295485">
<richcontent TYPE="NODE"><html>
<head>
</head>
<head/>
<body>
<p>
Gradient <b>sehr nah</b>&#160;am zu erwartenden Wert
@ -116727,9 +116815,7 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
</body>
</html></richcontent>
<richcontent TYPE="NOTE"><html>
<head>
</head>
<head/>
<body>
<p>
wenn man die empirisch beobachtete, effektive Concurrency und reale durchschnittliche Job-Zeit ansetzt
@ -116739,9 +116825,7 @@ std::cout &lt;&lt; tmpl.render({&quot;what&quot;, &quot;World&quot;}) &lt;&lt; s
</node>
<node CREATED="1712879303773" ID="ID_1548672424" MODIFIED="1712879504409" TEXT="Sockel-Offset: 5ms + spin-up/down-Effekt">
<richcontent TYPE="NOTE"><html>
<head>
</head>
<head/>
<body>
<p>
das bedeutet: der tats&#228;chlich beobachtete Sockel h&#228;ngt von der L&#228;nge der Job-Last und der Concurrency ab: Grunds&#228;tzlich mu&#223; man einmal die ganze Worker-Pool-size zu Beginn und am Ende aufschlagen &#8212; mit reduzierter Concurrency. Das ergibt sich bereits aus einer rein logischen &#220;berlegung: &#187;Voll-Last&#171; kann erst konstituiert werden, wenn <i>der erste Worker sich den zweiten Job holt. </i>Analog beginnt der spin-down, wenn der erste worker <i>idle f&#228;llt.</i>