Scheduler-test: address defects in memory manager

...discovered by during investigation of latest Scheduler failures.
The root of the problems is that block overflow can potentially trigger
expansion of the allocation pool. Under some circumstances, this on-the fly
allocation requires a rotation of index slots, thereby invalidating
existing iterators.

While such behaviour is not uncommon with storage data structures (see std::vector),
in this case it turns out problematic because due to performance considerations,
a usage pattern emerged which exploits re-using existing storage »Slots« with known
deadline. This optimisation seems to have significant leverage on the
planning jobs, which happen to allocated and arrange a whole strike of
Activities with similar deadlines.

One of these problem situations can easily be fixed, since it is triggered
through the iterator itself, using a delegate function to request a storage expansion,
at which point the iterator is able to re-link and fix its internal index.
This solution also has no tangible performance implications in optimised code.

Unfortunately there remains one obscure corner case where such an pool expansion
could also have invalidated other iterators, which are then used later to
attach dependency relations; even a partial fix for that problem seems
to cause considerable performance cost of about -14% in optimised code.
This commit is contained in:
Fischlurch 2023-12-26 20:15:04 +01:00
parent af680cdfd9
commit 3716a5b3d4
4 changed files with 124 additions and 22 deletions

View file

@ -372,15 +372,58 @@ namespace gear {
return static_cast<Epoch&> (extent);
}
struct StorageAdaptor : RawIter
/**
* Adapt the access to the raw storage to present the Extents as Epoch;
* also caches the address resolution for performance reasons (+20%).
*/
class StorageAdaptor
: public RawIter
{
Epoch* curr_{nullptr};
Epoch*
accessEpoch()
{
return RawIter::checkPoint()? & asEpoch (RawIter::yield())
: nullptr;
}
public:
StorageAdaptor() = default;
StorageAdaptor(RawIter it) : RawIter{it} { }
Epoch& yield() const { return asEpoch (RawIter::yield()); }
StorageAdaptor(RawIter it)
: RawIter{it}
, curr_{accessEpoch()}
{ }
bool
checkPoint() const
{
return bool(curr_);
}
Epoch&
yield() const
{
return *curr_;
}
void
iterNext()
{
RawIter::iterNext();
curr_ = accessEpoch();
}
void
expandAlloc (size_t cnt =1)
{
RawIter::expandAlloc(cnt);
curr_ = accessEpoch();
}
};
public:
BlockFlow()
: alloc_{Strategy::initialEpochCnt()}
@ -499,7 +542,8 @@ namespace gear {
___sanityCheckAlloc(requiredNew);
if (distance % _raw(epochStep_) > 0)
++requiredNew; // fractional: requested deadline lies within last epoch
alloc_.openNew(requiredNew); // Note: nextEpoch now points to the first new Epoch
nextEpoch.expandAlloc (requiredNew);
// nextEpoch now points to the first new Epoch
for ( ; 0 < requiredNew; --requiredNew)
{
REQUIRE (nextEpoch);

View file

@ -163,7 +163,17 @@ namespace mem {
/* === pass-through extended functionality === */
size_t getIndex() { return index; }
void expandAlloc(){ exFam->openNew();}
void
expandAlloc (size_t cnt =1)
{
size_t prevStart = exFam->start_;
exFam->openNew(cnt);
if (index >= prevStart)
index += (exFam->start_-prevStart);
// was in a segment that might be moved up
ENSURE (exFam->isValidPos (index));
}
};

View file

@ -524,7 +524,7 @@ namespace test {
gear::BlockFlow<blockFlow::RenderConfig> blockFlow;
// Note: using the RenderConfig, which uses larger blocks and more pre-allocation
auto blockFlowAlloc = [&]{
auto allocHandle = blockFlow.until(Time{400,0});
auto allocHandle = blockFlow.until(Time{BASE_DEADLINE});
auto allocate = [&, j=0](Time t, size_t check) mutable -> Activity&
{
if (++j >= 10) // typically several Activities are allocated on the same deadline

View file

@ -86024,7 +86024,7 @@ Date:&#160;&#160;&#160;Thu Apr 20 18:53:17 2023 +0200<br/>
</node>
<node BACKGROUND_COLOR="#eee5c3" COLOR="#990000" CREATED="1703424162079" ID="ID_247540517" MODIFIED="1703432901042" TEXT="&#xdc;berarbeitung: Planungs-Job und Koordination">
<linktarget COLOR="#7d5f93" DESTINATION="ID_247540517" ENDARROW="Default" ENDINCLINATION="-616;-398;" ID="Arrow_ID_1233467542" SOURCE="ID_1652444759" STARTARROW="None" STARTINCLINATION="-1628;68;"/>
<linktarget COLOR="#a60d7a" DESTINATION="ID_247540517" ENDARROW="Default" ENDINCLINATION="-1144;66;" ID="Arrow_ID_577004378" SOURCE="ID_451970697" STARTARROW="None" STARTINCLINATION="-106;-376;"/>
<linktarget COLOR="#a60d7a" DESTINATION="ID_247540517" ENDARROW="Default" ENDINCLINATION="-1144;66;" ID="Arrow_ID_577004378" SOURCE="ID_451970697" STARTARROW="None" STARTINCLINATION="-117;-494;"/>
<icon BUILTIN="yes"/>
<node BACKGROUND_COLOR="#e0ceaa" COLOR="#690f14" CREATED="1703428031217" ID="ID_936423761" MODIFIED="1703428320686" TEXT="die inzwischen aufgebaute L&#xf6;sung erweist sich in Ausnahmef&#xe4;llen als fragil">
<richcontent TYPE="NOTE"><html>
@ -86574,10 +86574,19 @@ Date:&#160;&#160;&#160;Thu Apr 20 18:53:17 2023 +0200<br/>
</node>
</node>
<node CREATED="1703443091412" ID="ID_1259354421" MODIFIED="1703443096295" TEXT="Ansatzpunkte">
<node BACKGROUND_COLOR="#eee5c3" COLOR="#990000" CREATED="1703443098499" ID="ID_869681663" MODIFIED="1703443139736" TEXT="der IdxIter sollte sich im Falle expandAlloc() selber reparieren">
<node COLOR="#435e98" CREATED="1703443098499" ID="ID_869681663" MODIFIED="1703618109556" TEXT="der IdxIter sollte sich im Falle expandAlloc() selber reparieren">
<icon BUILTIN="yes"/>
<node CREATED="1703443141159" ID="ID_1212377700" MODIFIED="1703443157045" TEXT="das w&#xfc;rde die selbst-Gef&#xe4;rdung beseitigen"/>
<node CREATED="1703443157603" ID="ID_1554730434" MODIFIED="1703443178364" TEXT="und die Kosten sind minimal im Vergleich zum Aufwand der Rotation"/>
<node COLOR="#338800" CREATED="1703618122823" ID="ID_424709567" MODIFIED="1703618180054" TEXT="Implementierung">
<icon BUILTIN="button_ok"/>
<node CREATED="1703618145203" ID="ID_1675853787" MODIFIED="1703618152791" TEXT="nur notwendig wenn idx &gt; start_"/>
<node CREATED="1703618153971" ID="ID_1352276864" MODIFIED="1703618175867" TEXT="dann alten start merken und nachher korrigieren mit +&#x394;"/>
</node>
<node CREATED="1703620372994" ID="ID_664572519" MODIFIED="1703620376477" TEXT="Benchmark">
<node CREATED="1703620377237" ID="ID_1761258606" MODIFIED="1703620390018" TEXT="Debug-Mode: +20ns"/>
<node CREATED="1703620390957" ID="ID_1495081869" MODIFIED="1703620409473" TEXT="Release-Mode / -O3 kein Unterschied feststellbar"/>
</node>
</node>
<node CREATED="1703443353377" ID="ID_1706502358" MODIFIED="1703443373066" TEXT="Fremd-Gef&#xe4;hrdung ist schwer zu vermeiden">
<node CREATED="1703443375198" ID="ID_1083582649" MODIFIED="1703443451296" TEXT="man k&#xf6;nnte auf das Weiterverwenden fremder Handles verzichten"/>
@ -86610,13 +86619,39 @@ Date:&#160;&#160;&#160;Thu Apr 20 18:53:17 2023 +0200<br/>
<node BACKGROUND_COLOR="#e0ceaa" COLOR="#690f14" CREATED="1703556475582" ID="ID_957677073" MODIFIED="1703556503251" TEXT="die rettende Idee: die Extends selber bleiben an stabiler Addresse im Speicher">
<icon BUILTIN="idea"/>
</node>
<node BACKGROUND_COLOR="#eee5c3" COLOR="#990000" CREATED="1703556506473" ID="ID_1106474590" MODIFIED="1703556535688" TEXT="also kann man im high-level-Iterator diese Adresse cachen (spart sogar noch Ausf&#xfc;hrungszeit)">
<icon BUILTIN="flag-yellow"/>
<node COLOR="#338800" CREATED="1703556506473" ID="ID_1106474590" MODIFIED="1703626079178" TEXT="also kann man im high-level-Iterator diese Adresse cachen (spart sogar noch Ausf&#xfc;hrungszeit)">
<icon BUILTIN="button_ok"/>
</node>
<node CREATED="1703626079854" ID="ID_1386992576" MODIFIED="1703626106254" TEXT="Implementierung">
<node CREATED="1703626107267" ID="ID_1403395674" MODIFIED="1703626108245" TEXT="StorageAdapter wird jetzt eine richtige Klasse"/>
<node CREATED="1703626111274" ID="ID_143127896" MODIFIED="1703626135482" TEXT="eager-pull der neuen Adresse, auch nach iterNext()"/>
<node CREATED="1703626141118" ID="ID_1144941782" MODIFIED="1703626159055" TEXT="jetzt das komplette Iter-Core-API explizit durchgemappt"/>
<node CREATED="1703626197126" ID="ID_1418005285" MODIFIED="1703626207080" TEXT="Code jetzt viel l&#xe4;nger aber expliziter und klarer"/>
<node CREATED="1703626170839" ID="ID_1957522257" MODIFIED="1703626195015" TEXT="mu&#xdf; auch das expandAlloc durchmappen (und die neue Addresse pullen)"/>
<node COLOR="#435e98" CREATED="1703626213560" ID="ID_747018530" MODIFIED="1703626228237" TEXT="dadurch geht auch der Trick mit dem weiterverendeten Iter nicht mehr">
<icon BUILTIN="messagebox_warning"/>
</node>
</node>
<node BACKGROUND_COLOR="#f8f1cb" COLOR="#a50125" CREATED="1703627854904" ID="ID_1263028057" MODIFIED="1703627860893" TEXT="Probleme">
<icon BUILTIN="messagebox_warning"/>
<node CREATED="1703627863246" ID="ID_1039073068" MODIFIED="1703627872651" TEXT="Benchmark">
<node CREATED="1703627874724" ID="ID_963318017" MODIFIED="1703627915635" TEXT="Debug-Modus um -25% besser"/>
<node CREATED="1703627889163" ID="ID_1345451591" MODIFIED="1703627902101" TEXT="Release-Modus um +14% schlechter"/>
</node>
<node CREATED="1703627918447" ID="ID_27018162" MODIFIED="1703631571590" TEXT="Problem nicht gel&#xf6;st &#x2014; es bleibt eine L&#xfc;cke">
<icon BUILTIN="broken-line"/>
<node CREATED="1703627950892" ID="ID_1504406738" MODIFIED="1703627957597" TEXT="n&#xe4;mlich wenn man iteriert..."/>
<node CREATED="1703627958197" ID="ID_1474583049" MODIFIED="1703627985170" TEXT="dann wird pl&#xf6;tzlich der interne, invalide Index verwendet"/>
</node>
</node>
</node>
</node>
</node>
<node COLOR="#338800" CREATED="1703601256942" HGAP="0" ID="ID_607541044" MODIFIED="1703601370098" TEXT="Verbesserung im Test deutlich sichtbar" VSHIFT="3">
<arrowlink COLOR="#37b809" DESTINATION="ID_1410555565" ENDARROW="Default" ENDINCLINATION="-653;-22;" ID="Arrow_ID_1516277754" STARTARROW="None" STARTINCLINATION="475;31;"/>
<icon BUILTIN="button_ok"/>
</node>
</node>
</node>
<node COLOR="#338800" CREATED="1687738213842" ID="ID_683259902" MODIFIED="1687738250423" TEXT="Payload: ActOrder">
<icon BUILTIN="button_ok"/>
@ -106549,10 +106584,10 @@ Date:&#160;&#160;&#160;Thu Apr 20 18:53:17 2023 +0200<br/>
</node>
</node>
</node>
<node BACKGROUND_COLOR="#eee5c3" COLOR="#990000" CREATED="1703423707106" ID="ID_451970697" MODIFIED="1703425281750" TEXT="Umgang mit Planung, Dispatch und Locking &#xfc;berdenken">
<arrowlink COLOR="#a60d7a" DESTINATION="ID_247540517" ENDARROW="Default" ENDINCLINATION="-1144;66;" ID="Arrow_ID_577004378" STARTARROW="None" STARTINCLINATION="-106;-376;"/>
<linktarget COLOR="#980e34" DESTINATION="ID_451970697" ENDARROW="Default" ENDINCLINATION="155;677;" ID="Arrow_ID_1930824070" SOURCE="ID_1994748011" STARTARROW="None" STARTINCLINATION="-607;42;"/>
<icon BUILTIN="flag-yellow"/>
<node BACKGROUND_COLOR="#eef0c5" COLOR="#990000" CREATED="1703423707106" ID="ID_451970697" MODIFIED="1703601760318" TEXT="Umgang mit Planung, Dispatch und Locking &#xfc;berdenken">
<arrowlink COLOR="#a60d7a" DESTINATION="ID_247540517" ENDARROW="Default" ENDINCLINATION="-1144;66;" ID="Arrow_ID_577004378" STARTARROW="None" STARTINCLINATION="-117;-494;"/>
<linktarget COLOR="#980e34" DESTINATION="ID_451970697" ENDARROW="Default" ENDINCLINATION="163;698;" ID="Arrow_ID_1930824070" SOURCE="ID_1994748011" STARTARROW="None" STARTINCLINATION="-607;42;"/>
<icon BUILTIN="pencil"/>
</node>
<node BACKGROUND_COLOR="#eee5c3" COLOR="#990000" CREATED="1702944889122" ID="ID_816186443" MODIFIED="1703125925828" TEXT="Berechnung der systematisch-erwartbaren Ausf&#xfc;hrungszeit">
<linktarget COLOR="#5d567f" DESTINATION="ID_816186443" ENDARROW="Default" ENDINCLINATION="-25;77;" ID="Arrow_ID_280715198" SOURCE="ID_1983964457" STARTARROW="None" STARTINCLINATION="-621;-17;"/>
@ -107858,7 +107893,7 @@ Date:&#160;&#160;&#160;Thu Apr 20 18:53:17 2023 +0200<br/>
<icon BUILTIN="hourglass"/>
</node>
</node>
<node BACKGROUND_COLOR="#d7abab" COLOR="#435e98" CREATED="1702949773611" ID="ID_1539248543" MODIFIED="1703424094982" TEXT="wieder Assertion-Failure aus dem Allokator">
<node BACKGROUND_COLOR="#d7abab" COLOR="#435e98" CREATED="1702949773611" FOLDED="true" ID="ID_1539248543" MODIFIED="1703424094982" TEXT="wieder Assertion-Failure aus dem Allokator">
<linktarget COLOR="#fd1e5d" DESTINATION="ID_1539248543" ENDARROW="Default" ENDINCLINATION="-214;-502;" ID="Arrow_ID_1665215112" SOURCE="ID_1042591319" STARTARROW="None" STARTINCLINATION="417;46;"/>
<linktarget COLOR="#e02174" DESTINATION="ID_1539248543" ENDARROW="Default" ENDINCLINATION="-578;34;" ID="Arrow_ID_250558884" SOURCE="ID_1069075638" STARTARROW="None" STARTINCLINATION="186;12;"/>
<icon BUILTIN="broken-line"/>
@ -108464,9 +108499,6 @@ Date:&#160;&#160;&#160;Thu Apr 20 18:53:17 2023 +0200<br/>
</body>
</html></richcontent>
<icon BUILTIN="list"/>
<node BACKGROUND_COLOR="#fdfdcf" COLOR="#ff0000" CREATED="1703198801785" ID="ID_1451479605" MODIFIED="1703198823852" TEXT="TODO analysieren: ASSERT.log">
<icon BUILTIN="flag-pink"/>
</node>
<node CREATED="1703200132880" ID="ID_710240395" MODIFIED="1703200139593" TEXT="lief der Planer (re-entrant)"/>
<node CREATED="1703200140817" ID="ID_1805000189" MODIFIED="1703200165593" TEXT="fand ein &#xbb;Tick&#xab; statt (&#xd83e;&#xdc32; queue-Head EMPTY)"/>
<node CREATED="1703200175851" ID="ID_1332001301" MODIFIED="1703200188303" TEXT="gingen mehrere Worker in den Dispatch"/>
@ -108486,7 +108518,7 @@ Date:&#160;&#160;&#160;Thu Apr 20 18:53:17 2023 +0200<br/>
</html></richcontent>
<arrowlink COLOR="#e52142" DESTINATION="ID_960168734" ENDARROW="Default" ENDINCLINATION="408;21;" ID="Arrow_ID_1198532158" STARTARROW="Default" STARTINCLINATION="-380;-216;"/>
<icon BUILTIN="forward"/>
<node BACKGROUND_COLOR="#f0d5c5" COLOR="#750099" CREATED="1703346114689" ID="ID_1031276036" MODIFIED="1703351355333" TEXT="wie ist die Situation zustandegekommen?">
<node BACKGROUND_COLOR="#f0d5c5" COLOR="#750099" CREATED="1703346114689" FOLDED="true" ID="ID_1031276036" MODIFIED="1703601851865" TEXT="wie ist die Situation zustandegekommen?">
<icon BUILTIN="help"/>
<node CREATED="1703346210222" ID="ID_821266283" MODIFIED="1703346224774" TEXT="Chunk-1 : im Vorlauf"/>
<node CREATED="1703346225482" ID="ID_224653691" MODIFIED="1703346267705" TEXT="Chunk-2 :">
@ -109133,8 +109165,8 @@ Date:&#160;&#160;&#160;Thu Apr 20 18:53:17 2023 +0200<br/>
</node>
</node>
<node CREATED="1703423664596" ID="ID_1621633186" MODIFIED="1703423680354" TEXT="die Guards ufern aus und k&#xf6;nnten die Performance beeintr&#xe4;chtigen"/>
<node BACKGROUND_COLOR="#e0ceaa" COLOR="#a81342" CREATED="1703423614885" ID="ID_1994748011" MODIFIED="1703423951697" TEXT="ich bin mit der Gesamtsituation unzufrieden">
<arrowlink COLOR="#980e34" DESTINATION="ID_451970697" ENDARROW="Default" ENDINCLINATION="155;677;" ID="Arrow_ID_1930824070" STARTARROW="None" STARTINCLINATION="-607;42;"/>
<node BACKGROUND_COLOR="#e0ceaa" COLOR="#a81342" CREATED="1703423614885" ID="ID_1994748011" MODIFIED="1703601748343" TEXT="ich bin mit der Gesamtsituation unzufrieden">
<arrowlink COLOR="#980e34" DESTINATION="ID_451970697" ENDARROW="Default" ENDINCLINATION="163;698;" ID="Arrow_ID_1930824070" STARTARROW="None" STARTINCLINATION="-607;42;"/>
<font ITALIC="true" NAME="SansSerif" SIZE="12"/>
</node>
</node>
@ -109154,6 +109186,22 @@ Date:&#160;&#160;&#160;Thu Apr 20 18:53:17 2023 +0200<br/>
<node COLOR="#338800" CREATED="1703353928974" ID="ID_538772671" MODIFIED="1703361847604" TEXT="auch Planungs Zeit-Schritt konfigurierbar machen">
<icon BUILTIN="button_ok"/>
</node>
<node COLOR="#338800" CREATED="1703601935796" ID="ID_1503115625" LINK="#ID_451970697" MODIFIED="1703601960057" TEXT="Umbau Eingang: Problemsituation generell vermeiden">
<icon BUILTIN="button_ok"/>
</node>
</node>
<node BACKGROUND_COLOR="#c9ceb6" COLOR="#338800" CREATED="1703601281179" ID="ID_1410555565" MODIFIED="1703601662531" TEXT="nach Umbau: Test l&#xe4;uft auch unter Druck sauber" VSHIFT="11">
<linktarget COLOR="#37b809" DESTINATION="ID_1410555565" ENDARROW="Default" ENDINCLINATION="-653;-22;" ID="Arrow_ID_1516277754" SOURCE="ID_607541044" STARTARROW="None" STARTINCLINATION="475;31;"/>
<font NAME="SansSerif" SIZE="13"/>
<icon BUILTIN="button_ok"/>
<node CREATED="1703601382110" ID="ID_612095081" MODIFIED="1703601408132" TEXT="durch die Umordnung der Eingangs-Kette kann der Planer nicht mehr in den Dispatch"/>
<node CREATED="1703601408415" ID="ID_1444215970" MODIFIED="1703601428435" TEXT="dadurch laufen die Planungs-Chunks nun einfach durch (auch wenn sie langsam sind)"/>
<node CREATED="1703601429221" ID="ID_1288842934" MODIFIED="1703601441154" TEXT="nach kurzer Contention-Phase ziehen sich die Worker zur&#xfc;ck"/>
<node CREATED="1703601442222" ID="ID_1064601873" MODIFIED="1703601462225" TEXT="die Continuations (und alles) f&#xe4;llt stark hinter den Plan"/>
<node CREATED="1703601463248" ID="ID_1303287458" MODIFIED="1703601511505" TEXT="deshalb wird nun zwischen Planungs-L&#xe4;ufen immer das Backlog (nominelle Zeit) abgearbeitet"/>
<node CREATED="1703601514661" ID="ID_1656926263" MODIFIED="1703601541261" TEXT="dabei l&#xe4;uft die Concurrency relativ z&#xfc;gig wieder hoch"/>
<node CREATED="1703601552039" ID="ID_485470617" MODIFIED="1703601578783" TEXT="der Lauf kann nun nur noch scheitern, wenn irgendwann die Deadlines &#xfc;berfahren werden"/>
<node CREATED="1703601579504" ID="ID_1688998231" MODIFIED="1703601596285" TEXT="ansonsten kommt die Berechnung immer noch schneller an als single-Threaded"/>
</node>
</node>
</node>