...at the point where I identified the need to parse nested terms.
The goals are still the same
* write tests to ''verify connectivity'' of nodes generated by the new `NodeBuilder`
* allow for ''extended custom attributes'' in the ProcID
* provide the ability to mark specific parametrisations
* build a Hash-Key to identify a given processing step
__Note Library__: this is the first time `lib::Several` was used to hold a ''const object''.
Some small adjustments in type detection were necessary to make that work.
Access to stored data happens through the `lib::Several` front-end and thus always includes
the const modifier; so casting any const-ness out of the way in the low-level memory management
is not a concern...
This finishes an ''exercise'' in tool design,
which was set off by the requirement to parse the spec-ID of a render node.
While generally within the confines of a helper utility for simple use cases,
the solution became quite succinct and generic, as it allows to handle arbitrary
LL(n) grammars, possibly with recursion.
...which is the reason for this whole excursion into parser business;
we want to accept specification terms with elements from C++ type expressions,
which especially requires to accept complete comma separated lists within
angle brackets or parenthesis, while separating by comma at top level.
The idea is to model ''not as an expression'' but rather as an ''extended quote'',
and to use inverted regular expressions for non-quote-characters as terminal
...evaluating the recursive syntax of a numerical expression!
* so this light-weight parsing support framework indeed allows
to build fully capable LL(x) parsers, when the user knows how
handle syntax clauses and bind the result models
* furthermore, a notation is demonstrated how to arrange the
binding functions so to keep the syntax definition legible
* this involves a shortcut for homogeneous alternatives
The concept was indeed successful, albeit quite difficult to pull off in detail.
It requires a carefully crafted path of Deduction guides and overloads
to effect the switch from std::function to std::function& at the point
where a predeclared syntax clause placeholder is used recursively
In accordance to the plan drafted yesterday, I will try to integrate
this essential capability into the framework established thus far by a trick,
requiring only minimal adjustment, but some work by the user.
Since the parse function is defined as a (unqualified) template argument,
it is possible to emplace either a `std::function`, or a reference thereto.
For this to work, the user is required to pre-define the expected result type,
and, furthermore, must later on assign a fully specified clause, which
also has a model transformation binding attached to yield this predeclared
result type
...several improvements as result from the more elaborate test cases
- spelling out the model types taken as argument can be challenging and tedious,
thus improve the ability to pass a λ-generic.
- furthermore, using structured bindings on a SeqModel can also simplifiy
binding code; this did not work because the compiler picks the wrong strategy
and attempts to bind the structure fields; need to provide explicit speicalisations
to support the »tuple protocol« for SeqModel.
..considered several further helpers, (like auto-joining into a single string),
but in the end did not implement them, due to questionable relevance
The `bindMatch()` as implemented yesterday works only directly on top
of the terminal parsers, which yield a `RegExp`-Matcher. However,
it would be desirable to provide a generic shortcut to always get
some string as result model. A simple fallback is to return
the part of the input-string accepted thus far.
Basically the implementation is already in place;
yet for better error messages we need to find out if the given functor
can handle the model present at this stage. Since generic-λ are not
functions by themselves (but rather templates), we need to ''probe''
with the expected argument and see if instantiation is possible.
⚠ NOTE: still a strange bug related to using the same Syntax several times
Allowing free recursion in grammars is the key enabling feature,
which allows to accept arbitrary complex structures (like numeric expressions).
It is however also the element which makes the task of parsing a challenging endeavour;
after weighting the arguments, I decided ''not to place the focus on advanced usage,''
yet to open a pathway towards representation of such grammars.
Essentially, I consider it acceptable to require some additional work by the user,
if arbitrary recursive grammars are desired; because this design relies on explicitly
given parse functions, we need to introduce some kind of indirection interface,
to allow ''declaring'' a recursive rule first and later to ''supply the definition,''
which obviously then will involve other rules (or itself) recursively.
This leads to a very ''nifty approach'' towards recursion: we require the user
to provide an ''explicit model type'' beforehand, which implies that this is a
simple type, that can be spelled out (no λ) — and so the user is also
''forced to augment the actual rule with a model-binding,'' thereby reducing
the structured return types from the parse into something simple and uniform.
The user ''has to do the hard work,'' but can ''exploit additional knowledge''
related to the specific use case.
All this framework needs to do then is to supply a `std::function`, using the
explicit return type given; everything else will still work as implemented,
since a `std::function` can always stand-in for any arbitrary λ.
This is the very key feature that requires a real parser and can not be handled by regular expressions.
After all the groundwork, it is surprisingly easy provide now;
only coding up all those DSL-variants is tedious. Notably we also
support accepting an ''optional'' bracket, and we support arbitrary
expressions for the ''opening'' and ''closing construct.''
It seemed that using postfix-decorating operators would be a good fit
for the DSL. Exploring this idea further showed however, that such a scheme
is indeed a good fit from the implementation side, but leads to rather confusing
and hard to grasp DSL statements for many non-trivial syntax definition.
The reason is: such a postfix-decorator will by default work on ''everything defined''
up to that point; this is too much in many cases.
The other alternative would be a function-style definition, which has the benefit
to take the sub-clause directly as argument (so the scope is always explicit).
The downside is that argument arrangement is a bit more tricky for the repetition
combinator (there can be mis-matches, since we take the »SPEC« as free-template argument)
And, moreover, with function-style, having more top-level entrance points would
be helpful. Overall, no fundamental roadblock, just more technicalities in the setup
of the DSL functions.
With that re-arrangd structure, an optional combinator could be easily integrated,
and a solution was provided to pick up the parser function from a sub-expression
defined as Syntax object.
Meanwhile, some kind of style scheme has emerged for the DSL:
We're working much with postfix-decorating operators, which
augment or extend the ''whole syntax clauses defined thus far''
In accordance with this scheme, I decided also to treat repeated expression
as a postfix operator (other than initially planned). This means, the actual
body to be repeated is ''the syntax clause defined thus far'', and the
repeat()-operator only details the number of repetitions and an optional delimiter.
After all the preparation, now this panes out quite well:
* use a simple 3-way branch structure
* the model type was already pre-selected by the `_Join` Model selector
* can just pass the result-model elements to a constructor/builder
* incremental extension can be directly mapped to the predecessor model
This is a rather obnoxious limitation of C++ variadics:
the inability to properly match against a mixed sequence with variadics.
The argument pack must always be the last element, which precludes to match
the last or even the penultimate element (which we need here).
After some tinkering, I found a way to recast this as ''rebinding to a remoulded sequence'',
and could package a multitude of related tools into a single helper-template,
which works without any further library dependencies.
🠲 extract into a separate header (`variadic-rebind.hpp`) for ease of use.
So this turned out to be much more challenging than expected,
due to the fact that, with this design, typing information is
only available at compile-time. The key trick was to use a
''double-dispatch'' based on a generic lambda. In the end,
this could be rounded out to be self-contained library helper,
which is even fully copyable and assignable and properly
invokes all payload constructors and destructors.
The flip side is that such a design is obviously very flexible
and direct regarding the parser model-bindings, and it should
be fairly well optimisable, since the structure is entirely
static and without any virtual dispatch.
Proper handling of payload lifecycle was verified using
a tracking test object with checksum.
* the implementation of this ''Sum Type'' got quite technical and complicated;
thus better to be extracted as separate library component
* use this as base for the `AltModel`
* make a usage sketch, invoking only the model interactions required
After exploring the »nested decorator-chain« implementation variant,
I decided to stick to the solution with the λ-vistor, while attempting
to level and smooth-out the design.
* allow to engage ''any'' «slot» at construction
* reverse the order of type parameters to be ascending (as in `std::tuple`)
* make it fully copyable, movable, assignable and provide a `swap()`,
relying on a variant of the "copy-and-swap"-idiom
* add an ''cross-constructor'' for an extended branch set
To represent the result-model for syntax alternatives,
we need a C++ representation for a ''sum type,'' i.e.
a type that can be one from a fixed set of alternatives.
Obviously the implementation will rely on some kind of Union,
or otherwise employ an opaque buffer and perform a forced cast.
Moreover, to be actually usable, a branch-selector-ID must be
captured and stored alongside, so that code processing the results
can detect which branch of the syntax was chosen.
There seem to be several possible avenues to build and structure
an actual class template to provide this implementation model
* a nested decorator-chain
* using a recursive selector-function with a generic-λ
''all these look quite unattractive, unfortunately....''
Seems like a pragmatic choice, which simplifies most syntax definitions significantly.
In exceptional cases, it is still possible to enforce a situation with `\b` or `\B`
* need to pass the parse end-point in the Eval-Result to allow composed models
* this also prepares for support of generic model-binding-λ
With the help of the model-joining case definitions it is then possible to handle sequence extension.
Deliberately I do not engage into fine grained signature checking, since this would lead to very technical code and moreover this is an implementation feature and we control all invocations (with signatures guaranteed to be correct)
Unfortunately, there are some common syntactic structures, which can not easily be dissected by regular expressions alone, since they entail nested subexpressions. While it is possible to get beyond those fundamental limitations with some trickery, doing so remains precisely that, ''trickery.''
After fighting some inner conflicts, since ''I do know how to write a parser'' —
in the end I have brought myself to just do it.
And indeed, as you'd might expect, I have looked into existing library solutions,
and I would not like to have any one of them as part of the project.
* I do not want a ''parser engine'' or ''parser generator''
* I want the directness of recursive-descent, but combined with Regular Expressions as terminal
* I want to see the structure of the used grammar at the definition site of the custom parser function
* I want deep integration of ''model bindings'' into the parse process, i.e. binding-λ
* I do not want to write model-dissecting or pattern-matching code after the parse
* I do not want to expose ''Monads'' as an interface, since they tend to spread unhealthy structure to surrounding code
* I do not want to leak technicalities of the parse mechanics into the using code
* I do not want to impose hard to remember specific conventions onto the user
Thus I've set the following aims:
* The usage should require only a single header include (ideally header-only)
* The entrance point should be a small number of DSL-starter functions
* The parser shall be implemented by recursive-descent, using the parser-combinator technique
* But I want that wrapped into a DSL, to be able to control what is (not) provided or exposed.
* I want a stateful, applicative logic, since parsing, by its very nature, is stateful!
* I want complete compile-time typing, visible to the optimiser, without a virtual »Parser« interface
And last but not least, ''I do not want to create a ticket, since I do not know if those goals can be achieved...''
Building a correct processing-identification is a complex and challenging task; only some aspects can be targeted and implemented right now, as part of the »Playback Vertical Slice«
* components of the ProcID
* parsing the argument-spec
* dispatch of detail information function to retrieve source ports
The choice to rely on strictly typed functor bindings for the Node operation
bears the danger to produce ''template bloat'' — it would be dangerous to add
further functions to the Port-API naïvely; espeically simple information functions
will likely not depend on the full type information.
A remedy to explore would be to exploit properties marked into the Port's `ProcID`
as key for a dispatcher hashtable; assuming that the `NodeBuilder` will be responsible
for registering the corresponding implementation functions, such a solution could even
be somewhat type-safe, as long as the semantics of the ProcID are maintained correctly.
* this changeset builds a complex processing network for the first time
* furthermore, some ideas towards verification are spelled out
''verification not implemented''
...which aims at building up increasingly more complex Node Graphs,
to validate that all clauses are defined and connected properly.
Reconsidering the testing plan: initially especially this test was aimed
primarily at driving me through the construction of the Node builder and
connection scheme. Surprisingly enough, already the first test case basically
forced the complete construction, by setting me on tangential routes,
notably the **parameter handling**.
Now I'm returning to this test plan with an already finished construction,
and thus it can be straightened just to give enough coverage to validate
the correctness of this construction...
The namespace `steam::engine::test::ont` will hold some typical definitions
for the fake „media processing library“ — to be used for validating aspects of mapping and binding.
This picks up the efforts towards a »Test Ontology« from end November:
d80966c1f
The `TestRandOntology` is intended as a playground to gradually find out
how to maintain bindings processing functionality provided by a specific Library
and thus related to a ''Domain Ontology''
Remark: generating symbolic specs might seem like a mere test exercise, yet is in fact
quite crucial, since the node-identity is based on such a spec, which must be ''semantically correct,''
otherwise caching and especially cache invalidation will be broken.
Yesss .... in Lumiera naming and cache invalidation are linked directly ;-)
This is a high-level integration test to sum up this development effort
* an advanced refactoring was carried out to introduce a
flexible and fully-typed binding for the ''processing-functor''
* this entailed a complete rework of the `FeedManifold` to integrate
inline storage for a ''parameter tuple'' and input / output ''buffer tuples''
* optional ''parameter functors'' were included into the design at a deep level,
closely related to the binding of the processing-functor
* the chosen design is thus a compromise between ''everything nodes''
and a ''dedicated parameter-handling'' at invocation level
As a proof-of-concept, an scheme to handle extended parameters was devised,
using a special »Param Agent Node« and extension storage blocks in stack memory.
While not immediately necessary, this design exercise proves the overall design
is flexible enough to accommodate future extended needs.
Actually this is now quite easy to implement, as a shortcut on top of generic functionality;
just in this case the param-functor takes a Time value as argument.
So its more a matter of documentation to provide a dedicated hook for this common case.
incidentally, this is also the first test case ever to involve linked nodes,
so it revealed several bugs in the related code, which was not yet tested.
This is a ''move-builder'' and thus represents a tricky and sometimes dangerous setup,
while allowing to switch the type context in the middle of the build process.
It is essential to return a RValue-Reference from all builder calls which
stay on the same builder context.
After fixing those minor (and potentially dangerous) aspects regarding move-references,
the code built yesterday worked as expected!
This is some quite technical and redundant code, which largely maps
the configured elements from the Builder-DSL level down into the delegate
builder functors. For the ''Param Agent Node,'' most of the structure
is already embedded deep into the `ParamWeavingPattern`, by virtue of a
tuple of parameter-functors, which are supplied to the builder-API
as a `ParamBuildSpec` (which in fact is in itself a builder and will be
used on a higher level to fill in suitable parameter-functors)
This changeset is assumed to complete the definition of a builder and
weaving pattern for a ''Param Agent Scheme'' — yet only the tests to be
elaborated next will show the extent to which this is true....
unfortunately the "mechanics" of this builder setup are quite convoluted,
due to constrains with the memory manager, which basically force us to
collect a set of ''builder-λ'', together with summing up all the required storage,
so that the actual allocation of all Ports can be done into one contiguous block
of memory, to be connected to the actual Node.
For the regular `PortBuilder`, we use a helper subclass, the `WeavingBuilder`,
to construct this builderλ. But here, for the setup of an ''Param Agent Node,''
the actual wiring is much simpler and it is not justified to use a delegate builder;
rather we perfrom the complete setup directly in the terminal sub-builder operation,
prior to returning up to the NodeBuilder, which controls the overall build.
Still having some doubts if using a ''weaving-pattern'' is the right approach here,
but if we do, then the steps would be mapped as drafted here. This includes
passing additional parameters, notably the `TurnoutSystem&` to every step.
As it turns out, we need to embed the Param-Functor tuple,
but only for a single use from a »builder« component.
On the other hand, the nested »Slot« classes are deemed dangerous,
since they just seem to invite being bound into some functor, which
would create a dangling reference once the `ParamBuildSpec` is gone.
Thus it's better to do away with this reference and make those accessors
basically static, because this way they ''can'' be embedded into param-access
functors (and I'd expect precisely that to happen in real use)
...intended to be used as a Turnout for a ''Param Agent Node....''
This leads to several problems, since the ''chain-data-block'' was defined to be non-copyable,
which as such is a good idea, since it will be accessed by a force-cast through the TurnoutSystem.
So the question is how to group and arrange the various steps into the general scheme of a Weaving-Pattern...
In `NodeFeed_test`...
Demonstrate the base mechanism of creating a ''Param Spec'' with a
functor-definition for each parameter. This can then later be used to
invoke those functors and materialise the results into a data tuple,
and this data tuple can be linked into the TurnoutSystem, so that
the parameter values can be accessed type-safe with getter-functors.
Relying basically on the trick to invoke std::apply with a generic variadic Lambda
onto the tuple of functors; within the lambda we can use variadic expansion
to pass the results directly into the builder and so construct the param-tuple in-place.
Oh well.
2024 is almost gone by now.
Had to endure yet another performance of Beethoven's 9th symphony...
This is rather the easy part, building upon the foundation developed with `HeteroData`:
* the `TurnoutSystem` will now accept a `HeteroData`-Accessor
* the `ParamBuldSpec` can thus construct an Accessor-Type for each »slot«
...the more tricky part will be how actually to build, populate and attach
such an extension data slot, placed into the local stack frame...
...which in turn would then allow
* to refer to extended parameters within scope
* to build a Param(Agent)Node, which builds a parameter tuple
by invoking the given parameter-functors
Can now demonstrate in the test
* define several »slots«, each with either value or functor
* apply these functors to a `TurnoutSystem`
So this is a design sketch how a `ParamBuildSpec` descriptor could be created,
which in turn would provide the foundation to implement a ''Parameter Weaving Pattern...''
__Note__: since this is an extension for advanced usage, yet relies on a storage layout
defined to allow for extensions like this use case here, the anchor type is now defined
to reside in the `TurnoutSystem` in the form of a ''standard parameter block''.
Those standard invocation parameters are fixed and thus can be hard coded.
Based on ''theoretical reasoning,'' I draw the conclusion that some advanced usages
of processing parameters can not be satisfied by the simple direct integration of a
parameter-functor...
Thus the concept for an extension point, which relies on a dedicated ''Param (Agent) Node''
and a specifically tailored ''Param Weaving Pattern'' to evaluate several parameter functors
and place the results into an extension data block in the invocation stack frame.
* ...by defining a parameter-functor to »drop off« a given value
* ...also add a static sanity check to reject unsuitable parameter-functor \\
(e.g. for a processing-functor with different or even no parameters)
This required some ''type massaging'' to construct the proper follow-up builder type;
other than that, all components work together as expected.
This can be demonstrated both in a direct setup and using the builder.
While the handling of invocation parameters is now integrated in the node processing,
there is still a gap to close in the Node Builder, which is tricky due to the way
the parameter-functor is now integrated deeply into the setup of the `FeedManifold`;
so the `PortBuilder` is tasked now with implanting a `FeedPrototype` -- which must be
adapted to a ''specific parameter-functor,'' which is only supplied optionally,
as a further build step.
At first this seemed to present a pattern very similar to a ''State Monad'' — and thus
I investigated if encapsulating the build of the prototype into such a State Monad would
simplify the structure of the builder — yet once again, Monads turned out as ''Anti Pattern''
rather: we'd had to ad an extra component, which is superficially generic
but without any tangible relation to patterns of the real world, it would be
rather technical (using lots of composed lambda primitives, which will be condensed
into a single builder function by the compiler / optimiser. But worse still,
this highly complicated setup does not actually solve the problem with N x M
typed contexts — implying that it ''is not actually an abstraction,'' rather just
pretends to be generic.
The benefit of this lengthy design exercise is to understand better the situation
in the builder, which amounts to building up mixed typed context with several
degrees of freedom. It is better to accept this reality and keep it in plain sight,
i.e. let the builder be explicitly typed from end to end and do not try
to package parts of this selection process behind a virtualisation.