..as concluded from the preceding analysis.
NOTE this entails a semantical change, since this
predicate is now only meant to be indicative, not conclusive
remarks: the actual implementation of the diff application process
as bound via the TreeMutator remains yet to be written...
how can ordinary object fields be treated as "Attributes"
and thus tied into the Diff framework defined thus far.
This turns out to be really tricky, even questionable
while simple to add into the implementation, this whole feature
seems rather qestionable to me now, thus I've added a Ticket
to be revisited later.
In a nutshell, right here, when implementing the binding layer
for STL collections, it is easy to enable the framework to treat
Ref::THIS properly, but the *actual implementation* will necessarily
be offloaded onto each and every concrete binding implementation.
Thus client code would have to add support for an rather obscure
shortcut within the Diff language. The only way to avoid this
would be to change the semantics of the "match"-lambda: if this
binding would rather be a back-translation of implementation data
into GenNode::ID values, then we'd be able to implement Ref::THIS
natively. But such an approach looks like a way inferiour deisgn
to me; having delegated the meaning of a "match" to the client
seems like an asset, since it is both natural and opens a lot
of flexibility, without adding complexity.
For that reason I tend to avoid that shortcut now, in the hope
to be able to drop it entirely from the language
...basically this worked right away and was easy to put together.
However, when considering how many components, indirections and
nested lambdas are working together here, I feel a bit dizzy...
:-/
write down a first draft for a definiton section,
to describe the fundamental parts involved, when
applying a diff message onto implementation defined
data structures
After a break of tree weeks, I found it difficult to find may way
amidst all those various levels of abstraction. In addition to this
definition, we'll probably also need a high level overview of the
whole diff system operation.
...all of this implementation boils down to slightly adjusting
the code written for the test-mutation-target. Insofar it pays off now
having implemented this diagnostic and demonstration first.
Moreover I'm implementing this basic scheme of "diff application"
roughly the fourth time, thus things kindof fall into place now.
What's really hard is all those layers of abstraction in between.
Lesson learned (after being off for three weeks, due to LAC and
other obligations): I really need to document the meaning of the
closures, and I need to document the "abstract operational semantics"
of diff application, otherwise no one will be able to provide
the correct closures.
while I still keep my stance not to allow reflection and
switch-on-type, access to the internal / semantic type of
an embedded record seems a valid compromise to allow
to deal with collections of object-like children
of mixed kind.
Indirectly (and quite intentional) this also opens a loophole
to detect if a given GenNode might constitute a nested scope,
but with the for the actual nested element indeed to cary
a type symbol. Effectively this limits the use of this shortcut
to situations where the handling context does have some pre-established
knowledge about what types *might* be expected. This is precisely
the kind of constraint I intend to uphold: I do not want the
false notion of "total flexibility", as is conveyed by introspection.
since we're moving elements around to apply the diff,
dangerous situation might arise in case anyone takes a copy
of the mutator. Thus we effectively limit the possible
usage pattern and only allow to build an anonymous
TreeMutator subclass through the Builder-DSL.
The concrete "onion layers" of the TreeMutator are now limited
- to be created by the chaining operations of the Builder DSl
- to be moved into target location, retaining ownership.
I still feel somewhat queasy with this whole situation!
We need to return the product of the DSL/Builder by value,
but we also want to swap away the current contents before
starting the mutation, and we do not want a stateful lifecycle
for the mutator implementation. Which means, we need to swap
right at construction, and then we copy -- TADAAA!
Thus I'm going for the solution to disallow copying of the
mutator, yet to allow moving, and to change the builder
to move its product into place. Probably should even push
this policy up into the base class (TreeMutator) to set
everyone straight.
Looks like this didn't show up with the test dummy implementation
just because in this case the src buffer also lived within th
TestMutationTarget, which is assumed to sit where it is, so
effectively we moved around only pointers.
the whole implementation will very much be based on
my experiences with the TestMutationTarget and TestWireTap.
Insofar it was a good idea to implement this test dummy first,
as a prototype. Basically what emerges here is a standard pattern
how to implement a tree mutator:
- the TreeMutator will be a one-way-off "throwaway" object.
- its lifecylce starts with sucking away the previous contents
- consuming the diff moves contents back in place
- thus the mutator always attaches onto a target by reference
and needs the ability to manipulate the target
the collection binding can be configured with various
lambdas to supply the basic building blocks of the generated binding.
Since we allow picking up basically anything (functors,
function pointers, function objects, lamdas), and since
we speculate on inlining optimisation of lambdas, we can not
enforce a specific signature in the builder functions.
But at least we can static_assert on the effective signature
at the point where we're generating the actual binding configuration
we can't generate a static assertion so easily here.
Problem is, when forming this type, we don't know if
the user will override and provide a custom binding
in some chained call within the nested DSL.
Might still be able to come up with some clever trick,
like e.g. returing an unsuitable marker type from these
dummy default implementations and then, later on, when
actually building the collection binding, to detect
those marker types and rise a static assert at that point.
This would at least give us a better error message,
and in theory, it should always be possible to
detect this kind of misuse at compile time
...through the use of partial specialisation and SFINAE.
There are some rather specific (yet expectedly not uncommon) cases,
where we'd be able to provide a sensible default for the
- match predicate
- new element constructor
of the binding. While in all other cases, the user
has to provide an explicit implementation for these
crucial building blocks anyway.
...but does not compile, since all of the fallback functions
will be instantiated, even while in fact we're overriding them
right away with something that *can* be compiled.
this prompts me to reconsider and question the basic approach
with closures for binding, while in fact what I am doing here
is to implement an ABC.
- the test will use some really private data types,
valid only within the scope of the test function.
- invoking the builder for real got me into problems
with the aggregate initialisation I'd used.
Maybe it's the function pointers? Anyway, working
around that by definint a telescope ctor
when setting up a binding to child elements within a STL collection,
all the variable elements are preconfigured to a more or less
disabled and inactive state.
the concern is for the structure of the builder to be
incomprehensible and completely buried within the
implementation details of the various binding layers
most of the mutation primitives return bool(true)
when /any/ layer or part of the TreeMuator was able
to cope with the diff verb.
This is based on the assumption to configure the
TreeMutator in such a way that at most one facility
will actually handle and apply a given verb. That is,
we'll assume that the TreeMutator acutally wraps and
adapts *one* custom data structure, to which the
diff has to be applied.
The TestWireTap is special, insofar it indeed targets
a *second* data structure, albeit not a "real" one,
just a dest and diagnostics dummy.
the first part of the unit test (now passing)
is able to demonstrate the full set of diff operations
just by binding to a TestMutationTarget.
Now, after verifying the design of those primmitive operations,
we can now proceed with bindings to "real" data structures
when implementing the assignment and mutation primitives
it became clear that the original approach of just storing
a log or string rendered elements does not work: for
assignment, we need to locate an element by ID
this one went through unnoticed, because the situation
is not covered in unit-test. The tests written thus fare
are more like a proof-of-concept. I didn't want to spend
weeks on writing extensive coverage of all corner cases,
at least not before all aspects of the tree diff protocol
are settled. Seemingly this backfires already
now the full API for the "mutation primitives" is shaped.
Of course the actual implementation is missing, but that
should be low hanging fuit by now.
What still requires some thinking though is how to implement
the selector, so we'll actually get a onion shaped decorator
basically we'll establish a collaboration where both sides
know only the interface (contract) of the partner; a safe margin
for allocation size has to be established through metaprogramming (TODO)
what's problematic is that we leave back waste in the
internal buffer holding the source. Thus it doesn't make
sense to check if this buffer is empty. Rather the
Mutator must offer an predicate emptySrc().
This will be relevant for other implementations as well
while the original name, 'replace', conveys the intention,
this more standard name 'swap' reveals what is done
and thus opens a wider array of possible usage
now this feels like making progress again,
even when just writing stubs ;-)
Moreover, it became clear that the "typing" of typed child collections
will always be ad hoc, and thus needs to be ensured on a case by case
base. As a consequence, all mutation primitives must carry the
necessary information for the internal selector to decide if this
primitive is applicable to a given decorator layer. Because
otherwise it is not possible to uphold the concept of a single,
abstracted "source position", where in fact each typed sub-collection
of children (and thus each "onion layer" in the decorator chain)
maintains its own private position
after sleeping one night over the problem, this seems to be
the most natural solution, since the possibility of assignment
naturally arises from the fact that, for tree diff, we have
to distinguish between the *identity* of an element node and
its payload (which could be recursive). Thus, IFF the payoad
is an assignable value, why not allow to assign it. Doing so
elegnatly solves the problem with assignment of attributes
Signed-off-by: Ichthyostega <prg@ichthyostega.de>
I assumed that, since GenNode is composed of copyable and
assignable types, the standard implementation will do.
But I overlooked the run time type check on the opaque
payload type within lib::Variant. When a type mismatch
is detected, the default implementation has already
assigned and thus altered the IDs.
So we need to roll our own implementation, and to add
insult to injury, we can't use the copy-and-swap idiom either.
This is actually a STL library feature, and was added precisely
for the reason encountered here: if we want logarithmic search,
we'll have to construct a new GenNode object, just to have something
for the set to invoke the comparison operator.
C++14 introduced the convention that the Comparator of the set
may define a marker type `is_transparent` alongside with a generic
comparison operator. But, as is obvious from the source code of
our GNU Standard library implementation, our std::set has no such
overload to make use of that feature
http://en.cppreference.com/w/cpp/container/set/findhttp://stackoverflow.com/questions/20317413/what-are-transparent-comparators
The only good thing is that, just 10 minutes ago, I felt like
a complete moron because I'm writing a unit test for such a simple
storage class. ;-)
...when the Test-Nexus processes a command binding message.
In the real system of course we do not want to log every bind message.
The challenge here is the fact that command binding as such
is opaque, and the types of the data within the bind message
are opaque as well. Finally I settled on the compromise
to log them as strings, but only the DataCap part;
most value types applicable within GenNode
have a string representation to match.
the rationale is that I deliberately do not want to provide
a mechanism to iterate "over all contents in stringified form".
Because this could be seen as an invitation to process GenNode-
datastructures in an imperative way. Please recall we do not
want that. Users shall either *match* contents (using a visitor),
or they are required to know the type of the contents beforehand.
Both cases favour structural and type based programming over
dynamic run-time based inspection of contents
The actual task prompting me to add this iteration mechanism
is that I want to build a diagnostic, which allows to verify
that a binding message was sent over the bus with some
specific parameter values.
...no need to enclose empty sections when there are no
attributes or no children. Makes test code way more readable.
TestEventLog_test PASS as far as implemented
some tests rely on additional diagnostics code being linked in,
which happens, when lib/format-util.hpp is included prior to
the instantiation of lib::diff::Record rsp. lib::Variant.
The reason why i opended this can of worms was to avoid includion
of this formatting and diagnostics code into such basic headers
as lib/variant.hpp or lib/diff/gen-node.hpp
Now it turns out, that on some platforms the linker will use
a later instantiation of lib::Variant::Buff<GenNode>::operator string
in spite of a complete instantiation of this virtual function
being available already in liblumierasupport.so
But the real reason is that -- with this trickery -- we're violating
the single definition rule, so we get what we deserved.
TODO (Ticket #973): at a later point in development we have to re-assess,
the precise impact of including lib/format-util.hpp into
lib/diff/gen-node.hpp
Right now I expect GenNode to be used pervasively, so I am
reluctant to make that header too heavyweight.
because otherwise we'd need to send a whole subtree
over the wire and then descend into it just to find an element.
This too is a ripple effect of making '==' deep
well... this was quite a piece of work
Added some documentation, but a complete documentation,
preferably to the website, would be desirable, as would
be a more complete test covering the negative corner cases
yeah, working with open fire is dangerous...
For performace reasons I've undercut the premise
to make GenNode / Record immutable. Now I'm dealing with
raw storage layout together with this quite hairy distinction
between "attribute scope" and "child scope"
In hindsight, it might have been better to implement Record
as a single list, and to maintain a shortcut pointer to jump
to the start of the attributes.
while implementing this, I've discovered a conceptual error:
we allow to accept attributes, even when we've already entered
the child scope. This means that we can not predictable get back
at the "last" (i.e. the currently touched) element, because this
might be such an attribute. So a really correct implementation
would have to memorise the "current" element, which is really
tricky, given the various ways of touching elements in our
diff language.
In the end I've decided to ignore this problem (maybe a better
solution would have been to disallow those "late" attributes?)
My reasoning is that attributes are unlikely to be full records,
rather just values, and values are never mutated. (but note
that it is definitively possible to have an record as attribute!)
...while I must admit that I'm a bit doubtful about that
language feature, but it does come in handy when manually
writing diff messages. The reason is the automatic naming
of child objects, which makes it often hard to refer to
a child after the fact, since the name can not be
reconstructed systematically.
Obviously the downside of this "anonymous pick / delete"
is that we allow to pick (accept) or even delete just
any child, which happens to sit there, without being
able to detect a synchronisation mismatch between
sender and receiver.
i.e. flat match, not deep equality.
This allows to send just an Ref (with the ID) over the
wire to refer to an complete object to be picked, moved
or deleted on the receiver side.
in the first version, I defined equality to just compare the IDs
But that didn't seem right, or what one would expect by the concept
of equality (this is a long standing discussion with persistent
object-relationally mapped data).
So I changed the semantics of equaility to be "deep".
As this means possiblty to visit a whole tree depth-first,
it seems reasonable to provide the shallow "identity-comparison" likewise.
And the most reaonable choice is to use the "matches(object)" API
for that, since, in case of objects, the matches was defined
as full equality, which now seems redundant.
Thus: from now on: obj.matches(otherObj)
means they share the same IDs
The Ref-GenNode is just a specifically constructed GenNode,
and intended to be sliced down to an ordinary GenNode
immediately after construction. It seems, GCC didn't "get that"
and instead emitted an recursive invocation of the same ctor,
which obviously leads to stack overflow.
Problem solved by explicitly coding the copy initialisation,
after the full definition of Ref is available.
the type is the only meta attribute supported by now,
thus the decision was to handle this manually, instead of
introducing a full scope for meta attributes. Unfortunately
this leads to an assymetry: while it is possible to send an
attribute named "type", which will be intercepted and used
as a new type ID, the type will not show up when iterating
or searching through attributes.
When applying a diff, the only possibility is to *insert*
a new type attribute, and we need to check and handle this
likewise manually.
It is difficult to reconcile our general architecture for the
linearised diff representation with the processing of recursive,
tree-like data structures. The natural and most clean way to
deal with trees is to use recursion, i.e. the processor stack.
But in our case, this means we'd have to peek into the next
token of the language and then forward the diff iterator
into a recursive call on the nested scope. Essentially, this
breaks the separation between receiving a token sequence and
interpretation for a concrete target data structure.
For this reason, it is preferrable to make the stack an
internal state of the concrete interpreter. The downside of
this approach is the quite confusing data storage management;
we try to make the role of the storage elements a bit more
clear through descriptive accessor functions.
implement the list handling primitives analogous to the
implementation of list-diff-applicator -- just again with
the additional twist to keep the attribute and child scopes
separated.
...so now the stage is set. We can reimplement
the handling of the list diff cases here in the context
of tree diff application. The additional twist of course
being the distinction between attribute and child scope
each language token of our "linearised diff representation"
carries a payload data element, which typically is the piece
of data to be altered (added, mutated, etc).
Basically, these elements have value semantics and are
"sent over wire", and thus it seems natural when the
language interpreter functions accept that piece of payload
by-value. But since we're now sending GenNode elements as
parameter data in our diff, which typically are of the
size of 10 data elements (640 bit on a 64bit machine),
it seems more resonable to pass these argument elements
by const& through the interpreter function. This still
means we can (and will indeed) copy the mutated data
values when applying the diff, but we're able to
relay the data more efficiently to the point where
it's consumed.
this boils down to the two alternatives
- manipulate the target data structure
- build an altered copy
since our goal is to handle large tree structures efficiently,
the decision was cast in favour of data manipulation
so basically it's time to explicate the way
our diff language will actually be written.
Similar to the list diff case, it's a linear sequence
of verb tokens, but in this case, the payload value
in each token is a GenNode. This is the very reason
why GenNode was conceived as value object with an
opaque DataCap payload