This is a development snaphot pre release of Lumiera.
It features codebase maintenance, upgrade to C++14 and GTK-3
and some work towards a Proc-GUI connection (unfinished)
Update README, AUTHORS, LICENSE and similar release docs.
...this will be the third preview release
Lumiera is still in pre-alpha stage, and thus
there are no proper releases, just preview snapshots.
Again this version will be built and packaged
on several supported Linux platforms
This page gives the rationale for the way our diff framework is built.
This reasoning might *reduce* the relevance of any decisions
regarding the implementation data structure and thus lead to
far reaching consequences for the whole architecture.
Hehe...
with GenNode, we started to use these global Type-IDs to generate
unique Names for unnamed Children in a diff::Record. This means,
when running in the test-suite, the TypeID for 'short' and 'long' are
likely to be allready allocated, so our Test can not not observe the
allocateion, nor is it sensible to assume fixed numbers for these Type-IDs.
Instead, we create two local types right within the test function, to force
generation of new unique type-IDs, which we can observe
because otherwise we'd need to send a whole subtree
over the wire and then descend into it just to find an element.
This too is a ripple effect of making '==' deep
well... this was quite a piece of work
Added some documentation, but a complete documentation,
preferably to the website, would be desirable, as would
be a more complete test covering the negative corner cases
yeah, working with open fire is dangerous...
For performace reasons I've undercut the premise
to make GenNode / Record immutable. Now I'm dealing with
raw storage layout together with this quite hairy distinction
between "attribute scope" and "child scope"
In hindsight, it might have been better to implement Record
as a single list, and to maintain a shortcut pointer to jump
to the start of the attributes.
while implementing this, I've discovered a conceptual error:
we allow to accept attributes, even when we've already entered
the child scope. This means that we can not predictable get back
at the "last" (i.e. the currently touched) element, because this
might be such an attribute. So a really correct implementation
would have to memorise the "current" element, which is really
tricky, given the various ways of touching elements in our
diff language.
In the end I've decided to ignore this problem (maybe a better
solution would have been to disallow those "late" attributes?)
My reasoning is that attributes are unlikely to be full records,
rather just values, and values are never mutated. (but note
that it is definitively possible to have an record as attribute!)
...while I must admit that I'm a bit doubtful about that
language feature, but it does come in handy when manually
writing diff messages. The reason is the automatic naming
of child objects, which makes it often hard to refer to
a child after the fact, since the name can not be
reconstructed systematically.
Obviously the downside of this "anonymous pick / delete"
is that we allow to pick (accept) or even delete just
any child, which happens to sit there, without being
able to detect a synchronisation mismatch between
sender and receiver.
i.e. flat match, not deep equality.
This allows to send just an Ref (with the ID) over the
wire to refer to an complete object to be picked, moved
or deleted on the receiver side.
in the first version, I defined equality to just compare the IDs
But that didn't seem right, or what one would expect by the concept
of equality (this is a long standing discussion with persistent
object-relationally mapped data).
So I changed the semantics of equaility to be "deep".
As this means possiblty to visit a whole tree depth-first,
it seems reasonable to provide the shallow "identity-comparison" likewise.
And the most reaonable choice is to use the "matches(object)" API
for that, since, in case of objects, the matches was defined
as full equality, which now seems redundant.
Thus: from now on: obj.matches(otherObj)
means they share the same IDs
The Ref-GenNode is just a specifically constructed GenNode,
and intended to be sliced down to an ordinary GenNode
immediately after construction. It seems, GCC didn't "get that"
and instead emitted an recursive invocation of the same ctor,
which obviously leads to stack overflow.
Problem solved by explicitly coding the copy initialisation,
after the full definition of Ref is available.
the type is the only meta attribute supported by now,
thus the decision was to handle this manually, instead of
introducing a full scope for meta attributes. Unfortunately
this leads to an assymetry: while it is possible to send an
attribute named "type", which will be intercepted and used
as a new type ID, the type will not show up when iterating
or searching through attributes.
When applying a diff, the only possibility is to *insert*
a new type attribute, and we need to check and handle this
likewise manually.
It is difficult to reconcile our general architecture for the
linearised diff representation with the processing of recursive,
tree-like data structures. The natural and most clean way to
deal with trees is to use recursion, i.e. the processor stack.
But in our case, this means we'd have to peek into the next
token of the language and then forward the diff iterator
into a recursive call on the nested scope. Essentially, this
breaks the separation between receiving a token sequence and
interpretation for a concrete target data structure.
For this reason, it is preferrable to make the stack an
internal state of the concrete interpreter. The downside of
this approach is the quite confusing data storage management;
we try to make the role of the storage elements a bit more
clear through descriptive accessor functions.
implement the list handling primitives analogous to the
implementation of list-diff-applicator -- just again with
the additional twist to keep the attribute and child scopes
separated.
...so now the stage is set. We can reimplement
the handling of the list diff cases here in the context
of tree diff application. The additional twist of course
being the distinction between attribute and child scope
each language token of our "linearised diff representation"
carries a payload data element, which typically is the piece
of data to be altered (added, mutated, etc).
Basically, these elements have value semantics and are
"sent over wire", and thus it seems natural when the
language interpreter functions accept that piece of payload
by-value. But since we're now sending GenNode elements as
parameter data in our diff, which typically are of the
size of 10 data elements (640 bit on a 64bit machine),
it seems more resonable to pass these argument elements
by const& through the interpreter function. This still
means we can (and will indeed) copy the mutated data
values when applying the diff, but we're able to
relay the data more efficiently to the point where
it's consumed.
this boils down to the two alternatives
- manipulate the target data structure
- build an altered copy
since our goal is to handle large tree structures efficiently,
the decision was cast in favour of data manipulation
so basically it's time to explicate the way
our diff language will actually be written.
Similar to the list diff case, it's a linear sequence
of verb tokens, but in this case, the payload value
in each token is a GenNode. This is the very reason
why GenNode was conceived as value object with an
opaque DataCap payload
while it's still not really clear how we'll use this helper
and if we need it at all -- some weeks ago I changed its
semantics to be strictly based on the delta to a reference level.
Now this means, we could go below level zero, but this doesn't
make any sense in the context of navigating a tree. Actually,
our test case triggered this situation, which caused the
reference level to wrap around, since it is stored in an
unsigned variable.
Thus I'll add a precondition to keep the level positive,
and I'll change the test to comply.
Initially I've deliberately omitted those, to nudge towards
using time quantisation and TCode formatting for any external
representation of time values.
While this recommendation is still valid, the overloaded
string conversion turns out to be helpful for unit testing
and diagnostics in compound data structures.
See Record<GenNode>
initially the intention was to include a "bracketing construct"
into the values returned by the iterator. After considering
the various implementation and representation approaches,
it seems more appropriate just to expose a measure for the
depth-in-tree through the iterator itself, leaving any concerns
about navigation and structure reconstruction to the usage site.
As rationale we consider the full tree reconstruction as a very
specialised use case, and as such the normal "just iteration" usage
should not pay for this in terms of iterator size and implementation
complexity. Once a "level" measure is exposed, the usage site
can do precisely the same, with the help of the
HierarchyOrientationIndicator.