Lumiera: The Inner Core
=======================


[abstract]
******************************************************************************
The Lumiera Developers have a distinct vision about the core of a modern NLE. 
This document outlines some of the basic technical concepts and highlights
the fundamental design decisions. New Developers may find this Document useful
to get an idea about how the different components work together.
******************************************************************************


Overview
--------

Lumiera constitutes of a broad range of subsystems. These are roughly grouped
into three layers plus some extra components. This structure is mostly kept in
the source directory structures.

This three Layers are:

 The User Interface::
     User interfaces are implemented as plug-ins, most commonly one will see
     the default GUI. But also scripting interfaces or specialized GUI's are
     possible.

 The Processing layer::
     Keeps the Session, generates the rendering graphs for sequences.

 The IO and System interface Backend::
     Manages thread-queues, schedules jobs, does the memory management for the
     heavy multimedia data.

The extra components are:

 Lumiera::
     The main program itself, basically acts only as loader to pull the rest up.

 Common::
     Vital components and common services which must be available for pulling up
     any part of the system.

 Library::
     A lot of (largely) stateless helper functionality used throughout the code.


Coding Style
~~~~~~~~~~~~

The Lumiera team agreed on using GNU coding style with some exceptions (no
tabs, line wrap must not always be enforced). Otherwise we are a bit pedantic
to be consistent to make the codebase uniform. Function naming conventions
and other details are described in several RFCs.

Documentation
~~~~~~~~~~~~~

Every public function should be documented with doxygen comments. The overall
design is outlined in this text and documented in the various detail pages and
accompanying RFCs. Bootstrapping the Lumiera design used several other places,
in several TiddlyWikis, an UML model and cehteh's private wiki, most of this
information is asciidoced meanwhile and in progress to be integrated in the
central documentation hierarchy.


Test Driven Development
~~~~~~~~~~~~~~~~~~~~~~~

We strive to use _Test-Driven-Development:_ tests are to be written first,
defining the specification of the entity being tested. Then things get
implemented until they pass their tests. While increasing the initial
development effort significantly, this approach is known to lead to
clearly defined components and overall increases code quality.
In practice, this approach might not be suitable at times,
nevertheless we try to sick to it as far as possible
and maintain fairly complete test coverage.


Lumiera Application
-------------------
Generally speaking, the Application is comprised of several self contained 
_subsystems_, which may depend on each other. Dependencies between components
are to be abstracted through interfaces. Based on the current configuration,
the application framework brings up the necessary subsystems and finally
conducts a clean shutdown. Beyond that, the application framework remains
passive, yet provides vital services commonly used.

While the core components are linked into a coherent application and may
utilise each other's services directly based on C/C++ language facilities,
several important parts of the applications are loaded as plug-ins, starting
with the GUI.


User Interfaces
---------------

The purpose of the user interface(s) is to act on the _high-level data model_
contained within the Session, which belongs to the _processing layer_ below.
User interfaces are implemented as plug-ins and are pulled up on demand,
they won't contain any relevant persistent state beyond presentation.


Processing Layer
----------------

High Level Model
~~~~~~~~~~~~~~~~
_tbw_

Assets
^^^^^^
_tbw_

Placement
^^^^^^^^^
_tbw_

Scoping
^^^^^^^
_tbw_

MObject References
^^^^^^^^^^^^^^^^^^
_tbw_

QueryFocus
^^^^^^^^^^
_tbw_

Output Management
~~~~~~~~~~~~~~~~~
_tbw_

Stream Type System
~~~~~~~~~~~~~~~~~~
_tbw_

Command Frontend
~~~~~~~~~~~~~~~~
_tbw_

Defaults Manager
~~~~~~~~~~~~~~~~
_tbw_

Rules System
~~~~~~~~~~~~
_tbw_

Builder
~~~~~~~
_tbw_

Low Level Model
~~~~~~~~~~~~~~~
_tbw_

Play/Render processes
~~~~~~~~~~~~~~~~~~~~~
_tbw_


Backend
-------

I/O Subsystem
~~~~~~~~~~~~~

.OS Filehandles
as mru cache, round robin reused

.Files
Lumiera has its own abstract file handles which store the state and name of a
file. The associated filehandle doesn't need to be kept open and will be
reopened on demand. Hardlinked files are recognized and opened only once.

.Memory Mapping
All file access is done by memory mapping to reduce data copies between
userland and kernel. Moreover the kernel becomes responsible to schedule
paging (which will be augmented by lumiera) to make the best use of available
resources. Memory is mapped in biggier possibly overlapping windows of
resonable sized chunks. Requests asking for a contingous set of data from the
file in  memory.


.Indexing

.Frameprovider


Threadpools
~~~~~~~~~~~

Manages serveral classes of threads in pools. The threadpool is reasonable
dumb. Higher level management will be done by the Schedulers and Jobs.


Schedulers
~~~~~~~~~~

Scheduling Queues for different purposes:

.Deadline
Higher priority jobs ordered by a deadline time plus some (negative) hystersis. Jobs are
started when they approach their deadline. Jobs who miss their deadline are
never scheduled here.

.Background
Background jobs scheduled by priority and timeout.


.Realtime
Timer driven queue which starts jobs at defined absolute times. Timer might be
also an external synchronization entity.


Job
^^^
a job can be part of multiple queues, the queue which picks them first runs
them. When other queues hit a running job they either just drop it or promote
its priority (to be decided).


Resource Management
~~~~~~~~~~~~~~~~~~~

Running Lumiera requires a lot different resources, such as CPU-Time, Threads,
IO Bandwidth, Memory, Address space and so on. Many of this resources are rather
hard limited and the system will return errors when this limits are hit, but
often one does not even reach this hard limits because performance will
degrade way before coming into the realm of this limits. The goal for Lumiera
is to find a sweet spot for operating with optimal performance. Thus we have
some facilities to monitor and adjust resource usage depending and adapting to
the system and current circumstances.


Profiler
^^^^^^^^

Collects statistic about resource load, helps to decide if job constraints can
be fulfilled.

Things to watch:
 * cpu utilization
 * memory usage (swapping, paging)
 * I/O load, latency


Budget Manager
^^^^^^^^^^^^^^

resources need to be distributed among a lot subsystems and jobs. Each of this
component can become part of a budgeting system which accounts resource usage
and helps to distribute it. Resource usage is only voluntary managed.


Resource collector
^^^^^^^^^^^^^^^^^^

Handles system errors related to resource shortage. There are several classes
of resources defined. Other subsystems can hook in functions to free
resources. Has multiple policies about how aggressive resources should be freed.

If no one cares it does a final abort(). So all systems should hook better
recovery here in!


Common Services
---------------

Subsystem runner
~~~~~~~~~~~~~~~~
_tbw_

Lifecycle events
~~~~~~~~~~~~~~~~
_tbw_

Interface system
~~~~~~~~~~~~~~~~
_tbw_

Plugin loader
~~~~~~~~~~~~~
_tbw_

Rules system
~~~~~~~~~~~~
_tbw_

Serialiser
~~~~~~~~~~
_tbw_

Config loader
~~~~~~~~~~~~~
_tbw_

Lua Scripting
~~~~~~~~~~~~~
_tbw_


Library
-------

The Lumiera support library contains lots of helper functionality
factored out from the internals and re-used. It is extended as we go.


Locking
~~~~~~~
Based on object monitors. Performance critical code
uses mutexes, condition vars and rwlocks direcly.
Intentionally no semaphores.

- C++ locks are managed by scoped automatic variables
- C code uses macros to wrap critical sections


Time
~~~~
Time values are represented by an opaque date type `lumiera::Time`
with overloaded operators. The implementation is based on `gavl_time_t`,
an integral (µsec) time tick value. Any Time handling and conversions
is centralised in library routines.

We distinguish between time values and a _quantisation_ into a frame
or sample grid. In any case, quantisation has to be done once, explicitly
and as late as possible. See the link:{rfc}/TimeHandling.html[Time handling RfC].


Errors
~~~~~~
 
 * As a Rule, Exceptions + RAII are to be preferred over error
   codes and manual cleanup. At external interfaces we rely on
   error states though.
 * Exceptions can happen everywhere and any time
 * Exceptions and Errors shall only be dealt with at locations where it's actually
   possible to _handle_ them (the so called ``fail not repair'' rule)
 * API functions are categorised by _error safety guarantee_
   - *EX_FREE* functions won't raise an exception or set an error state,
     unless the runtime system is corrupted.
   - *EX_STRONG* functions are known to have no tangible effect in
     case of raising an exception / error state (transactional behaviour).
   - *EX_SANE* functions might leave a partial change, but care to leave
     any involved objects in a sane state.
 * Raising an Exception creates an _error state_ -- error states can also
   be set directly per thread.
 * Error states are identified by pointers to static strings.
 * Error states are thread local and sticky (a new state can't be set
   unless a pending state got cleared).


Exception hierarchy
^^^^^^^^^^^^^^^^^^^
Typically, when an error situation is detected, the error will be categorised
by throwing the appropriate exception. Exceptions provide a mechanism to attach
the root cause. The classification happens according to the _relevance for the
application_ as a whole -- _not_ according to the cause.

 `lumiera::Error`:: root of error hierarchy. extends `std::exception`
 `error::Logic`::   contradiction to internal logic assumptions detected
 `error::Fatal`::   (⤷ `Logic`) unable to cope with, internal logic floundered
 `error::Config`::  execution aborted due to misconfiguration
 `error::State`::   unforeseen internal state (usually causes component restart)
 `error::Flag`::    (⤷ `State`) non-cleared error state from C code
 `error::Invalid`:: invalid input or parameters encountered
 `error::External`:: failure in external service the application relies on
 `error::Assertion`:: assertion failure

Please be sure to clear the error state whenever catching and handling an exception.


Error states
^^^^^^^^^^^^
Errors states get declared in headers with the `LUMIERA_ERROR_DECLARE(err)` macro.
A matching definition needs to reside in some translation unit, using the
`LUMIERA_ERROR_DEFINE(err, msg)` macro. There is no central registry, any component
can introduce its own errorcodes but must ensure that the error identifier is
unique.

.Error handling in C
There are two helper macro forms for setting up error conditions, one is
`LUMIERA_ERROR_SET..(flag, err, extra)` and the other one is
`LUMIERA_ERROR_GOTO..(flag, err, extra)`. Each for different logging levels.
The `SET` form just logs an error and sets it, the `GOTO` form also jumps to
an error handler. Both take a NoBug flag used for logging and a optional
`extra` c-string.


[source,C]
--------------------------------------------------------------------------------
const char*
mayfail()
{
  const char* ret = foo();
  if (!ret)
    LUMIERA_ERROR_GOTO (flag, FOO_FAILED, 0);

  if (!bar(ret))
    LUMIERA_ERROR_GOTO (flag, BAR_FAILED, 1,
                        lumiera_tmpbuf_snprintf (256, "foo was %s", ret));

  return "everything ok";

  /* cleanup in reverse setup order */
 BAR_FAILED1:
  lumiera_free (ret);
 FOO_FAILED0:
  return NULL;
}
--------------------------------------------------------------------------------


Singletons
~~~~~~~~~~
_tbw_

Extensible Factory
~~~~~~~~~~~~~~~~~~
_tbw_

Visiting Tool
~~~~~~~~~~~~~
_tbw_

Iterators
~~~~~~~~~
Lumiera Forward Iterator
^^^^^^^^^^^^^^^^^^^^^^^^
_tbw_

Iterator Adapters
^^^^^^^^^^^^^^^^^
_tbw_

Itertools
^^^^^^^^^
_tbw_


Wrappers and Opaque Holders
~~~~~~~~~~~~~~~~~~~~~~~~~~~
- smart handle
- unified value/ptr/reference holder
- ownership managing collection
- opaque holder to ``piggypack'' an object inline,
  without the need for heap allocated storage
- vector of references

Unique Identifiers
~~~~~~~~~~~~~~~~~~

LUID
^^^^
Generating 128 bit non cryptographic strong unique identifiers.

 - having an alternative representation to store a pointer
 - may be extended for a strong (slow) and a fast (weak) variant in future

EntryID
^^^^^^^
Combines an user readable ID, a (compile time) type tag and a hash-ID.
The latter is based on the symbolic ID and the type tag, which is discarded
at runtime (type erasure)

Typed Lookup
^^^^^^^^^^^^
_planned_ a system of per-type lookup tables, based on `EntryID`, together
with an type specific access functor. Together, this allows to translate
transparently and typesafe from symbolic ID to object instance, which
is an prerequisite for integrating a rules based system. Besides, these
tables allow unique IDs per type


Allocators
~~~~~~~~~~
_tbw_


Memory Pools
~~~~~~~~~~~~

Fast memory pools for moderately small static sized allocations in highly
dynamic situations.

 * optimized for cache locality
 * supporting a destructor callback to free all objects


Temporary Buffers
~~~~~~~~~~~~~~~~~

Provides a small number of round robin buffers which need not to be freed.
Must only be used locally when no deep recursion may use tmpbufs. Using these
wrong (recursively) will result in corrupted data.

Very fast and efficient from smallest too hugest allocations. No need to care
for 'free()'


Template Metaprogramming
~~~~~~~~~~~~~~~~~~~~~~~~

Typelists
^^^^^^^^^
_tbw_

Tuples
^^^^^^
_tbw_

Functor Utils
^^^^^^^^^^^^^
_tbw_


Preprocessor Metaprogramming
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

ppmpl.h


CLib wrappers
~~~~~~~~~~~~~

Some wrapers for the C memory management functions malloc, calloc, realloc and
free which never fail. In case of an error the resourcecollector in the
backend is invoked to free resources or doing an emergency shutdown.

Safe wrapers for some string functions from the C-library which also never
fail. NULL strings are propagated to "" empty strings.


Polymorphic Programming in C
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Just a macro for simplyfying vtable function calls
 VCALL(Handle, function, arguments...)
translates to
 Handle->vtable->function (Handle, arguments...)

The user is responsible for setting up a `vtable` member in his datastructures
this macro does some NoBug checks that self and function are initialized.


Algorithms & Datastructures
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Probabilistic Splay Tree
^^^^^^^^^^^^^^^^^^^^^^^^

Self optimizing splay tree

Advantage:
 * can be iterated
 * very fast in situations where few common elements are queried most often

Disadvantages
 * provides no own locking, needs to be locked from outside
 * (almost) every access is a mutation and needs an exclusive lock, bad for concurrency


BTree
^^^^^

Generic B+/B* Tree implementation. Details are provided by a vtable at an
actual implementation.

 * Fine grained (block level) locking with different modes
 * supports cursors to iterate over the data, in both directions
 * exact and inexact searched (what's close before/after something)


Cuckoo Hashing
^^^^^^^^^^^^^^
_Currently defunct, to be revived someday_


Hash functions
^^^^^^^^^^^^^^
_planned_


Linked Lists
^^^^^^^^^^^^

.llist

Cyclic double linked intrusive list.

.slist

Single linked variant.


Most Recent used Cachelists
^^^^^^^^^^^^^^^^^^^^^^^^^^^

A list where old entries are recycled from the tail, used things are removed
from the list (ownership acquired) and released back to the head.

Items not used propagate towards the tail where they will be reused.


//Undocumented
//access-casted.hpp
//advice.hpp symbol-impl.cpp
//allocationcluster.cpp symbol.hpp
//bool-checkable.hpp
//cmdline.cpp
//cmdline.hpp
//del-stash.hpp
//diagnostic-context.hpp
//element-tracker.hpp
//external
//factory.hpp
//format.hpp
//frameid.hpp
//functor-util.hpp
//handle.hpp
//hash-indexed.hpp
//iter-adapter-stl.hpp
//iter-adapter.hpp
//iter-source.hpp
//itertools.hpp
//lifecycle.cpp
//lifecycleregistry.hpp
//lumitime-fmt.hpp
//lumitime.cpp
//multifact-arg.hpp
//multifact.hpp
//nobug-init.cpp <<why here and not in common?
//nobug-init.hpp
//null-value.hpp
//observable-list.hpp
//opaque-holder.hpp
//p.hpp
//query.cpp
//query.hpp
//ref-array-impl.hpp
//ref-array.hpp
//result.hpp
//scoped-holder.hpp
//scoped-ptrvect.hpp
//scopedholdertransfer.hpp
//singleton-ref.hpp
//singleton-subclass.hpp
//singleton.hpp
//singletonfactory.hpp
//singletonpolicies.hpp
//singletonpreconfigure.hpp
//streamtype.cpp
//test
//thread-local.hpp
//tree.hpp
//typed-allocation-manager.hpp
//typed-counter.hpp
//util-foreach.hpp
//util.cpp
//util.hpp
//variant.hpp
//visitor-dispatcher.hpp
//visitor-policies.hpp
//visitor.hpp
//wrapper.hpp
//wrapperptr.hpp