DOC: update technical docs to reflect recent development

At various places, concepts and drafts from the early stage of the
Lumiera Project are still reflected in the online documentation pages.
During the last months, development focussed on the Render Engine,
causing a shift in some parts of the design, and obsoleting other
parts altogether (notably we consider to use IO_URING for async IO)
This commit is contained in:
Fischlurch 2023-10-25 00:02:08 +02:00
parent c4dcdb93c4
commit 23f6f731f1
12 changed files with 746 additions and 855 deletions

View file

@ -278,6 +278,11 @@ function edit()
EDITOR="${EDITOR:-$(git config --get core.editor)}"
EDITOR="${EDITOR:-$VISUAL}"
if [ -z "$EDITOR" ]; then
echo -e "\nFATAL\n\$EDITOR undefined\n\n"
exit -1
fi
local file="$1"
local line=0

View file

@ -1,25 +0,0 @@
The Scheduler
-------------
:Author: CehTeh
:Date: 6/2007
//MENU: label Scheduler
Scheduling is done with two priority queues, one for high priority jobs and one for low priority jobs.
These priority queues are ordered by absolute time values plus some job specific identified.
There are following (non exhaustive) kinds of jobs:
* started job
* job to be canceled
* unscheduled job
* dependency providing jobs
Jobs implement a kind of future. We try hard to avoid any blocking waits.
The Job scheduler runs singlethreaded. Its only task is to schedule and delegate jobs to worker threads,
by itself it will never do any extensive processing.
Each job has an pre configured behaviour for the case of failure or deadline miss.
Any canceling and expireing jobs gets noted in *Statistics* to adjust performance and timings
for optimal performance and I/O throughput.

View file

@ -9,106 +9,8 @@ data access. Within Lumiera, there are two main kinds of data handling:
* The Session and the object models manipulated through the GUI are kept in memory.
They are backed by a _storage backend,_ which provides database-like storage and
especially logging, replaying and ``Undo'' of all ongoing modifications..
* Media data is handled _frame wise_ -- as described below.
The vault layer (``backend'') uses *memory mapping* to make data available to the program.
This is somewhat different to the more common open/read/write/close file access,
while giving superior performance and much better memory utilization.
The Vault-Layer must be able to handle more data than will fit into the memory
or even address space on 32 bit architectures. Moreover, a project might access more files
than the OS can keep open simultaneously, thus the for _Files used by the Vault,_ it needs a
*FilehandleCache* to manage file handle dynamically.
Which parts of a file are actually mapped to physical RAM is managed by the kernel;
it keeps a *FileMapCache* to manage the *FileMaps* we've set up.
In the End, the application itself only requests *Data Frames* from the Vault.
To minimize latency and optimize CPU utilization we have a *Prefetch thread* which operates
a *Scheduler* to render and cache frames which are _expected to be consumed soon_. The intention
is to manage the rendering _just in time_.
The prefetcher keeps *Statistics* for optimizing performance.
Accessing Files
---------------
+FileDescriptor+ is the superclass of all possible filetypes, it has a weak reference to a
+FileHandle+ which is managed in within the +FilehandleCache+. On creation, only the existence
(when reading) or access for write for new files are checked. The +FileDescriptor+ stores some
generic metadata about the underlying file and intended use. But the actual opening is done on demand.
The _content of files is memory mapped_ into the process address space.
This is managed by +FileMap+ entries and a +FileMapCache+.
File Handles
~~~~~~~~~~~~
A +FilehandleCache+ serves to store a finite maximum number of +FileHandles+ as a MRU list.
FileHandles are managed by the +FilehandleCache+; basically they are just storing the underlying OS file
handles and managed in a lazy/weak way, (re)opened when needed and aging in the cache when not needed,
since the amount of open file handles is limited aged ones will be closed and reused when the system
needs to open another file.
File Mapping
~~~~~~~~~~~~
The +FileMapCache+ keeps a list of +FileMaps+, which are currently not in use and subject of aging.
Each +FileMap+ object contains many +Frames+. The actual layout depends on the type of the File.
Mappings need to be _page aligned_ while Frames can be anywhere within a file and dynamically sized.
All established ++FileMap++s are managed together in a central +FileMapCache+.
Actually, +FileMap+ objects are transparent to the application. The upper layers will just
request Frames by position and size. Thus, the +File+ entities associate a filename with the underlying
low level File Descriptor and access
Frames
~~~~~~
+Frames+ are the smallest datablocks handled by the Vault. The application tells the Vault Layer to make
Files available and from then on just requests Frames. Actually, those Frames are (references to) blocks
of continuous memory. They can be anything depending on the usage of the File (Video frames, encoder frames,
blocks of sound samples). Frames are referenced by a smart-pointer like object which manages the lifetime
and caching behavior.
Each frame referece can be in one out of three states:
readonly::
the backing +FileMap+ is checked out from the aging list, frames can be read
readwrite::
the backing +FileMap+ is checked out from the aging list, frames can be read and written
weak::
the +FileMap+ object is checked back into the aging list, the frame can't be accessed but we can
try to transform a weak reference into a readonly or readwrite reference
Frames can be addressed uniquely whenever a frame is not available. The vault can't serve a cached
version of the frame, a (probably recursive) rendering request will be issued.
Prefetching
~~~~~~~~~~~
There are 2 important points when we want to access data with low latency:
. Since we handle much more data than it will fit into most computers RAM.
The data which is backed in files has to be paged in and available when needed.
The +Prefetch+ Thread manages page hinting to the kernel (posix_madvise()..)
. Intermediate Frames must eventually be rendered to the cache.
The Vault Layer will send +Renderjobs+ to the +Scheduler+.
Whenever something queries a +Frame+ from the vault it provides hints about what it is doing.
These hints contain:
* Timing constraints
- When will the +Frame+ be needed
- could we drop the request if it won't be available (rendered) in-time
* Priority of this job (as soon as possible, or just in time?)
* action (Playing forward, playing backward, tweaking, playback speed, recursive rendering of dependent frames)
.Notes
* The Vault Layer will try to render related frames in groups.
* This means that following frames are scheduled with lower priority.
* Whenever the program really requests them the priority will be adjusted.
-> more about link:Scheduler.html[the Scheduling of calculation jobs]
* Media data is handled _frame wise_ -- prefetching data asynchronously.
The goal is to optimize CPU utilization; large scale data IO will be performed
asynchronously. Data retrieval and processing of prerequisites will
be *scheduled* such as to manage the rendering _just in time_.

View file

@ -3,7 +3,7 @@ Design Process : All Plugin Interfaces Are C
[grid="all"]
`------------`-----------------------
*State* _Final_
*State* _Dropped_
*Date* _2007-06-29_
*Proposed by* link:ct[]
-------------------------------------
@ -159,17 +159,50 @@ After a talk on IRC ichthyo and me agreed on making lumiera a multi language
project where each part can be written in the language which will fit it best.
Language purists might disagree on such a mix, but I believe the benefits
outweigh the drawbacks.
-- link:ct[] [[DateTime(2007-07-03T05:51:06Z)]]
ct:: '2007-07-03 05:51:06'
C is the only viable choice here. Perhaps some sort of "test bench" could be
designed to rigorously test plugins for any problems which may cause Lumiera to
become unstable (memory leaks etc).
-- link:Deltafire[] [[DateTime(2007-07-03T12:17:09Z)]]
Deltafire:: '2007-07-03 12:17:09'
after a talk on irc, we decided to do it this way, further work will be
documented in the repository (tiddlywiki/source)
-- link:ct[] [[DateTime(2007-07-11T13:10:07Z)]]
ct:: '2007-07-11 13:10:07'
Development took another direction over course of the years;
Lumiera is not based on a _generic plug-in Architecture_ and the topic
of interfaces for _dedicated plugins_ still needs to be worked out
Ichthyostega:: '2023-10-24 22:55:23'
Conclusion
----------
Initially there was agreement over the general direction set out by this proposal.
However, _Ichthyo_ was always sceptical regarding the benefits of a generic plug-in
Architecture. Experience with high-profile projects based on such a concept seem
to show both tremendous possibilities, especially regarding user involvement, but
at the same time also indicate serious problems with long-term sustainability.
The practical development -- mostly driven ahead by _Ichthyo_ -- thus never really
embraced that idea; rather structuring by internal interfaces and contracts was
preferred. The basic system for loading of Plug-ins, as indicated by this proposal,
is still used though to load some dedicated plug-ins, most notably the GUI.
To draw an conclusion: this proposal is now considered as *rejected*.
Instead, Ticket https://issues.lumiera.org/ticket/1212[#1212 »Extension Concept«]
was added to the list of relevant
https://issues.lumiera.org/report/17[»Focus Topics«] for further development.
.........
.........
''''
Back to link:/documentation/devel/rfc.html[Lumiera Design Process overview]

View file

@ -3,7 +3,7 @@ Design Process : Data Backend
[grid="all"]
`------------`-----------------------
*State* _Final_
*State* _Parked_
*Date* _2007-06-04_
*Proposed by* link:ct[]
-------------------------------------
@ -42,24 +42,14 @@ This just starts as braindump, I will refine it soon:
http://www.pipapo.org/pipawiki/Cinelerra/Developers/ichthyo/Cinelerra3/Architecture[Architecture] and a sketch/draft about http://www.pipapo.org/pipawiki/Cinelerra/Developers/ichthyo/Possibilities_at_hand[things possible in the middle layer]
Tasks
^^^^^
Parked
~~~~~~
The underlying principles remain valid, yet development took another
direction during the last years. Special frameworks for high-performance
asynchronous IO will be used for dedicated use cases.
Ichthyostega:: '2023-10-24' ~<prg@ichthyostega.de>~
Pros
^^^^
Cons
^^^^
Alternatives
^^^^^^^^^^^^
Rationale
~~~~~~~~~
Comments
@ -71,5 +61,10 @@ Sounds fairly complete to me
Developement takes place in the repo now
-- link:ct[] [[DateTime(2007-06-27T16:14:56Z)]]
Development took another direction over the last years;
the former »Backend« layer is restructured
-- link:Ichthyostega[] [[DateTime(2023-10-24T22:45:55Z)]]
''''
Back to link:/documentation/devel/rfc.html[Lumiera Design Process overview]

View file

@ -0,0 +1 @@
../rfc/ConfigLoader.txt

View file

@ -1,121 +0,0 @@
The incomplete Guide to Lumiera Configuration
=============================================
:author: ct
:date: 8/2008
WARNING: this is a draft from the early days of Lumiera +
IMHO, the whole topic ``Configuration'' requires further
discussion and design.
+
-- -- Ichthyo
'''''''''''''
'''''''''''''
Order is roughly alphabetically, depending on the mood of the writer.
Defaults are noted if present. Not all are implemented yet.
General Introduction
--------------------
Lumiera uses plaintext files with a INI file like syntax for
configuration. This Syntax is strictly line based. There are only a
few syntactic elements.
TODO:describe config syntax here
Config Subsystem
----------------
The path where Lumiera searches its configuration. Single components are
separated by colons as in PATH and other such environment variables.
Here it might be handy that any Lumiera configuration can be
overridden by a environment variable:
'LUMIERA_CONFIG_PATH=somewhere:else lumiera ...'
A default are initialized at installation time, this is important to
bootstrap the whole configuration system.
config.path
The config system check for a preferred format when writing config
entries. For each key 'foo.bar', these can be overridden with a key
'config.format.foo.bar' linking to the desired format.
config.formatkey ='config.format.%s'
The following are links to the default formatting when no explicit
format is set for a key. Changing these to a wrong type will break the
system!
config.formatdef.link < config.formatstr.link
config.formatdef.number < config.formatstr.number.dec
config.formatdef.real < config.formatstr.real
config.formatdef.string < config.formatstr.string
config.formatdef.word < config.formatstr.word
config.formatdef.bool < config.formatstr.bool
This are the low level formating specifications for the buildin
types, DONT TOUCH THESE!
config.formatstr.link = '< %s'
config.formatstr.number.dec = '= %lld'
config.formatstr.number.hex = '= 0x%llX'
config.formatstr.number.oct = '= 0%llo'
config.formatstr.real = '= %Lg'
config.formatstr.real.dec = '= %Lf'
config.formatstr.real.sci = '= %Le'
config.formatstr.string = '= %s'
config.formatstr.string.dquoted = '= \"%s\"'
config.formatstr.string.quoted = '= ''%s'''
config.formatstr.word = '= %s'
config.formatstr.bool = '= %d'
Plugin System
-------------
The path where Lumiera searches its plugins. Single components are
separated by colons as in PATH and other such environment variables.
Here it might be handy that any Lumiera configuration can be
overridden by a environment variable:
'LUMIERA_PLUGIN_PATH=somewhere:else lumiera ...'
Sensible defaults are initialized at installation time.
plugin.path
I/O Backend
-----------
File handling
~~~~~~~~~~~~~
How many filehandles the backend shall use [approx 2/3 of all available]
vault.file.max_handles
Memory mapped Files
~~~~~~~~~~~~~~~~~~~
Address space limit (memory mapping)
Defaults:
3GiB on 32 bit arch
192TiB on 64 bit arch
vault.mmap.as_limit
Default start size for mmaping windows.
128MB on 32 bit arch
2GB on 64 bit arch
vault.mmap.window_size
How many memory mappings shall be established at most
Default 60000
vault.mmap.max_maps

View file

@ -5,5 +5,4 @@ Here we collect bits of technical documentation for the Vault-Layer.
For now, we have:
* link:ConfigLoader.html[Config Loader brainstorming from 2008]
* link:scheduler.html[Scheduler and Jobs]

View file

@ -4,37 +4,16 @@ Scheduler and Job handling
The purpose of the _Scheduler_ is to run small self contained _Jobs_
ordered by priority and observing specific timing constraints.
Scheduler implementation ideas
------------------------------
NOTE: Subject to [yellow-background]#active design and implementation# work as of 10/2023
Use multiple priority queues
Work-in-progress documentation can be found in the
link:{l}/wiki/renderengine.html#PlaybackVerticalSlice%20RenderEngine%20Scheduler%20SchedulerWorker%20SchedulerMemory%20RenderActivity%20Player%20FrameDispatcher%20JobPlanningPipeline%20PlayProcess%20Rendering%20ProcNode%20NodeOperationProtocol[Tiddly Wiki]
- background work
- foreground high-priority
- soft-realtime actions
About Jobs
----------
A job is a closure to run a small and limited action or operation, which
in itself _should not block_. Job may depend on each other and on resources
to be provided. A job may be conained in multiple queues and may be marked
as _canceled_ -- in which case the job function will never run and the job
will be discarded on occasion.
Job States
~~~~~~~~~~
[source,C]
--------------
enum job_state
{
done, // already done, nothing to do
running, // job is running
waiting, // waits for some resource (annother job)
rejected, // sorry, cant do that dave, time will run out
expired, // time expired
aborted // got aborted
}
--------------
to be provided. A job may be rescheduled prior to invocation, but it is
activated at most once by an _Scheduler Action._