lumiera_/doc/design/lowlevel/index.txt

115 lines
5.4 KiB
Text
Raw Normal View History

Design Documents: Vault
=======================
What follows is a summary of Lumiera's *Data Handling Backend*
This is the foundation layer responsible for any high performance or high volume
data access. Within Lumiera, there are two main kinds of data handling:
* The Session and the object models manipulated through the GUI are kept in memory.
They are backed by a _storage backend,_ which provides database-like storage and
especially logging, replaying and ``Undo'' of all ongoing modifications..
* Media data is handled _frame wise_ -- as described below.
The vault layer (``backend'') uses *memory mapping* to make data available to the program.
This is somewhat different to the more common open/read/write/close file access,
while giving superior performance and much better memory utilization.
The Vault-Layer must be able to handle more data than will fit into the memory
or even address space on 32 bit architectures. Moreover, a project might access more files
than the OS can keep open simultaneously, thus the for _Files used by the Vault,_ it needs a
*FilehandleCache* to manage file handle dynamically.
Which parts of a file are actually mapped to physical RAM is managed by the kernel;
it keeps a *FileMapCache* to manage the *FileMaps* we've set up.
In the End, the application itself only requests *Data Frames* from the Vault.
To minimize latency and optimize CPU utilization we have a *Prefetch thread* which operates
a *Scheduler* to render and cache frames which are _expected to be consumed soon_. The intention
is to manage the rendering _just in time_.
The prefetcher keeps *Statistics* for optimizing performance.
Accessing Files
---------------
+FileDescriptor+ is the superclass of all possible filetypes, it has a weak reference to a
+FileHandle+ which is managed in within the +FilehandleCache+. On creation, only the existence
(when reading) or access for write for new files are checked. The +FileDescriptor+ stores some
generic metadata about the underlying file and intended use. But the actual opening is done on demand.
The _content of files is memory mapped_ into the process address space.
This is managed by +FileMap+ entries and a +FileMapCache+.
File Handles
~~~~~~~~~~~~
A +FilehandleCache+ serves to store a finite maximum number of +FileHandles+ as a MRU list.
FileHandles are managed by the +FilehandleCache+; basically they are just storing the underlying OS file
handles and managed in a lazy/weak way, (re)opened when needed and aging in the cache when not needed,
since the amount of open file handles is limited aged ones will be closed and reused when the system
needs to open another file.
File Mapping
~~~~~~~~~~~~
The +FileMapCache+ keeps a list of +FileMaps+, which are currently not in use and subject of aging.
Each +FileMap+ object contains many +Frames+. The actual layout depends on the type of the File.
Mappings need to be _page aligned_ while Frames can be anywhere within a file and dynamically sized.
All established ++FileMap++s are managed together in a central +FileMapCache+.
Actually, +FileMap+ objects are transparent to the application. The upper layers will just
request Frames by position and size. Thus, the +File+ entities associate a filename with the underlying
low level File Descriptor and access
Frames
~~~~~~
+Frames+ are the smallest datablocks handled by the Vault. The application tells the Vault Layer to make
Files available and from then on just requests Frames. Actually, those Frames are (references to) blocks
of continuous memory. They can be anything depending on the usage of the File (Video frames, encoder frames,
blocks of sound samples). Frames are referenced by a smart-pointer like object which manages the lifetime
and caching behavior.
Each frame referece can be in one out of three states:
readonly::
the backing +FileMap+ is checked out from the aging list, frames can be read
readwrite::
the backing +FileMap+ is checked out from the aging list, frames can be read and written
weak::
the +FileMap+ object is checked back into the aging list, the frame can't be accessed but we can
try to transform a weak reference into a readonly or readwrite reference
Frames can be addressed uniquely whenever a frame is not available. The vault can't serve a cached
version of the frame, a (probably recursive) rendering request will be issued.
Prefetching
~~~~~~~~~~~
There are 2 important points when we want to access data with low latency:
. Since we handle much more data than it will fit into most computers RAM.
The data which is backed in files has to be paged in and available when needed.
The +Prefetch+ Thread manages page hinting to the kernel (posix_madvise()..)
. Intermediate Frames must eventually be rendered to the cache.
The Vault Layer will send +Renderjobs+ to the +Scheduler+.
Whenever something queries a +Frame+ from the vault it provides hints about what it is doing.
These hints contain:
* Timing constraints
- When will the +Frame+ be needed
- could we drop the request if it won't be available (rendered) in-time
* Priority of this job (as soon as possible, or just in time?)
* action (Playing forward, playing backward, tweaking, playback speed, recursive rendering of dependent frames)
.Notes
* The Vault Layer will try to render related frames in groups.
* This means that following frames are scheduled with lower priority.
* Whenever the program really requests them the priority will be adjusted.
-> more about link:Scheduler.html[the Scheduling of calculation jobs]