document our custom iterator concept

This commit is contained in:
Fischlurch 2012-01-08 00:13:23 +01:00
parent 484149e73e
commit d2c5297a9d
3 changed files with 191 additions and 3 deletions

View file

@ -149,6 +149,12 @@ for the most common STL containers, plus Map, key and value extractors.
Ichthyostega:: 'Sa 16 Apr 2011 00:20:13 CEST'
minor change: removed support for post-increment. It doesn't fit with the concept
and caused serious problems in practice. A correct implementation of post-increment
would require a ``deep copy'' of any underlying data structures.
Ichthyostega:: 'Sa 07 Jan 2012 21:49:09 CET' ~<prg@ichthyostega.de>~
//endof_comments:

View file

@ -0,0 +1,141 @@
Iterators and Pipelines
=======================
The link:http://c2.com/cgi/wiki?IteratorPattern[Iterator Pattern] allows to
expose the contents or elements of any kind of collection, set or container
for use by client code, without exposing the implementation of the underlying
data structure. Thus, iterators are one of the primary API building blocks.
Lumiera Forward Iterator
------------------------
While most modern languages provide some kind of _Iterator,_ the actual semantics
and the fine points of the implementation vary greatly from language to language.
Unfortunately the C++ standard library uses a very elaborate and rather low-level
notion of iterators, which doesn't mix well with the task of building clean interfaces.
Thus, within the Lumiera core application, we're using our own Iterator concept,
initially defined as link:{ldoc}/devel/rfc/LumieraForwardIterator.html[RfC],
which places the primary focus on interfaces and decoupling, trading off
readability and simplicity for (lesser) performance.
.Definition
An Iterator is a self-contained token value,
representing the promise to pull a sequence of data
- rather then deriving from an specific interface, anything behaving
appropriately _is a Lumiera Forward Iterator._ (``duck typing'')
- the client finds a typedef at a suitable, nearby location. Objects of this
type can be created, copied and compared.
- any Lumiera forward iterator can be in _exhausted_ (invalid) state, which
can be checked by +bool+ conversion.
- especially, default constructed iterators are fixed to that state.
Non-exhausted iterators may only be obtained by API call.
- the exhausted state is final and can't be reset, meaning that any iterator
is a disposable one-way-off object.
- when an iterator is _not_ in the exhausted state, it may be _dereferenced_
(`*i`), yielding the ``current'' value
- moreover, iterators may be incremented (`++i`) until exhaustion.
Motivation
~~~~~~~~~~
The Lumiera Forward Iterator concept is a blend of the STL iterators and
iterator concepts found in Java, C#, Python and Ruby. The chosen syntax should
look familiar to C++ programmers and indeed is compatible to STL containers and
ranges. To the contrary, while a STL iterator can be thought of as being just
a disguised pointer, the semantics of Lumiera Forward Iterators is deliberately
reduced to a single, one-way-off forward iteration, they can't be reset,
manipulated by any arithmetic, and the result of assigning to an dereferenced
iterator is unspecified, as is the meaning of post-increment and stored copies
in general. You _should not think of an iterator as denoting a position_ --
just a one-way off promise to yield data.
Another notable difference to the STL iterators is the default ctor and the
+bool+ conversion. The latter allows using iterators painlessly within +for+
and +while+ loops; a default constructed iterator is equivalent to the STL
container's +end()+ value -- indeed any _container-like_ object exposing
Lumiera Forward Iteration is encouraged to provide such an `end()`-function,
additionally enabling iteration by `std::for_each` (or Lumiera's even more
convenient `util::for_each()`).
Implementation
~~~~~~~~~~~~~~
As pointed out above, within Lumiera the notion of ``Iterator'' is a concept
(generic programming) and doesn't mean a supertype (object orientation). Any
object providing a suitable set of operations can be used for iteration.
- must be default constructible to _exhausted state_
- must be a copyable value object
- must provide a `bool` conversion to detect _exhausted state_
- must provide a pre-increment operator (`++i`)
- must allow dreferentiation (`*i`) to yield the current object
- must throw on any usage in _exhausted state_.
But, typically you wouldn't write all those operations again and again.
Rather, there are two basic styles of iterator implementations, each of which
is supported by some pre defined templates and a framework of helper functions.
Iterator Adapters
^^^^^^^^^^^^^^^^^
Iterators built based on these adaptor templates are lightweight and simple to use
for the implementor. But they don't decouple from the actual implementation, and
the resulting type of the iterator usually is rather convoluted. So the typical
usage scenario is, when defining some kind of custom container, we want to add
a `begin()` and `end()` function, allowing to make it behave similar to a STL
container. There should be an embedded typedef `iterator` (and maybe `const_iterator`).
This style is best used within generic code at the implementation level, but is not
well suited for interfaces.
-> see 'lib/iter-adapter.hpp'
Iteration Sources
^^^^^^^^^^^^^^^^^
Here we define a classical abstract base class to be used at interfaces. The template
`lib::IterSource<TY>` is an abstract promise to yield elements of type TY. It defines
an embedded type `iterator` (which is an iterator adapter, just only depending on
this abstract interface). Typically, interfaces declare to return an
`IterSource<TY>::iterator` as the result of some API call. These iterators will
hold an embedded back-reference to ``their'' source, while the exact nature of this
source remains opaque. Obviously, the price to pay for this abstraction barrier is
calling through virtual functions into the actual implementation of the ``source''.
Helpers to define iterators
^^^^^^^^^^^^^^^^^^^^^^^^^^^
For both kinds of iterator implementation, there is a complete set of adaptors based
on STL containers. Thus, it's possible to expose the contents of such a container,
or the keys, the values or the unique values just with a single line of code. Either
as iterator adapter (-> see 'lib/iter-adapter-stl.hpp'), or as iteration source
(-> see 'lib/iter-source.hpp')
Pipelines
---------
The extended use of iterators as an API building block naturally leads to building
_filter pipelines_: This technique form functional programming completely abstracts
away the actual iteration, focussing solely on the selecting and processing of
individual items. For this to work, we need special manipulation functions, which
take an iterator and yield a new iterator incorporating the manipulation. (Thus,
in the terminology of functional programming, these would be considered to be
``higher order functions'', i.e. functions processing other functions, not values).
The most notable building blocks for such pipelines are
filtering::
each element yielded by the _source iterator_ is evaluated by a _predicate function,_
i.e. a function taking the element as argument and returning a `bool`, thus answering
a ``yes or no'' question. Only elements passing the test by the predicate can pass
on and will appear from the result iterator, which thus is a _filtered iterator_
transforming::
each element yielded by the _source iterator_ is passed through a _transformnig function,_
i.e. a function taking an source element and returing a ``transformed'' element, which
thus may be of a completely different type than the source.
Since these elements can be chained up, such a pipeline may pass several abstraction barriers
and APIs, without either the source or the destination being aware of this fact. The actual
processing only happens _on demand_, when pulling elements from the end of the pipeline.
Oten, this end is either a _collecting step_ (pulling all elements and filling a new container)
or again a IterSource to expose the promise to yield elements of the target type.
Pipelines work best on _value objects_ -- special care is necessary when objects with _reference
semantics_ are involved.

View file

@ -670,17 +670,58 @@ _tbw_
Iterators
~~~~~~~~~
Iterators serve to decouple a collection of elements from the actual data type
implementation used to manage those elements. The use of iterators is a
design pattern.
-> see link:{ldoc}/technical/library/iterator.html[detailed library documentation]
Lumiera Forward Iterator
^^^^^^^^^^^^^^^^^^^^^^^^
_tbw_
Within Lumiera, we don't treat _Iterator_ as a base class -- we treat it as a _concept_
for generic programming, similar to the usage in the STL. But we use our own definition
of the iterator concept, placing the primary focus on interfaces and decoupling.
Our ``Lumiera Forward Iterator'' concept deliberately removes most of the features
known from the STL. Rather, such an iterator is just the promise for pulling values
_once_. The iterator can be disposed when _exhausted_ -- there is no way of resetting,
moving backwards or doing any kind of arithmetic with such an object. The _exhausted
state can be detected by a +bool+ conversion (contrast this with STL iterators, where
you need to compare to an +end+ iterator). Beyond that, the usage is quite similar,
even compatible to +std::for_each+.
Iterator Adapters
^^^^^^^^^^^^^^^^^
_tbw_
We provide a collection of pre defined adapter templates to ease building
Lumiera Forward Iterators.
- a generic solution using a _iteration control_ callback API
- the `lib::RangeIter` just wraps up a pair of iterators for ``current position''
and ``and'' -- compatible with the STL
- there is a variant for automatically dereferencing pointers
- plus a set of adapters for STL containers, allowing to expose each value, each
key, distinct values and so on.
Iterator Adapters are designed for ease of use, they don't conceal the underlying
implementation (and the actual type is often quite convoluted).
Iteration Sources
^^^^^^^^^^^^^^^^^
To the contrary, the `lib::IterSource<TY>` template is an abstract base class.
This allows to expose the promise to deliver values through any kind of API, without
disclosing the actual implementation. Obviously, this requires the use of virtual
functions for the actual iteration.
Again, there are pre-defined adaptors for STL containers, but the actual container
is concealed in this case.
Itertools
^^^^^^^^^
_tbw_
Iterators can be used to build pipelines. This technique from functional programming
allows to abstract away the actual iteration completely, focussing only on the way
individual elements are processed. To support this programming style, several support
templates are provided to build _filtering iterators, transforming iterators,_ to pick
only _unique values,_ to _take a snapshot on-the-fly_ etc. There are convenience
builder functions for those operations, figuring out the actual source and destination
types by template metaprogramming.
Front-end for boost::format