From d2c5297a9d30afaa4151bacb20487985712af508 Mon Sep 17 00:00:00 2001 From: Ichthyostega Date: Sun, 8 Jan 2012 00:13:23 +0100 Subject: [PATCH] document our custom iterator concept --- doc/devel/rfc/LumieraForwardIterator.txt | 6 + doc/technical/library/iterator.txt | 141 +++++++++++++++++++++++ doc/technical/overview.txt | 47 +++++++- 3 files changed, 191 insertions(+), 3 deletions(-) create mode 100644 doc/technical/library/iterator.txt diff --git a/doc/devel/rfc/LumieraForwardIterator.txt b/doc/devel/rfc/LumieraForwardIterator.txt index e93ca8861..5bb298127 100644 --- a/doc/devel/rfc/LumieraForwardIterator.txt +++ b/doc/devel/rfc/LumieraForwardIterator.txt @@ -149,6 +149,12 @@ for the most common STL containers, plus Map, key and value extractors. Ichthyostega:: 'Sa 16 Apr 2011 00:20:13 CEST' +minor change: removed support for post-increment. It doesn't fit with the concept +and caused serious problems in practice. A correct implementation of post-increment +would require a ``deep copy'' of any underlying data structures. + +Ichthyostega:: 'Sa 07 Jan 2012 21:49:09 CET' ~~ + //endof_comments: diff --git a/doc/technical/library/iterator.txt b/doc/technical/library/iterator.txt new file mode 100644 index 000000000..e06b9bf4e --- /dev/null +++ b/doc/technical/library/iterator.txt @@ -0,0 +1,141 @@ +Iterators and Pipelines +======================= + +The link:http://c2.com/cgi/wiki?IteratorPattern[Iterator Pattern] allows to +expose the contents or elements of any kind of collection, set or container +for use by client code, without exposing the implementation of the underlying +data structure. Thus, iterators are one of the primary API building blocks. + +Lumiera Forward Iterator +------------------------ +While most modern languages provide some kind of _Iterator,_ the actual semantics +and the fine points of the implementation vary greatly from language to language. +Unfortunately the C++ standard library uses a very elaborate and rather low-level +notion of iterators, which doesn't mix well with the task of building clean interfaces. + +Thus, within the Lumiera core application, we're using our own Iterator concept, +initially defined as link:{ldoc}/devel/rfc/LumieraForwardIterator.html[RfC], +which places the primary focus on interfaces and decoupling, trading off +readability and simplicity for (lesser) performance. + +.Definition +An Iterator is a self-contained token value, +representing the promise to pull a sequence of data + + - rather then deriving from an specific interface, anything behaving + appropriately _is a Lumiera Forward Iterator._ (``duck typing'') + - the client finds a typedef at a suitable, nearby location. Objects of this + type can be created, copied and compared. + - any Lumiera forward iterator can be in _exhausted_ (invalid) state, which + can be checked by +bool+ conversion. + - especially, default constructed iterators are fixed to that state. + Non-exhausted iterators may only be obtained by API call. + - the exhausted state is final and can't be reset, meaning that any iterator + is a disposable one-way-off object. + - when an iterator is _not_ in the exhausted state, it may be _dereferenced_ + (`*i`), yielding the ``current'' value + - moreover, iterators may be incremented (`++i`) until exhaustion. + + +Motivation +~~~~~~~~~~ +The Lumiera Forward Iterator concept is a blend of the STL iterators and +iterator concepts found in Java, C#, Python and Ruby. The chosen syntax should +look familiar to C++ programmers and indeed is compatible to STL containers and +ranges. To the contrary, while a STL iterator can be thought of as being just +a disguised pointer, the semantics of Lumiera Forward Iterators is deliberately +reduced to a single, one-way-off forward iteration, they can't be reset, +manipulated by any arithmetic, and the result of assigning to an dereferenced +iterator is unspecified, as is the meaning of post-increment and stored copies +in general. You _should not think of an iterator as denoting a position_ -- +just a one-way off promise to yield data. + +Another notable difference to the STL iterators is the default ctor and the ++bool+ conversion. The latter allows using iterators painlessly within +for+ +and +while+ loops; a default constructed iterator is equivalent to the STL +container's +end()+ value -- indeed any _container-like_ object exposing +Lumiera Forward Iteration is encouraged to provide such an `end()`-function, +additionally enabling iteration by `std::for_each` (or Lumiera's even more +convenient `util::for_each()`). + +Implementation +~~~~~~~~~~~~~~ +As pointed out above, within Lumiera the notion of ``Iterator'' is a concept +(generic programming) and doesn't mean a supertype (object orientation). Any +object providing a suitable set of operations can be used for iteration. + +- must be default constructible to _exhausted state_ +- must be a copyable value object +- must provide a `bool` conversion to detect _exhausted state_ +- must provide a pre-increment operator (`++i`) +- must allow dreferentiation (`*i`) to yield the current object +- must throw on any usage in _exhausted state_. + +But, typically you wouldn't write all those operations again and again. +Rather, there are two basic styles of iterator implementations, each of which +is supported by some pre defined templates and a framework of helper functions. + +Iterator Adapters +^^^^^^^^^^^^^^^^^ +Iterators built based on these adaptor templates are lightweight and simple to use +for the implementor. But they don't decouple from the actual implementation, and +the resulting type of the iterator usually is rather convoluted. So the typical +usage scenario is, when defining some kind of custom container, we want to add +a `begin()` and `end()` function, allowing to make it behave similar to a STL +container. There should be an embedded typedef `iterator` (and maybe `const_iterator`). +This style is best used within generic code at the implementation level, but is not +well suited for interfaces. + +-> see 'lib/iter-adapter.hpp' + +Iteration Sources +^^^^^^^^^^^^^^^^^ +Here we define a classical abstract base class to be used at interfaces. The template +`lib::IterSource` is an abstract promise to yield elements of type TY. It defines +an embedded type `iterator` (which is an iterator adapter, just only depending on +this abstract interface). Typically, interfaces declare to return an +`IterSource::iterator` as the result of some API call. These iterators will +hold an embedded back-reference to ``their'' source, while the exact nature of this +source remains opaque. Obviously, the price to pay for this abstraction barrier is +calling through virtual functions into the actual implementation of the ``source''. + +Helpers to define iterators +^^^^^^^^^^^^^^^^^^^^^^^^^^^ +For both kinds of iterator implementation, there is a complete set of adaptors based +on STL containers. Thus, it's possible to expose the contents of such a container, +or the keys, the values or the unique values just with a single line of code. Either +as iterator adapter (-> see 'lib/iter-adapter-stl.hpp'), or as iteration source +(-> see 'lib/iter-source.hpp') + + +Pipelines +--------- +The extended use of iterators as an API building block naturally leads to building +_filter pipelines_: This technique form functional programming completely abstracts +away the actual iteration, focussing solely on the selecting and processing of +individual items. For this to work, we need special manipulation functions, which +take an iterator and yield a new iterator incorporating the manipulation. (Thus, +in the terminology of functional programming, these would be considered to be +``higher order functions'', i.e. functions processing other functions, not values). +The most notable building blocks for such pipelines are + +filtering:: + each element yielded by the _source iterator_ is evaluated by a _predicate function,_ + i.e. a function taking the element as argument and returning a `bool`, thus answering + a ``yes or no'' question. Only elements passing the test by the predicate can pass + on and will appear from the result iterator, which thus is a _filtered iterator_ + +transforming:: + each element yielded by the _source iterator_ is passed through a _transformnig function,_ + i.e. a function taking an source element and returing a ``transformed'' element, which + thus may be of a completely different type than the source. + +Since these elements can be chained up, such a pipeline may pass several abstraction barriers +and APIs, without either the source or the destination being aware of this fact. The actual +processing only happens _on demand_, when pulling elements from the end of the pipeline. +Oten, this end is either a _collecting step_ (pulling all elements and filling a new container) +or again a IterSource to expose the promise to yield elements of the target type. + +Pipelines work best on _value objects_ -- special care is necessary when objects with _reference +semantics_ are involved. + diff --git a/doc/technical/overview.txt b/doc/technical/overview.txt index 3e389544c..e6a8f0b5c 100644 --- a/doc/technical/overview.txt +++ b/doc/technical/overview.txt @@ -670,17 +670,58 @@ _tbw_ Iterators ~~~~~~~~~ +Iterators serve to decouple a collection of elements from the actual data type +implementation used to manage those elements. The use of iterators is a +design pattern. +-> see link:{ldoc}/technical/library/iterator.html[detailed library documentation] + Lumiera Forward Iterator ^^^^^^^^^^^^^^^^^^^^^^^^ -_tbw_ +Within Lumiera, we don't treat _Iterator_ as a base class -- we treat it as a _concept_ +for generic programming, similar to the usage in the STL. But we use our own definition +of the iterator concept, placing the primary focus on interfaces and decoupling. +Our ``Lumiera Forward Iterator'' concept deliberately removes most of the features +known from the STL. Rather, such an iterator is just the promise for pulling values +_once_. The iterator can be disposed when _exhausted_ -- there is no way of resetting, +moving backwards or doing any kind of arithmetic with such an object. The _exhausted +state can be detected by a +bool+ conversion (contrast this with STL iterators, where +you need to compare to an +end+ iterator). Beyond that, the usage is quite similar, +even compatible to +std::for_each+. Iterator Adapters ^^^^^^^^^^^^^^^^^ -_tbw_ +We provide a collection of pre defined adapter templates to ease building +Lumiera Forward Iterators. + +- a generic solution using a _iteration control_ callback API +- the `lib::RangeIter` just wraps up a pair of iterators for ``current position'' + and ``and'' -- compatible with the STL +- there is a variant for automatically dereferencing pointers +- plus a set of adapters for STL containers, allowing to expose each value, each + key, distinct values and so on. + +Iterator Adapters are designed for ease of use, they don't conceal the underlying +implementation (and the actual type is often quite convoluted). + +Iteration Sources +^^^^^^^^^^^^^^^^^ +To the contrary, the `lib::IterSource` template is an abstract base class. +This allows to expose the promise to deliver values through any kind of API, without +disclosing the actual implementation. Obviously, this requires the use of virtual +functions for the actual iteration. + +Again, there are pre-defined adaptors for STL containers, but the actual container +is concealed in this case. Itertools ^^^^^^^^^ -_tbw_ +Iterators can be used to build pipelines. This technique from functional programming +allows to abstract away the actual iteration completely, focussing only on the way +individual elements are processed. To support this programming style, several support +templates are provided to build _filtering iterators, transforming iterators,_ to pick +only _unique values,_ to _take a snapshot on-the-fly_ etc. There are convenience +builder functions for those operations, figuring out the actual source and destination +types by template metaprogramming. Front-end for boost::format