often the result stream was the same length, so can reuse the
existing token stream
also, in cases where a new stream was required, set capacity to
the length of the input stream. most output stream are at least
as long as the input, so this may avoid some subsequent resizing
ultimately this is make it more convenient for us to wire up
different elements of the analysis pipeline, without having to
preload everything into memory before we need it
separately the index layer now has a mechanism for storing
internal key/value pairs. this is expected to be used to
store the mapping, and possibly other pieces of data by the
top layer, but not exposed to the user at the top.
defined article word maps for french, italian, irish and catalan
defined elision filters for these same languages
updated analyers for french and italian to use this new filter
irish and catalan still depend on other missing pieces
closes#25