From c2a5caa487bd720d33ea9050e9225db80c043b44 Mon Sep 17 00:00:00 2001 From: Gibheer Date: Fri, 10 Jun 2011 10:55:53 +0200 Subject: [PATCH] some more todo items --- TODO | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/TODO b/TODO index d357e64..e99a6d1 100644 --- a/TODO +++ b/TODO @@ -1,3 +1,10 @@ +* implement stemmer for strings + * build an abstract stemmer class + * implement a simple stemmer +* build an analyzer to split the strings + * the analyzer uses tokens to split the stream + * it can throw out string parts, which have no meaning + * builds an array of strings with number of occurences (needed for scoring) * IndexSearcher * build indexes for attributes * implement tree structures for index @@ -9,9 +16,6 @@ * could be implemented as per document score stored in the index * findings of string in the document in relation to findings in all documents * info should be accessible after building the index -* implement stemmer for strings - * build an abstract stemmer class - * implement a simple stemmer * implement resultset * should be streamlined to gather resultset from multiple queries * sorts the result if needed