1
0
Fork 0
Commit Graph

3 Commits

Author SHA1 Message Date
Jan Steemann d2aef2dcdc add words to fulltext index
this parses the fulltext-indexed attributes of documents when there's a fulltext index, and adds the individual words to the index.
As the fulltext index is case-sensitive, all words are added to the index in lower case.
The text tokenisation implementation is still very naive and currently works properly only for character ranges [a-z] and [A-Z].
Unicode words are also supported, but they are not normalised nor lower-cased yet. Additionally, unicode punctuation characters are not excluded and will also be added to the index.
Updating documents that are fulltext-indexed currently does not work.
2012-12-02 00:55:59 +01:00
Frank Celler f126016484 added ExtractShapedJsonVocShaper 2012-07-16 15:42:41 +02:00
Frank Celler d2c758d663 the great rename 2012-06-08 15:01:25 +02:00