Lucene by Example: Specifying Analyzers on a per-field-basis and writing a custom Analyzer/Tokenizer

Lucene is my favourite search engine library and the more often I use it in my projects the more features or functionality I find that were unknown to me. Two of those features I’d like to share in the following tutorial is one the one hand the possibility to specify different analyzers on a per-field basis and on the other hand the API to create a simple character based tokenizer and analyzer within a few steps. ...

July 6, 2014 · 7 min · 1468 words · Micha Kops

Extending the Confluence Search Index

Developing plugins for the Confluence Wiki a developer sometimes needs to save additional metadata to a page object using Bandana or the ContentPropertyManager. Wouldn’t it be nice if this metadata was available in the built-in Lucene index? That is were the Confluence Extractor Module comes into play.. Overview An extractor allows the developer to add new fields to the lucene search index. Creating a new extractor is quite simple – just implement the interface com.atlassian.bonnie.search.Extractor or bucket.search.lucene.extractor.BaseAttachmentContentExtractor if you want to build a new file extractor. ...

May 23, 2010 · 4 min · 713 words · Micha Kops