Posts Tagged ‘search’
Sunday, December 2nd, 2012
Encountering the situation that you want to extract meta-data or content from a file – might it be an office document, a spreadsheet or even a mp3 or an image – or you’d like to detect the content type for a given file then Apache Tika might be a helpful tool for you.
Apache Tika supports a variety of document formats and has a nice, extendable parser and detection API with a lot of built-in parsers available.
(more…)
Tags: Apache, content extraction, formats, Java, lucene, maven, parser, search, tika
Posted in Java | No Comments »
Saturday, September 8th, 2012
In Lucene 4.x there is an API to fetch index statistics for specific document’s fields.
The following examples shows how to create an index with some random documents and fetch some statistics for a field afterwards ..
(more…)
Tags: analyzer, document, field, indexer, lucene, search, stats
Posted in Java | No Comments »
Tuesday, August 28th, 2012
The latest snippet from my Lucene examples demonstrates how to achieve a facet search using the Lucene 4.0 API and how easy it is to define multiple category paths to aggregate search results for different possible facets.
In the following example we’re indexing some books as a classical example and create multiple category paths for author, publication date and category afterwards ..
(more…)
Tags: analyzer, Api, facet, faceting, indexer, lucene, maven, sbt, search, taxonomy
Posted in Java | 1 Comment »
Monday, March 26th, 2012
In today’s tutorial we’re exploring the world of faceted searches like the one we’re used to see when we’re searching for an item on Amazon.com or other websites. We’re using Hibernate Search here that offers an API to perform discrete as well as range faceted searches on our persisted data.
(more…)
Tags: discrete, entity, example, facet, hibernate, jboss, jpa, lucene, persistence, range, search, tutorial
Posted in Java | No Comments »
Sunday, May 23rd, 2010
Developing plugins for the Confluence Wiki a developer sometimes needs to save additional metadata to a page object using Bandana or the ContentPropertyManager. Wouldn’t it be nice if this metadata was available in the built-in Lucene index?
That is were the Confluence Extractor Module comes into play..
(more…)
Tags: Confluence, document, example, extractor, field, howto, indexer, lucene, luke, maven, plugin, search, tutorial
Posted in Confluence | 3 Comments »
Thursday, March 25th, 2010
Helo – today I wanted to post a small tutorial for a small index and search operation using the Lucene indexer and Maven for the project setup. (more…)
Tags: demo, document, indexer, lucene, maven, multi-field-search, search, snippets, solr, tutorial
Posted in Java | 1 Comment »