JPA Persistence and Lucene Indexing combined in Hibernate Search

February 5th, 2012 by

Often we’re writing an application that has to handle entities that – on the one side need to be persisted in a relational database using standards like the Java Persistence API (JPA) and using frameworks like Hibernate ORM or EclipseLink.

On the other side those entities and their fields are often stored in a highspeed indexer like Lucene. From this situation arises a bunch of common problems .. to synchronize both data sources, to handle special data mapped in an entity like an office document and so on..

Hibernate Search makes this all a lot easier for us as we’re hopefully going to see in the following short tutorial…


 

Prerequisites

You need to meet the following requirements to run the samples below ..

The Software Stack

In the following tutorial I am using the Java Persistence API as abstraction to the persistence layer with Hibernate 4.0.1 as persistence provider and for the sake of lazyness HSQL as a database.

Hibernate Search 4.0.0 is added and brings Lucene 3.4.0 with it.

Architecture Overview

Architecture Overview

Maven Dependencies

Only three dependencies for hibernate-search, hibernate-entitymanager and  hsqldb are needed here .. so my final pom.xml looks like this one here

<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.hascode.tutorial</groupId>
<artifactId>hibernate-lucene-tutorial</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>hasCode.com Hibernate Lucene Tutorial</name>
<properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
    <dependency>
        <groupId>org.hibernate</groupId>
        <artifactId>hibernate-search</artifactId>
        <version>4.0.0.Final</version>
    </dependency>
    <dependency>
        <groupId>org.hibernate</groupId>
        <artifactId>hibernate-entitymanager</artifactId>
        <version>4.0.1.Final</version>
    </dependency>
    <dependency>
        <groupId>org.hsqldb</groupId>
        <artifactId>hsqldb</artifactId>
        <version>2.2.8</version>
    </dependency>
</dependencies>
<build>
    <plugins>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-compiler-plugin</artifactId>
            <configuration>
                <source>1.6</source>
                <target>1.6</target>
            </configuration>
        </plugin>
    </plugins>
</build>
<repositories>
    <repository>
        <id>maven2-repository.dev.java.net</id>
        <name>Java.net Repository for Maven</name>
        <url>http://download.java.net/maven/2/</url>
        <layout>default</layout>
    </repository>
</repositories>
</project>

A Book Search Example

Now that we’ve got all dependencies needed we’re building a simple proof of concept: We’re building an application that persists some books and afterwards searches for a given criterion using both possible ways – via Lucene query and via JPA JPQ query..

Adjusting JPA and Hibernate

Create a new file named persistence.xml in the directory src/main/resources/META-INF to adjust the JPA persistence settings. For some more detailed information about possible Hibernate Search configuration parameters, please take a look at this chapter from the official user reference.

<persistence xmlns="http://java.sun.com/xml/ns/persistence" version="1.0">
<persistence-unit name="hascode-local" transaction-type="RESOURCE_LOCAL">
    <provider>org.hibernate.ejb.HibernatePersistence</provider>
    <class>com.hascode.tutorial.Book</class>
    <properties>
        <property name="javax.persistence.jdbc.driver" value="org.hsqldb.jdbcDriver"/>
        <property name="javax.persistence.jdbc.url" value="jdbc:hsqldb:mem:testdb"/>
        <property name="javax.persistence.jdbc.user" value="sa"/>
        <property name="javax.persistence.jdbc.password" value=""/>
        <property name="hibernate.dialect" value="org.hibernate.dialect.HSQLDialect"/>
        <property name="hibernate.hbm2ddl.auto" value="update"/>
    </properties>
</persistence-unit>
</persistence>

Defining an Entity

Create a new class named Book, use the common JPA annotations to configure JPA and afterwards configure the indexing using the annotations in the org.hibernate.search.annotations package.

The book entity

The book entity

The basic annotations used here are @Indexed - this one is needed to declare an entity to be indexed and @Field to adjust the index settings for specific fields.

More details on the various possibilities to adjust Hibernate Search here can be found at the following chapter of the official documentation

package com.hascode.tutorial;
 
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.Id;
import javax.persistence.Lob;
 
import org.hibernate.search.annotations.Analyze;
import org.hibernate.search.annotations.DocumentId;
import org.hibernate.search.annotations.Field;
import org.hibernate.search.annotations.Indexed;
import org.hibernate.search.annotations.Store;
 
@Entity
@Indexed(index = "indexes/books")
public class Book {
 private Long id;
 private String title;
 private String summary;
 private String author;
 
 @Id
 @GeneratedValue
 public Long getId() {
 return id;
 }
 
 @Field(name = "title", analyze = Analyze.YES, store = Store.YES)
 public String getTitle() {
 return title;
 }
 
 @Lob()
 @Field(name = "summary", analyze = Analyze.YES, store = Store.YES)
 public String getSummary() {
 return summary;
 }
 
 public void setTitle(final String title) {
 this.title = title;
 }
 
 public void setSummary(final String summary) {
 this.summary = summary;
 }
 
 public void setId(final Long id) {
 this.id = id;
 }
 
 public void setAuthor(final String author) {
 this.author = author;
 }
 
 @Field(name = "author", analyze = Analyze.NO, store = Store.YES)
 public String getAuthor() {
 return author;
 }
 
}

Filling Database and Index and Searching both

Now we’re ready to persist and index some of the entities created above.

As you can see, in the first step we’re initializing the usual JPA context, the EntityManagerFactory, EntityManager and the Transaction. Afterwards we’re creating three book entities and persist them.

Afterwards we’re searching for all books from the author named “fred” – first using a Lucene Term Query and second using a JPA JPQL query.

package com.hascode.tutorial;
 
import java.util.List;
 
import javax.persistence.EntityManager;
import javax.persistence.EntityManagerFactory;
import javax.persistence.EntityTransaction;
import javax.persistence.Persistence;
import javax.persistence.Query;
 
import org.apache.lucene.index.Term;
import org.apache.lucene.search.TermQuery;
import org.hibernate.search.jpa.FullTextEntityManager;
 
public class Library {
 public static void main(final String... args) {
 // creating persistence context
 final EntityManagerFactory emf = Persistence
 .createEntityManagerFactory("hascode-local");
 final EntityManager em = emf.createEntityManager();
 final EntityTransaction tx = em.getTransaction();
 
 tx.begin();
 // creating some books to be indexed and persisted
 Book book1 = new Book();
 book1.setTitle("The big book of nothing");
 book1.setSummary("This is a book without any content");
 book1.setAuthor("fred");
 
 Book book2 = new Book();
 book2.setTitle("Exciting stories I");
 book2.setSummary("A compilation of exciting stories - part 1.");
 book2.setAuthor("selma");
 
 Book book3 = new Book();
 book3.setTitle("My life");
 book3.setSummary("A book about Fred's life.");
 book3.setAuthor("fred");
 
 em.persist(book1);
 em.persist(book2);
 em.persist(book3);
 tx.commit();
 
 // search using lucene
 FullTextEntityManager fullTextEntityManager = org.hibernate.search.jpa.Search
 .getFullTextEntityManager(em);
 
 org.apache.lucene.search.Query titleQuery = new TermQuery(new Term(
 "author", "fred"));
 javax.persistence.Query fullTextQuery = fullTextEntityManager
 .createFullTextQuery(titleQuery);
 
 System.out.println("searching using lucene..");
 List<Book> result = fullTextQuery.getResultList();
 printResults(result);
 
 // oldschool JPA search
 Query query = em
 .createQuery("SELECT b FROM Book b WHERE b.author=:author");
 query.setParameter("author", "fred");
 System.out.println("searching using JPA/JPQL..");
 result = query.getResultList();
 printResults(result);
 
 em.close();
 emf.close();
 }
 
 private static void printResults(final List<Book> result) {
 System.out.println(String.format("%s items found for author:fred",
 result.size()));
 for (Book b : result) {
 System.out.println("title: " + b.getTitle() + ", summary: "
 + b.getSummary() + "(id: " + b.getId() + ")");
 }
 }
}

Running this code you should see the following output

searching using lucene..
2 items found for author:fred
title: The big book of nothing, summary: This is a book without any content(id: 1)
title: My life, summary: A book about Fred's life.(id: 3)
searching using JPA/JPQL..
2 items found for author:fred
title: The big book of nothing, summary: This is a book without any content(id: 1)
title: My life, summary: A book about Fred's life.(id: 3)

Inspecting the Lucene Index

Last but not least we’re going to take a look at the created Lucene index – I am using Luke – the Lucene Index Toolbox here..

Luke - Lucene Index Overview

Luke - Lucene Index Overview

Luke - Lucene Index Document View

Luke - Lucene Index Document View

Tutorial Sources

I have put the source from this tutorial on my Bitbucket repository – download it there or check it out using Mercurial:

hg clone https://bitbucket.org/hascode/hibernate-search-samples

Troubleshooting

  • Caused by: java.lang.ClassNotFoundException: Could not load requested class : org.hibernate.search.event.FullTextIndexEventListener” – if you have followed some older tutorial and added the three Hibernate event listeners in the persistence.xml as the following code – please just delete the following lines – theyare not needed anymore in the newer versions of Hibernate Search
    <property name="hibernate.ejb.event.post-update"
     value="org.hibernate.search.event.FullTextIndexEventListener" />
     <property name="hibernate.ejb.event.post-insert"
     value="org.hibernate.search.event.FullTextIndexEventListener" />
     <property name="hibernate.ejb.event.post-delete"
     value="org.hibernate.search.event.FullTextIndexEventListener" />
  • Be sure not to have Hibernate Annotations in a version <3.3.x somewhere referenced in your project. It is often referenced from other dependencies and should be excluded if possible – please take a look at the discussion from the Hibernate Community Forums

Resources

Tags: , , , , , , , ,

2 Responses to “JPA Persistence and Lucene Indexing combined in Hibernate Search”

  1. Tom Says:

    Great introduction. Could you give more advanced example with full-text-search and document parsing(XML, PDF etc.) ?

  2. micha kops Says:

    sure! it is indeed a good idea for a follow up tutorial :)
    In retrospect I could have used a WildcardQuery, PhraseQuery or FuzzyQuery instead of a simple TermQuery to demonstrate the advantage of Lucene in the example.

Leave a Reply

Please leave these two fields as-is:

Protected by Invisible Defender. Showed 403 to 80,962 bad guys.

Search
Categories