A look at Google’s Protocol Buffers

July 6th, 2010 by
Protocol Buffers are a serialization format developed by Google- you might ask if another IDL is really needed here – is Google barking at the wrong tree?
But protocol buffers offer some advantages over data serialization via XML or JSON – Google says they (compared to XML)..
  • are 3 to 10 times smaller
  • are 20 to 100 times faster
  • provide generated data access classes for programmatic use
  • provide backward compatibility
So lets play around a little with protocol buffers in Java and build a small application that serializes and deserializes some data using a defined format..

Small Java Example

We want to define a data structure and read it back into a small Java application – in this example we have a to-do list that contains several to-do objects, each todo object has a mandatory title (string) and  an optional priority (enum).

  1. Download and install the protobuf-compiler from the download site, Debian/Ubuntu users just type
    sudo apt-get install protobuf-compiler
  2. Otherwise download the desired sources, unpack and run
    ./configure
    make
    make install
  3. Create a new Maven project e.g. by using the Eclipse Maven Plug-in or via console
    mvn archetype:create -DgroupId=com.hascode.examples.protocol-buffer -DartifactId=protocol-buffer-sample
    # or better
    mvn archetype:generate
  4. Create a directory src/main/resources/protobuf
  5. Create a protocol buffers definition in this directory in a file named todo_provider.proto. More detailed information about the interface definition language syntax is to be found at the Protocol Buffer Homepage.
    package examples;
     
    option java_package = "com.hascode.examples.protocol_buffer";
    option java_outer_classname = "TodoProvider";
     
    message Todo {
    	required string title = 1;
    	enum Priority {
    		NORMAL = 1;
    		MEDIUM = 2;
    		HIGH = 3;
    	}
     
    	optional Priority priority = 2;
    }
     
    message Todos {
    	repeated Todo todos = 1;
    }
  6. Compile the definition via protoc –java_out=src/main/java src/main/protobuf/todo_provider.proto or add the following Ant task to your pom.xml
    <build>
    	<plugins>
    		<plugin>
    			<artifactId>maven-antrun-plugin</artifactId>
    			<executions>
    				<execution>
    					<id>generate-sources</id>
    					<phase>generate-sources</phase>
    					<configuration>
    						<tasks>
    							<exec executable="protoc">
    								<arg value="--java_out=src/main/java" />
    								<arg value="src/main/protobuf/todo_provider.proto" />
    							</exec>
    						</tasks>
    						<sourceRoot>src/main/java</sourceRoot>
    					</configuration>
    					<goals>
    						<goal>run</goal>
    					</goals>
    				</execution>
    			</executions>
    		</plugin>
    	</plugins>
    </build>
  7. Now there is a new class named TodoProvider that provides access to the inner classes Todos and Todo.
    Take a look at the methods provided for the Todo and Todos classes there are integrated builders or methods to read/write Todo/s from/to streams.
    The Todos class that contains several todo objects also offers some features to retrieve a list of todos, query for the amount of todos or fetch a todo at a given index.
    In addition both classes provide methods to test if fields are set (hasXXX e.g. Todo -> hasTitle()) or to print the serialized size of the object – very handy imho..
  8. Add the dependency for protobuffer-java to your pom.xml
    <dependency>
    	<groupId>com.google.protobuf</groupId>
    	<artifactId>protobuf-java</artifactId>
    	<version>2.3.0</version>
    </dependency>
  9. At last, my pom.xml looks like this
    <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
    	<modelVersion>4.0.0</modelVersion>
    	<groupId>com.hascode.examples.protocol-buffer</groupId>
    	<artifactId>protocol-buffer-sample</artifactId>
    	<version>0.0.1-SNAPSHOT</version>
    	<dependencies>
    		<dependency>
    			<groupId>com.google.protobuf</groupId>
    			<artifactId>protobuf-java</artifactId>
    			<version>2.3.0</version>
    			<scope>compile</scope>
    		</dependency>
    	</dependencies>
    	<build>
    		<plugins>
    			<plugin>
    				<artifactId>maven-antrun-plugin</artifactId>
    				<executions>
    					<execution>
    						<id>generate-sources</id>
    						<phase>generate-sources</phase>
    						<configuration>
    							<tasks>
    								<exec executable="protoc">
    									<arg value="--java_out=src/main/java" />
    									<arg value="src/main/protobuf/todo_provider.proto" />
    								</exec>
    							</tasks>
    							<sourceRoot>src/main/java</sourceRoot>
    						</configuration>
    						<goals>
    							<goal>run</goal>
    						</goals>
    					</execution>
    				</executions>
    			</plugin>
    			<plugin>
    				<groupId>org.apache.maven.plugins</groupId>
    				<artifactId>maven-compiler-plugin</artifactId>
    				<version>2.0.2</version>
    				<configuration>
    					<source>1.6</source>
    					<target>1.6</target>
    				</configuration>
    			</plugin>
    		</plugins>
    	</build>
    </project>
  10. Create a java class to write/read data using protocol buffers
    package com.hascode.examples.protocol_buffer;
     
    import java.io.FileInputStream;
    import java.io.FileOutputStream;
    import java.io.IOException;
     
    import com.hascode.examples.protocol_buffer.TodoProvider.Todo;
    import com.hascode.examples.protocol_buffer.TodoProvider.Todos;
    import com.hascode.examples.protocol_buffer.TodoProvider.Todo.Priority;
     
    public class Main {
    	public static void main(String[] args) throws IOException {
    		// write some todos to file
    		Todo todo1 = Todo.newBuilder()
    						  .setTitle("Do the laundry")
    						  .setPriority(Priority.MEDIUM).build();
    		Todo todo2 = Todo.newBuilder()
    						  .setTitle("Write the tutorial")
    						  .setPriority(Priority.HIGH).build();
    		Todos todos = Todos.newBuilder()
    							.addTodos(todo1)
    							.addTodos(todo2)
    							.build();
     
    		FileOutputStream os = new FileOutputStream("/tmp/todo.data");
    		todos.writeTo(os);
    		os.close();
     
    		// read todos from file
    		FileInputStream is = new FileInputStream("/tmp/todo.data");
    		Todos.newBuilder().build();
    		Todos newTodos = Todos.parseFrom(is);
    		for (Todo todo : newTodos.getTodosList()) {
    			System.out.println("Reading todo - title: " + todo.getTitle()
    			+ " priority: " + todo.getPriority());
    		}
    	}
    }
  11. Run the application – you should see this output
    Reading todo - title: Do the laundry priority: MEDIUM
    Reading todo - title: Write the tutorial priority: HIGH
  12. Take a look at the created binary file todo.data in /tmp it’s only 44 bytes “big“!

Downsides

Of course there are also downsides, e.g. lack of inheritance support, no RMI style plain objects (objects have builders and programmatical methods)  and debugging with XML sure is easier than debugging protocol buffers.

There is a nice article from Jorge M. Faleiro covering this topic – it’s definitely worth a look.

Tutorial Sources

I have put the sources for the examples here on Bitbucket .. you may download it there or check it out using

hg clone https://bitbucket.org/hascode/hascode-tutorials

Troubleshooting

  • Several errors + “generics are not supported in -source 1.3 (use -source 5 or higher to enable generics)” – force the Maven compiler to use at least Java 5 by adding the following lines to you pom.xml
    <build>
    	[..]
    	<plugin>
    		<groupId>org.apache.maven.plugins</groupId>
    		<artifactId>maven-compiler-plugin</artifactId>
    		<version>2.0.2</version>
    		<configuration>
    			<source>1.5</source>
    			<target>1.5</target>
    		</configuration>
    	</plugin>
    </build>
  • “Cannot invoke buildParsed() on the primitive type boolean” - According to this issue this is happening if the version of your protoc compiler differs from the java library you’re using. Thanks to Matthew Smith for mentioning! Just install the desired version from the download page and watch out for the correct version. Quick run: protobuf-2.3.0/src/.libs/lt-protoc –java_out=src/main/java src/main/protobuf/todo_provider.proto
  • “I don’t want my classes in src/main/java to be overwritten by the Ant task” – just use this Ant task – it creates a folder target/generated-sources and saves the generated classes into this folder
    <build>
    	<plugins>
    		<plugin>
    			<artifactId>maven-antrun-plugin</artifactId>
    			<executions>
    				<execution>
    					<id>generate-sources</id>
    					<phase>generate-sources</phase>
    					<configuration>
    						<tasks>
    							<mkdir dir="target/generated-sources" />
    							<exec executable="protoc">
    								<arg value="--java_out=src/main/java" />
    								<arg value="src/main/protobuf/todo_provider.proto" />
    							</exec>
    						</tasks>
    						<sourceRoot>target/generated-sources</sourceRoot>
    					</configuration>
    					<goals>
    						<goal>run</goal>
    					</goals>
    				</execution>
    			</executions>
    		</plugin>
    	</plugins>
    </build>

Resources

Alternatives

Another alternative for you might be Apache Avro. If you’re interested, please feel free to have a look at my blog article: “Using Apache Avro with Java and Maven“.

Article Updates

  • 2015-03-03: Table of contents and links to other articles added.

Tags: , , , , , , , , ,

2 Responses to “A look at Google’s Protocol Buffers”

  1. Matthew Says:

    Apparently the buildParsed error is due to a mismatch in versions between your src and the protoc. I have tried installing versions 2.3 and 2.4 and they apparently both install the 2.2 protoc. If you can find a 2.3 or 2.4 protoc I would love to see it.

  2. micha kops Says:

    Thanks for your remark! I’ve the protoc version and the version in the tutorial libs to 2.3.0

Search
Categories