In this post, we will discuss about an example of Avro Serializing and Deserializing with avro data file creation (serializing data) and deserializing the same avro data file to read the contents back. This is continuation for our previous post on Avro Schema , in which we have defined schema for Employee record and compiled the schema with the help of avro-tools-1.7.4.jar file which generated the Java code for schema. In this post, we will discuss below topics.
- Serializing and Deserializing with Code generation
- Serializing and Deserializing without Code generation
In this section we will mainly focus on Java API for serializing and deserializing with code generation and without code generation.
With Code generation:
Lets create some employee records in avro data file with the help of Employee_Record.java file created in example.avro package. Lets copy below lines of code into GenerateDataWithCode.java program in example package. In Eclipse, we will copy these programs into their packages example.avro and example respectively.
In the above code we are creating employee records in three ways (Calling setter methods, Constructor & via Builder class). And we are serializing these employee object records into avro data file with the help of SpecificDatumWriter & DataFileWriter classes of avro library. Below are a few details of these classes.
- SpecificDatumWriter – Java I-O Class to write data of a schema. It implements the base interface DatumWriter. DatumWriter converts Java objects into an in-memory serialized format.
- DataFileWriter – Stores a sequence of data conforming to a schema in a file. The schema is stored in the file with the data. Each datum in a file is of the same schema. Data is written with a DatumWriter. Data is grouped into blocks. A synchronization marker is written between blocks, so that files can be split. Blocks can be compressed. Extensible metadata is stored at the end of the file. Files may be appended to.
After compiling the above program by keeping it in correct package hierarchy, then we can run the program in eclipse itself. Now we can see the employees.avro file got created in the eclipse project folder.
Below is the snapshot of project folder after running the above program in eclipse.
So, now the avro data file is successfully created.
Now lets, read the avro data file with help of below program which uses Employee_Record class to read the employee objects and prints the objects on console. Lets copy the below lines of code into DeserializeWithCode.java program.