Today, we are announcing release of Microsoft Avro Library. This release is a result of collaborative effort of multiple teams in Microsoft. The release brings complete and performant .NET implementation of the Avro serialization format to Azure HDInsight Service and open source community.
Download this release
The Avro library is available from NuGet gallery. In Visual Studio, the package can be installed or updated from NuGet Gallery or NuGet Package Manager using the following syntax:
Install-Package Microsoft.Hadoop.Avro
The source code is available under the Apache 2.0 license on CodePlex.
Apache Avro
Apache Avro provides a compact binary data serialization format similar to Thrift or Protocol Buffers. It has additional features that make it more suitable for distributed processing environments like Hadoop. The Avro Library implements the Apache Avro data serialization specification for the .NET environment.
Supported Functionality
The Avro library serializes arbitrary type structure by building in-memory expression tree. This in-memory expression tree is compiled into IL code which delivers native performance of the specialized serializer.
The library supports following modes:
- Reflection mode. The IL code for the serializer is built based on the schema of .NET types to achieve maximum performance.
- Generic record mode. The JSON schema of the data can be specified at runtime so that it provides the ability for handling dynamic data with arbitrary schema.
- Container mode. The library can generate portable files with embedded schema. The file format is compatible with Avro container file specification and can be used across platforms.