Linux FUSE adapter for Blob Storage

Postado em 6 novembro, 2017

Program Manager

Azure Blob Storage is used in cloud applications everyday for its cost-efficiency, scale and simplicity. Today however, many applications continue to be developed with a local file system in mind, and there are a large number of legacy applications that still exist. Oftentimes, modifying these applications to consume Blob storage through the REST APIs has been a challenge, even with the great SDKs we provide. To address these challenges, we are introducing the preview of the new FUSE adapter for Blob Storage. This enables you to mount a blob container on Linux platforms. There is no need to rewrite your application! 

blobfuse – A virtual file system for Linux, backed by Azure Blob storage

Filesystem in Userspace (FUSE) is an interface on Linux which allows users to create their own file systems without the complexity of the kernel code. blobfuse implements the necessary functions to communicate with this interface, and creates a virtual file system backed by Azure Blob storage. You can use blobfuse to mount a container with a non-privileged user on Linux, and access the same data you access from REST APIs using the regular file system interface on Linux. 

Here is a demo where I run a few simple operations with blobfuse, including transcoding a video with ffmpeg:

Features

Currently, we implement the following features in blobfuse:

  • Mount a Blob container on Linux
  • Basic file system operations such as mkdir, opendir, readdir, rmdir, open, read, create, write, close, unlink, truncate, stat, and rename
  • Local cache to improve subsequent file access times
  • Parallel download and upload of data to achieve higher throughput
  • Allows multiple nodes to mount the same container. With blobfuse mounted on multiple nodes, you can take advantage of the increased throughput and IOPS limit on storage accounts while accessing the data with the regular file system APIs. (Note, however, there is no sync between nodes on writes to blob storage. See the following limitations section for more information.) 

Limitations

blobfuse’s primary usage scenario is to enable developers, data scientists, system administrators to interact with blob storage using familiar file system operations. It can also help enable some legacy applications to run on top of blob storage. Because blobfuse runs on top of an object store, it is important to understand that not all functionality will behave identically to a regular file system. Some examples include: 

  • Unimplemented file system operations in blobfuse may break your existing application. Symbolic links, file permissions, flags, and POSIX file locking operations are currently not implemented.
  • Updating an existing file is an expensive operation as blobfuse downloads the entire file to the local disk before it can modify the contents.
  • You can mount blobfuse from multiple nodes, however there is no synchronization between nodes regarding writes to Blob storage. Do not use blobfuse for concurrent writes.

Please keep these limitations in mind when evaluating blobfuse. For a full list of all limitations and issues, please visit blobfuse repository on Github. 

Installation

Follow the installation directions. At this time, the only way to install is to build it from source code, and mount using the mount script provided in the repository. We expect to provide install packages in the future, stay tuned for more information.

Notes

blobfuse is currently in preview, make sure to have appropriate backups of your data before you use blobfuse. We recommend you enable HTTPS to secure your data over the wire and protect against data corruption.

Feedback

All feedback is welcome. Please drop us a line by creating an issue on the blobfuse repository. We are going to improve Blobfuse with your feedback. So, don’t hesitate to reach out!