FUSE, and FUSE on macOS


Overview

FUSE, File system in Userspace, is an interface which provides an ability to implement a file system as a normal application. Compared to traditional file systems who run in the OS kernel. File system on FUSE is:

  • easier to implement (and debug, and avoid kernel panic),
  • which brings more flexibility in application scenarios.

Some applications of FUSE:

  • sshfs: mount an SSH server as a local dir
  • fuse-zip: work with a ZIP archive as though it is a plain directory
  • nydus: A acceleration framework for container images, which uses fuse to implement the container file system.

clock demo

The basic principle of FUSE is forwarding the file system operations from kernel to the userspace. A fuse file system consists of 2 parts:

  • A shared module in the kernel, works like a normal file system (like ext4) to the kernal
  • A userspace process, which utilizes libfuse to communicate with the kernel part

A normal system call on file system will be forwarded to the userspace process through the fuse module in kernel, and the result replied by userspace process will be returned in the opposite direction.

[图片] [图片]

The userspace process implements the file system, by implementing the file system operations like open read write opendir stat and etc. Here is the detail of libfuse high-level API. (Jump to the source code)

Background

1. File system, and its basic structures

File system, in short, is a software / system module that provides persistence ability in terms of file and tree-style directory. One side of a file system is the disk which can store raw data, and the other side is the API provides operations like open read write. [图片] [图片]

The term, inode (index node), is a data structure in a Unix-style file system that describes a file system object such as a file or a directory. In this context, inode is a data structure stored on the disk. It contains the meta-data (mode, size, uid …) and where the content data stored on the disk. Inode number is the name of inode, and it’s kind of the low-level filename.

Directory is a special file, containing the filename -> inode number mapping of its children. When reading a file by a filepath, it will look up though the directories repeatedly, find the inode of the target file, then get the raw data of file content. (How to find the data blocks of inode is a main topic of file system implementation, but not for our topic.)

2. VFS, Dentry and Cache

VFS, virtual file system, is a abstract layer of unix-like OS to provide a unified interface to access the underlay concrete file system (like NFS and ext4). VFS unified the abstraction(file, path, inode), the standard API(open, read, opendir) and etc.

VFS, on Linux, consists of these data struct and its operations:

  • Inode Object (not the very sturct used in the implementation of a concrete filesystem, but it represents the same functionality)
  • File Object
  • Super Block Object
  • Directory Entry (dentry) (https://www.kernel.org/doc/Documentation/filesystems/vfs.txt) [图片]

To access a file by a path, like /usr/local/bin/go, we must find the inode by repeatedly looking up the directory’s content, which involves too many disk operations, leading to slow response. The direct entry is the struct for caching the path to inode mapping (actually the ‘parent + name’ to inode) as a hashtable, called dentry cache. The inode is also cache in the kernel.

The file content also has a caching mechanism. The related concept is Unified Page Cache. The ‘unified’ means unifying the memory page cache and filesystem cache. It manages the file system cache like the memory pages. The file system cache is indexed by the inode. (search ‘The Address Space Object’ in the doc above for more detail)

For BSD-derived OS, like macOS, the data structure is a little bit different. Vnode is a structure like the vfs inode. Some docs [1] [2].

FUSE (on Linux)

FUSE in kernel is a normal file system works under VFS. FUSE talks to the userspace process through a Character Device at /dev/fuse.

It’s available as an optional module of the Linux kernel. For Debian Linux, fuse is not installed by default. By sudo apt install fuse, the core module and userspace tools will be installed. The current version of FUSE API is 3.x.

Libfuse

Libfuse (source code) implements the library for communicating with the kernel part, and is the SDK in C language for building your filesystem.

Libfuse provides 2 sets of API:

Low-level API is basically a forward of VFS operations. Here’s a piece of detail of low-level API

  • lookup: It’s a operation callback that looks up a directory entry by name, in form of inode
  • Inode based: all callbacks’ input targets are specified by inode number
  • Control the VFS caching with explicit API, like fuse_lowlevel_notify_inval_inode

High-level API is built on low-level API. It simplifies the API to provide a easier interface to implement a file system.

  • The API is posix like. (chmod, rename)
  • Path based.
  • Automatically handling

The libfuse repo contains a list of examples demonstrating the how to implement a file system with libfuse. An experience needed to mention is not all callback are needed to implement. For high-level API, some callback has default implementation if not provided yours. And even if some advanced callbacks are not implemented, the file system can also run the basic functionalities.

There are other independent implementations of the FUSE protocol (the userspace SDK) besides libfuse. A notable project is go-fuse, which is implemented totally from scratch in golang, and well documented.

Preformance

As FUSE adds an extra kernal/user space transition, it’s slower than a file system in kernal in principle. According to the paper To FUSE or Not to FUSE: Performance of User-Space File Systems

We found that for many workloads, an optimized FUSE can perform within 5% of native Ext4. However, some workloads are unfriendly to FUSE and even if optimized, FUSE degrades their performance by up to 83%. Also, in terms of the CPU utilization, the relative increase seen is 31%.

Another topic is caching. File system on FUSE won’t be that stupid to call the userspace process every time reading the same file in the senerio of reading the content. Like we talked before, VFS has its own caching mechanism. The dentry cache will cache the lookup result (filename -> inode), Unified Page Cache will cache the file content. The caching can be controlled by the fuse API (opitions) in some degree.

FUSE on macOS

macFUSE

FUSE is native to Linux, but not to macOS. macFUSE (formally as osxFuse) is a third-party implementation. It seems originally developed by Google, and now maintained by a independent developer. macFUSE implements the FUSE functionality by a macOS kernel extension (kext). The core kext is originally open-sourced at v3 (repo), but not since v4 (currently is v4.6) (the discussion). But the libfuse repo is still open-source (it’s only a fork of the Linux libfuse with little change)

The macFUSE’s install package contains the following components:

  • file system bundle /Library/Filesystems/macfuse.fs
    • including the kext and mount_macfuse
  • Objective-C framework /Library/Frameworks/macFUSE.framework
    • is a higher API than the libfuse high-level API
  • C-based libraries /usr/local/lib/libfuse*.dylib and headers /usr/local/include
  • *preference pane

(macFUSE’s official wiki is old and lacks documentation. It is not very developer-oriented. The mount-option is very meaningful for the developer)

The old repo of macFUSE (https://github.com/osxfuse) seems to not update any more, but a more modern repo by the same author is https://github.com/macfuse , containing the latest code, like libfuse .

Difference

The main difference between macFUSE and FUSE on Linux are:

  1. macFUSE only supports FUSE v2 (2.9), while FUSE on linux is v3
  2. Some macOS/BSD specified API are added or modified in libfuse.
    • Like the new getxtimes callback; setxattr has a extra BSD style arg.
  3. Performace is not as good as Linux’s All changes in macFUSE version libfuse are wrapped with a #ifdef __APPLE__ macro, so it’s easy to tell the difference in API. But other sides of difference are lack documentation.

Linux FUSE application projects may not run on macOS very well without adaption, due to the BSD style API. For example, getxattr returns ENODATA on linux if not found, but ENOATTR on macOS.