File System Project Seminar File System Interfaces Prof. Andreas - - PowerPoint PPT Presentation

file system project seminar file system interfaces
SMART_READER_LITE
LIVE PREVIEW

File System Project Seminar File System Interfaces Prof. Andreas - - PowerPoint PPT Presentation

File System Project Seminar File System Interfaces Prof. Andreas Polze Andreas Grapentin, Sven Khler Max Plauth, Jossekin Beilharz, Felix Eberhardt Hasso Plattner Institute 1 File System Seminar Background Interfaces Polze, Grapentin,


slide-1
SLIDE 1

File System Project Seminar File System Interfaces

  • Prof. Andreas Polze

Andreas Grapentin, Sven Köhler Max Plauth, Jossekin Beilharz, Felix Eberhardt Hasso Plattner Institute

slide-2
SLIDE 2

1

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 2

Background

slide-3
SLIDE 3

Interfaces Multiple File Systems May be Present

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 3

/ proc usr home share bin

btrfs procfs ext4 nfs The operating system needs to provide a common interface to different file system drivers, as well as the user space applications.

slide-4
SLIDE 4

Block Buffer

Interfaces A Common Abstraction

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 4

  • pen

program

readdir

Virtual File System ext4 proc fs btrfs nfs socket disk

kernel

slide-5
SLIDE 5

Interfaces Virtual File Systems

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 5

■ First introduced with SunOS 2.0 in 1985 (for NFS) ■ Nowadays most major operating systems provide a VFS or comparable means (Linux, *BSD, macOS, Windows, …) ■ File System In User Space (FUSE) is a cross-platform VFS, fed from User Space processes ■ Desktop Enviroments often provide their own VFS (KDE – KIO, Gnome – GIO)

program VFS FUSE FUSE driver block buffer disk GIO

slide-6
SLIDE 6

Virtual File System Tasks

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 6

■ Provides an abstraction layer between applications and file systems ■ Presents a common file model ■ Caches inodes, directory entries, block buffers (e.g. LRU) ■ May include syncronization/locking mechanisms ■ Ease implementation for new file system drivers by auto-completing interfaces (e.g. use mmap to implement read/write)

slide-7
SLIDE 7

2

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 7

Linux VFS

Applications (processes) VFS Request-based device mapper targets dm-multipath Physical devices HDD SSD DVD drive Micron PCIe card LSI RAID Adaptec RAID Qlogic HBA Emulex HBA malloc BIOs (block I/Os) sysfs (transport attributes) SCSI upper level drivers /dev/sda scsi-mq ... /dev/sd* SCSI low level drivers megaraid_sas aacraid qla2xxx ... libata ahci ata_piix ... lpfc Transport classes scsi_transport_fc scsi_transport_sas scsi_transport_... /dev/vd* virtio_blk mtip32xx /dev/rssd* ext2 ext3 btrfs ext4 xfs ifs iso9660 ... NFS coda Network FS gfs
  • cfs
smbfs ... Pseudo FS Special purpose FS proc sysfs futexfs usbfs ... tmpfs ramfs devtmpfs pipefs network nvme device version 4.0, 2015-06-01
  • utlines the Linux storage stack as of Kernel version 4.0
mmap (anonymous pages) iscsi_tcp network /dev/rbd* Block-based FS read(2) write(2)
  • pen(2)
stat(2) chmod(2) ... Page cache mdraid ... stackable Devices on top of “normal” block devices drbd (optional) LVM BIOs (block I/Os) BIOs BIOs Block Layer multi queue blkmq Software queues Hardware dispatch queues ... ... hooked in device drivers (they hook in like stacked devices do) BIOs Maps BIOs to requests deadline cfq noop I/O scheduler Hardware dispatch queue Request based drivers BIO based drivers Request based drivers ceph struct bio
  • sector on disk
  • bio_vec cnt
  • bio_vec index
  • bio_vec list
  • sector cnt
Fibre Channel
  • ver Ethernet
LIO target_core_mod tcm_fc FireWire ISCSI Direct I/O (O_DIRECT) device mapper network iscsi_target_mod sbp_target target_core_file target_core_iblock target_core_pscsi vfs_writev, vfs_readv, ... dm-crypt dm-mirror dm-thin dm-cache tcm_qla2xxx tcm_usb_gadget USB Fibre Channel tcm_vhost Virtual Host /dev/nvme*n* SCSI mid layer virtio_pci LSI 12Gbs SAS HBA mpt3sas bcache /dev/nullb* vmw_pvscsi /dev/skd* skd stec device virtio_scsi para-virtualized SCSI VMware's para-virtualized SCSI target_core_user unionfs FUSE /dev/mmcblk*p* dm-raid /dev/sr* /dev/st* pm8001 PMC-Sierra HBA SD-/MMC-Card /dev/rsxx* rsxx IBM flash adapter /dev/zram* memory null_blk ufs userspace ecryptfs Stackable FS mobile device flash memory nvme
  • verlayfs
userspace (e.g. sshfs) mmc rbd zram dm-delay
slide-8
SLIDE 8

Virtual File System in Linux The Basic Building Blocks (4+2)

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 8

■ struct super_block – represents a mounted system, links the root directory. ■ struct inode – represents an existing file ■ struct file – represents an open file, with pointer ■ struct dentry – represents a directory entry, links a name to an inode ■ struct file_system_type – used to mount a system, builds a superblock ■ struct vfsmount – a device+mountpoint pair

slide-9
SLIDE 9

Virtual File System in Linux Relation Of The Four Basic Objects

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 9

Storage Device proc1 proc2 file file dentry dentry inode superblock fd fd f_dentry f_dentry d_inode i_sb

Dentry cache Hardlink

slide-10
SLIDE 10

Virtual File System in Linux OOP-ish?

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 10

AbstractInode ________________ + size + permission … ________________ + create = 0 + link = 0 + unlink = 0 … InodeExt4 InodeNFS InodeProcFS

Object oriented programming looks like the ideal paradigm here. But Linux, as most kernels, uses C. Virtual methods, polymorphism and inheritance require additional effort in C.

slide-11
SLIDE 11

Virtual File System in Linux How To Do Virtual Methods and Objects in C

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 11

( )

GObject

/

PT2 <= 2013

slide-12
SLIDE 12

Virtual File System in Linux How To Do Virtual Methods The Linux Way (I)

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 12

struct inode { unsigned long i_ino; umode_t i_mode; uid_t i_uid; gid_t i_gid; kdev_t i_rdev; loff_t i_size; struct timespec i_atime; struct timespec i_ctime; struct timespec i_mtime; struct super_block *i_sb; struct inode_operations *i_op; const struct file_operations *i_fop; void *i_private; }

include/linux/fs.h#L566

slide-13
SLIDE 13

Virtual File System in Linux How To Do Virtual Methods The Linux Way (II)

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 13

struct inode_operations { int (*create) (struct inode *, struct dentry *, int); struct dentry * (*lookup) (struct inode *, struct dentry *); int (*link) (struct dentry *, struct inode *, struct dentry *); int (*unlink) (struct inode *, struct dentry *); int (*symlink) (struct inode *, struct dentry *, const char *); int (*mkdir) (struct inode *, struct dentry *, int); int (*rmdir) (struct inode *, struct dentry *); int (*mknod) (struct inode *, struct dentry *, int, dev_t); int (*rename) (struct inode *, struct dentry *, struct inode *, struct dentry *); int (*readlink) (struct dentry *, char *,int); int (*follow_link) (struct dentry *, struct nameidata *); void (*truncate) (struct inode *); int (*permission) (struct inode *, int); int (*setattr) (struct dentry *, struct iattr *); int (*getattr) (struct vfsmount *mnt, struct dentry *, struct kstat *); ... };

inode->i_op->create(...)

slide-14
SLIDE 14

Virtual File System in Linux How To Do Virtual Methods The Linux Way (III)

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 14

struct file_operations { struct module *owner; loff_t (*llseek) (struct file *, loff_t, int); ssize_t (*read) (struct file *, char *, size_t, loff_t *); ssize_t (*write) (struct file *, const char *, size_t, loff_t *); int (*readdir) (struct file *, void *, filldir_t); int (*ioctl) (struct inode *, struct file *, unsigned int, unsigned long); int (*mmap) (struct file *, struct vm_area_struct *); int (*open) (struct inode *, struct file *); int (*flush) (struct file *); unsigned long (*get_unmapped_area)(struct file *, ...); ... };

Your new FS driver needs to fill in those function pointers and provide

{super,inode,file,dentry}_operations to the corresponding data struct.

“this”-pointer

slide-15
SLIDE 15

Virtual File System in Linux Adding A New File System (Overview)

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 15

  • 1. Create a new loadable kernel module
  • 2. Fill a struct file_system_type
  • 3. Register your file system
  • 4. Implement get_sb to load and create a superblock
  • 5. Implement file, inode, superblock and dentry operations
  • 6. Implement iget to load an arbitrary inode and iput to save them
  • 7. Load super block and create inode for your root directory
slide-16
SLIDE 16

Virtual File System in Linux Adding A New File System (Register)

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 16

static struct file_system_type demofs_fs_type = { .owner = THIS_MODULE, .name = ”demofs", .get_sb = demofs_get_sb, .kill_sb = kill_block_super, .fs_flags = FS_REQUIRES_DEV, }; static int __init init_tue_fs(void) { return register_filesystem(&demofs_fs_type ); } static void __exit exit_tue_fs(void){ unregister_filesystem(&demofs_fs_type); }

Start a new kernel module: Register filesystem at startup default cleanup, can be specialized I need a device (no procfs, …) required for mounting

slide-17
SLIDE 17

Virtual File System in Linux Adding A New File System (Mounting)

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 17

static struct dentry *demofs_mount(struct file_system_type *fs_type, int flags, const char *dev_name, void *data) { return mount_bdev(fs_type, flags, dev_name, data, demofs_fill_super); } static int demofs_fill_super(struct super_block *sb, void *data, int silent) { struct demofs_sb *info; struct inode *root; int ret; sb->s_blocksize = DEMOBSIZE; sb->s_blocksize_bits = blksize_bits(DEMOBSIZE); if (!(info = kzalloc(DEMOFS_SB_SIZE, GFP_KERNEL))) return -ENOMEM; if ((ret = demofs_blk_read(sb, 0, info, DEMOFS_SB_SIZE))) goto fail; sb->s_fs_info = info; root = demofs_iget(sb, ROOT_DIR_INODE); ... }

In most cases let the kernel handle block device access

slide-18
SLIDE 18

Virtual File System in Linux Adding A New File System (Using The Block Buffer)

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 18

int demofs_blk_read(struct super_block *sb, unsigned long pos, void *buf, size_t buflen) { struct buffer_head *bh; unsigned long offset; size_t segment; while (buflen > 0) {

  • ffset = pos & (DEMOBSIZE - 1);

segment = min_t(size_t, buflen, DEMOBSIZE - offset); bh = sb_bread(sb, pos >> DEMOBSBITS); if (!bh) return -EIO; memcpy(buf, bh->b_data + offset, segment); brelse(bh); buf += segment; buflen -= segment; pos += segment; } return 0; }

decrement ref counter

slide-19
SLIDE 19

Virtual File System in Linux Adding A New File System (iget, iput)

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 19

iget – Load inode based on index number What to do if the file system has no index numbers (e.g. FAT)? FAT creates a fake number based on the first data block.

slide-20
SLIDE 20

Virtual File System in Linux Starting Point: Examples In The Kernel

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 20

linux/fs/ramfs – Very simple implementation using libfs linux/fs/romfs – Example using a block device

slide-21
SLIDE 21

Virtual File System in Linux Other stuff – Page Cache

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 21

slide-22
SLIDE 22

3

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 22

File System In User Space

slide-23
SLIDE 23

File System In User Space Operations

Polze, Grapentin, Köhler Plauth, Beilharz, Eberhardt 18.10.2017 File System Seminar Interfaces Chart 23

Design very much resembles the Linux VFS. FUSE-C API follows the a similar function-pointer pattern, but combines file and inode operations into one struct.