Poster of Linux kernelThe best gift for a Linux geek
 Linux kernel map 
⇦ prev ⇱ home next ⇨

16.2. The Block Device Operations

We had a brief introduction to the block_device_operations structure in the previous section. Now we take some time to look at these operations in a bit more detail before getting into request processing. To that end, it is time to mention one other feature of the sbull driver: it pretends to be a removable device. Whenever the last user closes the device, a 30-second timer is set; if the device is not opened during that time, the contents of the device are cleared, and the kernel will be told that the media has been changed. The 30-second delay gives the user time to, for example, mount an sbull device after creating a filesystem on it.

16.2.1. The open and release Methods

To implement the simulated media removal, sbull must know when the last user has closed the device. A count of users is maintained by the driver. It is the job of the open and close methods to keep that count current.

The open method looks very similar to its char-driver equivalent; it takes the relevant inode and file structure pointers as arguments. When an inode refers to a block device, the field i_bdev->bd_disk contains a pointer to the associated gendisk structure; this pointer can be used to get to a driver's internal data structures for the device. That is, in fact, the first thing that the sbull open method does:

static int sbull_open(struct inode *inode, struct file *filp)
{
    struct sbull_dev *dev = inode->i_bdev->bd_disk->private_data;

    del_timer_sync(&dev->timer);
    filp->private_data = dev;
    spin_lock(&dev->lock);
    if (! dev->users) 
        check_disk_change(inode->i_bdev);
    dev->users++;
    spin_unlock(&dev->lock);
    return 0;
}

Once sbull_open has its device structure pointer, it calls del_timer_sync to remove the "media removal" timer, if any is active. Note that we do not lock the device spinlock until after the timer has been deleted; doing otherwise invites deadlock if the timer function runs before we can delete it. With the device locked, we call a kernel function called check_disk_change to check whether a media change has happened. One might argue that the kernel should make that call, but the standard pattern is for drivers to handle it at open time.

The last step is to increment the user count and return.

The task of the release method is, in contrast, to decrement the user count and, if indicated, start the media removal timer:

static int sbull_release(struct inode *inode, struct file *filp)
{
    struct sbull_dev *dev = inode->i_bdev->bd_disk->private_data;

    spin_lock(&dev->lock);
    dev->users--;

    if (!dev->users) {
        dev->timer.expires = jiffies + INVALIDATE_DELAY;
        add_timer(&dev->timer);
    }
    spin_unlock(&dev->lock);

    return 0;
}

In a driver that handles a real, hardware device, the open and release methods would set the state of the driver and hardware accordingly. This work could involve spinning the disk up or down, locking the door of a removable device, allocating DMA buffers, etc.

You may be wondering who actually opens a block device. There are some operations that cause a block device to be opened directly from user space; these include partitioning a disk, building a filesystem on a partition, or running a filesystem checker. A block driver also sees an open call when a partition is mounted. In this case, there is no user-space process holding an open file descriptor for the device; the open file is, instead, held by the kernel itself. A block driver cannot tell the difference between a mount operation (which opens the device from kernel space) and the invocation of a utility such as mkfs (which opens it from user space).

16.2.2. Supporting Removable Media

The block_device_operations structure includes two methods for supporting removable media. If you are writing a driver for a nonremovable device, you can safely omit these methods. Their implementation is relatively straightforward.

The media_changed method is called (from check_disk_change) to see whether the media has been changed; it should return a nonzero value if this has happened. The sbull implementation is simple; it queries a flag that has been set if the media removal timer has expired:

int sbull_media_changed(struct gendisk *gd)
{
    struct sbull_dev *dev = gd->private_data;
    
    return dev->media_change;
}

The revalidate method is called after a media change; its job is to do whatever is required to prepare the driver for operations on the new media, if any. After the call to revalidate, the kernel attempts to reread the partition table and start over with the device. The sbull implementation simply resets the media_change flag and zeroes out the device memory to simulate the insertion of a blank disk.

int sbull_revalidate(struct gendisk *gd)
{
    struct sbull_dev *dev = gd->private_data;
    
    if (dev->media_change) {
        dev->media_change = 0;
        memset (dev->data, 0, dev->size);
    }
    return 0;
}

16.2.3. The ioctl Method

Block devices can provide an ioctl method to perform device control functions. The higher-level block subsystem code intercepts a number of ioctl commands before your driver ever gets to see them, however (see drivers/block/ioctl.c in the kernel source for the full set). In fact, a modern block driver may not have to implement very many ioctl commands at all.

The sbull ioctl method handles only one command—a request for the device's geometry:

int sbull_ioctl (struct inode *inode, struct file *filp,
                 unsigned int cmd, unsigned long arg)
{
    long size;
    struct hd_geometry geo;
    struct sbull_dev *dev = filp->private_data;

    switch(cmd) {
        case HDIO_GETGEO:
        /*
         * Get geometry: since we are a virtual device, we have to make
         * up something plausible.  So we claim 16 sectors, four heads,
         * and calculate the corresponding number of cylinders.  We set the
         * start of data at sector four.
         */
        size = dev->size*(hardsect_size/KERNEL_sectOR_SIZE);
        geo.cylinders = (size & ~0x3f) >> 6;
        geo.heads = 4;
        geo.sectors = 16;
        geo.start = 4;
        if (copy_to_user((void _ _user *) arg, &geo, sizeof(geo)))
            return -EFAULT;
        return 0;
    }

    return -ENOTTY; /* unknown command */
}

Providing geometry information may seem like a curious task, since our device is purely virtual and has nothing to do with tracks and cylinders. Even most real-block hardware has been furnished with much more complicated structures for many years. The kernel is not concerned with a block device's geometry; it sees it simply as a linear array of sectors. There are certain user-space utilities that still expect to be able to query a disk's geometry, however. In particular, the fdisk tool, which edits partition tables, depends on cylinder information and does not function properly if that information is not available.

We would like the sbull device to be partitionable, even with older, simple-minded tools. So, we have provided an ioctl method that comes up with a credible fiction for a geometry that could match the capacity of our device. Most disk drivers do something similar. Note that, as usual, the sector count is translated, if need be, to match the 512-byte convention used by the kernel.

    ⇦ prev ⇱ home next ⇨
    Poster of Linux kernelThe best gift for a Linux geek