Prepared by: Richard A. Sevenich, rsevenic@netscape.net
Chapter 6: Using a Character Driver to Look at the PCI Bus References: Neil Matthew and Richard Stones, Beginning Linux Programming, 2nd Edition, Chapter 21, Wrox Press Ltd. (1999). Neil Matthew, Richard Stones, et alii, Professional Linux Programming, Chapter 26, Wrox Press Ltd. (1999). Don Anderson and Tom Shanley, PCI System Architecture, 4th edition, Chapter 19, Addison Wesley (2000). /usr/src/linux/Documentation/pci.txt /usr/src/linux/include/linux/pci.h /usr/src/linux/drivers/pci/pci.c man pages for lspci and setpci (the pciutils) http://www.pcisig.com - the web site of the PCI SIG (Special Interest Group) which performs various PCI related activities, including maintaining the PCI specification Note 1: Our study in this chapter includes an update and revision of GPL'd code drawn from our first reference - the Wrox book by Matthew and Stones. This is a well done book with great examples. Note 2: The fourth through sixth references above illustrate a common situation - you may need to go to the source code hierarchy for documentation. Further, of those possibilities, items found in /usr/src/linux/Documentation are not necessarily up to date. However, in our case pci.txt is reasonably current - and pci.h and pci.c are well commented and (as source code) necessarily current for the associated kernel. 6.0 Introductory Comments The character driver that evolved in the prior chapters gave us a look at most of the API and infrastructure of the character driver. The pp_probe function was interesting in that it had to deal with hardware ill-designed to let the system learn its whereabouts in I/O space and its specified irq. In this chapter we exercise another character driver. This one will investigate the PCI bus and manipulate the PCI based graphics card. The two new areas of interest are the Linux PCI layer which provides data structures and functions for interfacing with PCI devices learning how to access memory on an IO device such as the PCI based graphics card When the PCI bus is implemented in accordance with the fullest expression of the PCI specification, probing for the hardware is reduced to inspecting a PCI database set up during system boot. Linux provides a PCI layer with functions intended to work with this database and the associated PCI bus/cards. This layer has evolved as the kernel versions have moved from 2.0 to 2.2 to 2.4. Our new driver will use some of the functions available in the Linux PCI layer. It is not itself a PCI driver, but looks at an already supported PCI hardware card in your system. Nevertheless, before we build our investigative driver, we will provide an introduction to PCI device drivers. 6.1 PCI Device Drivers What Linux has determined about a particular PCI device is stored in a corresponding pci_dev struct. The early part of this struct has links to the global list of PCI devices, identifies this struct as a node in the per bus list, identifies which PCI bus this device is on and so on - essentially providing linkage to the entire PCI bus system. It also identifies device specific information such as the vendor, the device, and the device driver. Further, it identifies the resources needed by the device driver, such as irq, I/O ports, and memory. This is clearly a different world than that we encountered with our earlier driver (in previous chapters). Instead of probing blindly for the device resources needed, it is now available in this (and other) data structures. We'll see more of the pci_dev struct once we construct our own driver. R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 1 How does a PCI device driver access the information stored in the linked list of pci_dev structs? In the 2.4 series kernel, the driver's probe function examines that linked list looking for the desired device. In particular, the preferred approach is that the driver fill out the pci_driver struct and register that with the PCI layer. As expected this registartion is done within the driver's init_module code. When init_module is executed (e.g. by insmod), the kernel will use the probe function identified in the pci_driver struct to find the device by appropriately traversing the list of pci_dev structs. Here is the pci_driver struct: struct pci_driver { struct list_head node; char *name; const struct pci_device_id *id_table; /* NULL if wants all devices */
int (*probe)(struct pci_dev *dev, const struct pci_device_id *id); /* New device inserted */
void (*remove)(struct pci_dev *dev); /* Device removed (NULL if not a hot-plug capable driver) */ int (*save_state)(struct pci_dev *dev, u32 state); /*Save Dev Context */ int (*suspend)(struct pci_dev *dev); /* Device suspended */ int (*resume)(struct pci_dev *dev); /* Device woken up */
int (*enable_wake)(struct pci_dev *dev, u32 state, int enable); /* Enable wake event */ }; Let's describe some of these fields: name - the name chosen for the driver id_table - references a table of device id's with which this driver is concerned probe - this function searches for the hardware resources. It returns 0 if no device matching an id in the prior table is found; otherwise > 0. remove - references the function to be called when a device belonging to this driver is removed by being unregistered or removed physically in the case of a hot-pluggable device suspend - references a power management function when the device goes to sleep resume - references a power management function when the device is awakened from a sleep The pci driver passes reference to the pci_driver struct to the kernel as an argument to pci_register_driver. Then the probe is called within the scope of pci_register_driver. If the probe returns 0, pci_register_driver will return 0 to the initialization routine. In the 2.2 series kernel, init_module did much of the work of searching the pci_dev list, work which in the 2.4 series is now done by the static kernel. The search routines used by the older style init_module are still available. In this chapter, we write a driver which investigates the PCI bus, looking for a graphics card with prefetchable memory. Note that our driver will not be a PCI driver itself - the drivers for the PCI devices are already available on our machine and we aren't planning to supplant any. However, the search functions used directly in the older init_module code in the 2.2 series will be useful in our investigative driver. As indicated above pci_register_driver function is used by init_module to register the pci_driver struct with the kernel, but that's not the whole story. You may have noticed that the pci_driver struct contained only a few entry points and those didn't really deal with the productive work to be accomplished with the driver - they have more to do with finding, suspending, awakening, and removing the device. The PCI device could be a character device, a network device etc. and how it registers its other entry points varies from case to case. For example, if you inspect the device driver for the PCI watchdog timer WDT500/501, /usr/src/linux/drivers/char/wdt_pci.c you find that it uses the file_operations struct which it encloses in a miscdevice struct and registers that miscdevice struct via the misc_register function You might use the Linux cross reference (http://lxr.linux.no) to investigate the internals of various PCI device drivers, employing pci_register_driver as your initial search identifier. R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 2 6.2 Our Investigative Module This example assumes that your graphics card is of the PCI/AGP variety. The PCI bus features high data transfer rates and Plug and Play (PNP) features. The PCI specification indicates that each PCI device has 256 bytes of configuration address space of which the first 64 bytes, called the Configuration Header Region, have predefined purposes. Some of the registers in this predefined area are then written during the PNP configuration process. If a particular card is not completely PNP compliant the system will have an extra burden. The 64 bytes of predefined registers include fields which are useful to the operating system in deciding which device driver is appropriate. There are nine mandatory fields of which those particularly useful for device determination are: Vendor ID Device ID Revision Class Code Subsystem Vendor ID Subsystem ID The PCI SIG assigns PCI vendor id's, while the vendor chooses device id's. As Linux evolves, it becomes less necessary for the PCI device driver writer to directly query the raw PCI data base described in the prior paragraph. The Linux PCI layer now includes, not only carefully designed data structures holding the information in a conveniently accessible form, but also has functions for interacting with that database. We can use these functions to do such things as: find a device enable the device access device configuration space to discover specific resources allocate those resources use the device deallocate the resources disable the device For example, if we focus on finding a device, we have various functions, including: pci_find_dev() pci_find_class() Let's say we want to determine if we have a vga capable graphics card and, if so, we want the base address for the prefetchable video memory and also its size. There are a number of useful functions and fields we can use to glean this information. These functions were found by examining Linux source files and also by using the references given at the beginning of this chapter. We'll proceed by showing actual code and then going back and examining the salient program lines. R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 3 Let's start with this nearly empty module containing an init_module and a cleanup_module: #define __KERNEL__ #define MODULE #include <linux/module.h> #include <linux/version.h> #include <linux/wrapper.h> #include <linux/fs.h> #include <linux/sched.h> #include <linux/ioport.h> #include <linux/pci.h> #include <asm/uaccess.h> #include <asm/io.h> MODULE_LICENSE("GPL"); #define DEV_NAME "pci_inv" #define true 1 #define false 0 #define SUCCESS 0 #define FAILURE -1 int init_module(void) { int i, dev_flags, real_base, size; u32 base[6]; struct pci_dev *pdev = NULL; printk(KERN_ALERT"\nTrying to install %s.\n", DEV_NAME); printk("Search for PCI_CLASS.\n"); if (pdev = pci_find_class(PCI_CLASS_DISPLAY_VGA << 8, NULL)) { printk("Found graphics card.\n"); printk("Vendor, Device: 0x%x, 0x%x\n", pdev->vendor, pdev->device); for (i=0; i<6; i++) { base[i] = pdev->resource[i].start; dev_flags = pdev->resource[i].flags; if(dev_flags & PCI_BASE_ADDRESS_MEM_PREFETCH) { real_base = base[i] & PCI_BASE_ADDRESS_MEM_MASK; printk("Found prefetchable memory at 0x%x,\n", real_base); printk(" lurking in base_register[%d]\n", i); size = pdev->resource[i].end + 1 - real_base; printk(" and with a size of %d MB.\n", size/1024/1024); return SUCCESS; } } printk("but not prefetchable memory.\n\n"); return FAILURE; } printk( KERN_ALERT "Did not find graphics card.\n"); return FAILURE; } void cleanup_module(void) { printk( KERN_ALERT "Removing %s.\n", DEV_NAME); } R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 4 Here is the pci information harvested as shown in output from insmod: Apr 17 10:25:28 aurach kernel: Trying to install pci_inv. Apr 17 10:25:28 aurach kernel: Search for PCI bus. Apr 17 10:25:28 aurach kernel: Search for PCI_CLASS. Apr 17 10:25:28 aurach kernel: Found graphics card. Apr 17 10:25:28 aurach kernel: Vendor is 0x1013. Apr 17 10:25:28 aurach kernel: Device is 0xbc. Apr 17 10:25:28 aurach kernel: Found prefetchable memory at 0xf6000000, Apr 17 10:25:28 aurach kernel: lurking in base_register[0] Apr 17 10:25:28 aurach kernel: with a size of 32 MB. and from rmmod: Apr 17 10:25:46 aurach kernel: Removing pci_inv. Let's discuss some of the program statements in the init_module: pdev = pci_find_class(PCI_CLASS_DISPLAY_VGA << 8, NULL) - this searches though the linked list of pci_dev structs looking for a graphics card. As used in the if statement, it quits when it finds the first acceptable card and returns that pci_dev struct. If it finds nothing, it returns NULL. printk("Vendor is 0x%x.\n", pdev->vendor); - the vendor field of the struct contains the vendor id printk("Device is 0x%x.\n", pdev->device); - the device field contains the device id for that vendor base = pdev->resource[i].start; - there are as many as six IO or memory regions and this references the start of one of those (as designated by i) dev_flags = pdev->resource[i].flags; - the flags give us crucial information as shown next if(dev_flags & PCI_BASE_ADDRESS_MEM_PREFETCH) - we can use the flags field of the resource to determine if we've found prefetchable IO memory as desired base = base & PCI_BASE_ADDRESS_MEM_MASK; - we just want the bits that constitute the actual address Note: Video memory on PCI cards is often 'prefetchable' and marked as such, indicating that it can be cached by the CPU and manipulated as desired. Many devices also implement their IO registers in memory regions. These would be marked as not prefetchable to avoid being cached. IO registers, of course, need to be read and written without being cached - otherwise we are reading or writing the cache and not the IO device in its current state. 6.3 Mapping IO Memory Cards in IO space can have their own memory. For example, a PCI based video card will have sufficient memory for the screen display, often called the frame buffer. In general, this memory in IO space is called IO memory. How can a driver gain convenient access to this memory? If we focus on the CPU (IA32 variety), we note that the CPU is aware of three kinds of memory: logical addresses linear addresses physical addresses The IA32 translates logical addresses to linear addresses using the segmentation system, and then translates linear addresses to physical addresses using the paging system. The physical addresses are driven onto the data bus to access system RAM and ROM. Most other CPU architectures do not employ segmentation, only paging - so those CPU's will be aware of just linear addresses physical addresses IA32 segmentation has some virtues, but it is usually treated as a necessary and inescapable nuisance. Operating systems often set its parameters up to provide the simplest segmentation system, minimizing its impact. IO memory is yet another kind of address space, sometimes called a bus address space. In the IA32 the bus address is the physical address, not necessarily true in other architectures. We will focus on the IA32. For the PCI bus, the video card frame buffer is mapped to physical addresses beyond the end of the physical main memory. Further, programs (such as our driver) do not access physical memory directly. What is done is to map the IO memory to the virtual memory (logical memory for the IA32, linear for non segmented architectures) accessible to the driver. This is done with the Linux function ioremap which hides architectural differences so the driver writer can expect to achieve portability. R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 5 As a quick example, let's suppose that we investigate the PCI subsystem of our computer with the command lspci -v (cat /proc/pci is deprecated) and find somewhere in the output that our video card has 32 MB of prefetchable 32-bit memory starting at address 0xF6000000. To make that accessible to our driver we could do this: char * vid_mem_ptr = ioremap(0xF6000000, 32*1024*1024); and then access it in various ways e.g. writeb('Y', vid_mem_ptr + 43); Of course, the driver would need to release this mapping when done with it. In our example, this would be done with iounmap(vid_mem_ptr); 6.3.1 The iomap.c driver and related files In this section we look at the driver and support files from the first of the references as given at the start of this chapter. We've updated it in only a limited fashion; enough to get it up and running for the 2.4 kernel. We'll explain how it works and then use it as suggested in the reference. In this version it does not use the sorts of functions as discussed in section 6.2. In fact, the user must learn which card he has etc., by manually entering lspci -v Subsequently, in Section 6.4, we'll look at interesting features of this program. As indicated, the program is based on an earlier version of the Linuc PCI layer, so in Section 6.5 we'll make further changes to the driver ... somewhat in the spirit of Section 6.2. This scenario is typical of open source - we can go to GPL'ed source code and learn a great deal, while it continues to evolve under our feet. We'll show listings for three files, modified from their kernel 2.2 origins so we can compile and run them for a 2.4 kernel: iomap.h iomap.c setup.c Then in subsequent sections we'll focus on various areas of the example. Note: The following listings may omit some lines extraneous to this discussion. However, your instructor will provide a tarball of the whole works anyway. First, here is the header file iomap.h: #define IOMAP_MAJOR 42 #define IOMAP_MAX_DEVS 16 /* define to use readb and writeb to read/write data */ #define IOMAP_BYTE_WISE /* the device structure */ typedef struct Iomap { unsigned long base; unsigned long size; char *ptr; } Iomap; #define IOMAP_GET _IOR(0xbb, 0, Iomap) #define IOMAP_SET _IOW(0xbb, 1, Iomap) #define IOMAP_CLEAR _IOW(0xbb, 2, long) #define MSG(string, args...) printk(KERN_ALERT "iomap: " string, ##args) R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 6 Second, here are the salient parts of iomap.c. /* * Example of memory mapping i/o memory */ #include <linux/module.h> #include <linux/version.h> #include <linux/wrapper.h> #include <linux/kernel.h> #include <linux/fs.h> #include <linux/types.h> #include <linux/config.h> #include <linux/sched.h> #include <linux/slab.h> #include <linux/wrapper.h> #include <linux/pci.h> #include <linux/pci_ids.h> #include <asm/uaccess.h> #include <asm/page.h> #include <asm/io.h> #include "iomap.h" Iomap *iomap_dev[IOMAP_MAX_DEVS]; MODULE_DESCRIPTION("iomap, mapping i/o memory"); MODULE_AUTHOR("Jens Axboe"); MODULE_LICENSE("GPL"); int iomap_remove(Iomap *idev) { iounmap(idev->ptr); MSG("buffer at 0x%lx removed\n", idev->base); return 0; } int iomap_setup(Iomap *idev) { /* remap the i/o region */ idev->ptr = ioremap(idev->base, idev->size); MSG("setup: 0x%lx extending 0x%lx bytes\n", idev->base, idev->size); return 0; } /* Beware: A RedHatism was encountered around the 2.4.20 Linux kernel where RedHat changed a prototype. In Particular, RedHat took the original prototype: extern int remap_page_range(unsigned long from, unsigned long to, unsigned long size, pgprot_t prot); and inserted yet another argument at the start of the argument list: struct vm_area_struct *vma
The prototype stock kernel was unchanged. See <linux/mm.h> for your system. */ R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 7 static int iomap_mmap(struct file *file, struct vm_area_struct *vma) { Iomap *idev = iomap_dev[MINOR(file->f_dentry->d_inode->i_rdev)]; unsigned long size;
/* no such device */ if (!idev->base) return -ENXIO;
/* size must be a multiple of PAGE_SIZE */ size = vma->vm_end - vma->vm_start; if (size % PAGE_SIZE) return -EINVAL;
if (remap_page_range(vma->vm_start, idev->base, size, vma->vm_page_prot)) { return -EAGAIN;
MSG("region mmapped\n"); return 0; } R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 8 static int iomap_ioctl(struct inode *inode, struct file *file, unsigned int cmd, unsigned long arg) { Iomap *idev = iomap_dev[MINOR(inode->i_rdev)];
switch (cmd) {
/* create the wanted device */ case IOMAP_SET: { /* if base is set, device is in use */ if (idev->base) return -EBUSY;
if (copy_from_user(idev, (Iomap *)arg, sizeof(Iomap))) return -EFAULT; /* base and size must be page aligned */ if (idev->base % PAGE_SIZE || idev->size % PAGE_SIZE) { idev->base = 0; return -EINVAL; }
MSG("setting up minor %d\n", MINOR(inode->i_rdev)); iomap_setup(idev); return 0; }
case IOMAP_GET: { /* maybe device is not set up */ if (!idev->base) return -ENXIO;
if (copy_to_user((Iomap *)arg, idev, sizeof(Iomap))) return -EFAULT; return 0; }
case IOMAP_CLEAR: { long tmp; /* if base is set, device is in use */ if (!idev->base) return -EBUSY;
if (get_user(tmp, (long *)arg)) return -EFAULT; memset_io(idev->ptr, tmp, idev->size); return 0; } default: return -1; } } int iomap_open(struct inode *inode, struct file *file) { int minor = MINOR(inode->i_rdev);
/* no such device */ if (minor >= IOMAP_MAX_DEVS) return -ENXIO; return 0; } R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 9 int iomap_release(struct inode *inode, struct file *file) { return 0; } static ssize_t iomap_read(struct file *file, char *buf, size_t count, loff_t *offset) { Iomap *idev = iomap_dev[MINOR(file->f_dentry->d_inode->i_rdev)]; char *tmp;
/* device not set up */ if (!idev->base) return -ENXIO; /* beyond or at end? */ if (file->f_pos >= idev->size) return 0; tmp = (char *) kmalloc(count, GFP_KERNEL); if (!tmp) return -ENOMEM; /* adjust access beyond end */ if (file->f_pos + count > idev->size) count = idev->size file->f_pos; /* get the data from the mapped region */ #ifdef IOMAP_BYTE_WISE { int i; for (i = 0; i < count; i++) tmp[i] = readb(idev->ptr+file->f_pos+i); } #else memcpy_fromio(tmp, idev->ptr+file->f_pos, count); #endif
/* copy retrieved data back to app */ if (copy_to_user(buf, tmp, count)) { kfree(tmp); return -EFAULT; } file->f_pos += count; kfree(tmp); return count; } R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 10 static ssize_t iomap_write(struct file *file, const char *buf, size_t count, loff_t *offset) { Iomap *idev = iomap_dev[MINOR(file->f_dentry->d_inode->i_rdev)]; char *tmp;
/* device not set up */ if (!idev->base) return -ENXIO;
/* end of mapping? */ if (file->f_pos >= idev->size) return 0; tmp = (char *) kmalloc(count, GFP_KERNEL); if (!tmp) return -ENOMEM; /* adjust access beyond end */ if (file->f_pos + count > idev->size) count = idev->size file->f_pos; /* get user data */ if (copy_from_user(tmp, buf, count)) { kfree(tmp); return -EFAULT; } /* write data to i/o region */ #ifdef IOMAP_BYTE_WISE { int i; for (i = 0; i < count; i++) writeb(tmp[i], idev->ptr + file->f_pos + i); } #else memcpy_toio(idev->ptr+file->f_pos, tmp, count); #endif
file->f_pos += count; kfree(tmp); return count; } struct file_operations iomap_fops = { owner: THIS_MODULE, read: iomap_read, write: iomap_write, ioctl: iomap_ioctl, mmap: iomap_mmap, open: iomap_open, release: iomap_release, }; R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 11 int init_module(void) { int res, i;
/* register device with kernel */ res = register_chrdev(IOMAP_MAJOR, "iomap", &iomap_fops); if (res) { MSG("can't register device with kernel\n"); return res; }
for (i = 0; i < IOMAP_MAX_DEVS; i++) { iomap_dev[i] = (Iomap *) kmalloc(sizeof(Iomap), GFP_KERNEL); iomap_dev[i]->base = 0; } MSG("module loaded\n"); return 0; } void cleanup_module(void) { int i = 0; Iomap *tmp;
/* delete the devices */ for (tmp = iomap_dev[i]; i < IOMAP_MAX_DEVS; tmp = iomap_dev[++i]) { if (tmp->base) iomap_remove(tmp); kfree(tmp); }
R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 12 Lastly, here is the user space program setup.c: #include <stdio.h> #include <fcntl.h> #include <sys/ioctl.h> #include "iomap.h" #define BASE 0xf6000000 /* specific to your graphics card */ int main(int argc, char *argv[]) { int fd1 = open("/dev/iomap0", O_RDWR); int fd2 = open("/dev/iomap1", O_RDWR);
Iomap dev1, dev2;
if (fd1 == -1 || fd2 == -1) { perror("open"); return 1; } /* setup first device */ dev1.base = BASE; dev1.size = 1024 * 1024; if (ioctl(fd1, IOMAP_SET, &dev1)) { perror("ioctl"); return 2; } /* setup second device, offset starting from first */ dev2.base = BASE + dev1.size; dev2.size = 1024 * 1024; if (ioctl(fd2, IOMAP_SET, &dev2)) { perror("ioctl"); return 3; }
return 0; } 6.4 How iomap works - a specific example The iomap.c allows us to create one or more devices where each device is a chunk of the frame buffer. Each such device is accessible through the virtual file system e.g. open, close, read, write, etc. The user program begins by opening two devices, iomap0 and iomap1. An opened device is not usable until initialized - a step which sets the device's base address and size. This is done with an ioctl call. The user program could then write and read to the devices to manipulate what is displayed on the monitor screen. However, what the given user program does instead is to then quit, without even closing the devices. Nevertheless, the devices remain open and available. The reference then suggests that we use the cp command to copy one device over the other - this will use the driver's read and write functions. The result is a screwed up display, as expected. We'll now work through this scenario in some detail. Let's assume that iomap.c has been compiled and the driver installed via insmod. The next tasks to do are these: Make sure that the hard coded BASE in setup.c has been changed to match your card and that you choose a size that will work (1024x1024 is hard coded in the original setup.c). Next compile setup.c, but before running it, make the needed device files with the appropriate permissions; i.e., mknod /dev/iomap0 c 42 0 mknod /dev/iomap1 c 42 1 chmod 666 /dev/iomap* R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 13 Now run setup, so that the system is ready to demonstrate how it can manipulate your X Windows display. The suggestion from the reference is to copy part of the screen display on top of another part by entering at the command line: cp /dev/iomap1 /dev/iomap0 Let's follow the execution of setup to see what goes on. In particular, consider the calls made by setup that will vector to the driver. These are, in order, open two instances of the driver, minor numbers 0 and 1. int fd1 = open("/dev/iomap0", O_RDWR); int fd2 = open("/dev/iomap1", O_RDWR); set up the parameters (base address and size) for both devices. The driver defines the device as the Iomap struct (in iomap.h): typedef struct Iomap { unsigned long base; unsigned long size; char *ptr; } Iomap; The struct contains a base address as lying somewhere in the frame buffer the device extent (bytes of some chunk of the frame buffer) a virtual memory pointer that the driver uses to access the device memory So the setup program next initializes the just opened devices as follows: /* setup first device */ dev1.base = BASE; dev1.size = 1024 * 1024; if (ioctl(fd1, IOMAP_SET, &dev1)) { perror("ioctl"); return 2; } /* setup second device, offset 1 meg from first */ dev2.base = BASE + dev1.size; dev2.size = 1024 * 1024; if (ioctl(fd2, IOMAP_SET, &dev2)) { perror("ioctl"); return 3; } In the next subsections we'll look at what this accomplishes at the driver level. 6.4.1 Opening instances of the driver So, what does something such as the following do? open("/dev/iomap0", O_RDWR); The underlying driver function iomap_open merely checks the validity of the minor number and, if that's not OK, returns an error. R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 14 6.4.2 Setting up device parameters (base address and size) As we've seen, the setup program next initializes the base and size fields for the Iomap struct and then calls ioctl for an already open (but not initialized) device passing the desired command and the address of the Iomap struct with this sort of syntax: ioctl(fd1, IOMAP_SET, &dev1); This triggers these events at the driver level: 1. the driver level ioctl switches to the code for the requested ioctl command (IOMAP_SET) checks that the device has not already been set up copies the user space Iomap struct into a local copy checks the struct base and size fields to assure that they are page aligned calls iomap_setup, passing along the Iomap struct 2. iomap_setup As described in section 6.3, this maps the frame buffer so that the driver can access it. In particular, it calls the kernel function ioremap to get the virtual memory pointer and puts that into the Iomap struct. Now the device as represented by the Iomap struct is initialized, accessible, and ready for use by such functions as read and write. So when setup is done, there are two open and initialized minor devices which can be used. Again, note that the userland 'setup' program does not close those device files. 6.4.3 Subsequent use of the devices The setup program did only what its name implies. If we want to use the devices we can write another program to do so - or even use them at the command line. The reference from which this has been extracted then asks the reader to do this: cp /dev/iomap1 /dev/iomap0 which copies one area of the frame buffer onto the other half. 6.4.4 The Read and Write Functionality in iomap.c This driver shows how the read and write functionality expected by a user space program is provided by the underlying driver level read and write functions, iomap_read and iomap_write. These use quite different functions than did pp_read and pp_write in our earlier driver. There we were essentially transferring data back and forth between userland memory (mapped to physical RAM) and kernel memory (also mapped to physical RAM). Here we have mapped physical I/O memory to be directly accessible to the kernel space driver. In user space the default behavior of read and write is blocking i.e. if the system call cannot proceed the caller blocks so other processes can get useful work done in the meantime. When the read or write can proceed the caller is unblocked. To get more specific about each, the behavior is as follows: read - if no data is available, the process blocks. When data becomes available, the process is awakened and data is returned to the caller. If there is less data than specified in the read's count argument, the amount available is still delivered and the return value to the read system call is the number of bytes actually read. write - if there is no space in the write buffer, the process blocks. When some data has been written to the device opening room in the buffer, the process is awakened and writes successfully to the buffer. If the amount of buffer space is less than specified in the write's count argument, it writes as much as there is space and the return value to the write system call is the number of bytes actually written. Alternatively, to specify the nonblocking behavior, the O_NONBLOCK flag is used in the open system call. In both cases, if the system call cannot proceed, it returns immediately with the return value -EAGAIN. R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 15 To see how the driver level calls support these behaviors we'll investigate the iomap_read function (iomap_write is similar): static ssize_t iomap_read(struct file *file, char *buf, size_t count, loff_t *offset) { Iomap *idev = iomap_dev[MINOR(file->f_dentry->d_inode->i_rdev)]; char *tmp;
/* device not set up */ if (!idev->base) return -ENXIO; /* beyond or at end? */ if (file->f_pos >= idev->size) return 0; tmp = (char *) kmalloc(count, GFP_KERNEL); if (!tmp) return -ENOMEM; /* adjust access beyond end */ if (file->f_pos + count > idev->size) count = idev->size file->f_pos; /* get the data from the mapped region */ #ifdef IOMAP_BYTE_WISE { int i; for (i = 0; i < count; i++) tmp[i] = readb(idev->ptr+file->f_pos+i); } #else memcpy_fromio(tmp, idev->ptr+file->f_pos, count); #endif
/* copy retrieved data back to app */ if (copy_to_user(buf, tmp, count)) { kfree(tmp); return -EFAULT; } file->f_pos += count; kfree(tmp); return count; } The sequence of events in the read is clear: assure that the device has been set up (i.e. is valid) if the file pointer is already at or beyond the end of the file, return 0 kmalloc enough memory as a buffer for the bytes to be read if the number of bytes requested would take the file pointer past the end of the file, adjust the byte count to the number of bytes actually available read the number of bytes from the device into the buffer copy those bytes to user space adjust the file pointer appropriately kfree the buffer space return the number of bytes read R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 16 6.5 Minor but Interesting Modifications to iomap.c In the two preceding sections we looked at code from the first reference. We can make modifications based on what we saw in section 6.2. For example, we could have init_module automate the device setep that was heretofore being done in the user code with hard coded frame buffer base address and size. For example, we could have init_module find the graphics card find the frame buffer's base address and size set up 16 devices, each mapping 1/16th of the buffer This would require some changes elsewhere in the driver as well. To avoid a dramatic driver revision, we suggest that you merely add a new ioctl command which will find the card find the frame buffer's base address and size The new ioctl would use the code from the init_module of section 6.2. We could then open the device, use the new ioctl to get the frame buffer parameters, initialize the devices as before, and finally use the devices as wished. This will be explored in the activities section at the end of this chapter. 6.6 PCI Caveats If your system is entirely PCI PNP compliant, it makes life relatively easy for the device driver implementer. However, incompatibilities are quite possible. A non exhaustive list of some such problems would include: BIOS does not contain PCI PNP ISA device which is not PNP configurable PCI device which is not PNP Conflict with memory management software We'll give a brief description of each of these problem areas. BIOS does not contain PCI PNP The PCI specification does not strictly require that the BIOS perform PNP configuration. Such a non PNP BIOS will start the system with inoperable PNP devices. There is a PNP BIOS Specification which has been provided by Intel, Phoenix, and Compaq - but is not part of the PCI specification. Such a situation leaves configuration entirely to the Operating System. ISA device which is not PNP configurable This sort of device does not allow a way to configure needed resources in software; the needed resources are determined by the hardware. This removes a great deal of flexibility in resolving any resource conflicts. The PNP BIOS Specification mentioned in the previous section provides an optional (not required) way to handle such devices. Its central feature is that the resources needed are to be stored in the system's CMOS setup, thereby making the information available to the PNP BIOS PCI configurator. PCI device which is not PNP There exist PCI devices which are not PNP - they may, for example, be hardwired to an I/O space region. The PNP BIOS may assign space for the device and everything may work fine if no resource conflicts are encountered. Conflict with memory management software The PNP BIOS assigns the needed device memory resources as the system boots. Properly done, there will be no memory assignment conflicts. However, it is possible that a subsequently installed memory manager could cause a conflict. This might also arise with improperly designed device drivers. R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 17 6.6.1 The bottom line If the PCI BIOS meets the PNP BIOS Specification (separate from the PCI Specification), if the devices are PCI PNP compliant, and if neither the OS nor device drivers introduces conflicts, then all is rosy. More likely the OS may need to do some reallocation etc. Further, debugging problems can be quite difficult. For example, there may be a non compliant card in the system, but fortuitously no resource conflicts, and the system works fine. Then the user adds a fully compliant card and the latent problem now rears its head. The user will most likely blame the new card and its vendor. To actually fix the blame correctly might be difficult. It is important that non compliance be identified before purchase and then avoided. 6.7 Activities 6.7.1 Activity 1 Implement the module shown in Section 6.2. Record the results obtained for card vendor id device id frame buffer base address frame buffer size See if you can find the vendor and device names in /usr/src/linux/include/linux/pci_ids.h. Compare all the above results to what is reported by lspci -v R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 18 6.7.2 Activity 2 This relies on the code described in Section 6.4. Get the tarball of the necessary code from the instructor. Change the hardcoded BASE value in setup.c to match your card, based on the prefetchable memory address result from the prior problem. Also change the size from 1024x1024 bytes appropriately. Then try the software. The detailed steps are: expand/untar iomapi.tgz change to the newly created iomapi/ directory and run make make the BASE and size changes in the user program setup.c and then compile it create device nodes for the two minor devices mknod /dev/iomap0 c 42 0 mknod /dev/iomap1 c 42 1 change the device permissions i.e. chmod 666 /dev/iomap* install the device driver via insmod iomap.o run the user program then enter, cp /dev/iomap1 /dev/iomap0 Then describe what happened at the last step. 6.7.3 Activity 3 The setup program hard codes the base address. Remove that and implement a new ioctl command (driver level and user level) so setup can determine the base address with the new ioctl command and then use that to accomplish its later tasks. Note that the code from problem 1 will be incorporated into the driver level version of the new ioctl command and not appear in init_module. Then redo the appropriate parts of problem above and run the new approach. 6.7.4 Activity 4 Modify the setup.c program so that it uses the read and write functions and closes the devices when done. 6.7.5 Activity 5 The driver function iomap_setup uses the kernel function ioremap to give the driver a pointer to get access to the frame buffer in physical memory. Is it possible to give a pointer to the user space program, so it can access the frame buffer? Investigate the driver's iomap_mmap function and the man page for mmap. Then write a user program to manipulate the frame buffer using the pointer.. R.A. Sevenich 2004 Introduction to Linux Device Driver Development 6 - 19