Hello Kernel World - Introduction to kernel modules

The goal of this practical is a simple hello world ‘program’ — running in kernel-space. To do this, we will introduce you to kernel modules, which you will be using to write your solutions for the following assignments with.

Please note it is strongly recommended that you do all kernel development on the virtual image (we recommend ssh'ing in using ssh(1) or Putty on Windows) as it is very easy to crash the kernel you are working on while doing development. It is possible you may lose data if you work on a physical Linux machine and crash its kernel.

Contents


What is a kernel module

A kernel module allows us to load new features into a running Linux kernel without rebuilding its entire source or rebooting the machine. This is a very useful feature for many reasons, but the one we are most concerned with is that it makes development much more efficient. Instead of rebuilding the Linux source tree on every modification (which can take anywhere from 5 minutes to 1 hour), we keep our code for the module outside of the kernel source tree.

Not everything can be done outside of the kernel source tree. For instance modifying the scheduler requires you to modify the core of the kernel — modifications in this fashion cannot be loaded in at runtime. Device drivers and file systems are two types of modules that can be loaded with ease.

Question: why do you think the scheduler must be built into the kernel instead of being loaded as a module?

More information on the concept of kernel modules can be found on Wikipedia.


Anatomy of a kernel module

At a bare minimum a kernel module requires two callback functions: the initialiser (which is called when the kernel loads the module), and a cleanup function (which is called when the kernel unloads the module). These functions are comparible in concept to class constructors and deconstructors in object-oriented programming languages.

init function

The initialiser function (a common shorthand for this is the init function) takes no arguments and returns an integer. This return code is used to determine whether the kernel should continue loading the module. If the return evaluates to greater than 0, then the kernel will unload the module.

    int __init init_module(void)
    {
        printk(KERN_INFO "Hello, kernel world!\n");
        return 0;
    }
    

Convention dictates that the function is named init_module so the kernel can recognise this is the function that should be called when the module is loaded. You will note the __init symbol between the return type and the function name. This is a bit different to normal C programs, but is a common thing in the world of kernel development. __init is a preprocessor definition that instructs the kernel to discard the memory used by the function after it returns (since it will only be called once over a module’s lifetime).

Inside the function, there is a call to a function that you may not have seen before — printk(). This is the kernel’s version of printf() and allows us to talk to the outside world. The arguments are the same as the standard printf() call, except we prefix the format string with a logging level. In the case of our example, we used KERN_INFO. This is a preprocessor definition that will expand out to a prefix string that, due to the way C handles consecutive string constants, will concatenate together (note that there is no comma between the prefix and the format string).

We will talk about where the output from this call goes later on.

And that’s it for out init function. When loaded, the kernel will execute the init_module() function and something will be printed — but where will this output go?

In a standard C program that is run from a terminal, the output from printf() is directed to the terminal and appears on the screen (unless redirected to a file). But in the kernel, where does this output go? The answer is dmesg.

dmesg

dmesg is used to print the contents of the kernel's buffer. This is where the output of printk() goes. Find the last couple of messages that the kernel has printed by typing dmesg | tail into a terminal.

cleanup function

We have already created the init function for our module, but now we need to write a cleanup function that will be called when the module is unloaded. It gives us one final chance to clean up and free any memory before the module is removed from the kernel.

    void __exit cleanup_module(void)
    {
        printk(KERN_INFO "Goodbye, kernel world!\n");
    }
    

In our simple cleanup function, we simply print out that the module is being unloaded. This output will go to dmesg as described above.

Module metadata

Each module has some metadata attached to it, such as its author and description, but more importantly, the license it is released under. We have to declare this otherwise the module will taint the kernel - meaning there is a module running that is not compatible with the same GPL license Linux is released under. For this course, we will declare that our modules are licensed under the GPL.

Insert the following metadata into your module's source, just above any functions:

    MODULE_LICENSE("GPL");
    MODULE_AUTHOR("Your name here");
    MODULE_DESCRIPTION("A simple hello world kernel module");
    

Includes

Since we have called quite a few functions and macros, and used some preprocessor definitions, we need to tell the compiler where to find the definitions. For our hello world program, we need 3 includes:

  • #include <linux/module.h> — required by every kernel module and should be first in the includes list
  • #include <linux/kernel.h> — this gives us the definitions for KERN_INFO and printk()
  • #include <linux/init.h> — defines the __init and __exit keywords

Place these at the very top of the source file.


Building a kernel module

Now that we have written the kernel module, we need a way of compiling it ready for the kernel to load. The kernel's build system can take care of this for the most part, but we still need to set up our own Makefile so make can create the module.

    OBJ=helloworld.o

    obj-m += $(OBJ)
    MOD_DIR=/lib/modules/$(shell uname -r)/build

    .PHONY: all
    all:
        make -C $(MOD_DIR) M=$(PWD) modules

    .PHONY: clean
    clean:
        make -C $(MOD_DIR) M=$(PWD) clean
    

Try and understand what the Makefile above does (hint: try looking at the manpage for make(1)) and explain it to your tutor to show you understand how it works.


Loading and unloading

Loading — insmod

To insert a module into the kernel, we use a program called insmod, like so:

sudo insmod hello.ko

If the module was loaded successfully, insmod returns status 0. Have a look at dmesg to see the module's init function printing hello world.

Listing — lsmod

Sometimes it is useful to see if a kernel module is currently loaded, and how much space the code is taking up in the kernel. To do this, we use a program called lsmod without any arguments. Go ahead and type it now, you should see your module in the (possibly quite long) list.

Unloading — rmmod

To unload a module when we are finished with it (or we have rebuilt it and wish to reload it), type:

sudo rmmod hello.ko

Unloading a module is not guaranteed to work, as the kernel will not allow a module to be unloaded if other modules or user-space programs are using it. More details on this will be provided in the next practical.


Next steps

This practical was an abridged version of the LDD3 chapter 2: Building and Running Modules. Please work your way through this chapter now as it contains invaluable information that you will use for assignments 2 and 3.