Professional Documents
Culture Documents
1 Introduction 9
1.1 Some background . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2 How the book works . . . . . . . . . . . . . . . . . . . . . . . 11
I RTCore Basics 13
2 Introductory Examples 15
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Using RTCore . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.1 Hello world . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.2 Multithreading . . . . . . . . . . . . . . . . . . . . . . 17
2.2.3 Basic communication . . . . . . . . . . . . . . . . . . . 18
2.2.4 Signalling and multithreading . . . . . . . . . . . . . . 21
2.3 Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3
4 CONTENTS
5 More concepts 61
5.1 Copying synchronization objects . . . . . . . . . . . . . . . . . 61
5.2 API Namespace . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.3 Resource cleanup . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.4 Deadlocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.5 Synchronization-induced priority inversion . . . . . . . . . . . 63
5.6 Memory management . . . . . . . . . . . . . . . . . . . . . . . 63
5.7 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.7.1 Methods and safety . . . . . . . . . . . . . . . . . . . . 64
5.7.2 One-way queues . . . . . . . . . . . . . . . . . . . . . . 65
5.7.3 Atomic operations . . . . . . . . . . . . . . . . . . . . 70
7 Debugging in RTCore 93
7.1 Enabling the debugger . . . . . . . . . . . . . . . . . . . . . . 93
7.2 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
7.3 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6 CONTENTS
7.3.1 Overhead . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.3.2 Remote debugging . . . . . . . . . . . . . . . . . . . . 99
7.3.3 Safely stopping faulted applications . . . . . . . . . . . 99
7.3.4 GDB notes . . . . . . . . . . . . . . . . . . . . . . . . 100
II RTLinux
r
Professional Components 121
11 Real-time Networking 123
11.1 Introduction and basic concepts . . . . . . . . . . . . . . . . . 123
12 PSDD 125
12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
12.2 Hello, world with PSDD . . . . . . . . . . . . . . . . . . . . . 125
12.3 Building and running PSDD programs . . . . . . . . . . . . . 127
CONTENTS 7
B Terminology 161
Introduction
9
10 CHAPTER 1. INTRODUCTION
This runs the real-time application and logs output to a non-real-time Linux
file. For those who have used UNIX at all, this should look very familiar.
1
Generally, we use the term ”GPOS”, or General Purpose Operating System to gener-
ically refer to the non-real-time system. The RTCore API and behavior remain the same
regardless of which GPOS is being used.
2
In later chapters, we will see how to reduce this down to 0 microseconds, bypassing
hardware jitter.
1.1. SOME BACKGROUND 11
This book starts with some background and simple examples and then
takes a detour for an in-depth introduction to the basic concepts of RTCore
and an overview of the API. Next, the available communication models for
exchanging data between real-time threads and the non-real-time domain are
presented. The sample programs then use these mechanisms to show how
to apply them to simple problems. These chapters are devoted to stepping
through these programs, making every step as clear as possible, and require
little prior knowledge. Following that, several chapters are devoted to the
more advanced features of RTLinux Professional, or RTLinuxPro. After
having covered the basic concepts in a few sample programs, we then provide
a basic model for writing real-time drivers.
learning how to write real-time applications using the RTCore OS. The book
focuses on getting the user spun up on each facet quickly so they can become
productive quickly, rather than trying to intuit facts from scattered sources.
Experience in developing real-time applications is helpful but not necessary,
as RTCore uses the standard POSIX API. Users with some knowledge of
POSIX and UNIX should feel right at home.
The full sources of the programs referenced here can be found in Appendix
G and are provided with the RTLinuxPro development kit.
Part I
RTCore Basics
13
Chapter 2
Introductory Examples
2.1 Introduction
The RTCore OS is a small, hard real-time operating system that can run
Linux or BSD UNIX as an application server. This allows a standard oper-
ating system to be used as a component of a real-time application. In this
part, we will provide an overview of RTCore capabilities, introducing basic
concepts, the API, and some of the add-on components. This book starts
assuming you have already installed RTLinuxPro, RTCore/BSD or RTLin-
uxFree - refer to the installation instructions that came with your package
for details. This chapter will assume an RTLinuxPro environment, but the
procedures apply equally to a BSD host.
15
16 CHAPTER 2. INTRODUCTORY EXAMPLES
need more grounding, skip ahead to Chapter 3 for a little more background
information.
For this example, you will need to have the core RTCore OS loaded
as described in the appropriate appendix, and we assume that your current
working directory as the root user is the rtlinuxpro directory of RTLinuxPro
(Or the appropriate installation point for your RTLinuxFree installation). If
you don’t see the referenced files in the directory, type make to ensure that
everything is up to date. Now, on with the code:
#include <stdio.h>
int main(void)
{
printf("Hello from the RTL base system\n");
return 0;
}
Surprised? This is all that is involved - nothing more than what you
would see in a normal C introduction. Running the example (./hello.rtl)
forces the RTCore OS to load the application, and enter the main() context.
Here it prints a message out through standard I/O for the user to see, and
exits.
Those familiar with older RTLinux versions are used to these messages
silently appearing in the kernel’s ring buffer, but now they print through
stdout just like any other application. Also, there is a standard printf(),
rather than the rtl printf() some users have seen. This printf() is fully
capable, and can handle any format that a normal printf() can handle.
Once the message has been printed, the program exits, RTCore unloads
the application, and we’re done. Now, let’s move on to something a little bit
more useful.
2.2. USING RTCORE 17
2.2.2 Multithreading
If you’re familiar with POSIX threading, you’ll feel at home with RTCore.
If you’re not familiar with it, there are many solid references on the subject,
such as the O’Reilly book on Pthreads Programming. Let’s start with a
basic example of the pthread model here, with a task that operates on a 1
millisecond interval.
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
#include <rtl posixio.h>
pthread t thread;
while ( 1 ) {
timespec add ns( &next, 1000*1000 );
clock nanosleep( CLOCK REALTIME, TIMER ABSTIME,
&next, NULL);
count++;
if (!(count % 1000)) 20
printf("woke %d times\n",count);
}
return NULL;
}
int main(void)
{
pthread create( &thread, NULL, thread code, (void *)0 );
30
18 CHAPTER 2. INTRODUCTORY EXAMPLES
return 0;
}
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
#include <sys/mman.h>
#include <rtl posixio.h>
pthread t thread;
int fd1;
while ( 1 ) {
timespec add ns( &next, 1000*1000*1000 );
return NULL;
}
int main(void)
{
mkfifo( "/communicator", 0666); 30
20 CHAPTER 2. INTRODUCTORY EXAMPLES
ftruncate(fd1, 16<<10);
close( fd1 );
unlink( "/communicator" );
return 0;
}
This code starts up and creates the FIFO with standard POSIX calls.
mkfifo() creates the FIFO with permissions such that a device will appear
in the GPOS filesystem dynamically. We then open the file normally and
call ftruncate() to size it - this sets the ’depth’ of the FIFO.
A thread is spun, we wait to be killed, and the main code is done. Once
rtl main wait() completes, we need to close/unlink the FIFO in addition
to the thread cleanup, just like any normal file. RTCore will catch dangling
devices and clean them up for a user, but good programming practice is to
do the work right in the first place.
Our thread in this instance sleeps on a one second interval and writes
to the FIFO every time it wakes up. As before, it will do this indefinitely.
There are no real surprises here, so let’s look at the userspace code:
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
2.2. USING RTCORE 21
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
#include <rtl posixio.h>
#include <posix/semaphore.h>
{
while (1) {
sem wait(&sema);
printf("Waiter woke on a post\n");
}
}
while ( 1 ) {
timespec add ns( &next, 1000*1000*1000 );
return NULL;
}
int main(void)
{
sem init(&sema, 1, 0); 40
sem destroy(&sema);
return 0;
}
Instead of a single thread, two are spun up, once the semaphore is ini-
tialized. One thread waits on the semaphore, while the other sleeps and
periodically performs the sem post() operation. Before the post occurs, and
after the waiter wakes, a message is printed to indicate the sequence of events.
Semaphores really are that easy - we’ll see how they can be used later on
to very easily handle synchronization problems.
2.3 Perspective
At this point, take a step back and look at what we’ve just covered. In a
short introduction, you’ve seen code that performs standard output, POSIX
threads, communication through real-time devices, and synchronization through
standard POSIX semaphores. None of it required much experience beyond
basic knowledge of C and POSIX, and a little bit of UNIX background. In
fact, these applications are no different than what you would see in a normal
C environment under another UNIX. The difference here is that you get hard
real-time response in your threads.
The point of this was to get you, the reader, handling useful code as
quickly as possible, easing the stigma surrounding real-time programming.
Now that you see that it doesn’t involve occult knowledge, we’ll step back
and take a broader view of RTCore, the API and some of the grounding
principles of real-time programming.
24 CHAPTER 2. INTRODUCTORY EXAMPLES
Chapter 3
You’ve now seen some basic RTCore code, and can see that real-time pro-
gramming isn’t as mystifying as it sounds. However, before we dive into
detailed coverage of the API, some basic concepts need to be demonstrated.
For those familiar with RTOS concepts, most of this chapter should be re-
view, but skimming is recommended as we will be explaining how RTCore
handles real-time problems.
25
26 CHAPTER 3. REAL-TIME CONCEPTS AND RTCORE
A system that can fulfill these criteria is fully deterministic and considered
to be ”hard” real-time. Of course, there are varying levels of service, as some
hard real-time systems might have a worst case jitter of 2 seconds, while
others provide 25 microseconds. Both qualify according to the definition, but
only one is usable for a wide range of applications. The RTCore approach
qualifies on all of these counts, and the response time tends to be near the
limits of the underlying hardware.
Hard real-time systems will generally have slightly lower average perfor-
mance than soft real-time systems, which in turn are generally not as efficient
with resources as non-real-time systems. This is because non-real-time sys-
tems are concerned with throughput - if an Ethernet transfer is delayed a
little in order to burst out several disk transfers, this results in higher system
output, and has no significant repercussions in a non-real-time environment.
In a hard real-time system, not performing this optimization results in lower
overall throughput, but it maintains determinism. This determinism is what
28 CHAPTER 3. REAL-TIME CONCEPTS AND RTCORE
makes the difference between getting your task done without fail and doing
a ”best effort” based on available system resources.
• Since the core system is an RTOS, the vast amount of available software
cannot (in most cases) be used without modification or at least signifi-
cant analysis with respect to real-time demands. It is almost impossible
to determine interactions between the software and the RTOS.
3.2. THE RTOS DESIGN DILEMMA 29
• Because such systems are very complex (and often not well-documented),
it is extremely difficult to reliably achieve full preemption, especially
without performance degradation in some usage scenarios. Add in the
fact that the system is constantly developing, and the problem worsens.
• It runs full featured non-real-time Linux (or BSD) as the lowest priority
task. It is the ”idle task” of the RTOS, meaning that it is run whenever
the real-time system has nothing else to execute.
• Soft-IRQs: We will discuss this in detail later, but this approach in-
volves creating a virtual IRQ that is visible to the GPOS. Using this
IRQ, real-time code can signal to the non-real-time system that it needs
memory, and a handler on the other side safely takes care of the possible
blocking when allocating the memory. When this operation completes,
a signal is sent back to the RTCore code. There may be any amount
of delay before this handler gets to do the work, though.
retain statistical information on how your machine floor devices are doing.
Later on, we’ll cover the capabilities of this package in detail.
37
38 CHAPTER 4. THE RTCORE API
This will create a thread whose handle is stored in *thread. The thread’s
execution will begin in the start routine() function with the argument
arg. Attributes controlling the thread are specified by attr, and will use the
default values and create a stack internally if this value is NULL.
Note that pthread create() calls are generally limited to being within
the intialization context of main(). If the call is needed during normal real-
time operation, threads can be created with preallocated stack space. Other-
wise, calling pthread create() from another real-time thread would at the
worst cause deadlock, and at best delay the first real-time thread an unknown
amount while memory is allocated for the stack.
There is an attribute function (pthread attr setstackattr()) that al-
lows a thread to be prepared with a preallocated stack for operation. Let’s
look at a simple example:
#include <rtl.h>
#include <time.h>
#include <pthread.h>
#include <stdio.h>
return 0; 20
40 CHAPTER 4. THE RTCORE API
pthread cancel(thread1);
pthread join(thread1, NULL);
pthread cancel(thread2);
pthread join(thread2, NULL);
rtl gpos free(thread stack);
}
This again demonstrates the point that anything outside of the main()
call cannot directly allocate memory. Instead, we allocate a stack with
rtl gpos malloc()2 in main(), where it is safe to block while the system
handles any work, such as defragmentation, associated with the allocation.
This must be allocated, as on some architectures a global static value may
not be a safe place to store the stack of a running thread.
Next, a real-time thread is spawned. Within the handler function, it
initializes an attribute and configures it to use our preallocated area for the
stack. Finally, we spawn the thread and execution occurs just as you would
expect POSIX calls to behave, with the exception being that the stack is
already present. Note: A thread created with pthread create() is not
guaranteed to be started when the call returns, it is just slated for scheduling.
Note that thread stacks in RTCore are static, and will not grow as needed
depending on call sequence. Users need to make sure that they create enough
stack space for the thread, and prevent too many large structures from being
placed on the stack. In a system that allows for dynamic memory manage-
ment and the possible delays incurred by doing so, stacks can dynamically
2
rtl gpos malloc() uses the correct malloc() available on the host GPOS.
4.2. POSIX THREADING FUNCTIONS 41
grow as the application needs space. Under RTCore, growing the stack would
require the program to wait while proper memory is found, possibly destroy-
ing real-time performance. Instead, the stack is allocated at thread creation
and does not grow.
This stack can generally be only a couple dozen kilobytes in size, but
users with large data structures in function contexts need to understand
that these structures can soak up available stack space very quickly, causing
an overflow. If a thread has a 20K stack, and calls a function 3 times in
sequence, with a local structure of 7K per invocation, an overflow will occur.
Smaller structures should be used, or large structures should be kept off the
stack, or the thread’s stack should be enlarged to compensate.
Lastly, RTCore uses sizeof(struct rtl thread struct) bytes on the
bottom of the stack for thread information. For nearly all users, this differ-
ence is negligible, and is not likely to be noticed. But if you need to manage
your stack down to the last byte, it is recommended that you allocate your
needed stack size plus the size of the structure to be safe. So instead of:
rtl gpos malloc(32768);
use:
rtl gpos malloc(32768 + sizeof(struct rtl thread struct));
pthread kill() sends the signal specified by signo to the thread speci-
fied. This is fast and deterministic if called on a thread running on the local
CPU, but there can be a delay when signalling a thread on a remote CPU.
Programmers can use these calls to manipulate the stack size of the thread
the attribute is tied to. Note that this must be done within the main()
context, where memory management is possible. Refer back to our example
Section 4.2.1 for details, both on this and the pthread attr setstackaddr()
call. If these attributes are not set, the RTCore OS will handle the stack
manipulation internally.
Again, note that thread stacks under RTCore are static, and will not
grow as needed based on what functions are called. Users need to ensure
that they have enough stack space for their thread from the start. Section
4.2.1 has more details.
These calls are important when creating threads from within the real-time
kernel. As there is no memory management, threads need to be spawned us-
ing preallocated memory areas. By using these calls to manage the stack
address, one can create threads from inside the real-time kernel. We’ve al-
ready seen this used in the thread creation example, and as you can see, it
is not difficult to manage.
Use these two calls to switch a thread’s joinable state from PTHREAD
CREATE JOINABLE to PTHREAD CREATE DETACHED. Alternatively,
the pthread detach() call can be used to alter a running thread’s state.
4.3 Synchronization
4.3.1 POSIX spinlocks
RTCore provides support for the POSIX spinlock functions too. The API is
much like other POSIX objects - there is an initialization/destruction set:
pthread_spin_lock(pthread_spinlock_t *lock);
pthread_spin_trylock(pthread_spinlock_t *lock);
pthread_spin_unlock(pthread_spinlock_t *lock);
Again, no surprises - these calls allow you to take a lock, try to take it but
return if the lock is already held, and unlock a given spinlock, respectively.
The spinlocks themselves behave just like a normal spinlock - they will spin
a given thread in a busy loop waiting for the resource, rather than putting
it on a wait queue to be woken up later.
As such, the same spinlock caveats apply - they are generally only prefer-
able to another synchronization method when the given thread will spin a
shorter amount of time waiting than the sum of the work involved in putting
it on a queue (and any associated locking), and waking it up appropriately
when the resource becomes available. In a real-time system, it is also of
course important that the resource is available quickly so the thread does
not lose determinism due to a faulty locking chain in other subsystems.
mean calls that may leave the system in an unknown state if the call is in-
terrupted in the middle of execution. An example would be a function that
locks several mutexes in order to do work, and installs no cleanup handlers.
If the call is halfway through and is cancelled by a remote pthread cancel()
call, that thread will exit while holding some mutexes, potentially blocking
other threads indefinitely.
It is possible to handle mutex cleanups in a safe manner if one pushes
cleanup handlers for all shared resources, but this is complicated and dan-
gerous. Extreme care must be taken to ensure that held resources are freed
in a manner that doesn’t incur locking, and that everything is cleaned prop-
erly for every possible means of failure. Failing to get this absolutely right
will leave all waiting threads blocked forever, as the cancelled thread will
terminate with locked mutexes left behind.
#include <rtl.h>
#include <time.h>
#include <unistd.h>
#include <pthread.h>
4.3. SYNCHRONIZATION 47
#include <stdio.h>
pthread t thread;
pthread mutex t mutex;
application with CTRL-C at the command line, it induces the cancel and
cleanup handler, causing a proper exit.
Note again the concept of a cancellation point - if the code pushes the can-
cel handler on, but the thread is cancelled asynchronously before it actually
locks the mutex, the thread will continue to run until it enters a cancellation
point. It will continue to execute, running through the after the cleanup han-
dler push but before the mutex lock. Once it locks the mutex, the thread is
cancellable, the signal will be delivered, and the handler will be called from a
known point. Think of cancellation points as being places where the system
checks to see if it should stop and clean up.
Consider this case without the cleanup handler, even where the code
wasn’t infinitely blocked. Once the thread locks the mutex, and another
process asynchronously cancels the thread, the thread will still wait for a
cancellation point, but without the handler, it will exit with the mutex held,
and any other code that depends on it will be blocked indefinitely. Now
imagine what happens if you have multiple resources held at various times,
depending on the call chain. Any lockable resource that isn’t attached to a
cleanup handler properly can cause a deadlock if the holding thread thread
is cancelled.
As you can see, while there are mechanisms to avoid cancellation prob-
lems, extreme care must be taken to make sure that everything is handled
properly. Failure to do so in every possible cancel situation will result in
system deadlock. With a real-time system, this can be disastrous, and it is
for this reason we’ve taken this time to demonstrate how careful one must
be.
4.4 Mutexes
The POSIX-style mutexes are also available to real-time programmers as a
means of controlling access to shared resources. As timing is critical, it is
important that mutexes are handled in such a way that blocking will not
impede correct operation of the real-time application.
As with the standard POSIX call, this locks a mutex, allowing the caller
to know that it is safe to work on whatever resources the mutex protects. In
a real-time context, locks around mutexes must be short, as long locks could
cause serious delays in other waiting threads.
The pthread mutex trylock() call will attempt to lock a mutex, and
will return immediately, whether it gets the lock or not. Based on the return
value, one can tell whether the lock is held, and take appropriate action. For
some applications that may not be able to wait for a lock indefinitely, this is
a way to avoid long delays.
As with normal POSIX calls, this inititalizes a given mutex. If a pthread mutex attr
is provided, it will use it, otherwise a default attribute set will be created
and attached. The second call of course destroys an existing mutex, assum-
ing that it is in a proper state and not already locked. Destroying a mutex
that is in use will result in an error to the caller.
50 CHAPTER 4. THE RTCORE API
This is used to initialize a given mutex attribute with the normal values,
or destroy an already existing attribute.
int pthread_mutexattr_settype(
pthread_mutexattr_t *attr,
int type);
int pthread_mutexattr_gettype(
pthread_mutexattr_t *attr,
int *type);
This call allows you to set the type of mutex used. For example, the
type can be either PTHREAD MUTEX NORMAL, which implies normal
mutex blocking, or PTHREAD MUTEX SPINLOCK NP, which will force
the mutex to use spinlock semantics when attempting to grab a lock. The
second call will return the type previously set, or the default value.
int pthread_mutex_setprioceiling(
pthread_mutex_t *mutex,
int prioceiling,
int *old_ceiling);
int pthread_mutex_getprioceiling(
const pthread_mutex_t *mutex,
int *prioceiling);
This call sets the priority ceiling for the given mutex, returning the old
value in old ceiling. This call blocks until the mutex can be locked for
modification. The second call returns the current ceiling. More detail on
priority ceilings will follow later on.
A condition variable must be created and destroyed just like any other
object. Note that there is an attribute object that is specific to condition
variables, and can be used to drive the behavior of the variable.
As with pthread cond wait(), this call waits for a condition to happen,
locked by a mutex. In this version, however, it will only wait the amount
of time specified by abstime. Based on the return value, the caller can
determine whether the call succeeded and the condition occurred, or if time
ran out.
52 CHAPTER 4. THE RTCORE API
4.6 Semaphores
Again, RTCore semaphores look just like POSIX semaphores. As with the
conditional variables, if a thread is cancelled with a process-shared semaphore
blocked, this semaphore will never be released, and consequently, a deadlock
situation can occur. It is the programmer’s responsibility to ensure that
semaphores are handled properly in cleanup handlers.
Signals that interrupt sem wait() and sem post() will terminate these
functions, so that neither acquiring or releasing the semaphore is accom-
plished. The function call interrupted by a signal will return with value
EINTR.
4.6. SEMAPHORES 53
This function will store the current value of the semaphore in the sval
variable.
sem post() increases the count of the semaphore, and never blocks, al-
though it may induce an immediate switch if posting to a semaphore that a
higher priority thread is waiting for.
These are the calls used to force a wait until the semaphore reaches a
non-zero count, and operate in the same way the mutex wait calls do. The
sem wait() call blocks the caller until a non-zero count is reached, and the
sem trywait() does the same without blocking, returning EAGAIN if the
count was 0. The sem timedwait() call blocks up to the amount of time
specified by abs timeout.
54 CHAPTER 4. THE RTCORE API
With this configuration, RTCore will prepare the thread 7us in advance,
and then release it exactly when the required period hits. It will take away
from available CPU throughput as the chip will be spinning in that prepara-
tion window, but it will provide jitterless timing.
Users with very specific timing requirements may want to disable inter-
rupts during this, so that an interrupt that comes in as the thread is being
prepared does not induce even the slightest jitter to the thread.
deviate from that time by more than 13 microseconds. If the hardware, un-
der load, deviates by 10 microseconds, and the RTCore scheduling takes 3,
and the context switch time takes 2, you are already outside of the allowable
range.
For some users, the application might not be too cost-sensitive, and it is
just a matter of getting faster hardware. But for low cost systems, or where
there is no faster hardware, RTCore offers the advance timer. This allows
you to compensate for things like scheduling and hardware jitter.
The advance timer works as follows: When you make a call to clock nanosleep()
to sleep until your next scheduling period, RTCore allows an extra flag and
structure to perform early scheduling. So instead of:
a CPU such that the GPOS cannot run on that CPU. The benefit is that
the real-time code can then in many cases live entirely in cache, and achieve
more deterministic results at high speeds, as the GPOS cannot run on that
CPU and disturb the cache usage.
Tests on larger scale systems with significant bus traffic indicate that
reserve CPU capabilities can reduce jitter by an order of magnitude.
Affinity on NetBSD
RTCore on NetBSD also allows users to pin userspace processes to a specific
CPU. (Generally, they can be scheduled to run on any CPU.) To pin a
process, user programs must include the rtl pthread.h header file, and
call:
rtl_setproc_affinity(cpu_num);
More concepts
So far, we’ve looked at some basic real-time concepts, introduced some ex-
amples, and walked through the basics of the API. Given that the API is
POSIX, much of the learning curve is gone, and we can now hop back into
some general programming practices and concepts, and how they work in
RTCore. Let’s start off with some basic practices:
61
62 CHAPTER 5. MORE CONCEPTS
5.4 Deadlocks
When using synchronization primitives, it is the programmers responsibility
to ensure that either all shared resources correctly freed if asynchronous
signals are enabled or that these are blocked. Make sure to use thread cleanup
handlers to safely free resources if the thread is cancelled while holding a
resource.
5.5. SYNCHRONIZATION-INDUCED PRIORITY INVERSION 63
it is best to allocate it during initialization and then allocate pieces from that
pool by hand during execution.
The reason for this approach is simple - bounded time allocation in a
general purpose memory allocator is difficult to prove. For non-real-time
applications, a generic allocator is fine - the user calls malloc(), and the call
may return immediately if a chunk is available, or it may block indefinitely.
On an active system, it is entirely possible that memory may be extremely
fragmented, and the allocator might have to do a lot of work in order to
defragment existing pieces enough that the user request can be handled. In
a real-time system, this may mean that your thread is indefinitely blocked.
Users can allocate a large chunk during initialization (in main()), where
the code does not have real-time demands. This will ensure that a pool is
around for real-time use. If the usage pattern of the pool is known, a simple
allocator/deallocator could be implemented on top of this pool that would
allow for memory management calls in bounded time.
Bounded-time allocators will return with an answer in a specific time
frame, but may not use memory as efficiently as it could, while generic allo-
cators will optionally take extra time to use every last bit of memory. Some
bounded-time allocation mechanisms and algorithms do exist, and FSMLabs
is evaluating and testing some of these options. Future releases of RTCore
will likely include a bounded time allocator of some type as a convenience to
users.
5.7 Synchronization
This is possibly the most important concept in real-time systems engineering.
While synchronization is important as a protection mechanism in normal,
non-real-time threaded applications, it can make or break a real-time system.
In a normal application, a waiting thread will do just that - wait. In a hard
real-time system, a waiting thread might mean that a fuel pump isn’t being
properly regulated, as it is waiting on a mutex that another thread has held
too long.
protection will save the system from a programmer who doesn’t understand
or care what locks are held when in a real-time system. In fact, the presence
of it may result in carelessness on the part of the developer.
RTCore offers the standard POSIX synchronization methods, such as
semaphores, mutexes, and spinlocks, but also focuses on other, higher per-
formance synchronization methods. In fact, much of RTCore is designed
in such a way that synchronization is not necessary, or is very lightweight.
Heavy synchronization methods such as spinlocks can disable interrupts and
interfere with other activity in the system. Lighter mechanisms such as
atomic operations create very little bus traffic and have a minimal impact.
Of course, an entirely lock-free mechanism is even better, if possible.
An example of this is the RTCore POSIX I/O subsystem. The original
Free versions were very fast but had no locking mechanisms whatsoever.
While the performance was good, it didn’t hold up to industrial use. It
needed proper locking in order to traverse the layers properly. The layer also
needed to stay as fast as it was before - users want it to be fast and safe. A
simple and effective method would be to put mutexes around each contended
piece, locking and unlocking as needed.
While simple, this would severely slow the system down, as mutexes in-
volve waiting on queues, switching threads while others complete, and so on.
Instead, FSMLabs added a light algorithm based on atomic operationsThese
are covered in more detail in section 5.7.3. As requests to in to
add a device name to the pool, atomic operations such as rtl xchg are
used to grab pieces of the pool. This prevents interrupt disabling, and allows
other threads to use other pieces of the pool at the same time.
Some other restructuring was also used, resulting in a more flexible and
safe architecture that was just as fast as it had always been. Other systems
require different approaches, from heavier synchronization to none at all, but
it is very important that the correct method is chosen, not just one that
works.
Now that we’ve briefly covered the topic (synchronization is a very broad
topic in real-time systems), let’s look at a specific example of a light syn-
chronization method in RTCore.
#include <rtl.h>
#include <time.h>
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
This requires some explanation, as the syntax hides much of the work.
There are two threads, spawned as normal, where one enqueues data and the
other dequeues. Both are periodic, and as a quick method of preventing the
queue from overflowing, the dequeueing thread defines a period half as long
as the enqueueing thread. Half of the dequeue calls result in an empty queue
being found, but this is acceptable for our purposes.
Now let’s break down the interesting part into discrete steps, starting
with the initial declarations.
Declarations
We need to define a queue for data to flow between the threads. The syntax
defines two steps: Let’s look at just step 1 first:
DEFINE_OWQTYPE(our_queue,32,int,0,-1);
This first step creates a datatype for our queue. Think of this as the
backing for the queue operations - it defines the queue, it’s properties, and
structure. Parameter 1 is the name that will be provided so that we can
instantiate the queue itself, and parameter 2 defines the length of the queue.
Parameter 3 defines the type of unit the queue is made of - here we use an
integer as the base element, but we could have used pointers or anything
else. As the queue operations copy data into the queue, light units such as
pointers are favored over large structures. Parameters 4 and 5 are not used
at the moment.
We now have a queue structure named our queue containing 32 elements,
each the size of an int. If you were passing characters or structures through
the queue, you would use char or struct x as parameter 3.
5.7. SYNCHRONIZATION 69
DEFINE_OWQFUNC(our_queue,32,int,0,-1);
our_queue Q;
Usage
Looking on to the thread code, you can see that the actual usage of the queue
is simple: One thread calls our queue enq(&Q,count), which is the enqueue
function created in step 2 above, using our defined structure Q, and pushing
a value of count into it. The other end does a our queue deq(&Q) which
returns a correct type off of the queue for usage in the other thread. Note that
step 2 also defines a few other simple calls for the queue: our queue full()
to see if the queue is full, our queue init() to initialize a queue structure,
and also a our queue top(), which will return the current head of the queue
without removing it. (This also serves as an isempty() function.)
Queue interaction, as you can see, is very easy. It is also extremely fast,
and doesn’t require locking for most cases. The code is safe for multiple
threads that are doing enqueue and dequeue operations at the same time,
which is the common case. The user needs to add an external lock when two
or more threads are enqueueing data at the same time or a set are dequeueing
data at the same time. Otherwise, no additional locking is needed.
70 CHAPTER 5. MORE CONCEPTS
These two atomically set a bit or clear within a word address, respectively.
The first parameter specifies which bit should be toggled, and the second
specifies the address base to be used for the operation.
5.7. SYNCHRONIZATION 71
Communication between
RTCore and the GPOS
The two components of a complete RTCore system, real-time and the user-
space, generally run in two separate, protected address spaces. The real-time
component lives in the RTCore kernel, while the rest of the code lives as a
normal process within the GPOS. In order to manage each side, there has
to be some kind of communication between the two. RTCore offers several
mechanisms to facilitate this.
6.1 printf()
printf() is probably the simplest means of communicating from a real time
thread down to non-real-time applications. When an RTCore application
starts up, it creates a ’stdout’ device to communicate to the calling envi-
ronment, usually a terminal device of some kind. Calls to printf() in the
real-time application appear in the calling terminal the same way a printf()
call would in a normal application. This allows you to log real-time output
the following way:
The printf() implementation is fully capable, and can handle any nor-
mal data type and format. It also is a lightly synchronized method compared
to some others we will present here, and very fast as a result, without im-
pacting other core activity.
73
74CHAPTER 6. COMMUNICATION BETWEEN RTCORE AND THE GPOS
mkfifo("/mydevice", 0777);
fd = open("/mydevice", O_NONBLOCK);
ftruncate(fd, 8192);
6.3. REAL-TIME FIFOS 75
#include <rtl.h>
#include <time.h>
#include <pthread.h>
#include <unistd.h>
#include <rtl fifo.h>
pthread t thread;
76CHAPTER 6. COMMUNICATION BETWEEN RTCORE AND THE GPOS
while (status) {
usleep(1000000);
ret = read(fd1,&read int,sizeof (int));
if (ret) {
printf("/mydev0: %d (%d)\n", 20
read int,ret);
write(fd2,&read int,ret);
}
ret = read(fd0,&read int,sizeof (int));
if (ret) status = 0;
}
return 0;
}
mkfifo("/mydev1", 0777)
fd1 = open("/mydev1",O NONBLOCK);
ftruncate(fd1, 4096);
mkfifo("/mydev2", 0777)
fd2 = open("/mydev2",O NONBLOCK); 40
ftruncate(fd2, 4096);
We’ve already seen most of this code in other examples, but this succinctly
shows you how to use POSIX I/O from within RTCore. As usual, we spawn
a real-time thread from within main(), but we first have to explicitly create,
open, and size our FIFOs with the proper amount of preallocated space. 1
In the thread, we read() from fd0 to see if it’s time to shut down, and
otherwise read() from fd1 and write received data to fd2. These calls are
non-blocking for a reason - if the real-time thread ended up waiting for a
GPOS application that rarely got scheduling time, it would not be determin-
istic. So in this case, we just sleep and attempt to read from the devices.
There isn’t much to look at on the user-space side, but for the sake of
completeness, here it is:
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
Looks pretty much like any other userspace application, doesn’t it? That’s
because it is. All we do is open the FIFOs, dump data over them, and read
it back. After we’re done, we write to a third FIFO to signal the real-time
thread that it’s time to shut down, and then we close the files. One minor
difference is that on this end, we didn’t open the devices as non-blocking,
although it can easily be done that way.
ftruncate(fd,32768);
1. You are running in the main() context, and not in a thread. This way,
if the call determines that there is no preallocated space to use for the
device, it is safe to block while the memory allocation work is handled.
2. You are in the real-time context, and RTCore is running with preallo-
cated buffers for your data. In this case, even if you never performed an
explicit O CREAT, the open is safe, because RTCore has space set aside
for use by the FIFO. In this case, you will be forced to use the default
FIFO size that RTCore was built to use. This is a legacy option for the
/dev/rtf* devices and does not apply to arbitrarily named devices. It
also depends on specific compilation settings in RTCore, and as such,
using arbitrarily-named devices with proper sizing during initialization
is recommended.
6.4. SHARED MEMORY 79
6.3.5 Limitations
The real-time kernel is not bound to operate synchronously with the normal
operating system thread. If the real-time kernel is under heavy load, it may
not be able to schedule time for the GPOS to pull the data from the FIFO.
Since the FIFO is implemented as a buffer, it is feasible that the buffer
might fill from the real-time side before the user-space thread gets a chance
to catch up. In this case, it is advisable to increase the size of the buffer
(with ftruncate()) or to flush the buffer from the real-time code to prevent
the user-space application from receiving invalid data.
The inverse of this problem is that the FIFO cannot be a deterministic
means of getting command data to the real-time module. The real-time
kernel is not forced to run the GPOS thread with any regularity, as it may
have more important things to do. A command input from a graphical
interface on the OS side through the FIFO may not get across immediately,
and determinism should never be assumed.
A subtler problem that must be overcome by the programmer is that
the data passed through the FIFO is completely unstructured. This means
that if the real-time code pushes a structure into the FIFO with something
like write(fd,&x,sizeof(struct x));, the user-space code should pull it
out on the other side by reading the same amount of data into an identical
structure. There has to be some kind of coordination between the two in
order to determine a protocol for the data, as otherwise it will appear to
be a random stream of bits. For many applications, a simple structure will
suffice, possibly with a timestamp in order to determine when the data was
sampled and placed in the FIFO.
6.4.1 mmap()
If you are not familiar with mmap(), please refer to the RTCore or standard
man page for full details. The basic idea is that you open a file descriptor,
call mmap() on it with a given range, and it returns a pointer to an area in
this file or device. Under RTCore, this is used with a device. As we shall see,
both the real-time module and the user-space application both open the same
device, call mmap(), and can subsequently access the same area of memory.
The shared memory devices themselves are created with the POSIX
shm open(), destroyed with shm unlink(), and sized with ftruncate().
Please refer to the man pages for specific details - only an overview will be
given here.
First, the device must be created. This is done with shm open(), which
takes the name of the device, open flags, and optionally a set of permission
bits. If you are the first user and are creating the device, use RTL O CREAT.
Furthermore, if you want this device to be automatically visible in the GPOS
filesystem, specify a non-zero value for the permission bits. For example, the
following call creates a node named /dev/rtl shm region that is visible to
the GPOS with permission 0600, and returns a usable file descriptor attached
to the device:
ftruncate(shm_fd,400000);
Note that this will round up the size of the shared region in order to align
it on a page boundary (page size is dependent on architecture but generally
4096 bytes). Also, as it does perform memory allocation, it must occur in
the initialization segment. Now you can use mmap() from either real-time
code or user-space code, as in:
addr = (char*)mmap(0,MMAP_SIZE,PROT_READ|PROT_WRITE,
MAP_SHARED,shm_fd,0);
6.4. SHARED MEMORY 81
close(shm_fd);
shm_unlink("/dev/rtl_shm_region");
It is worth noting that these need not be in order: if a thread is still using
the area and another calls shm unlink(), the region will remain valid until
the last user calls close() on the file descriptor. RTCore does reference
counting on devices like shared memory and FIFOs in order to allow this
behavior.
6.4.2 An Example
The theory and practice are very simple, so without further discussion, let’s
look at an example. First, the real-time application:
#include <rtl.h>
#include <time.h>
#include <fcntl.h>
#include <pthread.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/mman.h>
p.sched priority = 1; 20
pthread setschedparam(pthread self(), SCHED FIFO, &p);
p.sched priority = 1;
pthread setschedparam(pthread self(), SCHED FIFO, &p);
while (1) {
timespec add ns(&next, 1000000000);
clock nanosleep(CLOCK REALTIME, TIMER ABSTIME, 60
&next, NULL);
printf("rtl_reader thread sees "
"0x%x, 0x%x, 0x%x, 0x%x\n",
raddr[0], raddr[1], raddr[2], raddr[3]);
}
}
ftruncate(wfd,MMAP SIZE);
close(wfd);
close(rfd);
shm unlink("/dev/rtl_mmap_test"); 100
return 0;
}
First, we create and open a device twice, once for a reader thread and
once for a writer. A thread is spawned for each task, which actually performs
the mmap(). Note that the ftruncate() call is in the main() context, as it
needs to perform memory allocation to back the shared area. Further calls
such as mmap() that don’t cause allocations can happen anywhere.
The result of the mmap() call is a reference to the shared area, so once we
have the handles needed, can reference the area freely. One thread updates
the area every second, and the other reads it. Now we have an area that
is shared between real-time process, but what about userspace? The same
mechanism applies, as you can see here:
#include <stdio.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdlib.h>
#include <errno.h>
int main(void)
{
int fd;
unsigned char *addr;
if ((fd=open("/dev/rtl_mmap_test", O RDWR))<0) {
perror("open");
exit(−1);
} 20
6.4. SHARED MEMORY 85
addr = mmap(0, MMAP SIZE, PROT READ, MAP SHARED, fd, 0);
if (addr == MAP FAILED) {
printf("return was %d\n",errno);
perror("mmap");
exit(−1);
}
while (1) {
printf("userspace: the rtl shared area contains" 30
" : 0x%x, 0x%x, 0x%x, 0x%x\n",
addr[0], addr[1], addr[2], addr[3]);
sleep(1);
}
close(fd);
return 0;
} 40
There isn’t much work involved here. The code opens the device as a
normal file and calls mmap() on it just as before. This piece of code performs
the same action as the reader in the real-time space, dumping the values of
the first few bytes of data every second or so. As the writer updates the area,
both the real-time reader and the user-space program see the same changes.
As with other RTCore mechanisms, it is assumed that the real-time side
does the initial work of creating the shared area. This ensures that the real-
time code has a handle on what exists, and doesn’t have to optionally wait
for some user-space application to get around to doing the work first. If you
attempt to start the user-space code first, it will fail for multiple reasons:
First, the device isn’t there to be opened until shm open() is called from
real-time code, and even if it is there, if there are no registered hooks for the
device.
86CHAPTER 6. COMMUNICATION BETWEEN RTCORE AND THE GPOS
6.4.3 Limitations
With shared memory, there is no inherent coordination between userspace
and real-time, as you can see in the example. Any rules governing usage of
the area must be added by your code. At any point, user code can overwrite
one area that a real-time thread needed to retain data in. In addition, one
can’t write to the area from real-time and then wait for it to be read and
cleared when Linux gets time to schedule your user-space process. This would
delay your real-time code indefinitely.
A little bit of synchronization can solve this type of problem. For example,
if you are using the area to get frames of data over to user-space, the real-time
thread could write the blocks at a given interval across the shared space, and
prepend each segment with a status byte indicating the state of the data.
The user-space program, when it is done reading or analyzing each segment,
can update that status byte to show that it is in use. This way the real-time
side can easily tell what areas are safe to overwrite.
This by-hand coordination can also easily allow you to direct real-time
code from user-space. One simple use is to allow control of real-time threads.
If both ends know that a certain area is meant to direct the actions of a real-
time thread, userspace code can easily flip a bit and indicate that a certain
thread should be suspended, resumed, or even spawned. This can be used
to (non-deterministically) direct nearly anything that the real-time code is
doing, or vice-versa.
CPU0
0: 75636868 XT-PIC timer
1: 6 XT-PIC keyboard
2: 0 XT-PIC cascade
4: 106 XT-PIC serial
5: 157842206 XT-PIC eth0
8: 1 XT-PIC rtc
13: 1 XT-PIC fpu
14: 13637083 XT-PIC ide0
6.5. SOFT INTERRUPTS 87
CPU0
0: 1398262 RTLinux virtual irq timer
1: 4 RTLinux virtual irq keyboard
2: 0 RTLinux virtual irq cascade
11: 4902708 RTLinux virtual irq usb-uhci, eth0
12: 0 RTLinux virtual irq PS/2 Mouse
14: 29546 RTLinux virtual irq ide0
15: 5 RTLinux virtual irq ide1
219: 12178 RTLinux virtual irq sofirq jitter test
220: 0 RTLinux virtual irq RTLinux Scheduler
221: 26 RTLinux virtual irq RTLinux FIFO
222: 1293626 RTLinux virtual irq RTLinux CLOCK_GPOS
223: 5124 RTLinux virtual irq RTLinux printf
NMI: 0
ERR: 0
actual execution. This is due to the same reason as before: The real-time
kernel may prevent the GPOS kernel from running for some time, depend-
ing on the current set of demands. However, for soft-real-time tasks, this is
generally a sufficient approach.
Again, it must be stressed that the GPOS is only seeing RTCore virtual
IRQs. The handlers the GPOS had installed before RTCore was loaded are
not affected but are now managed by the interrupt emulation layer, and thus
have become soft interrupts. This process of insertion is handled transpar-
ently to GPOS drivers.
This can be used to handle many inter-kernel communication mechanisms.
As previously discussed, rtl printf() uses this mechanism to pass data to
the kernel ring buffer. It could also serve as a way for real-time code to
allocate memory, by signalling a GPOS hander to safely perform the memory
management asynchronously.
The string passed as second argument to rtl get soft irq() is the
string name that will be associated with the IRQ, which on Linux will be
displayed in /proc/interrupts. It is a good idea to make this something
meaningful, especially if you are making heavy use of the soft IRQ handlers.
The interrupt number assigned is the first free interrupt number from
the top down. As such, there is little risk this will ever collide with a real
hardware interrupt. rtl get soft irq() will return -1 if there is a failure,
but should otherwise return the value of the IRQ registered.
To actually signal the interrupt to Linux the function rtl global pend
irq() is given the soft interrupt number. When the Linux kernel runs, it
will see this interrupt as pending and execute your Linux handler.
The interrupt handler declaration is just like the one you would use for a
regular Linux interrupt handler:
6.5.2 An Example
This section wouldn’t be complete without a simple example. The soft IRQ
API is fairly small, so let’s look at a piece of code that uses all of the calls:
#include <rtl.h>
#include <time.h>
#include <stdio.h>
#include <pthread.h>
pthread t thread;
static int our soft irq;
p . sched priority = 1;
pthread setschedparam (pthread self(),
SCHED FIFO, &p);
soft irq handler(), which in turn dumps the current interrupt count via
printk(). On exit, the cleanup module() destroys our real-time thread as
usual, and then deregisters the soft IRQ handler.
As you can see, it isn’t very hard to interact with the Linux kernel in this
fashion. By simply pending interrupts, you can trigger your own handlers to
do some dirty work in the GPOS kernel, without sacrificing determinism in
your real-time code.
92CHAPTER 6. COMMUNICATION BETWEEN RTCORE AND THE GPOS
Chapter 7
Debugging in RTCore
No one likes to admit it, but most developers spend a large chunk of time
debugging code, rather than writing it. Bugs in RTCore can be even more
difficult to trace down: by inserting any debug traces or other mechanisms,
the system is changed, and all of a sudden the bug won’t trigger. (Timing de-
pendent bugs are of course possible in other systems, but are more prevalent
in real-time development.)
Additionally, all real-time code, if it is running inside the RTCore kernel,
has the potential to halt the machine (PSDD threads live in external address
spaces). Debugging userspace applications is simpler, as a failure will simply
result in the death of the process, not the kernel. Trying to tackle the bug
is usually just a matter of cleaning up and trying the program again. These
luxuries are harder to come by in the kernel.
Fortunately, RTCore provides a debugger that can often prevent pro-
gramming errors from bringing the system down. Loaded with the rest of the
RTCore, (it can be disabled through recompilation with source) the RTCore
debugger watches for exceptions of any kind, and stops the thread that caused
the problem before the system goes down.
93
94 CHAPTER 7. DEBUGGING IN RTCORE
7.2 An example
There are some important things to know about the debugger, but before
getting into the details, lets walk through a simple example to describe ex-
actly what we are talking about. As with anything else, the first step is a
hello world application:
pthread t thread;
pthread t thread2;
10
void * start routine(void *arg)
{
int i;
struct sched param p;
struct timespec next;
p . sched priority = 1; 20
pthread setschedparam (pthread self(), SCHED FIFO, &p);
if (((long) arg) == 1) {
/* cause a memory access error */
*(unsigned long *)0 = 0x9;
}
has hit an exception, we can run GDB on the object file that was saved for
us during compilation for debugging:
# gdb hello.o.debug
(gdb)
The next step is to connect GDB to the real-time system. This is accom-
plished using the remote debugging facility of GDB. The real-time system
provides a real-time FIFO for debugger communication:
The above message tells us that we are indeed debugging through /dev/rtf10,
that the thread ID that faulted is 1123450880 and that the fault was in the
function start routine which was passed 1 argument named arg with value
0x1. This is all contained in source file test.c on source line 25. GDB also
displays the actual source line in question. The error that was generated was
indeed where we placed it.
Now, we examine the function call history. This may be necessary in
complex applications in order to determine the source of an error. Typing
bt will cause cause GDB to print the stack backtrace that led to this point.
(gdb) bt
#0 start\_routine (arg=0x1) at test.c:25
#1 0xd1153227 in ?? ()
7.2. AN EXAMPLE 97
Perhaps it is not clear what type of variables are being operated on. If
you wish to examine the type and values of some variables use the following
commands:
To get a better idea of what other operations are being performed in this
function one can list source code for any function name or any set of line
numbers with:
Once you are done debugging, you may exit the debugger and stop exe-
cution of the process being debugged.
(gdb) quit
The program is running. Exit anyway? (y or n) y
RTCore will resume execution of all threads but will leave the application
that was being debugged stopped. To actually remove the application or
module you must stop it through the means that you are used to - either by
sending it a signal to stop it (perhaps by typing control-c in the window) or
removing the application module.
7.3 Notes
There are a few items to keep in mind when using the RTCore debugger.
Most of these items are short but important, so keep them in mind in order
to make your debugging sessions more effective.
7.3.1 Overhead
The debugger module, when loaded, catches all exceptions raised, regardless
of whether it is related to real-time code, GPOS, or otherwise. This incurs
some overhead: Consider for example the case where a userspace program
causes several page faults as it is working through some data. These page
7.3. NOTES 99
faults cause the debugger to do at least some minor work to see if the fault
is real-time related. This may lead to a slight degradation of the GPOS
performance, so if the GPOS really needs some extra processing, the debug-
ger module may be removed. In practice, however, the benefits of having
protection against misbehaving RT programs usually outweigh the overhead
incurred by the debugger.
For those who wish to avoid this overhead, the source version of RTCore
allows you to reconfigure the OS without the debugger for production use.
This starts netcat on the device, listening on port 5000, feeding data from
the network listener into the FIFO, and also pushing data coming out of the
FIFO out onto that same listener. In GDB running on the development
host, you can connect to the remote real-time system with target remote
targethost:5000, where targethost is the target machine name.
(gdb) CTRL-Z
[1]+ Stopped gdb
# killall app_name
# kill %1
100 CHAPTER 7. DEBUGGING IN RTCORE
Make sure to not trigger any GDB commands that would cause the real-
time code to continue, as it would just execute the faulty code again.
gdb hello.o
(gdb) symbol-file /var/run/rtlmod/ksyms
(gdb) target remote /dev/rtf10
Chapter 8
Tracing in RTCore
8.1 Introduction
Real-time programs can be challenging to debug because traditional debug-
ging techniques such as setting breakpoints and step-by-step execution are
not always appropriate. This is mainly due to two reasons:
• Some errors are in the timing of the system. Stopping the program
changes the timing, so the system can not be analyzed without modi-
fying it’s behavior.
101
102 CHAPTER 8. TRACING IN RTCORE
• RTL TRACE RESUME – The system has recovered from an overflow con-
dition.
• RTL TRACE SCHED CTX SWITCH – context switch. The accompaning data
is a void * pointer of the new thread.
• RTL TRACE CLOCK NANOSLEEP – the thread invoked the clock nanosleep
call.
• RTL TRACE BLOCK– the thread voluntarily blocks itself (e.g., as a result
of a clock nanosleep call).
The events may be selectively enabled for tracing with the rtl trace set filter
function. For best performance, it is advisable to disable unneeded event
types.
It is possible to perform function call tracing the help of the tracer. To do
this, the program to be analyzed must be compiled with the -finstrument-functions
option to gcc. For an example, please see examples/tracer/testmod.c in
the RTCore distribution. For modules compiled with -finstrument-functions,
two special events are generated:
IRQ Control
Once RTCore is loaded, the GPOS does not have any direct control over
hardware IRQs - manipulation is handled through RTCore when there are
no real-time demands. However, RTCore applications can manipulate IRQs
for real-time control. We’ll now cover the basic usage of the IRQ control
routines.
This will hook the function passed as the second argument to rtl request irq
to be called when IRQ irq num occurs, much like any other IRQ handler.
When that function is invoked, it will run in interrupt context. This means
that some functions may not be callable from the handler and all interrupts
105
106 CHAPTER 9. IRQ CONTROL
will be disabled. This handler is not debuggable directly, but as threads are,
it is safe to post to a semaphore that a thread is waiting on. The thread
will be switched to immediately so that operations can be performed in a
real-time thread context. Upon execution of any operation this thread that
causes a thread switch control will return to the interrupt handler.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <semaphore.h>
pthread t thread; 10
sem t irqsem;
int irq = −1;
while ( 1 ) {
sem wait (&irqsem);
count++;
printf("IRQ %d has occurred %d times\n", 20
irq, count);
}
return NULL;
}
out:
rtl pthread cancel( thread );
rtl pthread join( thread, NULL ); 60
return ret;
}
This code initializes and pulls the requested IRQ for tracking from the
passed-in arguments. It then spawns a thread that waits on a semaphore -
this thread will be printing out the IRQ count as they occur. As mentioned,
the handler will be invoked in interrupt context, and as such is fairly limited
in what it can do. Instead, the handler is hooked up but does no real work
except for the rtl global pend irq() for the GPOS and sem post() for
the thread.
As with the other examples, this can continue indefinitely. If it is hooked
to the interrupt for a hard drive, it will trigger a message with a count for
each interrupt triggered by the device. When the application is stopped with
a CTRL-C, it will release the IRQ handler, kill the thread, and unload as
usual. The GPOS IRQ handler will then be the only handler for the device.
void function(void)
{
rtl irqstate t flags;
These calls do disable the real interrupts, so they must be used with
care. Interrupts should never be disabled longer than absolutely necessary,
as events may be missed. The system may also run out of control if the ap-
plication never re-enables the interrupts. However, some applications cannot
handle any kind of jitter during certain operations, even the minimal over-
head of receiving an Ethernet IRQ, and must disable all interrupts for short
periods.
While this is a simple mechanism for synchronization, it cannot be stressed
enough that lighter mechanisms that do not disable interrupts are almost al-
ways favorable. Even if you think that the code protected with disabled
interrupts is not on an important path, it may be running on the same hard-
ware with another application that cannot tolerate that kind of behavior.
Please see section 5.7.3 for more details.
Specific IRQs can be enabled or disabled with rtl hard enable irq(irq num)
and rtl hard disable irq(irq num) respectively. This allows the user to
target a specific IRQ rather than the entire set.
9.3 Spinlocks
pthread spin lock includes an implicit IRQ save of state and interrupt dis-
able and pthread spin unlock includes an implicit restore of the interrupt
state at the time the corresponding pthread spin lock call.
This can be a problem in cases where the locks are released in a different
order from when they were taken. For example:
void function(void)
{
rtl pthread spinlock t lock1, lock2;
9.3. SPINLOCKS 111
/* acquire lock 1 */
rtl pthread spin lock( &lock1 );
/* . . .interrupts are now disabled here. . . */
/* acquire lock 2 */
rtl pthread spin lock( &lock2 );
/* the state in lock2 is interrupts “enabled” */ 20
/* release lock 1 */
rtl pthread spin unlock( &lock1 );
/* interrupts are now disabled since lock1 state was “enabled” */
/* release lock 2 */
rtl pthread spin unlock( &lock2 );
/* restore to a interrupt disabled state */
} 30
Note that state restore when releasing lock1 and lock2 is incorrect since
the locks were acquired in a different order than they were released in.
112 CHAPTER 9. IRQ CONTROL
Chapter 10
This chapter presents examples of several classes of RTCore drivers and how
they interact with user-level programs and other RTCore applications.
Writing RTCore device drivers is very similar to writing normal RTCore
applications. Since all memory, including device memory, is accessible to
RTCore applications every RTCore program can potentially function as a
driver. Where drivers and normal RTCore applications differ is in how they
communicate with user-space (GPOS) applications and other RTCore pro-
grams.
113
114 CHAPTER 10. WRITING DEVICE DRIVERS
and only RTCore applications, through files, just as with a standard UNIX
system. These files are managed by RTCore and are not directly accessible
from the GPOS environment. For example, a Linux application that opens
/dev/lpt0 is communicating with the Linux (non-realtime) parallel port
driver and not the RTCore driver. Conversely, a RTCore application that
opens /dev/lpt0 is communicating with the RTCore driver and not with the
Linux driver.
The example driver below provides a /dev/lpt0 file that can be used
through POSIX open(), read(), write(), ioctl, mmap and close() calls
from RTCore applications. Two files, /dev/lpt0 and /dev/lpt1 are created.
When an RTCore application performs any operations on these files the driver
prints a message.
#include <rtl.h>
#include <stdio.h>
#include <rtl posixio.h>
static ssize t rtl par read(struct rtl file *filp, char *buf,
size t count, off t* ppos)
{
printf("read() called on file /dev/lpt%dn", filp−>f priv);
return 0;
} 10
static ssize t rtl par write(struct rtl file *filp, const char *buf,
size t count, off t* ppos)
{
printf("read() called on file /dev/lpt%dn", filp−>f priv);
return 0;
}
{
printf("open() called on file /dev/lpt%dn", filp−>f priv);
return 0;
} 30
int rtl par mmap(struct rtl file *filp, void *a, size t b,
int c, int d, off t e, caddr t *f)
{
return 0;
} 50
int rtl par munmap(struct rtl file *filp, void *a, size t length)
{
return 0;
}
{
printf("sigaction() with SIGPOLL called\n");
return 0;
}
return 0; 110
}
This would have been passed in our fops structure with everything else.
When the last user exits, RTCore will call this function so that our memory
is safely deallocated, and not when other threads may be using it. Otherwise,
if some code was using the area when another called rtl unregister dev(),
the memory would be freed out from under active code.
RTCore provides a couple of routines to allow you to control these counts
by hand if needed: incr dev usage(int minor) and decr dev usage(int
minor). This is helpful if you need to work with device resources, and want
10.3. REFERENCE COUNTING 119
to make sure that the last user doesn’t exit and cause a destruction of all
device resources while this work is occurring. An alternative is to perform
a normal open() on the device, do the work, and then close(). This is
the simplest method, but some drivers may still derive some use from the
incr/decr routines.
There is one more factor to keep in mind when using these calls: the
rtl namei() call performs an implicit incr dev usage(). This is done in
order to simplify the process of safely allocating a device. For functions
that use rtl namei(), there must be a symmetric decr dev usage() call to
prevent an artificiallly raised usage count.
RTLinux
r Professional
Components
121
Chapter 11
Real-time Networking
123
124 CHAPTER 11. REAL-TIME NETWORKING
Chapter 12
PSDD
12.1 Introduction
The standard RTLinux/Pro (RTCore) execution model may be described as
running multiple hard real-time threads in the context of a general purpose
OS kernel. This model is very simple and efficient. However, it also implies
no memory protection boundaries between real-time tasks and the OS kernel.
For some applications, the single name space for all processes may also be a
problem. This is where Process Space Development Domain (PSDD) comes
into play.
In PSDD, real-time threads execute in the context of an ordinary user-
space processes and thus have the benefits of memory protection, extended
libc support, easier developing and debugging. It is also possible to use it for
prototyping ordinary in-kernel RTCore modules.
125
126 CHAPTER 12. PSDD
all: psddhello
include rtl.mk
psddhello: psddhello.c
$(USER CC) $(USER CFLAGS) −opsddhello psddhello.c \
−L$(RTL LIBS DIR) −lpsdd
in real-time mode. Second, rtl /RTL prefixes are added to the names of
all RTCore POSIX functions and constants to distinguish them from other
user-space POSIX threads implementations, e.g. LinuxThreads/glibc.
Most RTCore API functions have rtl prefix added to their names to
avoid ambiguity. This may result in a confusion. For example, there are
both nanosleep() and rtl nanosleep() available in PSDD environment.
nanosleep() should only be used in GPOS context (functions called from
main()). On the other hand, rtl nanosleep() should only be called from
RT threads and never from GPOS. A single program may use both functions
in different contexts.
of those are rtl open, rtl read, rtl write, rtl ioctl, rtl close. Most
devices currently do not implement blocking IO, and thus require O NONBLOCK
flag to open them. The notable exception is /dev/irq. Commonly available
devices include:
rt_fd=rtl_open("/dev/rtf0",O_WRONLY|O_CREAT|O_NONBLOCK);
After that, rtl write call can be used to put data to the RT-FIFO.
The user side will use ordinary user-space open/read/write functions
to access the FIFO.
RTCore also provides rtl inb and rtl outb functions for accessing x86
IO space.
12.7. EXAMPLE: USER-SPACE PC SPEAKER DRIVER 131
offending system call is being made. In order to do this, one must config-
ure RTCore and enable the option “Put PSDD tasks in a debuggable state
when executing syscalls” and recompile RTCore. With this option enabled
a breakpoint will be inserted anywhere a PSDD task executes a non-PSDD
system call. The programmer can then attach the debugger to the applica-
tion and obtain a source listing, backtrace or any other useful information in
determining where the call was made.
When the same example is run with the above debugging option enabled
the console output looks like:
To debug this, one runs GDB as normal and connects to the debugger
FIFO:
Now you can see that the call occurred in readv(), a part of libc. This
information by itself is not very useful since one already knows that the error
occurred in side libc as the system printed a message notifying us that system
call was made. To find out where that call was made from, a backtrace listing
shows:
136 CHAPTER 12. PSDD
(gdb) back
#0 0x080518c2 in __readv (fd=0, vector=0x0, count=0) at ../sysdeps/unix/sysv/linux/read
#1 0x080481e7 in start_routine ()
#2 0x0804831a in psdd_startup ()
From this one can tell that the function that called readv() was start routine()
which is the function that makes this call in the example.
void rt thread(void*arg) {
/* thread executed in hard real-time */
while (1) {
/* block the execution until the next run */
fsched block();
user code();
}
}
10
int main(int argc, char **argv) {
struct fsched task struct task desc;
application init();
// initialize RT subsystem
fsched init(argc, argv, &task desc, NULL);
12.11. FRAME SCHEDULER 139
The hard real time part of the user process is a PSDD RT thread and
therefore subject to the same restrictions, e.g., it can not use UNIX system
calls or non-reentrant library functions.
The task code itself does not contain any scheduling information. This
information is supplied when attaching a new task to the scheduler via com-
mand line interface. This approach allows the user to change schedules with-
out recompiling.
The power of PSDD can be seen from the fact that the frame scheduler
itself is implemented using hard real time user facilities of PSDD. Thus, quite
complicated real-time applications can be developed using the framework.
The source for the frame scheduler can be found in the PSDD distribution.
fsched create
- create and initialize the frame schedulers subsystem. This command
has to be issued before any other commands can be used. In the
current implementation, this starts the user-space scheduler process,
rtl fsched, and thus the directory containing rtl fsched must be
present in the user’s PATH variable.
fsched delete
- destroy the frame schedulers subsystem
fsched config -mpf minor cycles per frame -dt dt per minor cycle
[ -s sched id ] [ -i interrupt source ]
140 CHAPTER 12. PSDD
fsched debug -p pid - break in the user process ”pid”. The break
happens at the next minor cycle, and all scheduling activity stops. Af-
ter that, it is possible to attach to the process with GDB and perform
source-level debugging. Please refer to the GDB example in the distri-
bution and to the Chapter 7 for more information.
all: engine
include fsched.mk
engine: engine.c
$(USER_CC) $(FSCHED_CFLAGS) -o engine engine.c $(FSCHED_LIBS)
export PATH=$PATH:/directory_that_contains_fsched
fsched create
sleep 1
fsched config -mpf 10 -dt 50
fsched attach -n user1 -rf 3 -smc 1 -p 1
fsched start
This will run the fsched info command every second and display its output
full screen. An example of such a screen is displayed in Figure 12.5.
For each task, fsched info displays execution statistics: last, running
average, min and max execution times in microseconds, total number of
execution cycles, and number of overruns. Percentage of the current CPU
time used by the RT tasks is also displayed.
12.12 Conclusion
PSDD offers a simple means of writing complex real-time code in user space,
while still allowing for the normal RTCore approach of splitting real-time
logic from management code. Users with no knowledge of GPOS kernel
programming can use it for rapid prototyping and deployment of real-time
applications. Others may use it as a testbed for code that will eventually
run in kernel mode.
Chapter 13
13.1 Introduction
During the implementation of controllers and control algorithms, one finds
oneself needing to handle parameter updates and alarms in a well behaved,
controlled manner. Moreover, these may sometimes be handled in the context
of a distributed application, as would be the case in dangerous environments.
For example, a fully automated assembly plant may need to be centrally
monitored and tuned from a remote location.
FSMLabs has addressed this problem by introducing the FSMLabs Con-
trols Kit (CKit). It is a collection of utilities for building control systems and
control interfaces using XML to describe control objects. The Controls Kit
provides software for exporting RTLinux control variables, including meth-
ods for defining composite objects, setting alarms and triggers, updating and
exporting control information to either a local or remote machine. The CKit
makes it easy to develop both the localized and distributed application via
a set of API interfaces and libraries as well as the highly portable XML
document standard.
This document provides an overview of the CKit by working through a
simple PID example. For more in depth documentation, please refer to the
CKit manual.
143
144 CHAPTER 13. THE CONTROLS KIT (CKIT)
13.2.1 Overview
The design of our example is as follows. There are two threads of inter-
est. The first, and lowest priority thread, is the Linux thread. The second,
and higher priority thread, is our control thread. The context of the lower
priority thread (Linux) will be used to initialize the hardware, initialize the
PID parameters, and also to run our trigger function which will be used to
command a RUN or STOP to our task.
The program will sit idly once loaded into kernel memory. Then, as soon
as the “Run” parameter is updated to TRUE, our program will begin exe-
cution on a periodic basis until the “Run” parameter is updated to FALSE.
13.2. CKIT BY EXAMPLE 145
At the core of the entire CKit design is the idea of entities. For this
program, we thus need three entities. These entities are listed as:
1. Test: This is our toplevel “group” entity. Its job is to group all of the
underlying entities into one coherent group for ease of manipulation.
13.2.2 Coding
Declarations
The global declarations and includes needed for this program are rather
straightforward:
First, we include the appropriate libraries which correspond to the core
RTLinux, the core CKit, and the FSMCL library functions, respectively. We
also define simple constants for use within our program:
#include <rtl.h>
#include <ck_module.h>
#include <FSMCL_core.h>
#ifndef FALSE
#define FALSE (0)
#define TRUE (1)
#endif
Next, we declare our entities for the PID, the boolean, and the test group,
respectively:
Finally, we make the necessary declarations for the runtime, high priority
thread, and the trigger function which will do the job of starting or stopping
the runtime task:
CK_group_init(&TestGroup,
"Test",
"Test Group for PID Example",
NULL);
Second, we initialize the PID entity, and assign it to belong to the “Test”
group by passing it the pointer to the “TestGroup” entity for its parenthood:
FSMCL_PID_init(&MainPID,
"PIDControl",
"Test PID controller",
&TestGroup);
Next, we initialize the runtime boolean, assign it to the Test group entity,
set it so that by default it will be set to FALSE, and associate the function
“trigger function()” to it, so that next time that this variable is updated,
the given function will run in the context of the Linux thread. We pass a
void pointer to this function which points to the boolean variable itself so
that it can be used within the function.
13.2. CKIT BY EXAMPLE 147
CK_boolean_init(&RunBool,
"Run",
"Set to 1 to run, 0 to stop",
&TestGroup,
FALSE);
CK_execute_on_update(&RunBool,
triggerFcn,
(void *)&RunBool);
We now have an option here. We can either initialize all of the parameters
for the PID controller here, or we can do it from the command line through
a script prior to running. For this example, we choose to do it within the
program as such:
Finally, we initialize our main thread using the usual RTLinux API call:
which simply creates a thread, whose entry point is the function threadSrc().
We want to create a well behaved program. Consequently, we must
cleanup after ourselves. Thus, in the cleanup routine (cleanup module()),
we must destroy both our entities and also the main thread. To do so, we
simply type:
148 CHAPTER 13. THE CONTROLS KIT (CKIT)
Note that in this example, all we did was to cleanup the toplevel entity which
automatically cleans up all of its children.
Trigger Function
Next, we design the trigger function. As you may recall, each time that
the runtime flag “RunBool” is updated from the command line, this trigger
function will execute. Each time that it executes, we shall – depending on
the value of the variable – take appropriate actions. Thus, the entire trigger
function is written as:
if (arg==NULL){
return;
}
if (val){
CK_message("Commencing control run");
pthread_wakeup_np(testThrd);
} else {
CK_message("Controller shut down by user");
}
}
down after the next iteration. Notice that we could have added additional
if-then statements here that would detect if there has been a change in the
variable since the last time that this trigger function was run.
Runtime
Our final coding concern is the design of the main thread. Here, we set up two
while loops. The innermost loop is nothing more than your standard periodic
loop as seen in many of the RTLinux example programs. The external while
loop is an infinite loop which will be used to run and stop our program.
Thus, in this design, the outermost loop will first check the value of the
runtime boolean. If the value is false, it will simply suspend the thread and
go to sleep until it is kick started by the trigger function. At that point, it
will initialize the internal values of the PID controller and will immediately
begin execution of the periodic component of the thread. Within the periodic
component, we will sample our A/D boards, calculate the PID control using
the sensed values, and finally write out the control output through our DA
boards.
The periodic component will continue to execute until the runtime boolean
becomes false. As soon as it becomes false, the code will exit the innermost
while loop. And then, prior to once again suspending itself, the code will
convert the thread into a non-periodic thread.
The entire code for the main thread is shown in what follows:
/* Main Loop */
150 CHAPTER 13. THE CONTROLS KIT (CKIT)
while (1) {
if (!CK_scalar_get_boolean(&RunBool)){
pthread_suspend(pthread_self());
}
/* Periodic loop */
while(CK_scalar_get_boolean(&RunBool)){
pthread_wait_np ();
return 0;
}
Thus, at this point, we must enable the CKit daemon which will moni-
tor alarms and commands. This is a daemon with rich functionality, such as
alarm actions, shell execution on behalf of RTLinux programs, and XML/Text
output streams. To start up the CKit daemon, we simply type:
at the command line. Immediately we will see a set of alarms appear which
show all the registered parameters up to that point. In this case, we are
going to trap all alarms of level 10 (Hexadecimal 0xA) so that as soon as
they occur, they shall be emailed to our technician with the actual message
appearing as the subject line. In this example, we are using the special
expansion variables %L,%M, and %R to denote the alarm level, the alarm
message, and the recommended action. For example, a critical alarm with
the message “Startup failure” and recommended action “Did you turn on the
power relay?” would appear in the subject line in techman’s email program
as: “10:Startup failure - Did you turn on the power relay?”. Please refer to
the CKit documentation for complete documentation on the rich capabilities
of this daemon. We can also redirect all output into a pipe, which can then be
networked to a remote machine via netcat or FSMLabs’ XML-RPC interface.
We next have two ways of proceeding. We can use the commnad line pro-
grams ck hrt op or ck hrt op net (the XML/Text utilities used for local
and networked parameter manipulation, respectively) which can be embed-
ded within any scripting language. Alternatively, and for simplicity’s sake,
we can use the standalone graphical front end ck hrt op GUI1 to view and
update the parameters in either local or remote machines. Thus, we start up
the graphical front end by typing:
ck hrt op GUI
Immediately, we obtain the front end as seen in Figure 13.2. Thus, we can
now proceed to view and update any parameters, and when ready, we simply
set the “Run” boolean to “TRUE”.
1
This is actually a Perl program written using the GtkPerl libraries.
152 CHAPTER 13. THE CONTROLS KIT (CKIT)
Figure 13.2: The CK graphical user interface. Notice that this GUI parses
the XML generated by the underlying CK tools.
13.3 Conclusion
You now have the basic tools needed to write your own distributed and local
controller algorithms and still manipulate parameters and alarms in real time.
From here, we highly recommend that you look at the CKit documentation
which more readily describes all of the features and capabilities for each of
the CKit commands.
13.3. CONCLUSION 153
Appendices
155
Appendix A
List of abbreviations
157
158 APPENDIX A. LIST OF ABBREVIATIONS
Terminology
• Frontside Bus : This is the high speed bus that exists between the CPU
and memory.
161
162 APPENDIX B. TERMINOLOGY
• Host Bridge : The host bridge acts as a hub between most major
subsystems in a PC. It acts as an interface between CPUs, memory,
video, and other busses, such as PCI.
n
X Ci 1
< n(2 n − 1)
i=1 Ti
with C being the worst case execution time and T the period of each
task. As the task number n increases, the utilization converges to about
69%, which is not as efficient as other schedulers, but is preferable in
situations requiring static scheduling.
• Busy wait loop : This is the act of waiting for an event in a running
process, using the CPU during the wait. Rather than being put to
sleep and rescheduled when the event occurs, the process spins doing
useless activity during the wait. This saves the overhead of scheduling
another process in and then having to reschedule the first.
• Cache flush : A cache flush involves writing the content of the cache to
memory or to whatever media is appropriate. This is only necessary on
hardware that does not support write through caching, or on SMP sys-
tems when a task moves between CPUs. Generally cache flushes have a
noticable influence on performance, especially for real-time operations.
As the flushed data must be refetched, the resulting delay from a flush
may result in jitter.
• Context Switch : removing the currently running thread from the pro-
cessor and starting a different thread on this CPU. A context switch in
RTCore will only save the state of the integer registers unless floating
point is enabled. (See pthread attr setfp np.)
• Global Variables : Global variables are those that are visible through-
out the application, rather than being restricted to a specific thread.
The variables themselves are not protected against concurrent activity,
and usually require some kind of synchronization primitives to ensure
safe handling.
the assertion of the interrupt (the electric signal being active on the
interrupt pin) to the point where this interrupt service routine is called
is defined as the interrupt response time. In practice the interrupt
response time is the time from asserting the interrupt until the sys-
tem acknowleges it or respond with a noticable action. This time is
therefore a little longer than the ”theoretical” interrupt response time.
• Latency : The time between requesting an action and the actual oc-
currence.
• Multithreaded : A process that has more than one flow of control (In
general, there are also shared resources between these control paths).
has two states: locked and unlocked. Once a mutex has been locked
by a thread all other threads that try to lock it will block until the
thread that acquired the mutex unlocks it. After this one of the blocked
threads will acquire it.
expires it will be preempted by the kernel and the next runnable thread
will be scheduled.
that received the signal. Note that signal handlers are installed at the
process level and not at the thread level, so if a asynchronous signal is
received, it cannot be directed at a specific thread. Only signals issued
from within the process can be sent to specific thread thread IDs that
exist within that specific process.
• Sigaction : The sigaction call controls the actions taken upon reception
of a given set of signals. It sets up signal handlers for the action, among
other things.
• Synchronous Signal : Any signal that is the result of the threads ac-
tion, and occurs in direct reaction to that action. This is opposed to
asynchronous signals, which may arrive at any time and may not be
related to a thread action. An example of a synchronous signal would
be a thread that does a division by zero causing an FPE INTDIV. Syn-
chronous signals are delivered to the process that caused the it and not
to the specific thread.
C.1 Layout
The installation guide should walk you through the specifics of each directory,
so we will focus only on the important ones here:
1. bin - This is where all of the tool binaries exist. Make sure that the full
path to this directory is first in your $PATH variable, so that you use
these tools before any others in your path. (Running gcc -v should
report information including the /opt/rtldk-x.y path if this is con-
figured properly.
173
174 APPENDIX C. FAMILIARIZING WITH RTLINUXPRO
There are many other directories, but these are central to our uses here.
Also of note is the docs directory, which contains API documentation and
more information on getting started with RTCore.
kernel. This procedure is the same as with any other kernel, and as the proce-
dure is beyond the scope of this book, we suggest the normal Kernel-HOWTO
for details. Essentially, it involves changing to the rtlinux kernel 2 4 di-
rectory, building a kernel image suited to your device needs (or using the
provided stock image), and installing that image. This may be a local LILO
or GRUB update, or it might be a matter of making the image available for
TFTP by an embedded board. Again, we assume a self hosted development
environment for this example.
Once the system is running the correct kernel, RTCore can be loaded
with the following commands:
cd /opt/rtldk-x.y/rtlinuxpro
./modules/rtcore &
This will load the RTCore OS found in the installation, which will vary
based on any additional components installed. Unloading the OS consists of:
killall rtcore
ppc6xx root directory inside the development kit tree, but this name will
vary by architecture.
For a generic x86 system as we described in the installation section, you
will likely use the host filesystem already present. However, if you intend
to use separate systems for development and testing (as advised) or are tar-
getting a different architecture completely, this option should help speed the
development process. For many embedded systems, it is much simpler to NFS
root mount a remote filesystem, at least for testing, rather than rebuilding
an image every time you generate new binary code for the target.
For most distributions, exporting this tree is a very simple exercise,
and is no different than exporting any other NFS mount point. Edit your
/etc/exports and run exportfs -a, and the tree will be available to the
embedded system. In many environments, it is also advisable to simply have
the device retrieve its kernel image from DHCP, and build the image such
that it automatically mounts the root filesystem from your development ma-
chine. If this is useful for your environment, the kernel build offers an option
to build the boot parameters in as automatic arguments to the bootstrap
process. For example, under a PowerPC build, under the kernel’s ’General
setup’ option, you can set the boot options to be (as one variable):
The setting defined here sets the root filesystem to be NFS, and that this
root system lives on the machine at 10.0.0.2, under ${RTLDK ROOT}/ppc6xx root.
Be sure to replace ${RTLDK ROOT} with the correct location of your de-
velopment kit. The IP setting configures the device to use bootp in order to
configure itself, although there are many options that may be used in order to
configure the interface. These arguments are built into the kernel image, and
are passed as normal parameters to the boot process during a TFTP-based
boot, just as if they were typed in at a LILO prompt. For more information
on these options, refer to Documentation/nfsroot.txt inside the Linux ker-
nel tree. Many users need read/write access to the root filesystem, at least
for testing. Add a rw after the root=/dev/nfs to use this, or remount your
NFS root as read/write on the target, with:
Once these options are configured and your remote device is using the
NFS mount as its root filesystem, you can do all development on the host
C.4. SUMMARY 177
machine with the development kit, and move the resulting images under
the NFS mount point. For simplicity, it is often useful to simply copy the
rtlinuxpro directory from the kit under the NFS root mount. While some
of these pieces should be removed for the final system, this simple copy will
allow access to all of the targetted real-time code needed for the embedded
device.
C.4 Summary
This chapter on development kit usage might come across as being rather
light, and there is a reason for that. The development kit is intended to be
simple to use, and to allow a programmer to install a stable build environment
for producing real-time code. As such, this involves installing the kit, placing
the tools in your path, and then using the various components (such as the
root filesystem and modules) as needed. The intent is that configuration and
use is as simple as possible, allowing the programmer to concentrate on the
task at hand, and not have to be distracted by development tool problems
in the build environment. Specific details such as board configuration for
network boot is described in more detail in the devkit manual.pdf document
provided with RTLinuxPro.
178 APPENDIX C. FAMILIARIZING WITH RTLINUXPRO
Appendix D
This is an overview with some usage examples that might be helpful when
working with RTCore, and most UNIXes in general.
bunzip2
The bzip2 and bunzip2 command are for compressing and decompressing
.bz2 files. Bzip2 offers better compression rates than gzip, and is becoming
more popular on FTP sites and other distribution locations.
bunzip2 linux-2.4.0.tar.bz2
Decompress the compressed archive.
bzip2recover file.bz2
Recover data from a damaged archive.
bunzip2 -t file.bz2
Test if the file could be decompressed, but don’t do it.
dmesg
The kernel logs important messages in a ring buffer. To view the contents
of this buffer you can use the dmesg command.
dmesg
Dump the entire ring-buffer content to the terminal.
dmesg -c
Dump it to the terminal and then clear it.
dmesg -n level
Set the level at which the kernel will print a message to the console. Set-
179
180 APPENDIX D. IMPORTANT SYSTEM COMMANDS
ting dmesg -n 1 will only allow panic messages through to the console, but
all messages are logged via syslog.
find
find can be used to find a specific fileset in a directory hierarchy, and
optionally execute a command on these files.
find .
List all files in the current directory and below.
find . -name "*.[ch]"
List all files in the directory and below that end in .c or .h (c-sources and
header files)
find . -type f -exec ls -l {} ;
Find all regular files an display a long listing (ls -l) of them.
find . -name "*.[ch]" -exec grep -le min ipl {} ;
List all files in the directory hirarchy that contain the string ” min ipl”
in them.
find /usr/src/ -type f -exec grep -lie MONOTONIC {} ;
List all files below /usr/src/ that contain the string MONOTONIC, using
a non case sensitive search. (MONoToniC will also match.)
grep
grep is for searching strings using regular expressions. Regular expres-
sions are comprised of characters, wildcards, and modifiers. Refer to the grep
man page or a book on regular expression syntax for details.
gunzip
The gunzip command will decompress .gz files. You will not need any
options for decompression. For compressing files use gzip.
gunzip FILE.gz
Decompress FILE.gz which will rename it to FILE in the process.
gunzip -c FILE.gz
Decompress FILE.gz and send the decompressed output to standard out-
put, and not to a file.
gzip FILE
Compress FILE renaming it to FILE.gz with the default compression
speed.
gzip -9 FILE
Compress FILE with the best compression ratio. (This will be slow)
init
Init is the master resource controller of a SysV-type Unix system. While
testing an RTCore system it is advisable to do this in init or runlevel 1, which
is a single user mode without networking and with a minimum set of system
resources.
init 1
Put the computer into runlevel 1. (No networking, single user mode)
init 2
After tests in init 1 ran successfully, bring the box back up to a multiuser
networking system. This need not be runlevel 2, and will vary depending on
which UNIX you are running. Check /etc/inittab to see which runlevel
is the system default runlevel. It should be safe to run it back up to the
runlevel set as default.
init 6
Reboot the system.
init 0
Halt the system.
locate
Many but not all Linux systems have the locate database available, which
caches all filenames on the system and make it easier to locate a specific file.
locate irq.c
182 APPENDIX D. IMPORTANT SYSTEM COMMANDS
List all files on the system that have irq.c in them. (alpha irq.c and
irq.c.rej will also match.)
locate rtlinux | more
If the search is too general, output will be more than a screen. By piping
the output into the ”more” program a paged listing is displayed.
make
GNU make is one of the primary tools of any development under modern
UNIXes. Given a makefile, Makefile, or GNUmakefile, which are the default
name make will look for, make will build a source tree, resolving dependan-
cies based on the information and macros given in the makefile.
make -f GNUmakefile
This will run make with the provided my makefile, if the name isn’t one
of the default names that GNU make will search for.
make -n
This will instruct make only to report what it would do, but will not
actually process any source files.
make -k
Normally make will terminate on the first fatal error it encounters. With
the -k flag make can be forced to continue. This makes sense if within a
source tree multiple independant executables are to be built, and one wants
to build the rest even if the first fails.
make -p -f /dev/null
Show the database settings that make will apply by default without actu-
ally compiling anything. This will list all implied rules and variable settings.
objdump
objdump allows you to view symbol information in object files, such as
kernel modules. It also allows you to disassemble object files. This is helpful
when trying to locate what could be causing system hangs with a module.
The output is not very user friendly, but if short functions were used it should
not be too hard to read. If long functions with many flow control statements
were used, it can be close to unreadable.
tar
Archives ending in .tar (Compressed tar files will end in .tar.gz tar.bz2
or .tgz) can be unpacked with tar. To make this operation safe, check what
183
uname
To get the exact system name of the running kernel, use the uname com-
mand. A common problem is that one has the wrong kernel running and
runs into ”funny” problems this way, such as symbol problems on module
load. Running uname should clear up any question of what kernel is active.
uname
Print the system type. (e.g. ”Linux”)
uname -m
Print the system hardware type. (e.g. ”i586”)
uname -r
Print the kernel release name of the running system. (e.g. 2.4.16-rtl)
uname -a
Print the full system string, dumping all known information about the
running kernel.
184 APPENDIX D. IMPORTANT SYSTEM COMMANDS
Appendix E
Things to Consider
185
186 APPENDIX E. THINGS TO CONSIDER
There may be a lot of flexibility here, depending on need. For a very high
performance application, the range of possible architectures may be limited
to a small handful of target systems. But for others, such as a few low
frequency sampling threads, a much slower system will likely be cheaper and
still provide ample resources.
A prime example of this is the National Semiconductor Geode processor.
While it is x86-compatible, many operations are virtualized on the chip,
meaning that performance may degrade during certain time windows. (Video
and audio are two known problem areas.) While the chip goes into System
Management Mode (SMM) to handle this activity, hardware-induced jitter
may spike as high as 5 milliseconds. For many applications, this is the kiss
of death - but others may be fine with this level of jitter. For these lower
bandwidth applications, the Geode is a cheap x86-compatible solution for
the field, and the jitter is within specification.
It is because of these situations that FSMLabs recommends evaluation
and testing with the RTLinuxPro test suite, followed by hard analysis of
application demands. If the target hardware will suit the application, it may
not matter if there is potentially 5 millisecond jitter - in this case, a Geode
is perfectly suitable. The important part is that requirements are built and
understood so that the proper hardware and software configuration can be
selected.
System Testing
When selecting a platform for RTCore the only way to know if it will really do
the job for you is to test on the actual hardware. While the test environment
does not have to exactly mirror the target environment, the closer you get to
the final system, the more reliable the results will be. In general the outcome
of these tests provide answers to three essential questions:
This will not eliminate the requirement to evaluate the final system setup
you wish to deploy, but it will minimize the risk of running into hardware
related problems during project development.
189
190 APPENDIX F. SYSTEM TESTING
bash scripts/regression.sh
This will then run a set of tests, which MUST all return the status [ OK
]. If any of the test fail, contact support@fsmlabs.com. If the first test passed
without any errors, running the regression test for a while is generally helpful
also. To run the test in an infinite loop issue the following command again
from the rtlinuxpro directory:
bash scripts/long_regression.sh
(This is the normal regression script run in an infinite loop, printing the
number of runs completed as it goes.)
make dep
make clean
make -j 60
This will add to the thrashing of the GPOS VM. Increase the number
of find processes running, preferably staggered in time so that the buffer
cache is cycled through. Add other applications until swapping is induced,
F.2. JITTER MEASUREMENT 191
and the system is under heavy load. For SMP machines, it helps to have
more instances running, as each CPU thrashes over the PCI bus. For some
embedded boards, running make on the kernel is not feasible, but a high
number of finds is a good approximation, when done in conjunction with the
next step.
Finally, run a ping flood (ping -f machine) from another machine on
the network, at least over a 100Mbit wire. This, in addition to the disk work,
will put the machine under heavy interrupt load. Feel free to add more work,
as RTCore will handle the load. It is important that you determine what
your hardware is capable of doing with respect to real-time demands.
Many test applications from other vendors do very short tests, either in
time or number of interrupts (some as short as a minute). Due to potential
cache interactions and other factors, it is important that a test machine be
placed under load for a long time, preferably days. FSMLabs performs all
testing under heavy load for a period of at least 48 hours before releasing
any kind of performance numbers.
Sample programs
Here we’ve collected the source code for all of the examples used in the book.
They are also be provided in the RTLinux
r /Pro distribution.
int main(void)
{
printf("Hello from the RTL base system\n");
return 0;
}
G.2 Multithreading
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
#include <rtl posixio.h>
pthread t thread;
193
194 APPENDIX G. SAMPLE PROGRAMS
{
struct timespec next; 10
int count = 0;
while ( 1 ) {
timespec add ns( &next, 1000*1000 );
clock nanosleep( CLOCK REALTIME, TIMER ABSTIME,
&next, NULL);
count++;
if (!(count % 1000)) 20
printf("woke %d times\n",count);
}
return NULL;
}
int main(void)
{
pthread create( &thread, NULL, thread code, (void *)0 );
30
rtl main wait();
return 0;
}
G.3 FIFOs
G.3.1 Real-time component
#include <stdio.h>
G.3. FIFOS 195
#include <pthread.h>
#include <unistd.h>
#include <sys/mman.h>
#include <rtl posixio.h>
pthread t thread;
int fd1;
while ( 1 ) {
timespec add ns( &next, 1000*1000*1000 );
return NULL;
}
int main(void)
{
mkfifo( "/communicator", 0666); 30
ftruncate(fd1, 16<<10);
40
pthread cancel( thread );
pthread join( thread, NULL );
close( fd1 );
unlink( "/communicator" );
return 0;
}
G.4 Semaphores
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
#include <rtl posixio.h>
G.4. SEMAPHORES 197
#include <posix/semaphore.h>
while ( 1 ) {
timespec add ns( &next, 1000*1000*1000 );
return NULL;
}
int main(void)
{
sem init(&sema, 1, 0); 40
sem destroy(&sema);
return 0;
}
p.sched priority = 1; 20
pthread setschedparam(pthread self(), SCHED FIFO, &p);
p.sched priority = 1;
pthread setschedparam(pthread self(), SCHED FIFO, &p);
ftruncate(wfd,MMAP SIZE);
close(wfd);
close(rfd);
shm unlink("/dev/rtl_mmap_test"); 100
return 0;
}
int main(void)
{
int fd;
unsigned char *addr;
if ((fd=open("/dev/rtl_mmap_test", O RDWR))<0) {
perror("open");
exit(−1);
} 20
addr = mmap(0, MMAP SIZE, PROT READ, MAP SHARED, fd, 0);
if (addr == MAP FAILED) {
printf("return was %d\n",errno);
perror("mmap");
exit(−1);
}
202 APPENDIX G. SAMPLE PROGRAMS
while (1) {
printf("userspace: the rtl shared area contains" 30
" : 0x%x, 0x%x, 0x%x, 0x%x\n",
addr[0], addr[1], addr[2], addr[3]);
sleep(1);
}
close(fd);
return 0;
} 40
pthread t thread;
pthread mutex t mutex;
return 0; 20
}
204 APPENDIX G. SAMPLE PROGRAMS
pthread cancel(thread1);
pthread join(thread1, NULL);
pthread cancel(thread2);
pthread join(thread2, NULL);
rtl gpos free(thread stack);
}
pthread t thread;
static int our soft irq;
p . sched priority = 1;
pthread setschedparam (pthread self(),
SCHED FIFO, &p);
#include <rtl.h> 20
#include <rtl pthread.h>
#include <rtl unistd.h>
#include <rtl ioctl.h>
#include <rtl fifo.h>
#include <rtl time.h>
#include <rtl signal.h>
#include <rtl errno.h>
#include <stdio.h>
#include <arch/rtl io.h>
#include <sys/fcntl.h> 30
#define FIFO NO 3
#define RTC IRQ 8
int fd fifo;
int fd irq;
rtl pthread t thread;
char save cmos A;
char save cmos B;
40
static int filter(int x) {
static int oldx;
int ret;
if (x & 0x80) {
x = 382 − x;
}
G.10. PSDD SOUND SPEAKER DRIVER 209
char devname[30];
sprintf(devname, "/dev/rtf%d", FIFO NO); 80
fd fifo = rtl open(devname, O WRONLY|O CREAT|O NONBLOCK);
if (fd fifo < 0) {
rtl printf("open of %s returned %d; errno = %d\n",
devname, fd fifo, rtl errno);
return −1;
210 APPENDIX G. SAMPLE PROGRAMS
}
rtl ioctl (fd fifo, RTF SETSIZE, 4000);
return 0;
}
213
214 APPENDIX H. THE RTLINUX WHITEPAPER