You are on page 1of 1

Watchdog timers

In many data-acquisition applications the PC must communicate with some external entity such as an intelligent data-logging module or a programmable logic controller. In these cases it can be useful for both components of the system to be aware of whether the other is functioning correctly. There are a number of ways in which the state of one subsystem can be determined by another. A program running on a PC can close a normally open contact to indicate that it has booted successfully and is currently monitoring some process or other. If the PC and relay subsequently lose power, the contact will open and alert external equipment or the operator to the situation. However, suppose that power to the PC remained uninterrupted, but the software failed due to a coding error or memory corruption. The contact would remain closed even though the PC was no longer functional. The system could not then make any attempt to automatically recover from the situation. Problems like this are potentially expensive, especially in long-term data-logging applications where the computer may be left unattended and any system crash could result in the loss of many days worth of data. A watchdog timer can help to overcome these problems. This is a simple analogue or digital device which is used to monitor the state of one of the component parts of a data-acquisition or computer system. The subsystem being monitored is required to refresh the watchdog timer periodically. This is usually done by regularly pulsing or changing the state of a digital input to the watchdog timer. In some implementations the watchdog generates a periodic timing signal and the subsystem being monitored must then refresh the watchdog within a predetermined interval after receipt of this signal. If the watchdog is not refreshed within a specified time period it will generate a time-out signal. This signal can be used to reset the subsystem or it can be used for communicating the timeout condition to other subsystems. The IBM PS/2 range of computers is equipped with a watchdog timer which monitors the computers system timer interrupt (IRQ0). If the software fails to service the interrupt, the watchdog generates an NMI (see Chapter 5). It is worth mentioning at this point that you should avoid placing watchdog-refresh routines within a hardware-generated periodic interrupt handler (e.g. the system timer interrupt). In the event of a software failure, it is possible that the interrupt will continue to be generated at the normal rate! It is sometimes necessary to interface a watchdog timer to a PCbased data-acquisition system in order to detect program crashes or loss of power to the PC. The timeout signal might be fed to a programmable logic controller, for example, to notify it (or the operator) of the error condition. It is also possible to reboot the PC by connecting the timeout signal to the reset switch (present on most PC-compatible machines) via a suitable relay and/or logic circuits. Occasionally, software crashes can (depending upon the operating system) leave the PCs support circuits in such a state of disarray that even a hardware reset cannot reboot the computer. The only solution in this case is to temporarily turn off the computers power. Although rebooting via the reset switch might be possible, the process can take up to two or three minutes on some PCs. It is not always easy for the software to completely recover from this type of failure, especially if the program crash or loss of power occurred at some critical time such as during a disk-write operation. It is preferable for the software to attempt to return to a default operating mode and not to rely on any settings or other information recorded on disk. The extent to which this is feasible will depend upon the nature and complexity of the application.

You might also like