You are on page 1of 6

An Online Prediction Framework for Sensor Networks

Mohsen Mollanoori
Tarbiat Modarres University mollanoori@modares.ac.ir

M. Majid Hormati
Sharif University of Technology mhormati@math.sharif.edu

Nasrollah M. Charkari
Tarbiat Modarres University charkari@modares.ac.ir

Abstract: This paper presents a novel approach to


online prediction in sensor networks based on temporal correlation of sensed data. The proposed approach greatly reduces the amount of data transmitted by sensor nodes and thus increasing network lifetime. Current prediction frameworks in sensor networks use an offline created model to predict the sensed value. In contrast, our approach creates and uses a prediction model in an online manner, no extra buffering or model creation delay is needed. Moreover, the amount of error caused by using our framework is bounded and often negligible.

Keywords: Wireless Sensor Networks, Data


Gathering, Temporal Dimension, Prediction.

1.

Introduction

In wireless sensor networks, nodes are usually supplied by non rechargeable batteries, so the energy available in these nodes is very limited. As a result, reducing energy consumption and prolonging the network lifetime is a very challenging issue in sensor networks. This limited energy is mostly consumed in transmission and reception of data. Hence, reducing the number of transmitted packets will greatly affect the network lifetime. Since sensed data in WSNs reflect the physical attributes existing in the environment, the generated data has both spatial and temporal correlation. Many algorithms employ this property to prolong the network lifetime. In this paper, we propose a novel approach that employs temporal correlation of sensed data to reduce energy consumption of network nodes. The proposed method uses a prediction model to reduce the number of transmitted packets and prolong the network lifetime.

There are two types of data gathering queries in sensor networks. In the first type, the result of the query is updated periodically in the specified time intervals; in this type user specifies the interval which the results should be updated. This type of queries is called continuous or long-running queries. In the second type, queries are answered once [1]. Our proposed method optimizes the energy consumption while answering continuous queries. Many algorithms use in-network query processing and in-network aggregation to reduce energy consumption in WSNs [2], [3], [4]. In-network aggregation is not useful in many cases, e.g. when all the sensed data is requested by user. Our proposed method is applicable in cases where innetwork aggregation is not applicable. Our approach uses two predictors on both source and destination. The user should provide an error threshold . If the prediction error at source node is within the user-provided error threshold, the sensed value will not be transmitted to the destination. Otherwise, the sensed value will be transmitted to the destination. The total error in results is guaranteed to be bounded within the userprovided error threshold . Moreover, all sensed data is available in an online manner with a maximum error of at the sink node (even the data that are not transmitted). In contrast with [5] and [6] which the prediction model is created offline and the model parameters is sent to the sink, no extra buffering or model creation delay is needed. Obviously, any prediction method works better than the case that no prediction is available. The experiment results shows that with an error threshold of 3%, we can extend the network lifetime up to 29 times compared to the case when no prediction is used.

2.

Related Works

Energy-efficient data aggregation is among the first set of research issues being addressed in sensor networks. Directed Diffusion [3] and TAG [2] are among the first works done in this area. These algorithms use in-network data aggregation to save energy and extend the network lifetime. Clustered AGgregation (CAG) is a routing algorithm that leverages spatial and temporal locality of data generated by nodes to reduce the number of transmitted packets. This algorithm forms clusters of nodes sensing similar values within a given threshold (spatial correlation). These clusters remain unchanged as long as the sensor values stay within a threshold over time (temporal correlation). With CAG, only one sensor reading per cluster is transmitted. Thus, CAG provides energy-efficient and approximate aggregation results with small and often negligible and bounded error [7]. Our proposed method differs from these works as it does not try to investigate the spatial correlation among sensed data and it does not use in-network aggregation. Instead, it investigates the temporal correlation among sensed data in a single node and minimizes the number of transmitted messages. TiNA uses a hierarchical in-network data aggregation method to gather sensed data in WSNs. In this algorithm, all leaf nodes and relay nodes in the network send the sensed value to the parent if it has been changed more than a userspecified error threshold in comparison to the last transmitted value. So the results generated by TiNA is not precise[8]. Our proposed method differs from TiNA in two ways; first, it does not do in-network aggregation and second, different prediction models could be applied which can greatly reduce the transmission cost. TEEN is a clustered routing algorithm that uses two different threshold values, the hard threshold and the soft threshold. In this algorithm, nodes continuously sense the environment. If the sensed value reaches the hard threshold, the sensor will switch on its transmitter and send the data for the first time. Next time, the node will send a sensed value if the value is greater than the hard threshold and differs from the last sent value by an amount equal to or greater than soft threshold [9]. In [5] a statistical method is presented that uses prediction modeling to decrease the temporally redundant data transmitted to the sink. A prediction model is fit to the sensor data at each node, and the estimated model parameters are transmitted to the sink node. The prediction model is expected to

perform well for some period of time, the length of which depends on the level of the temporal correlation in the sensor data. To infer the correct time to update the model parameters, statistical hypothesis testing is used. In contrast with this method which uses offline model creation, our model creates the model online. Therefore, no extra delay is forced by our model while gathering the sensed values. In [10] an algorithm for decreasing the energy consumption of approximate in-network data aggregation is proposed. The algorithm adaptively redistributes the error thresholds to those nodes that benefit the most and tries to minimize the total number of transmitted messages in the network. The authors of [11] proposed new approaches for linear forecasting in sensor networks. The proposed approaches has O(1) space and time complexity, so feasible to be used in sensor network applications.

3. System Model
Our proposed method assumes a network consisting of a set of sensor nodes (N) and one or more sink nodes. Each sensor node is equipped with m sensors, S1, S2, , Sm, sensing the environment. Each sensor node performs a reading from sensor Si every time units and sends it to the sink node. Our approach makes no assumption of the routing algorithm or the network topology, so any routing algorithm that transmits a single reading value from source node to the sink is acceptable. We assume independent streams from each sensor node to the sink, i.e. no aggregation is done in relay nodes. With this assumption, the system model could be seen as a set of source-destination pairs (Figure 1). The transmission channel may be single-hop or multi-hop.

Source

Destination

Transmission Channel

Figure 1- A single source-destination pair connected with transmission channel

In this paper we focus on answering queries of the form:

SELECT sensor_name FROM SENSORS EPOCH DURATION VALUES WITHIN %

The query requests all values of sensor_name and the results must be updated every time units and is the error threshold acceptable for results.

That is obvious that the amount of error caused by predicting any value at the sink is less than the user-specified error threshold, because otherwise the sensed value is sent by the node to the sink.
Sensor Node: : user-specified error threshold s: value read from sensor t: current epoch p: the predictor v = p.predict(t) if (|s - v| / |s|) > p.learn(s, t) send_to_sink(s) end if Figure 2 - The pseudo code for sensor nodes

4. Proposed Method
The proposed method uses two predictors on source and destination (sink) nodes. The key idea is to maintain a same state at the two ends of the channel for the two predictors. In addition to keeping the predictors in sync, the overhead of communication should be acceptable. To do so, we propose that the predictors learn from those values that they cannot predict correctly. In the following lines we explain the idea more precisely. Two different operation is defined on a predictor p. By p .learn(v, t) the predictor p will be learned by the given sensed value v at given time epoch t and by v = p.predict(t) the predictor p predicts the value of v for the time epoch t. An appropriate predictor should satisfy the following properties: i. It must learn only from the values which could not be predicted correctly. ii. It should be lightweight enough, i.e. the operation should finish in O(1) time and space complexity. iii. It must be deterministic, i.e. any instance of the predictor starting from a same initial state, must predict a same value for a give time epoch t and a learning sequence L = [(v1, t1), (v2, t2), , (vn, tn)]. At any time epoch t, each node works as follows: First it senses a value s from its sensor and predicts a value v using its predictor. If the sensed value s differs from predicted value v more than the userspecified threshold, the node learns its predictor using the value s and sends it to the sink. Otherwise the predictor is not learnt and nothing will be sent. Figure 2 shows the pseudo code for the algorithm runs on the sensor nodes.

Sink: t: current epoch ri: value received from sensor i or null if nothing has received pi: the predictor of sensor i vi: the sensed value at sensor i with maximum error for each node i if ri <> null p.learn(ri, t) vi = ri else vi = p.predict(t) end if end for Figure 3 - The pseudo code for the sink

5. Experiments
To analyze the effectiveness of our approach, we have done some experiments. We used a simple Shortest Path Tree (SPT) routing algorithm for the experiments. In SPT every node sends data along the shortest path to the sink node. To build a SPT every node selects the neighbor as the return path that minimizes the routing cost to the sink[12]. We have used real traces of sensor data of Intel Berkeley data set [13] in our experiments. The data set contains sensor reading from 54 sensor nodes deployed in Intel Berkeley research lab. The readings consist of sensed values for temperature, humidity, light, and voltage. Because of the remarkable number of missed data in the light sensor readings, we exclude them from our experiments.

At the sink, there is a predictor pi corresponding to each node ni. If a value is received from a node ni the corresponding predictor will be learnt and the received value will be assumed as the sensed value at ni, otherwise it uses pi to predict the sensed value of ni. Figure 3 shows the pseudo code of the running algorithm at the sink.

5.1. Predictors
Here we introduce three heuristic predictors which are used in our experiments. The predictors have the necessary properties that we mentioned earlier. The value vc will be predicted at a given time epoch tc according to a learning sequence L = [(v1, t1), (v2, t2), , (vn, tn)], where t1 < t2 < < tn < tc as follows: Single Point Predictor (SPP): This predictor always returns the last value of learning sequence, as shown in equation (1). The predictor works well when there is high temporal locality in sensed data. vc = SPP(tc) = vn (1)

in our simulations is short, we set = 2 in all cases (free space model). As in [14] we used C = 50nJ / bit and K = 10 pJ / bit .

5.3. Simulation Results


We compared our approach using the mentioned predictors with the case when all data is sent to the sink (no prediction). To evaluate the approach we have used three different metrics. Lifetime: The lifetime of the network is assumed as the time of the first node failure, as in [15] and [16]. Figure 4 shows the lifetime of the network for different error threshold values, for different sensors (Voltage, Temperature, Humidity). The approach has prolonged the network lifetime up to 29 times for voltage sensor using SPP for an error threshold of 3%, thats because the values for voltage readings is almost stationary with small changes. The two other predictors perform better in temperature and humidity prediction. The lifetime of the network for temperature and humidity is prolonged up to 10 times using ELE and SEL predictors respectively. Average Energy Consumption (AEC): The AEC of nodes is the total energy consumed by nodes divided by the number of nodes. The SPP reduced the AEC of readings of voltage for more than 180 times. The AEC of temperature and humidity is reduced by up to 14.9 and 16.5 times respectively. Figure 5 shows the comparison of AEC for different error threshold and different predictors. Quality of Data (QoD): Although a maximum error threshold is provided by user, but the generated values by our approach has usually very less error average. We defined QoD of the data as Equation (5):

Simple Linear Extrapolation (SLE): The predictor creates a tangent line at the end of the learning sequence and extends it beyond that limit. Equation (2) shows the formula to obtain vc. The predictor will provide good results when the sensed data is linear predictable.

vc =

tc t n1 (vn vn1 ) + vn1 t n t n1

(2)

Even Linear Extrapolation (ELE): This prediction model is a mixture of SPP and SLE. This model is mostly useful when most parts of data is linearly predictable but rapid changes occur in some parts of sensed values (jumps). Equation (3) shows the formula to obtain vc.

vn , n is odd vc = t c t n1 (vn vn1 ) + vn1 , n is even t n t n1

(3)

5.2. Energy Model


The energy model used in our experiments is the same as the one used in [14]. In this model, the energy consumed to transmit a packet to a distance d is:

QoD =

1
i

S i Vi Si N

(5)

Ed = C + Kd

(4)

As both of simulated algorithms are hop by hop and the distance between transmitter and receiver

Where Si is a sensor reading and Vi is the value generated by our approach. Figure 6 shows the QoD plots for different sensor readings and different error threshold values. For a error threshold of 3%, the QoD is about 99% for all three sensor reading.

Figure 4 The lifetime of the network (vertical axis) for different values of error threshold (horizontal axis), using different predictors. Left to right: Voltage, Temperature, and Humidity.

Figure 5 The average energy consumption (vertical axis) of the nodes for different values of error threshold (horizontal axis), using different predictors. Left to right: Voltage, Temperature, and Humidity.

Figure 6 The Quality of Data (vertical axis) for different values of error threshold (horizontal axis), using different predictors. Left to right: Voltage, Temperature, and Humidity.

6. Conclusion
Transferring all the sensed data to the sink has high energy overhead in sensor networks. Furthermore, as sensor nodes have limited bandwidth, transferring all sensed data in a short time interval may greatly increase the collision of packets. In this paper we proposed a framework for online prediction in sensor networks. With a bounded and often negligible error, the framework could greatly extend the network lifetime.

Further activities can be done on the method to support in-network aggregation and developing adaptive predictors.

Acknowledge
This research was funded in part by Iran Telecommunication Research Center (ITRC).

References
[1] R. Gummadi, X. Li, R. Govindan, C. Shahabi, and W. Hong, "Energy-efficient data

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9] [10]

[11]

[12]

[13] [14]

organization and query processing in sensor networks," in Data Engineering, 2005. ICDE 2005. S. Madden, M. J. Franklin, J. Hellerstein, and W. Hong, "TAG: a Tiny AGgregation Service for Ad-Hoc Sensor Networks," in USENIX Association 5th Symposium on Operating Systems Design and Implementation, 2002. C. Intanagonwiwat, R. Govindan, and D. Estrin, "Directed Diffusion: A Scalable and Robust Communication Paradigm for Sensor Networks," 2000. S. Pattern, B. Krishnamachari, and R. Govindan, "The Impact of Spatial Correlation on Routing with Compression in Wireless Sensor Networks," in IPSN Berkeley, California, USA, 2004. T. ARICI, T. AKGUN, and Y. ALTUNBASAK, "A Prediction Error-Based Hypothesis Testing Method for Sensor Data Acquisition," ACM Transactions on Sensor Networks, vol. 2, pp. 529-556, 2006. D. Tulone and S. Madden, "An Energyefficient Querying Framework In Sensor Networks For Detecting Node Similarities," in MSWiM06 Malaga, Spain: ACM, 2006. S. YOON and C. SHAHABI, "The Clustered AGgregation (CAG) Technique Leveraging Spatial and Temporal Correlations in Wireless Sensor Networks," ACM Transactions on Sensor Networks, vol. 3, 2007. M. A. Sharaf, J. Beaver, A. Labrinidis, and P. K. Chrysanthis, "TiNA: A Scheme for Temporal Coherency-Aware in-Network Aggregation," in MobiDE03 California, USA: ACM, 2003. A. Manjeshwar and D. P. Agrawal, "TEEN: A Routing Protocol for Enhanced Efficiency in Wireless Sensor Networks," IEEE, 2001. A. Deligiannakis, Y. Kotidis, and N. Roussopoulos, "Hierarchical In-Network Data Aggregation with Quality Guarantees," Springer-Verlag, 2004. J.-J. Lim and K. G. Shin, "Energy-efficient Self-adapting Online Linear Forecasting for Wireless Sensor Network Applications," IEEE, 2005. M. Mollanoori and N. M. Charkari, "LAD: A Routing Algorithm to Prolong the Lifetime of Wireless Sensor Networks," in ICNSC08 China: IEEE, 2008 (to appear). "Intel Lab Data, http://berkeley.intelresearch.net/labdata." W. B. Heinzelman, A. P. Chandrakasan, and H. Balakrishnan, "An Application-Specific Protocol Architecture for Wireless Microsensor Networks," IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, vol. 1, 2002.

[15]

[16]

S. Panichpapiboon, G. Ferrari, and O. K. Tonguz, "Optimal Transmit Power in Wireless Sensor Networks," IEEE TRANSACTIONS ON MOBILE COMPUTING, vol. 5, 2006. W. Liang and Y. Liu, "Online Data Gathering for Maximizing Network Lifetime in Sensor Networks," IEEE TRANSACTIONS ON MOBILE COMPUTING, vol. 6, 2007.

You might also like