You are on page 1of 8

Expert Systems with Applications 37 (2010) 3310–3317

Contents lists available at ScienceDirect

Expert Systems with Applications


journal homepage: www.elsevier.com/locate/eswa

A hybrid of nonlinear autoregressive model with exogenous input and


autoregressive moving average model for long-term machine state forecasting
Hong Thom Pham, Van Tung Tran, Bo-Suk Yang *
School of Mechanical Engineering, Pukyong National University, San 100, Yongdang-dong, Nam-gu, Busan 608-739, South Korea

a r t i c l e i n f o a b s t r a c t

Keywords: This paper presents an improvement of hybrid of nonlinear autoregressive with exogenous input (NARX)
Autoregressive moving average (ARMA) model and autoregressive moving average (ARMA) model for long-term machine state forecasting based
Nonlinear autoregressive with exogenous on vibration data. In this study, vibration data is considered as a combination of two components which
input (NARX) are deterministic data and error. The deterministic component may describe the degradation index of
Long-term prediction
machine, whilst the error component can depict the appearance of uncertain parts. An improved hybrid
Machine state forecasting
forecasting model, namely NARX–ARMA model, is carried out to obtain the forecasting results in which
NARX network model which is suitable for nonlinear issue is used to forecast the deterministic compo-
nent and ARMA model are used to predict the error component due to appropriate capability in linear
prediction. The final forecasting results are the sum of the results obtained from these single models.
The performance of the NARX–ARMA model is then evaluated by using the data of low methane compres-
sor acquired from condition monitoring routine. In order to corroborate the advances of the proposed
method, a comparative study of the forecasting results obtained from NARX–ARMA model and traditional
models is also carried out. The comparative results show that NARX–ARMA model is outstanding and
could be used as a potential tool to machine state forecasting.
Ó 2009 Elsevier Ltd. All rights reserved.

1. Introduction proposed in literature in which model-based techniques and


data-driven based techniques were commonly utilized. Model-
Machine state forecasting gradually plays an important role in based techniques are applicable to where the accurate mathemat-
modern industry due to its ability to foretell the states of machine ical models can be constructed based on the physical fundamentals
in the future. This provides the necessary information for system of a system, whilst data-driven based techniques utilize and re-
operators to implement the essential actions in order to avoid quire large amount of historical failure data to build a forecasting
the catastrophic failures, which lead to a costly maintenance or models that learn the system behavior. Obviously, data-driven
even human casualties. Moreover, foretelling the states of machine based techniques are inaccurate in comparison with model-based
enables maintenance action to be scheduled more effectively, techniques in prediction capability. However, data-driven based
avoids unplanned breakdown, assists maintainers in estimating techniques, which are frequently based on artificial intelligence,
the remaining useful life, provides alarms before a fault reaches can flexibly generate the forecasting models regardless of the com-
the critical levels to prevent machinery performance degradation plexity of system. Therefore, these techniques that some of those
and malfunction (Liu, Wang, & Golnaraghi, 2009), etc. Conse- have been proposed in Liu et al. (2009), Tran, Yang, Oh, and Tan
quently, machine state forecasting has been considerably attracted (2008), Vachtsevanos and Wang (2001), Wang (2007), Wang, Gol-
the attention of researchers in the recent time. naraghi, and Ismail (2004) are the first selection of researchers’
In order to predict the future states of machine, the forecasting investigations.
model uses the available observations that are generated from An alternative approach to ameliorate the predicting capability
measured data by using appropriate signal processing techniques. in time series forecasting is the combination of model-based and
The measured data could be vibration, acoustic, oil analysis, tem- data-driven based techniques. According to Zhang (2003), the rea-
perature, pressure and moisture, etc. Among of them, vibration sons for hybridizing these models are: (i) in practice, it is difficult
data is commonly used because of the easy-to-measure signals to determine whether a time series under study is generated from
and analysis. Several forecasting models have been successfully a linear or nonlinear underlying process or whether one particular
method in more effective than the other in out-of-sample forecast-
ing; (ii) data obtained from real-work is purely linear or nonlinear
* Corresponding author. Tel.: +82 51 629 6152; fax: +82 51 629 6150. that neither model-based techniques nor data-driven based
E-mail address: bsyang@pknu.ac.kr (B.-S. Yang).

0957-4174/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2009.10.020
H.T. Pham et al. / Expert Systems with Applications 37 (2010) 3310–3317 3311

techniques can be adequate in modeling and forecasting. Model- multilayer perceptron. The structure of an NARX network is de-
based techniques can adequately capture the linear component picted in Fig. 1.
of time series while data-driven based techniques are highly flex- Basically, NARX network is trained under one out of two
ible in modeling the nonlinear components. Accordingly, numer- models:
ous hybrid models have been depicted to provide the investors Parallel (P) mode: the output is fed back to the input of the feed-
with more precise prediction. For instance, Zhang (2003) combined forward neural network as part of the standard NARX architecture:
autoregressive integrated moving average (ARIMA) model and
^ðt þ 1Þ ¼ ^f ½yp ðtÞ; uðtÞ; W ¼ ^f ½y
y ^ðtÞ; y
^ðt  1Þ; . . . ; y
^ðt  ny þ 1Þ;
neural network model to forecast three well-known time series
sets that were sunspot data, Canadian lynx data and the British uðtÞ; uðt  1Þ; . . . ; uðt  nu þ 1Þ; W ð2Þ
pound/US dollar exchange rate data. Inde and Trafalis (2006) pro-
Series-parallel (SP) mode: the output’s regressor is formed only by
posed a hybrid model including parametric techniques (e.g. ARI-
actual values of the system’s output:
MA, vector autoregressive) and nonparametric techniques (e.g.
support vector regression, artificial neuron networks) for forecast- ^ðt þ 1Þ ¼ ^f ½ysp ðtÞ; uðtÞ; W ¼ ^f ½yðtÞ; yðt  1Þ; . . . ; yðt  ny þ 1Þ;
y
ing the exchange market. A hybrid of ARIMA and support vector
uðtÞ; uðt  1Þ; . . . ; uðt  nu þ 1Þ; W ð3Þ
machines was successfully presented by Pai and Lin (2005) for
predicting stock prices problems. Other outstanding hybrid ap- As mentioned above, NARX network inputs include the regres-
proaches could be found in Rojas et al. (2008), Tseng, Yu, and Tzeng sors of inputs and outputs of system while a time series is one or
(2002), Valenzuela et al. (2008). Most of these hybrid models were more measured output channels with no measured input. Hence,
implemented as a following process: first, the model-based tech- the forecasting abilities of the NARX network may be limited when
nique was used to predict the linear relation, then the data-driven applying for time series data without regressor of inputs. In this
based technique was utilized to forecast the residuals between ac- kind of application, the tapped-delay line over the input signal is
tual values and predicted results obtained from previous step. The eliminated, thus the NARX is reduced to the plain focused time-de-
final results were the sum of results gained each model. Further- lay neural network architecture (Lin, Horne, Tino, & Giles, 1997):
more, these hybrid approaches merely regarded as short-term
^ðt þ 1Þ ¼ f ½yðtÞ; yðt  1Þ; . . . ; yðt  ny þ 1Þ; W
y ð4Þ
prediction methodology.
In this study, an improved hybrid forecasting model is proposed According to Menezes and Barreto (2006), a simple strategy
for long-term prediction the operating states of machine. The pre- based on Takens’ embedding theorem was proposed for solving
diction strategy used here is recursive which is one of the strate- this problem. This strategy allows the computational abilities of
gies mentioned in Sorjamaa, Hao, Reyhani, Ji, and Lendasse the original NARX network to be fully exploited in nonlinear time
(2007). This forecasting model involving nonlinear autoregressive series prediction tasks and is described as following processes:
with exogenous input (NARX) (Leontaritis & Billings, 1985) and Firstly, the input signal regressor, denoted by u(t), is defined by
autoregressive moving average (ARMA) (Box & Jenkins, 1970) is the delay embedding coordinates:
novel in the following aspects: (1) vibration data indicating the
state of machine is divided into deterministic component and error
uðtÞ ¼ ½yðtÞ; yðt  sÞ; . . . ; yðt  ðdE  1ÞsÞ ð5Þ
component that is the residual between the actual data and deter- where dE = nu is embedding dimension and s is embedding delay.
ministic component. NARX and ARMA are simultaneously em- Secondly, since the NARX network can be trained in two differ-
ployed to forecast the former and the latter, respectively. The ent modes, the output signal regressor y(t) can be written as
final forecasting results are the sum of results obtained from single follows:
model; (2) long-term forecasting, which is still a difficult and chal-
lenging task in time series prediction domain, is applied. ysp ðtÞ ¼ ½yðtÞ; yðt  1Þ; . . . ; yðt  ny þ 1Þ ð6Þ
Additionally, the number of observations used as the input for ^ðtÞ; y
yp ðtÞ ¼ ½y ^ðt  1Þ; . . . ; y
^ðt  ny þ 1Þ ð7Þ
forecasting model, so-called embedding dimension, is the problem
where the output regressor y(t) for the SP mode in Eq. (6) contains
often encountered in time series forecasting techniques. Embed-
ny past values of the actual time series, while the output regressor
ding dimension could be estimated by using either Cao’s method
(1997) or false nearest neighbor method (FNN) (Kennel, Brown, &
Abarbanel, 1992). However, FNN method depends on the chosen
parameters wherein different values lead to different results. Fur- u (t )
thermore, FNN method also depends on the number of available
observations and is sensitive to additional noise. Cao’s method z −1
u ( t − 1)
overcomes the shortcomings of the FNN approach and therefore,
it is chosen in this study. z −1 y (t )
u (t − 2)

2. Background knowledge
z −1
2.1. Nonlinear autoregressive model with exogenous inputs (NARX) u ( t − nu )

The NARX model is an important class of discrete-time nonlin-


y ( t − ny )
ear systems that can be mathematically represented as follows:
z −1 z −1
yðt þ 1Þ ¼ f ½yðtÞ; yðt  1Þ; . . . ; yðt  ny þ 1Þ; uðtÞ; uðt  1Þ; . . . ;
uðt  nu þ 1Þ; W ¼ f ½yðtÞ; uðtÞ; W ð1Þ
y (t − 2)
where u(t) 2 and y(t) 2 , respectively represent the input and out- z −1
put of the model at time t, nu P 1 and ny P 1 (ny P nu) are the in- y ( t − 1)
put-memory and output-memory orders, W is a weights matrix, f
is the nonlinear function which should be approximated by using Fig. 1. The structure of NARX with nu inputs and ny output delays.
3312 H.T. Pham et al. / Expert Systems with Applications 37 (2010) 3310–3317

y(t) for the P mode in Eq. (7) contains ny past values of the estimated Eðy1 Þ ¼ Eðy2 Þ ¼    ¼ Eðyt Þ ¼ l
time series. Vðy1 Þ ¼ Vðy2 Þ ¼    ¼ Vðyt Þ ¼ r2 ð11Þ
For a suitably trained network, these outputs are estimates of
Cov ðyt ; ytk Þ ¼ Cov ðytþ1 ; ytkþ1 Þ ¼ ck
previous values of y(t + 1), and should obey the following predic-
tive relationships implemented by the NARX network: From the conditions in the Eq. (11), the covariances are func-
^ðt þ 1Þ ¼ f ½ysp ðtÞ; uðtÞ; W tional only of the lag k. These are usually called autocovariances.
y ð8Þ
The autocorrelations, denoted as qk, can be derived only depend
^ðt þ 1Þ ¼ f ½yp ðtÞ; uðtÞ; W
y ð9Þ on the lag
The NARX networks trained according to Eqs. (8) and (9) are de- Cov ðY ; Y
t tþk Þ E½ðY t  lÞðY tþk  lÞ ck
noted onwards by NARX-SP and NARX-P networks, respectively. qk ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffip ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ ¼ ð12Þ
VarðY t Þ VarðY tþk Þ rky c0
2.2. Autoregressive moving average (ARMA) The autocorrelations considered as a function of k are referred to as
the ACF. Note that since:
ARMA (p, q) prediction model for time series yt is given as ck ¼ Cov ðY t ; Y tk Þ ¼ Cov ðY tk ; Y t Þ ¼ Cov ðY t ; Y tþk Þ ¼ ck ð13Þ
follows:
It follows that ck = ck, and only the positive half of the ACF is usu-
X
p X
q
yt ¼ c þ ui yti þ /j etj þ et ð10Þ ally given.
i¼1 j¼1 In practice, due to a finite time series with N observations, the
estimated autocorrelation can be only obtained. If rk denotes the
where c is a constant, p is the number of autoregressive orders, q is estimated autocorrelation coefficient, the formula to obtain these
the number of moving average orders, ui is autoregressive coeffi- parameters is
cients, /j is moving average coefficients and et is a normal white Pnk
noise process with zero mean and variance r2. l
t¼1 ðY t  ÞðY tþk  lÞ
rk ¼ Pn 2
ð14Þ
Box and Jenkins (1970) proposed three iterative steps to build l
t¼1 ðY t  Þ
ARMA models for time series: model identification, parameter esti-
mation and diagnostic checking. The elaborate information of each Then the partial ACF can be attained as
step could be found in Zhang (2003). In order to determine the or- wt ¼ ðY t  lÞ ¼ U1k wt1 þ U2k wt2 þ    þ Ukk wtk þ et ð15Þ
ders of ARMA model, autocorrelation function (ACF) and partial
autocorrelation function (PACF) are used in conjunction with the
Akaike information criterion. Other selection technique in associ- 3. Improved hybrid model for long-term forecasting
ated with ACF and PACF for estimating the orders of ARMA model
is maximum likelihood estimation (MLE) (Ljung & Box, 1987) Vibration data which is used to indicate the state of machine is
which is used in this study. not easy to be captured due to its complexity. Hence, none of
For a weak stationary stochastic process, the first and second ARMA and NARX is a suitable model for forecasting this kind of
moments exist and do not depend on time: data. By using NARX network, the high noise of this data leads to

Fig. 2. The forecasting process of NARX–ARMA model.


H.T. Pham et al. / Expert Systems with Applications 37 (2010) 3310–3317 3313

difficult convergence if the number of neurons is small or over-fit- 4. Proposed forecasting system
ting if the number of neurons is large. On the other hand, ARMA
model is not able to apply well for the data which includes nonlin- In order to forecast the future states of machine, the proposed
ear components and is inadequate the stationary condition. In this system comprises four procedures sequentially as shown in
paper, an improved hybrid model is proposed in which the vibra- Fig. 3, namely, data acquisition, building model, validating model,
tion data is divided into two components: deterministic and error. and forecasting. The role of each procedure is explained as follows:
The deterministic component x = [x1, x2, . . . , k, . . . , xt1] is obtained
from a time series data y = [y1, y2, . . . , yk, . . . , yt] by using filtering Step 1 Data acquisition: this procedure is used to obtain the
technique, where xk is described as: vibration data from machine condition. This data is then split
into two parts: training set and testing set. Different data is
yk1 þ yk þ ykþ1 used for different purposes in the prognosis system. Training
xk ¼ ; k ¼ 1; 2; 3; . . . ; t  1 ð16Þ
3 set is used for creating the forecasting models whilst testing
set is utilized to test the trained models.
The error component e = [e1, e2, . . . , ek, . . ., et1]is the residual be- Step 2 Building model: Training data is separated into two com-
tween y and x, where ek = yk  xk, k = 1, 2, 3, . . . , t  1. The deter- ponents: deterministic component and error component. They
ministic component is degradation indicator which describes are used to build NARX–ARMA model as the process mentioned
clearly the machine’s health. This component is suitably captured in the previous section.
by NARX network. ARMA model is used to analyze the error compo- Step 3 Validating model: this procedure is used for measuring the
nent which describes the appearance of uncertain parts. The process performance capability.
of m step-ahead prediction using this proposal is shown in Fig. 2 Step 4 Forecasting: long-term forecasting method is used to fore-
cast the future states of machine. The forecasted results are
measured by the error between forecasted values and actual
values in the testing set.

5. Experiments and results

5.1. Experiments

The proposed method is applied to a real system to forecast the


trending data of a low methane compressor of a petrochemical
plant. The compressor shown in Fig. 4 is driven by a 440 kW motor,
6600 V, 2 poles and operating at a speed of 3565 rpm. Other infor-
mation of the system is summarized in Table 1.
The condition monitoring system of this compressor consists of
two types, namely off-line and on-line. In the off-line system,
accelerometers were installed along axial, vertical, and horizontal
directions at various locations of drive-end motor, non drive-end
motor, male rotor compressor and suction part of compressor. In
the on-line system, accelerometers were located at the same posi-
tions as in the off-line system but only in the horizontal direction.
The trending data was recorded from August 2005 to November
2005 which included peak acceleration and envelope acceleration
data. The average recording duration was 6 h during the data
Fig. 3. Proposed forecasting system. acquisition process. Each data record consisted of approximately

Fig. 4. Low methane compressor: wet screw type.


3314 H.T. Pham et al. / Expert Systems with Applications 37 (2010) 3310–3317

Table 1 3
Description of system.

Electric motor Compressor


2.5
Voltage 6600 V Type Wet screw
Power 440 kW Lobe Male rotor (4 lobes)
Pole 2 Pole Female rotor (6 lobes) 2

Acceleration (g)
Bearing NDE:#6216, DE:#6216 Bearing Thrust: 7321 BDB
RPM 3565 rpm Radial: Sleeve type
1.5

1.4 1

1.2
0.5

1
0
0 200 400 600 800 1000 1200
Acceleration (g)

0.8 Time

Fig. 6. The entire envelope acceleration data of low methane compressor.


0.6

0.4
2.5

0.2
2

0
0 200 400 600 800 1000 1200
Time 1.5
Acceleration (g)

Fig. 5. The entire peak acceleration data of low methane compressor.


1

1200 data points as shown in Figs. 5 and 6, and contained informa-


0.5
tion of machine history with respect to time sequence (vibration
amplitude). Consequently, it can be classified as time series data.
0

5.2. Results
-0.5
0 100 200 300 400 500 600 700
In order to build the forecasting model, 719 points of peak Time
acceleration data and 749 points of envelope acceleration data
are used. The remaining points of each data are then utilized to test Fig. 7. The filtered envelope acceleration data of low methane compressor.
the forecasting model. Additionally, the root mean square error
(RMSE) given in Eq. (17) is employed to evaluate forecasting
capability 0.9
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
PN 0.8
i¼1 ðyi  yi Þ
^ 2
RMSE ¼ ð17Þ
N 0.7

where N represents the number of data points, yi is actual value and 0.6
Acceleration (g)

ŷi represents the predicted value.


0.5
The deterministic component xt is obtained from vibration data
by using filtering technique. Figs. 7 and 8 show the deterministic 0.4
data after filtering of the envelope and peak acceleration data.
The NARX forecasting model is then generated by using these data. 0.3
To build NARX model, the embedding dimension dE must be firstly
0.2
determined. As mentioned in the Section 1, FNN as well as Cao’s
method could be possibly used to estimate dE. Cao’s method can 0.1
settle on a suitable embedding dimension of time series and distin-
guish deterministic signals and stochastic signals clearly. That is a 0
0 100 200 300 400 500 600 700
reason why Cao’s method is chosen in this paper. According to Cao Time
(1997), there are two important values that are E1(d) and E2(d)
needed to be calculated. E1(d) is used to choose the minimum Fig. 8. The deterministic component of peak acceleration data.

embedding dimension dE when it reaches the saturation. E2(d) is


used for the problem in practical computations where E1(d) is
slowly increasing or has stopped changing if embedding dimension However, for deterministic data, E2(d) is certainly related to as a re-
d is sufficiently large. For random data, the future values are inde- sult, it cannot be a constant for all d. Figs. 9 and 10 depict the
pendent of the past values, hence, E2(d)  1 for any value of d. embedding dimension applied for deterministic component of
H.T. Pham et al. / Expert Systems with Applications 37 (2010) 3310–3317 3315

1.2 1

Sample Autocorrelation
0.8 0.5
E1, E2

0.6

0.4 0

0.2
E1
E2
0 -0.5
2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 16 18 20
Dimension (d) Lag

Fig. 9. The value E1 and E2 of deterministic component of envelope data. Fig. 11. ACF of error component of envelope acceleration data.

1.2 1

1
Sample Autocorrelation

0.5
0.8
E1,E2

0.6

0
0.4

0.2
E1
E2 -0.5
0 2 4 6 8 10 12 14 16 18 20
2 4 6 8 10 12 14 Lag
Dimension (d)
Fig. 12. ACF of error component of peak acceleration data.
Fig. 10. The value E1 and E2 of deterministic component of peak acceleration data.

Table 2 By substituting dE = 4 and dE = 5 into Eq. (18), the number of


The RMSE values of NARX-SP and NARX-P. neurons can be found as N1 = 9, N2 = 3 for envelope acceleration
Vibration data NARX-SP NARX-P
data and N1 = 11, N2 = 3 for peak acceleration data, respectively.
The order of the output regressor ny in NARX-P and NARXSP
Training Predicting Training Predicting
models is calculated by product of time delay s and minimum
Peak 0.0232 0.0260 0.0417 0.0450 embedding dimension dE. The time delay s is chosen as 1 because
Envelope 0.0241 0.0188 0.0597 0.0363
long-term recursive forecasting methodology is used in this paper.
Thus, the order of the output regressor ny is respectively set to
ny = s  dE = 4 and ny = 5 for enveloped and peak acceleration data.
envelope and peak acceleration data, respectively. In these figures, Both the NARX-P and NARX-SP models are employed to select the
E1(d) obviously reaches its saturation at d = 4 and d = 5 for the proper NARX model. The standard back propagation algorithm in
envelope and peak acceleration data, respectively. Consequently, which 500 epochs and learning rate equals to 0.01 is used to train
the minimum embedding dimension dE is chosen as 4 for envelope the networks. The forecasting capability of these models is evalu-
acceleration data and 5 for peak acceleration data to build NARX ated by RMSE values in Table 2. From this table, NARX-SP model
model. is superior to NARX-P model in showing more the accuracy. Hence,
Other parameters to be considered are the number of neurons NARX-SP is chosen to hybridize with ARMA model.
in two hidden layers of NARX model. The number of neurons, N1 In the next step in forecasting process, the error e mentioned in
and N2, in the first and second hidden layers is the chosen accord- Section 3 is forecasted by ARMA model. In order to create this mod-
ing to the following heuristics: el, the model identification procedure is initially implemented to
pffiffiffiffiffiffi check the stationary condition. In case of the inadequate stationary
N1 ¼ 2dE þ 1; N2 ¼ N1 ð18Þ condition, it is considered that how many orders of differencing
need to stationalize the data. Time series is considered to be sta-
where N2 is rounded up toward the next integer number.
3316 H.T. Pham et al. / Expert Systems with Applications 37 (2010) 3310–3317

Table 3
The RMSE values of forecasting results of ARMA, NARX and NARX–ARMA.

Number of step-ahead Peak acceleration data Envelope acceleration data


ARMA NARX NARX–ARMA ARMA NARX NARX–ARMA
1 0.2647 0.0703 0.0363 0.2647 0.2090 0.0693
2 0.2821 0.2221 0.0595 0.2821 0.2581 0.1178
3 0.2921 0.2421 0.0671 0.2826 0.2664 0.1232
4 0.3026 0.2826 0.0896 0.2935 0.2867 0.1276
5 0.3226 0.3026 0.0956 0.2950 0.3018 0.1310
6 0.3950 0.3250 0.1764 0.3125 0.3271 0.1331

3.5 Finally, the final forecasting values of the hybrid model are the
Training Testing sum of the results obtained from NARX-SP and ARMA models.
3 Table 3 shows a summary of the RMSE values of three models
applied for peak and envelope acceleration data. In this table, all
2.5
the RMSEs of the NARX–ARMA model are vastly superior to the
2 other traditional models in that it is more accurate in both cases
Acceleration (g)

of peak and envelope acceleration data. The example of forecasting


1.5 capability using one-step-ahead shows in Figs. 13 and 14 of peak
acceleration and envelope acceleration data, respectively. They
1
indicate that the NARX–ARMA hybrid model can effectively cap-
0.5 ture and track the system behavior.

-0.5 Real value 6. Conclusions


Predicted Value
-1 Machine state forecasting gradually plays an important role in
200 400 600 720 800 1000 1200
Time modern industry due to its ability to foretell the operating condi-
tion of machine in the future. Hence, finding out the precise and
Fig. 13. The forecasting results of NARX–ARMA model for envelope acceleration reliable forecasting model is an important and challenging task.
data using one-step-ahead.
In this paper, an improvement of hybrid model consisting of NARX
and ARMA is investigated for long-term forecasting. Peak accelera-
tion and envelope acceleration trending data of a low methane
1.6 compressor are used to demonstrate the predictive ability of pro-
Trainning Testing posed method. From the results of a comparative study, the im-
1.4
proved hybrid model (NARX–ARMA) has a higher forecasting
accuracy the other traditional models. This demonstrates that the
1.2
NARX–ARMA model is a reliable and accurate tool for forecasting
1
the machine state.
Acceleration (g)

0.8
References
0.6
Box, G. E. P., & Jenkins, G. (1970). Time series analysis. Forecasting and control. San
Francisco, CA: Holden-Day.
0.4 Cao, L. Y. (1997). Practical method for determining the minimum embedding
dimension of a scalar time series. Physical D, 110, 43–50.
0.2 Inde, H., & Trafalis, T. B. (2006). A hybrid model for exchange rate prediction.
Real value
Decision Support Systems, 42, 1054–1062.
Predicted Value Kennel, M. B., Brown, R., & Abarbanel, H. D. I. (1992). Determining embedding
0
0 200 400 600 750 1000 1200 dimension for phase-space reconstruction using a geometrical construction.
Time Physical Review A, 45, 3403–3411.
Leontaritis, I. J., & Billings, S. A. (1985). Input–output parametric models for non-
linear systems. International Journal of Control, 41, 303–344.
Fig. 14. The forecasting results of NARX–ARMA model for peak acceleration data
Lin, T., Horne, B. G., Tino, P., & Giles, C. L. (1997). A delay damage model selection
using one-step-ahead.
algorithm for NARX neural networks. IEEE Transactions on Signal Processing,
45(11), 2719–2730.
Liu, J., Wang, W., & Golnaraghi, F. (2009). A multi-step predictor with a variable
tionary if its autocorrelation structure is constant over time or the input pattern for system state forecasting. Mechanical Systems and Signal
lag-1 autocorrelation is zero or negative. Figs. 11 and 12 which de- Processing, 23(5), 1586–1599.
Ljung, L., & Box, G. E. P. (1987). System identification theory for the user. Englewood
pict the ACF of envelope and peak acceleration data show that the
Cliffs, NJ: Prentice-Hall.
error component is satisfied the requirement of stationary condi- Menezes, J. M. P., & Barreto, G. A. (2006). A new look at nonlinear time series
tion. Thus, it can be directly applied to generate ARMA model with- prediction with NARX recurrent neural network. In Proceeding of the ninth
Brazilian symposium on neural networks (pp. 160–165).
out necessitating higher order of differencing. Basing on ACF, PACF
Pai, P. F., & Lin, C. S. (2005). A hybrid ARIMA and support vector machines model in
and experimental results, ARMA (3, 4) model for envelope acceler- stock price forecasting. Omega, 33, 497–505.
ation data and ARMA (3, 3) for peak acceleration data are chosen in Rojas, I., Valenzuela, O., Rojas, F., Guillen, A., Herrera, L. J., Pomares, H., et al. (2008).
this study. Furthermore, MLE is used to estimate the model param- Soft-computing techniques and ARMA model for time series prediction.
Neurocomputing, 71, 519–537.
eters ui , /j . Sorjamaa, A., Hao, J., Reyhani, N., Ji, Y., & Lendasse, A. (2007). Methodology for long-
term prediction of time series. Neurocomputing, 70, 2861–2869.
H.T. Pham et al. / Expert Systems with Applications 37 (2010) 3310–3317 3317

Tran, V. T., Yang, B. S., Oh, M. S., & Tan, A. C. C. (2008). Machine condition prognosis models for time series prediction. Fuzzy Sets and Systems, 159,
based on regression trees and one-step-ahead prediction. Mechanical Systems 821–845.
and Signal Processing, 22(5), 1179–1193. Wang, W. (2007). An adaptive predictor for dynamic system forecasting. Mechanical
Tseng, F. M., Yu, H. C., & Tzeng, G. H. (2002). Combining neural network model with Systems and Signal Processing, 21, 809–823.
seasonal time series ARIMA model. Technological Forecasting & Social Change, 69, Wang, W. Q., Golnaraghi, M. F., & Ismail, F. (2004). Prognosis of machine health
71–87. condition using neuro-fuzzy system. Mechanical System and Signal Processing,
Vachtsevanos, G., & Wang, P. (2001). Fault prognosis using dynamic wavelet neural 18, 813–831.
networks. AUTOTESTCON Proceedings, IEEE Systems Readiness Technology Zhang, G. P. (2003). Time series forecasting using a hybrid ARIMA and neural
Conference, 857–870. network model. Neurocomputing, 50, 159–175.
Valenzuela, O., Rojas, I., Rojas, F., Pomares, H., Herrera, L. J., Guillen, A.,
et al. (2008). Hybridization of intelligent techniques and ARIMA

You might also like