Multivariate Approach to Time Series Model Identification

Multivariate Approach to Time Series Model Identification

Chapter One

OBJECTIVES OF THE STUDY

This work is aimed at developing an entirely new model identification method which addresses most of the challenges of existing methods. In achieving this, our specific objectives are outlined below.

To develop quadratic discriminant functions for each of the ARMA models considered and define the iterative steps (algorithm) to be followed in application of the functions.

To apply the method to both simulated and real time series data.

To briefly compare the proposed method with existing methods.

CHAPTER TWO

LITERATUR REVIEW

INTRODUCTION

Though, no work has ever been done on application of Multivariate analytical methods in Time Series ARMA model identification, a lot has been done on model identification. In this section, we are going to briefly look at the extent of work that has been done on model identification, from use of ACF, PACF, EACF and Canonical correlation to test of hypothesis and later, use of information criteria.

REVIEW OF LITERATURE

Box and Jenkins (1976) have the credit for the most popular judgmental model identification method. Their approach adopts theoretical models to be fitted into Time Series based on careful examination of the sample ACF and PACF. Behaviour of sample ACF and PACF calculated from the Series to be fitted are compared with that of available theoretical model and the model with similar behavior is suspected to have generated the Series to be fitted and therefore considered. Theoretically, the ACF and PACF of an AR(p) tail off and cut off at lag p respectively. ACF and PACF of MA(q) cut off at lag q and tails off respectively while both ACF and PACF of ARMA(p,q) tail off. This approach has come under serious criticism based on a couple of arguments. First is that the method is highly judgmental because the behaviour of the sample ACF and PACF cannot be perfectly matched with the theoretical behaviour in most cases, therefore deciding whether ACF or PACF cut off at certain lag depends on individual Judgment. Box and Jenkins approach is merely tentative, the model identified at this stage is subject to series of modifications and the models selected at the end are usually different from the initial model. This will obviously lead to fitting and re-fitting of several models and series of goodness-of-fit tests. It is based on the arguments above and other similar arguments that Anderson (1975) concluded that Box and Jenkins approach, though attractive but in practice both time consuming, difficult and relatively expensive computationally.

One of the shortfalls of the Box and Jenkins approach is specifying the value of p and q in ARMA(p,q) model where since both ACF and PACF tail off in this case. Tsay and Tiao (1984) proposed the extended autocorrelation approach to address this problem. The EACF method uses the fact that if the AR part of a mixed ARMA model is known, “filtering out” the autoregression from the observed series results in a pure MA process which now enjoy the cut off property. Chan (1999) in a comparative simulation study concluded that the EACF approach has a good sampling property for moderately large sample size. In formalizing the procedure, EACF is usually calculated and compared with the theoretical behaviour of EACF for available models and the values of p and q are consequently considered. Cryer and Chan (2008) specifically pointed out that there will never be a clear cut off in the sample EACF as we have in the theoretical EACF. This makes the method even more difficult than the Box and Jenkins approach.

Bhansali (1993) worked on a hypothesis testing approach to model identification. His approach involve pre-fitting all the suspected models and testing hypothesis concerning the order until the right model is established. Consider an AR(p), the procedure is to test the hypothesis that the model is an AR(p) against an alternative that it is an AR(p+1). We continue increasing the value of value of p by 1 and repeating the hypothesis until the null hypothesis is accepted.

Whittle (1963) introduced an order selection procedure which is based on residual variance plot. Take an AR(p) for example, the lowest possible value of p is chosen and the model fitted, the error variance is estimated from the fitted model as , if the order fitted is lower than the actual order then is greater than . p is increased and the model fitted with calculated for each p. p’s are plotted against ’s and the value of p that corresponds to the point where stopped reducing is adopted as the order of the process. Jenkins and Watts (1968) also used this method and further suggested that the method can be used for order selection of MA and ARMA models.

Whittle’s (1963) procedure motivated Akaike into research on order selection for which he is so popular today.

Akaike (1969) improved on Whittle (1963) procedure for order selection of AR processes. He suggested that AR(p) be fitted for p=0,1,2,…,m when m is the upper bound placed on p. Value of order selection criteria which is the standardized error term calculated for each of the models and p that minimizes the calculated order selection criterion is selected as the order of the process.

Akaike (1969) proposed the use of the final prediction error, FPE. For a particular value of p, the FPE is defined as

where N is the series length, is the MLE of the variance of the residual based on the order model. The p for which FPE is lowest is adopted as the order of the AR model.

Akaike (1974) introduced the popular Akaike’s Information Criterion (AIC), AIC broadened the scope of usage of the information criteria. It can be used in Time Series model identification in the wide range of situations. Its usage is not even restricted to Time Series analysis Zazli (2002). For an AR(p) process, the AIC is given as ……………………………………………………………..(2).

CHAPTER THREE

METHODOLOGY

THE BAYESIAN AND FISHER’S CLASSIFICATION RULE

The problem of Discriminant Analysis is the problem of classifying a given observation into one of the available k-classes. Consider the population

……………….…………….. (5)

Note that implying that the population whose group is is normal with parameters and is the basic assumption of Discriminant Analysis whether it is linear or quadratic. When differ with and is the same for all then the linear Discriminant function is adopted. Quadratic Discriminant function is adopted when both and are assumed to vary with Let be the prior probability that an arbitrary unit belongs to group defined mathematically, we are interested in which is defined as

–(6)

……………………………………….……….. (7)

Equation (7) is the Bayes theorem hence the name Bayes classification rule which is being derived. We will compute for each class given the observation vector the class with the highest value of is the most likely class for the unit being considered. is the class posterior probability while

……………………………………………………………………….(8)

is the prior probability. With the knowledge of the prior probability, the posterior probability can be calculated and used to predict class membership of arbitrary units.

Consider and; class is more likely than if

CHAPTER FOUR

RESULTS

THE PROPOSED FUNCTIONS

Following the procedure described above, the following classifiers were developed for various models considered (note that prior probability is assumed equal for all groups and as such is 1/6)

CHAPTER FIVE

SUMMARY AND CONCLUSION

SUMMARY

In this work, we simulated 300 sets of Time Series data, 50 each for AR(1), AR(2), MA(1), MA(2), ARMA(1,1) and ARMA(2,2). From each of the 300 series, we calculated ACF for lag 1-4 and PACF for lag 2-5. We combined the calculated ACF and PACF which now form an eight dimensional multivariate random variable with each group now having 50*8 data points. We now used the eight dimensional multivariate random variables to build quadratic discriminant function for each of the groups; we chose to call those discriminant functions “classifiers”. We simulated another 25 time series for which we calculated ACF for lag 1-4 and PACF for lag 2-5, the values calculated constitute elements of observation vectors. We then impute the observation vector calculated say from the first of the 25 simulated Time Series into each of the classifiers and the classification rule is to classify the series from which the vector was calculated into the group that maximize the value of the classifiers. This was done for each of the 25 Time Series and the classification result is presented in the table of classification results. We also developed an algorithm to be used alongside our classifiers so that with our algorithm, misclassification at the initial stage merely imply that one will need more than a single iteration to correctly identify an appropriate model. We also went ahead to apply the method to a real life Time Series that has already been fitted by Box et al (2008) and the right model was identify in two iterations.

DISCUSSION OF RESULTS

Box and Jenkins (1976) have the credit for the most popular judgmental model identification method. Their approach adopts theoretical models to be fitted into Time Series based on careful examination of the sample ACF and PACF. Behaviour of sample ACF and PACF calculated from the Series to be fitted are compared with that of available theoretical models and the model with similar behavior is suspected to have generated the Series to be fitted and therefore considered. Theoretically, the ACF and PACF of an AR (p) tail off and cut off at lag p respectively. ACF and PACF of MA (q) cut off at lag q and tails off respectively while both ACF and PACF of ARMA (p, q) tail off. This approach has come under serious criticism based on a couple of arguments. First is that the method is highly judgmental because the behaviour of the sample ACF and PACF cannot be perfectly matched with the theoretical behaviour in most cases, therefore deciding whether ACF or PACF cut off at certain lag depend on individual Judgment. Box and Jenkins approach is merely tentative, the model identified at this stage is subject to series of modifications and the models selected at the end are usually different from the initial model. This will obviously lead to fitting and re-fitting of several models and series of goodness of fit tests. It is based on this argument and others that Anderson (1975) concluded that Box and Jenkins approach, though attractive but in practice both time consuming, difficult and relatively expensive computationally. One of the shortfalls of the Box and Jenkins approach is specifying the value of p and q in ARMA (p, q) model where since both ACF and PACF tail off in this case.

This new approach has used 300 simulated ARMA models to build the classifiers, one for each class of classes of models. The validation of the classifiers was done using 25 simulated ARMA models. We extracted the ACF for lag 1- 4, PACF for lag 2-5 to form the multivariate variables used in building the quadratic discriminant function (the classifiers). The usefulness of our classifiers does not stop at misclassification, in cases of misclassification as we observed, the model with value of classifier closest to the model wrongly suggested by the classifiers is adopted, and in the two cases we observed, such models were actually the right models. The algorithm for the utilization of this new method is given along-side our classifiers. The two cases of misclassifications observed in our table merely imply that you will need a second iteration to correctly identify such models; only one iteration will do justice to model identification problem in the remaining 23 simulated series. We also went ahead to apply our algorithm to real life Time Series which was correctly identified after two iterations. Our approach addressed the challenge posed by uncertainties associated with behaviour of ACF and PACF in Box and Jenkins approach and it also helps us to identify the right model with high level of uncertainties even before fitting the model unlike information criteria approach that identify the right model after fitting several models.

CONTRIBUTION

Box et al (2008) specifically said this:

…it should also be explained that identification is necessary inexact because the question of what type of model exist in practice in what circumstances, is a property of the behaviour of the physical world and therefore cannot be decided by purely mathematical argument. Furthermore, because at the identification stage, no precise formulation of the problem is available, statistically “inefficient” methods must necessarily be used. It is a stage at which graphical methods are particularly useful and judgments must be exercised. However it should be borne in mind that preliminary identification commits us to nothing except tentative consideration of a class of models that will later be efficiently fitted and checked

The nature of challenges of model identification methods presented in the above statement of Box et al (2008) motivated this work. This is the first work ever that considered application of the powerful multivariate analytical tool-discriminant function to ARMA model identification. This is as well the first exact model identification approach which is able to specifically suggest a particular model at the initial stage and also provides the next available course of action in cases where the initially selected model turn out to be the wrong model. Our approach answers most of the research questions in the above paragraph.

Contributions of this work to Time Series model identification are summarised below:

Our approach has reduced the negative impact of individual judgments from Time Series ARMA model identification.

There is now a precise formulation of problem in Time Series ARMA model identification.

Computational cost associated with Time Series modeling due to inexact model identification approach has been greatly reduced

REFERNCES

Akaike, H. (1969). fitting Autoregressive Models for Prediction. Annals Institute of Statistical Mathematics, 21: 243-247.
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control 19 (6): 716–723
Akaike, H. (1979). A Bayesian Extension of the Minimum AIC procedure of Autoregressive Model fitting. Biometrica, 66:237-242
Akaike, H. (1980). Likelihood and the Bayes procedure. in Bernardo, J. M.; et al., Bayesian Statistics, Valencia: University Press, pp. 143–166
Anderson, D. R. (2008). Model Based Inference in the Life Sciences, Springer
Anderson, O.D.(1975). Distinguishing between simple Box and Jenkins models. International Journal of math. Edu. In Sc. and Tech, 6(4): 461-465
Bhansali, R.J. (1993). Order Selection for Time Series Models: A Review, in development in Time Series Analysis by Rao, T.S. Chapman and Hall, London.
Bickel, P.J. and Levina, E. (2004). some theory for Fisher’s Linear Discriminant Function “Naïve Bayes” and some alternatives when there are many more variables than observations. Bernoulli, 10(6): 989-1010
Box, G.E.P and Jenkins, G.M. (1976). Time Series Analysis: Forecasting and Control. Holden-Day
Box, G.E.P. and Jenkins, G.M.(1970). Time Series Analysis: Forecasting and Control. Holden-Day
Box, G.E.P., Jenkins, G.M. and Reinsel, G.C.(2008). Time Series Analysis: forecasting and control. New Jersey: Wiley.
Burnham, K. P.; Anderson, D. R. (2002), Model Selection and Multimodel Inference: A.Practical Information-Theoretic Approach (2nd ed.), Springer-Verlag, ISBN 0-387-95364-7

Other Topics