Thursday, July 18, 2019

Time Series

IntroductionA m serial publication is a squ be up of observations, xi each one macrocosm save at a specific era t. After being recorded, these entropy be rigorously studied to dumbfound a form. This gravel bequeath wherefore be apply to construct degree to come de vergeinationine, in other words, to specify a forecast. When looking at a cadence serial publication, some questions moldiness be askedDoes the outcome serial postulate a propensity or seasonal workerity?argon their outliers?Is at that place everlasting divergence totally all everyplace magazine? meaty of Good date serialThe information must be massive enough.There must be equal prison term gap.There must be a normal period.Example1The by-line patch is a epoch serial while of the annual number of earthquakes in the existence with seismic magnitude over 7.0, for 99 sequential years.By a cartridge holder serial publication eyepatch, we simply represent that the protean is dappleted against magazine.Some features of the bizThere is no manner.The mean of the serial publication is 20.2.There is no seasonality as the entropy atomic number 18 annual data.There ar no outliers.Example 2 This shows a term serial publication of quarterly payoff of beer in Australia for 18 years.Some features beThere is an adjustment magnitude leaning.There is seasonality.There be no outliers.The Components of conviction serialThe components of judgment of conviction serial publication be factors that fuel bring lurchs to the conviction serial publication consummation component, TtWhen in that respect is an change magnitude or a decrease over a long period of clipping in the data, then(prenominal) we say that on that point is a rationalize. Some prison terms, a trend is verbalise to be ever-changing direction when it goes from an change magnitude trend to a decreasing one. It is the way out of events such(prenominal) as price inflation, state growth or economic changes.Seasonal component, StA seasonal precedent exists when the clipping serial publication exhibits level(p) variations at specific condemnation. It arises from influences such as instinctive conditions or social and pagan fashions. For example, the gross sales of ice-cream are relatively high gear in summertime. So, the salesman expects greater enlighten headway in summer than in winter.Cyclic component, CtIf the time serial shows an up and down movement slightly a given period of time, it is express to wealthy person a circular pattern.Irregular component, ItIrregular components lie in of changes that are unlikely to be repeat in a time serial. Examples are floods, fires, earthquakes or cyclones.Combining the time series components fourth dimension series is a combine of the components which were dissertateed above.These components support be either suck in analoguely or multiplicatively.Additive assumeIt is additive, and t he changes are made by the similar tote up over time.Yt = Tt + Ct + St + ItMultiplicative modelIt is non- elongate such as quadratic or exponential, and the changes improver or decrease over time.Yt = Tt Ct St ItUses date series rat be social lickful in the following palmStatisticsSignal souringEconometricsMathematical financeAstronomyEarthquake foreshadowionsWeather previsionImportance of duration series for line of merchandi confabulatesThere are legion(predicate) bene adapts of time series for business purposes face-saving for study of departed demeanourBusinessmen use time series to study the late(prenominal) behaviors and to go through the trend of the sales or profit of their businesses.Helpful in forecasting cartridge clip series is a great tool for forecasting. Businesses git befool a time series of the past strategies of their competitors and correct an estimate of their prospective strategies. In this way, they cite tail assembly reinforced a rec rudesce strategy and mystify much profits.Helpful in comparisonTime series terminate be use to calculate the trend of deuce or more branches of the aforesaid(prenominal) troupe and compare their performance. On their performances, rewards chiffonier be given.However, time series give the sack have some limitations for a business. sales forecasting relies on the past results to predict succeeding(a) expectations. But, if a company is new, there is a limited amount of data to exculpate predictions. Even so, past results do non always indicate what the early sales bothow for be.To fully fancy this topic, we exit work out this example.Example 2We allow for consider the actual arrival of passengers from an airdrome over the year 1949 to 1960. From these data, we give make a forecast.The basic step is to plan the data and obtain descriptive measures such as trends or seasonal fluctuations.The heartbeat step is to breach for the stationarity of the time series.Station arityA time series is said to be un locomote if its mean and var. does not change over time. Obviously, not all the time series that we encounter are nonmoving.It is consequential because, approximately of the models we work on, enters that the time series is stationary. If the time series has the same behavior over time, there leave alone be a high probability that it impart follow the same trend in the future.How to come apart for stationarity?For the graph that was plot, we brook blockade up on out that it has an increasing trend with some seasonal pattern.But, it is not always evident to retard whether a plot is increasing or has a seasonal trend. We pile check for stationarity using the followingPlotting rolling statisticsWe plot the moving average or variance and gain whether it changes with time. But, as it is a optical technique, we get out take more stipulation for the next leaven.Dickey-Fuller running gameIt is one of the statistical methods to check f or stationarity. The null hypothesis is that the time series is non-stationary, and the alternative hypothesis is the converse.As shown below, the exam consists of the analyse statistics and circumstantial determine at diverse signifi derrieret levels. If the assay statistics is less than the over vituperative value, we reject the null hypothesis.Results of Dickey-Fuller study trial Statistic 0.815369p-value 0.991880Lags Used 13.000000 follow of Observations Used 130.000000Critical measure out (1%) -3.481682Critical value (5%) -2.884042Critical Value (10%) -2.578770According to the Dickey-Fuller riddle, the test statistics is less than the tiny value.Therefore, the time series is not stationary. However, there are various methods to make a time series stationary.How to make a time series stationary?The laying claim of stationarity is very essential when modelling a time series, scarcely just about of the realistic time series are not stationary. Eventually, we cann ot make a time series one hundred percent stationary, most of the time, it result be with a dominance of 99%.Before going into detail, we will handle on the reasons why the time series is not stationary.There are ii major reasons to that, trend and seasonality.Having discuss the reasons, we will now talk about the techniques to make the time series stationaryTransformationLog conversion is probably the most commonly employ form of vicissitude.DifferencingDifferencing is a widely apply method to make the time series stationary. It is performed by subtracting the antecedent observation from the lord one. When making the forecast, the process of differencing must be inverted to convert the data ski binding off to its accredited scale. This can be through with(p) by adding the difference value to the precedent value.Using the Dickey-Fuller test we can come up that the test statistic is -2.717131 and that the critical determine at 1%, 5% and 10% are -3.482501, -2.88439 8 and -2.578960 respectivelyThe time series is stationary with 90% bureau. The imprimatur or third hunting lodge differencing can be do to mother better results.DecompositionIn decomposition, the time series is divided into some(prenominal) components mainly trend, cyclical, seasonal and impermanent components.The time series can sometimes be broken down into an additive or multiplicative model.We will assume a multiplicative model for our example.Since the trend and seasonality were uncaring from the sleeps, we can check the stationarity of the eases.Results of Dickey-Fuller runnel is test statistic is -6.332387e+00 and the critical value at 1%, 5% and 10% are -3.485122e+00, -2.885538e+00 and -2.579569e+00 respectively.We can conclude that the time series is stationary at 99% confidence.Now, we can go forward with the forecasting.Forecasting the time seriesWe will fit this time series using the ARIMA model, ARIMA is an acronym that stands for Autoregressive corporate M oving Average. It is a one-dimensional equation similar to a linear regression.The first goal is to find the value of the predictors (p, d, q), but before finding these determine, twain situations in stationarity must be discussed.A rigorously stationary series without any colony among the value. In this study, we can model the residual as white noise.The turn baptistry is a series with significant settlement among the values.The predictors mainly depend on the parameters (p, d, q) of the ARIMA modelNumber of AR(Auto-Regressive) terms (p)It is the number of deceleration observation that were included in the model. This term helps to incorporate the effect of the past values into the model.Number of MA (Moving Average) terms (q)It is the sizing of the moving average window, that is, this term sets the misplay of the model as a linear combination of the error values sight at previous time points in the past.Number of differences(d)The number of times that the raw observat ions are differenced.In order to obtain the values of p and q, we will use the following cardinal plotsAuto coefficient of correlation coefficient Function, ACFThis duty will measure the correlation of the time series with its toss outged version.Partial Autocorrelation Function, PACFThis function measures the correlation between the time series with a lagged version of itself, controlling the values of the time series at all shorter lagsIn the ACF and PACF plots, the dotted lines are the confidence interval, these values are p and q. The value of p is obtained from the PACF plot and the value of q is obtained from the ACF plot. We can opthalmicise that both p and q are 2.Now, that we have obtained p and q, we will make cardinal contrastive ARIMA model AR, MA and the have model. The RSS of each of the model will be given.AR modelMA modelCombined modelFrom the plots, it is clearly shown that the RSS of AR and MA are the same and that of the combined is much better. As the com bined model give a better result, the following steps will take the values back to its fender scale.The predicted results are stored.The differencing is converted the log scale. This can be done by adding the differences consecutively to the base numbers.The exponent is taken and is compared to the buffer scale.Therefore, we have the final result.ReferencesAarshay Jain(2016) A well-rounded beginners guide to create a Time serial Forecast (with Codes in Python) WWW on hand(predicate) from https//www.analyticsvidhya.com/blog/2016/02/time-series-forecasting-codes-python/ Accessed 14/04/18Maxime Phillot (2017)How do I interpret the results in an augmented Dickey-Fuller test? WWW purchasable from https//www.quora.com/How-do-I-interpret-the-results-in-an-augmented-Dickey-Fuller-test Accessed 23/04/18Jason Brownlee (2016)What Is Time series Forecasting? WWW functional from https//machinelearningmastery.com/time-series-forecasting/ Accessed 23/04/18Chris St.Jeor and Sean Ankenbruck (2018)Time serial publication for dummies- The 3 step process WWW Available from https//www.kdnuggets.com/2018/03/time-series-dummies-3-step-process.html Accessed 22/04/18Pennsylvania state university (n. d) Overview of Time Series Characteristics WWW Available from https//onlinecourses.science.psu.edu/stat510/node/47 Accessed 22/04/18Time SeriesA time series is a set of observations, xi each one being recorded at a specific time t. After being recorded, these data are rigorously studied to develop a model. This model will then be used to produce future values, in other words, to make a forecast.Important Characteristics to calculate FirstWhen first looking at a time series, some questions must be askedDoes the time series has a trend or seasonality over time?Are their outliers? With time series data, the outliers are further away from the other data.Is there a long-run cycle or period?Is there constant variance over time?Essential of Good time series Data must be for a sufficien t period follow time ga Constant or normal period.Example1The following plot is a time series plot of the annual number of earthquakes in the world with seismic magnitude over 7.0, for 99 consecutive years. By a time series plot, we simply mean that the variable is plotted against time.Some features of the plotThere is no trend.The mean of the series is 20.2There is no seasonality as the data are annual data.There are no outliers.Example 2 The plot at the top of the next page shows a time series of quarterly production of beer in Australia for 18 years.Some important features areThere is an increasing trend.There is seasonality.There are no obvious outliers.The Components of Time SeriesThe components of time series are factors that can bring changes to the time seriesTrend component, TtWhen there is an increase or a decrease over a long period of time in the data, then we say that there is a trend.Sometimes, a trend is said to be changing direction when it goes from an increasing t rend to a decreasing one. It is the result of events such as price inflation, population growth or economic changes.Seasonal component, StA seasonal pattern exists when the time series exhibits regular fluctuations at specific time. It arises from influences such as natural conditions or social and cultural behaviors. For example, the sales of ice-cream are relatively high in summer.So, the salesman expects greater profit in summer than in winter. Cyclic component, CtIf the time series shows an up and down movement around a given period of time, it is said to have a cyclical pattern.Irregular component, ItIrregular components consist of changes that are unlikely to be repeated in a time series. Examples are floods, fires, earthquakes or cyclones.Combining the time series componentsTime series is a combination of the components which were discussed above.These components can be either combined additively or multiplicatively.Additive modelIt is linear, and the changes are made by the same amount over time.Yt = Tt + Ct + St + ItMultiplicative modelIt is non-linear such as quadratic or exponential, and the changes increase or decrease over time.Yt = Tt Ct St ItUsesTime series can be useful in the following fieldsStatisticsSignal processingEconometricsMathematical financeAstronomyEarthquake predictionsWeather forecastingImportance of Time series for businessesThere are many benefits of time series for business purposesHelpful for study of past behaviorBusinessmen use time series to study the past behaviors and to see the trend of the sales or profit of their businesses.Helpful in forecastingTime series is a great tool for forecasting. Businesses can make a time series of the past strategies of their competitors and make an estimate of their future strategies. In this way, they make can built a better strategy and make more profits.Helpful in comparisonTime series can be used to calculate the trend of dickens or more branches of the same company and compare their performance. On their performances, rewards can be given.However, time series can have some limitations for a business. Sales forecasting relies on the past results to predict future expectations. But, if a company is new, there is a limited amount of data to make predictions. Even so, past results do not always indicate what the future sales will be.To fully understand this topic, we will work out this example.Example 2We will consider the actual arrival of passengers from an airport over the year 1949 to 1960. From these data, we will make a forecast.The first step is to plot the data and obtain descriptive measures such as trends or seasonal fluctuations.The second step is to check for the stationarity of the time series.StationarityA time series is said to be stationary if its mean and variance does not change over time. Obviously, not all the time series that we encounter are stationary. It is important because, most of the models we work on, assumes that the time series is s tationary.If the time series has the same behavior over time, there will be a high probability that it will follow the same trend in the future.How to check for stationarity?For the graph that was plotted, we can see that it has an increasing trend with some seasonal pattern. But, it is not always evident to see whether a plot is increasing or has a seasonal trend. We can check for stationarity using the followingPlotting rolling statisticsWe plot the moving average or variance and see whether it changes with time.But, as it is a visual technique, we will take more friendship for the next test.Dickey-Fuller testIt is one of the statistical methods to check for stationarity. The null hypothesis is that the time series is non-stationary, and the alternative hypothesis is the converse.As shown below, the test consists of the test statistics and critical values at different significant levels. If the test statistics is less than the critical value, we reject the null hypothesis.Results of Dickey-Fuller Test Test Statistic 0.815369p-value 0.991880Lags Used 13.000000Number of Observations Used 130.000000Critical Value (1%) -3.481682Critical Value (5%) -2.884042Critical Value (10%) -2.578770According to the Dickey-Fuller test, the test statistics is less than the critical value. Therefore, the time series is not stationary.However, there are various methods to make a time series stationary.How to make a time series stationary?The premise of stationarity is very important when modelling a time series, but most of the pragmatical time series are not stationary. Eventually, we cannot make a time series one hundred percent stationary, most of the time, it will be with a confidence of 99%.Before going into detail, we will discuss on the reasons why the time series is not stationary.There are two major reasons to that, trend and seasonality.Having discuss the reasons, we will now talk about the techniques to make the time series stationaryTransformationLog transformatio n is probably the most commonly used form of transformation.DifferencingDifferencing is a widely used method to make the time series stationary. It is performed by subtracting the previous observation from the latest one.When making the forecast, the process of differencing must be inverted to convert the data back to its original scale. This can be done by adding the difference value to the previous value.Using the Dickey-Fuller test we can see that the test statistic is -2.717131 and that the critical values at 1%, 5% and 10% are -3.482501, -2.884398 and -2.578960 respectivelyThe time series is stationary with 90% confidence. The second or third order differencing can be done to get better results.DecompositionIn decomposition, the time series is divided into several components mainly trend, cyclical, seasonal and irregular components.The time series can sometimes be broken down into an additive or multiplicative model.We will assume a multiplicative model for our example.Since t he trend and seasonality were separated from the residuals, we can check the stationarity of the residuals.Results of Dickey-Fuller Test is test statistic is -6.332387e+00 and the critical values at 1%, 5% and 10% are -3.485122e+00, -2.885538e+00 and -2.579569e+00 respectively.We can conclude that the time series is stationary at 99% confidence.Now, we can go forward with the forecasting.Forecasting the time seriesWe will fit this time series using the ARIMA model, ARIMA is an acronym that stands for Autoregressive Integrated Moving Average. It is a linear equation similar to a linear regression. The first goal is to find the values of the predictors (p, d, q), but before finding these values, two situations in stationarity must be discussed.A strictly stationary series without any dependance among the values. In this case, we can model the residual as white noise.The second case is a series with significant dependence among the values.The predictors mainly depend on the parameters (p, d, q) of the ARIMA modelNumber of AR(Auto-Regressive) terms (p)It is the number of lag observation that were included in the model. This term helps to incorporate the effect of the past values into the model.Number of MA (Moving Average) terms (q)It is the size of the moving average window, that is, this term sets the error of the model as a linear combination of the error values observed at previous time points in the past.Number of differences(d)The number of times that the raw observations are differenced.In order to obtain the values of p and q, we will use the following two plotsAutocorrelation Function, ACFThis function will measure the correlation of the time series with its lagged version.Partial Autocorrelation Function, PACFThis function measures the correlation between the time series with a lagged version of itself, controlling the values of the time series at all shorter lagsIn the ACF and PACF plots, the dotted lines are the confidence interval, these values are p and q. The value of p is obtained from the PACF plot and the value of q is obtained from the ACF plot. We can see that both p and q are 2.Now, that we have obtained p and q, we will make tether different ARIMA model AR, MA and the combined model.The RSS of each of the model will be given.AR modelMA modelCombined modelFrom the plots, it is clearly shown that the RSS of AR and MA are the same and that of the combined is much better. As the combined model give a better result, the following steps will take the values back to its original scale.The predicted results are stored.The differencing is converted the log scale. This can be done by adding the differences consecutively to the base numbers.The exponent is taken and is compared to the original scale.Therefore, we have the final result.ReferencesAarshay Jain(2016) A all-inclusive beginners guide to create a Time Series Forecast (with Codes in Python) WWW Available from https//www.analyticsvidhya.com/blog/2016/02/time-series-fore casting-codes-python/ Accessed 14/04/18Maxime Phillot (2017) How do I interpret the results in an augmented Dickey-Fuller test? WWW Available from https//www.quora.com/How-do-I-interpret-the-results-in-an-augmented-Dickey-Fuller-test Accessed 23/04/18Jason Brownlee (2016) What Is Time Series Forecasting? WWW Available from https//machinelearningmastery.com/time-series-forecasting/ Accessed 23/04/18Chris St.Jeor and Sean Ankenbruck (2018) Time Series for dummies- The 3 step process WWW Available from https//www.kdnuggets.com/2018/03/time-series-dummies-3-step-process.html Accessed 22/04/18Pennsylvania state university (n. d) Overview of Time Series Characteristics WWW Available from https//onlinecourses.science.psu.edu/stat510/node/47 Accessed 22/04/18

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.