# question 4

unknown

r

2 years ago

3.6 kB

2

Indexable

Never

^{}

## **Exercise 4** **4.1)** Firstly we plot the global_economy dataset without alteration, then we plot the same data after it has been normalized with a box_cox transformation. We can see that after the data has been normalized the variance in the graph has been reduced, therefore, we can justify the need for the transformation. ```{r, figures-side, fig.show="hold", out.width="50%"} US_GDP <- global_economy %>% filter(Country=="United States") %>% select(Country, GDP) lambda <- US_GDP %>% features(GDP, features = guerrero) %>% pull(lambda_guerrero) box_US_GDP <- US_GDP %>% mutate(GDP = box_cox(GDP, lambda)) US_GDP %>% autoplot(GDP) box_US_GDP %>% autoplot(GDP) ``` The time plot shows some non-stationarity with a steady increase and a strong upwards trend. The Autocorrelation function ACF() is done on the GDP column. The data shows significant spikes all through out the lags and we can see a decrease in the ACF as the amount of lags increase caused by the seasonality trend. This plot has a high autocorrelation. ```{r} box_US_GDP %>% gg_tsdisplay(GDP, plot_type = 'partial') + labs(title = "United States GDP") arima_fit <- box_US_GDP %>% model(arima = ARIMA(GDP, stepwise = FALSE, approx = FALSE)) arima_fit %>% gg_tsresiduals() ``` In the PACF The last significant spike can be seen at lag 1, which should be expected from an ARIMA(1,1,0) with drift. The box cox transformation is applied, this normalizes the data and reduces the AICc. ```{r} print(arima_fit) ``` **4.2)** After trying out all the possible combinations of ARIMA models, we can see that the model labelled 'arima022' performs the best with a lowest AICc score of 648.7516. The model is built as ARIMA(0,2,2), with p = 0, d = 2, q = 2. ```{r} fit_models <- box_US_GDP %>% model(arima000 = ARIMA(GDP ~ pdq(0,0,0)), arima010 = ARIMA(GDP ~ pdq(0,1,0)), arima110 = ARIMA(GDP ~ pdq(1,1,0)), arima210 = ARIMA(GDP ~ pdq(2,1,0)), arima020 = ARIMA(GDP ~ pdq(0,2,0)), arima120 = ARIMA(GDP ~ pdq(1,2,0)), arima220 = ARIMA(GDP ~ pdq(2,2,0)), arima320 = ARIMA(GDP ~ pdq(3,2,0)), arima011 = ARIMA(GDP ~ pdq(0,1,1)), arima111 = ARIMA(GDP ~ pdq(1,1,1)), arima211 = ARIMA(GDP ~ pdq(2,1,1)), arima021 = ARIMA(GDP ~ pdq(0,2,1)), arima121 = ARIMA(GDP ~ pdq(1,2,1)), arima221 = ARIMA(GDP ~ pdq(2,2,1)), arima321 = ARIMA(GDP ~ pdq(3,2,1)), arima021 = ARIMA(GDP ~ pdq(0,1,2)), arima112 = ARIMA(GDP ~ pdq(1,1,2)), arima212 = ARIMA(GDP ~ pdq(2,1,2)), arima022 = ARIMA(GDP ~ pdq(0,2,2)), arima122 = ARIMA(GDP ~ pdq(1,2,2)), arima222 = ARIMA(GDP ~ pdq(2,2,2)), arima322 = ARIMA(GDP ~ pdq(3,2,2)), stepwise = ARIMA(GDP), search = ARIMA(GDP, stepwise=FALSE)) report(fit_models) fit_models %>% pivot_longer(!Country, names_to = "Model name", values_to = "Orders") glance(fit_models) %>% arrange(AICc) %>% select(.model:AICc) fit_models %>% select(search) %>% gg_tsresiduals() ``` **4.3)** The best ARIMA model is plotted along side the ETS. The AICc of the best ARIMA model is significantly less than that of the ETS model with an AICc of 3191.941 compared to the ARIMA's 648. Remembering that the lower the AICc the better the model. ```{r} fit_ets <- US_GDP %>% model(ETS(GDP)) fit_ets %>% forecast(h = 10) %>% autoplot(US_GDP) + labs(title = "United States GDP 10 Year Forecast Using ETS") fit_models %>% forecast(h=5) %>% filter(.model =='search') %>% autoplot(box_US_GDP) + labs(title = "United States GDP 10 Year Forecast Using Arima Model") ```