Setup

For the rest of this guide, we will use time-series ausbeer from package fpp:

library("fpp")

beer_train = head(ausbeer, -20)
beer_test = head(tail(ausbeer, 20), 10)

NARX Model

A simple NARX model can be crafted using narx. A model with

can be instantiated as follows:

library("mltsp")
library("e1071")
spec = build_narx(svm, p=2, d=0, P=1, D=1, freq=frequency(ausbeer))
model = narx(spec, beer_train)

model
## NARX svm(2, 0)(1, 1)[4]

or alternatively:

model = narx(beer_train, svm, p=2, d=0, P=1, D=1, freq=frequency(ausbeer))

Duplicating the model

One could re-instantiate the model for use with another dataset using

beer_train2 = head(ausbeer, -10)

model2 = narx(model, beer_train2)

Prediction

Prediction uses either forecast (compatible with package forecast):

fcst = forecast(model, h = 10)
plot(fcst)
lines(beer_test, col="red")

or use predict if you want to also supply new data but use the same model:

beer_train2 = head(ausbeer, -10) # this is using data from another future!
beer_test2 = tail(ausbeer, 10) # this is using data from another future!

fcst2 = predict(model, beer_train2, h = 10)

plot(ausbeer)
lines(beer_test2, col="red")
lines(fcst2, col="blue")

Alternatively, one could reuse the model using

model2 = narx(model, beer_train2)
fcst3 = forecast(model2, h = 10)

plot(fcst3)
lines(fcst2, col="red")

Using exogenous data

Use xreg parameter as exogenous data. This example tries to forecast a random walk. Without xreg, this should be almost impossible:

set.seed(0)

tstamps = seq(as.Date("2000-01-01"), length.out = 110, by='day')
x = xts(runif(length(tstamps)), tstamps)
xreg = 1 - 0.5 * x
yreg = xts(runif(110), tstamps)

colnames(xreg) = colnames(yreg) = "xreg"

# training and testing data
x_train = head(x, 100)
x_test = tail(x, 10)
ind_test = index(x_test)

For simplicity, we use SimpleLM as the learner, which is a simple wrapper for lm. * lm is not compatible with narx as it requires a formula. SimpleLM allows using a linear model without resorting to crafting formulas, similar to what svm from package e1071 does.

Model one, without xreg:

model = narx(x_train, SimpleLM, p = 2)
pred1 = forecast(model, h=10)

Model two, with correlated (good) xreg:

model2 = narx(x_train, SimpleLM, p = 2, xreg = xreg)
pred2 = forecast(model2, xreg=xreg[ind_test])

Model three, with an uncorrelated (bad) xreg:

model3 = narx(x_train, SimpleLM, p = 2, xreg = yreg)
pred3 = forecast(model3, xreg=yreg[ind_test])

The results:

rmse <- function(x,y) sqrt(mean((x-y)^2))

c(Err_without_xreg= rmse(pred1$mean, x_test),
  Err_with_xreg= rmse(pred2$mean, x_test),
  Err_with_bad_xreg= rmse(pred3$mean, x_test))
##  Err_without_xreg     Err_with_xreg Err_with_bad_xreg 
##      2.755231e-01      1.570092e-16      2.985320e-01

The smallest error is obtained from the model with correlated xreg.

Notes

  • Some ML models, such as SimpleLM, require the same column names in training and testing to fit data.
  • Instead of h as the forecast horizon, xreg and its time-stamps are used.