Picture a scenario when the head of marketing asks you if they should launch a promo campaign next week for a specific city, to increase demand to hit the monthly target.
If you have built a predictive model that can forecast demand accurately you can reply with an informed answer.
You have the following data points to help us to give this informed answer:
1. Weekday [Monday to Sunday]
2. Day period [Early morning to Overnight]
3. Holiday [1/0]
4. Weather continuous variables [Temperature, humidity, wind speed and precipitation]
Demand is affected by an array of factors, some of which are quite hard to find and, or model for, the more influential independent variables we can find the better, a simple linear regression or ARIMA time series won’t achieve an accurate prediction of demand.
Algorithms, Regression in R, fitting the data for the best model:
Multiple Linear Regression: lm.outbound <- lm(sale_id ~ weekday + day_period + holiday + temperature + humidity + wind_speed, data = trainset) #fitting linear regression
Random Forest: rf.outbound <- randomForest(sale_id ~ weekday + day_period + holiday + temperature + humidity + wind_speed, data = trainset) #fitting random forest regression
Through testing the algorithms, we might find that the Random Forest regression is the most accurate, with the lowest MAE and RMSE. Duke University recommends the following steps for choosing the correct regression model:
If there is any one statistic that normally takes precedence over the others, it is the root mean squared error (RMSE), which is the square root of the mean squared error. When it is adjusted for the degrees of freedom for error (sample size minus number of model coefficients), it is known as the standard error of the regression or standard error of the estimate in regression analysis or as the estimated white noise standard deviation in ARIMA analysis. This is the statistic whose value is minimized during the parameter estimation process, and it is the statistic that determines the width of the confidence intervals for predictions. It is a lower bound on the standard deviation of the forecast error (a tight lower bound if the sample is large and values of the independent variables are not extreme), so a 95% confidence interval for a forecast is approximately equal to the point forecast “plus or minus 2 standard errors”–i.e., plus or minus 2 times the standard error of the regression.
However, there are a number of other error measures by which to compare the performance of models in absolute or relative terms:
- The mean absolute error (MAE) is also measured in the same units as the data, and is usually similar in magnitude to, but slightly smaller than, the root mean squared error. It is less sensitive to the occasional very large error because it does not square the errors in the calculation.
The regression output will not usually calculate this for you, so you will need to request this in R. Here’s an example.
Here is code to calculate RMSE and MAE in R and SAS.
RMSE (root mean squared error), also called RMSD (root mean squared deviation), and MAE (mean absolute error) are both used to evaluate models by summarizing the differences between the actual (observed) and predicted values. MAE gives equal weight to all errors, while RMSE gives extra weight to large errors.
First, in R:
(4, 6, 9, 10, 4, 6, 4, 7, 8, 7)
(5, 6, 8, 10, 4, 8, 4, 9, 8, 9)
error <- actual - predicted
(2, 10, 20, labels =
(weight ~ group)
Other recommended ways to compare models:
After fitting a number of different regression or time series forecasting models to a given data set, you have many criteria by which they can be compared:
- Error measures in the estimation period: root mean squared error, mean absolute error, mean absolute percentage error, mean absolute scaled error, meanerror, mean percentage error
- Error measures in the validation period (if you have done out-of-sample testing): Ditto
- Residual diagnostics and goodness-of-fit tests: plots of actual and predicted values; plots of residuals versus time, versus predicted values, and versus other variables; residual autocorrelation plots, cross-correlation plots, and tests for normally distributed errors; measures of extreme or influential observations; tests for excessive runs, changes in mean, or changes in variance (lots of things that can be “OK” or “not OK”)
- Qualitative considerations: intuitive reasonableness of the model, simplicity of the model, and above all, usefulness for decision making!
With so many plots and statistics and considerations to worry about, it’s sometimes hard to know which comparisons are most important. What’s the real bottom line?