We will conduct this forecasting in Gretl as estimating parameters in Excel is pretty complicated.
The spreadsheet Regression Dataset contains the same data as our basic dataset, however there were couple of transformations
- Instead of number of bookings on certain day before the arrival date, I calculated cumulated number of bookings on certain day before arrival day.
- We want to forecast 2013, therefore I cut the data period to 1.10.201 – 30.12.2012 – in dataset sheet for purpose of modeling
- The MAPE sheet is expanded by final bookings from period 1.01.2013 – 30.12.2013 for purpose of testing accuracy of this method.
Please follow my steps here:
1. Firstly let’s important data from Regression Dataset spreadsheet.*
* I assume that you’ve already installed gretl software
2. Find the location you saved the spreadsheet and remember to change the filter in right bottom from .xls to .xlsx
3. Start import at column 1; row 1
4. Now lets choose “model” and “ordinary least squares” – it will be our method of estimating parameters.
5. Now we choose our dependent variable (it will be final number of bookings on the arrival date – Y0) and now we can choose the independent variables – lets try following sets:
As a result our model looks like:
How should I use it?
If you want to forecast how many rooms will be booked tomorrow (as tomorrow is arrival date); You should check how many rooms are booked today (for tomorrow’s date) multiply the result by 0,598 and than add 42,067.
e.g. today is 31th May 2014 and we want to predict what will be the number of final bookings for tomorrow’s date: 1st June 2014. We know that we have already 87 bookings for tomorrow; so = 42,067 + 0,598*87 = 94 ; so we can predict that on 1st May there will be 94 rooms booked.
b) Similarly we can create models with different independent variables, some examples I suggest below:
6. So now you created the whole bunch of different linear regression (estimated with OLS) models. How to choose the best one?
For this purpose I decided to use 7th model. This model use number of bookings 1,3,7 days before arrival date as independent variables.
So our model looks like: Yt=42,63 + 0,536Y1 + 0,057Y3 + 0,0097Y7 and we can use values from 2013 to calculate forecast and MAPE.
The calculations are in the Regression Dataset spreadsheet in “MAPE” Sheet, as a result we achieved MAPE of 5,7% what is the smallest so result so far. It also means, that this forecasting method is the best so far (on our dataset)
Do you want to Return