Module 12 Assignment
a. Construct a time series plot using R
#a. Construct a time series plot
# Load necessary library
library(ggplot2)
# Create a data frame
df <- data.frame(
Month = factor(c(rep(c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"), 2)), levels = c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")),
Year = c(rep(2012, 12), rep(2013, 12)),
Charge = c(31.9, 27, 31.3, 31, 39.4, 40.7, 42.3, 49.5, 45, 50, 50.9, 58.5, 39.4, 36.2, 40.5, 44.6, 46.8, 44.7, 52.2, 54, 48.8, 55.8, 58.7, 63.4)
)
# Create a time series plot
ggplot(df, aes(x = interaction(Year, Month, lex.order = TRUE), y = Charge, group = Year, color = as.factor(Year))) +
geom_line() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
labs(x = "Month", y = "Charge", color = "Year")
b. Employ Exponential Smoothing Model as outline in Avril Voghlans notes and report the statistical outcome.
#b. Employ Exponential Smoothing Model
# Load necessary library
library(forecast)
# Create a time series object
ts_obj <- ts(df$Charge, start = c(2012, 1), frequency = 12)
# Fit an Exponential Smoothing model
model <- ets(ts_obj)
# Print the model
print(model)
c. Provide a discussion on time series and Exponential Smoothing Model result you led to.
From the time series plot, we observed that the charges have been increasing over time for both years, with a more rapid increase in 2013. This could be due to a variety of factors such as changes in usage patterns, changes in interest rates, or changes in the student population.
Observations from the plot:
The line for 2012 (red) appears to be increasing steadily over time.
The line for 2013 (green) also shows an increasing trend but at a faster rate compared to 2012.
Here’s a breakdown of the ETS model summary:
Smoothing parameters: These are the parameters that the model uses to “smooth” the data in order to identify the underlying pattern. In my model, alpha = 0.8635 is the smoothing parameter for the level (or the average value) of the series.
Initial states: These are the starting values for the level (l = 30.9107) of the series.
sigma: This is the standard deviation of the forecast errors, which is 0.1354 in my model. A smaller sigma indicates that the model fits the data more closely.
AIC, AICc, BIC: These are information criteria used for model selection. They balance the fit of the model (in terms of the likelihood) against the complexity of the model (in terms of the number of parameters). Lower values indicate a better model. In my case, the AIC is 164.3730, the corrected AIC (AICc) is 165.5730, and the BIC is 167.9071.
Based on this analysis, it seems that my model fits the data reasonably well, as indicated by the relatively low values of AIC, AICc, and BIC. However, without comparing it to other models or checking the residuals, it’s hard to definitively say how good this model is.
Comments
Post a Comment