The FSSA algorithm begins with the creation and visualization of functional time series (fts) data. We begin illustration of the FSSA algorithms with a univariate example of call center data. The number of calls made to a bank center, every six minutes, was tracked each day of the year in 1999. As such, for each day, we have 240 observations with which we can use to estimate functional data (fd) objects from. We can analyze each fd object day individually but it is more interesting to look at behavior of the data between days. We expect there to be some sort of pattern in the number of calls received to the call center between days for instance, they likely receive more calls during the weekdays than the weekends and we would like to uncover that behavior. So, we move our analysis from the finite dimensional domain of analyzing each day independent of one another for patterns within a day, to trying to find patterns between days and in particular, weekly patterns that arise from days of the week. This type of analyze calls for an approach that uses fd objects that can be used to extract information about the call center behavior not only within a day but also between the 365 days of hte year. As such, we create fd objects from the raw callcenter data using the following.
require(fda)
require(Rfssa)
data("Callcenter")
## Define functional objects
D <- matrix(sqrt(Callcenter$calls),nrow = 240)
N <- ncol(D)
time <- seq(ISOdate(1999,1,1), ISOdate(1999,12,31), by="day")
K <- nrow(D)
u <- seq(0,K,length.out =K)
d <- 22 #Optimal Number of basis elements
basis <- create.bspline.basis(c(min(u),max(u)),d)
Ysmooth <- smooth.basis(u,D,basis)
At this point, we have created 365 curves where each curve is representative of the number of calls to the center on a given day, throughout that day. Notice that the data is by definition, dependent on time, so really we are working with fts data and need to convert it to the proper fts object with the following.
## Define functional time series
Y <- fts(Ysmooth$fd,time = time)
In this version of the Rfssa package, we introduced plotting for fts objects that uses the ‘plotly’ package. The idea is to use these plots to help in deciding a proper lag parameter for the FSSA algorithm and to give a little more clearer view of the fts data. The following shows examples of the different plotting options available to the user using our call center example.
Here’s a line plot showing the fts object where each curve is representative of calls received in a day, throughout the day for 365 days. We see that despite the day, the call center receives less calls early in the day and more calls later on in the day which is the trend component we expect the FSSA algorithm to find.
plot(Y, main="Line Plot of Callcenter Data",xlab = "Intraday Intervals",ylab = "Sqrt of Callcenter", type='line')
This next plot is a heat map which offers a top-down view of the fts object so that the user can more clearly see patterns in the functions over days. We see that there is periodicity present in the fts object as about every seven days, we have a change in function behavior. This is to be expected as the bank receives less calls on weekends.
plot(Y, main = "Heat Map of Callcenter Data",tlab="Days",xlab = "Intraday Intervals", type='heatmap')
The following 3D surface and 3D line plot options are helpful in the sense that they combine the information provided in the line plot with the information provided in the heat map plot. For our call center example, we again see the seven day periodic behavior.
plot(Y, xlab = "Intraday intervals",ylab = "Sqrt of Callcenter", type='3Dsurface')
plot(Y, xlab = "Intraday intervals",ylab = "Sqrt of Callcenter", type='3Dline')
In general, the addition of the fts class with the new plotting options for the object is extremely helpful and informative when analyzing functional time series data as seen in our univariate call center example. These plotting options give us clues into how we choose our lag parameter and what types of patterns we should be searching for in our FSSA plots. For instance, since we see a roughly seven day periodic component, we expect to see seven pop up in our functional singular spectrum analysis in some way such as in the paired-plots.