Comparative Study of Wavelet-SARIMA and EMD-SARIMA for Forecasting Daily Temperature Series

,


1.Introduction
Climate change is one of the most critical issues in our time. Earth's climate has changed throughout history and threatens the lives and livelihood of billions of people. Earth-orbiting satellites and additional technological advances have enabled scientists to see the big picture by collecting different information about our planet and its climate on a global scale. The collected data over many years show the signals of climate change. Research conducted in 2018 [1] says that climate change is the most severe environmental problem. It directly affects human activities and property. Precipitation, melting ice, drought, poverty, river drying, and reducing the number and distribution of groundwater resources are some of the climate change signals. Temperature is one of the main climate change elements. Increasing or decreasing in temperature will affect the weather pattern in the lives of plants and animals. Understanding and predicting the future courses of temperature quantities based on the historical time series data is fundamental in climate change to know the pattern and increase in the frequency and magnitude of extreme events [2]- [5]. In addition, a good forecast with minimum error support for future preparation and good decisionmaking. Most time series forecasting methods are based on analyzing the past observed data by assuming that the past patterns in the data can be used to forecast future events.
The autoregressive integrated moving average (ARIMA) model is one of the most popular time series modelling methods. The main aim of the ARIMA model is to cautiously and rigorously study the past observations of a time series to develop a suitable model which can predict future value.
Over the past years, the ARIMA model has been widely used in temperature time series forecasting. For annual temperature prediction in Libya, [6] study showed that the linear ARIMA model and the quadratic ARIMA model had the best performance in making short-term predictions.
The study [32] used the ARIMA model for 50 year time period  in the south of Iran, and the ARIMA model was selected as the optimal model for temperature data. Furthermore, [7] used the seasonal autoregressive integrated moving average (SARIMA) model to forecast the monthly mean of the maximum surface air temperature of India, and [8] used SARIMA for monthly mean temperature in Nanjing, China. In all these studies, the selected models had higher prediction accuracy. But a disadvantage of the ARIMA model, it assumes that data are stationary and has a limited ability to capture nonstationarity and nonlinear data. ARIMA model cannot handle noisy time series data without preprocessing it. So removing the noise before applying the ARIMA model makes it easier in the modelling process to capture the features of the actual series. Different Int. J. Anal. Appl. (2022), 20:17 3 papers have introduced the hybridized approach (combination of other methods to model to get more accurate results) of decomposition models and ARIMA model to get a well-fitted model with good prediction performance of nonstationary series, [9]- [13]. Decomposition methods decompose the main time series into stationary subcomponents, which help us to see the detailed structure of the series and improve the model's prediction performance by removing noisy components. Wavelet transform (WT) and Empirical mode decomposition (EMD) based ARIMA model achieve better prediction accuracy than direct applying ARIMA model to our time series.
A comparative study in Bangladesh [9] used wavelet decomposition with the ARIMA and artificial neural network models for temperature prediction. In the study, the hybrid WT-ARIMA model result was effective. This technique supports the ARIMA model to fit the data structure, but it also has progressed in forecasting performance [10]- [12] [34]. Wavelet analysis has beautiful and deep mathematical properties, making them a well-adapted tool for different data types. It has much attention in signal processing since its theoretical development in 1984 by Grossman & Morlet. The studies of climate changes using wavelet analysis have received much consideration and applied in detecting climate signals [13]- [15]. In theory and applications, wavelet transforms enormously relates to Fourier Transform. Still, the advantage of using WT is that it offers a simultaneous localization of output in frequency and time domain with faster computation. The most significant benefit is separating fine details in the signal using small wavelets [16]. As a result, several applied fields are making use of wavelets such as [17], [18], [33].
Empirical Mode Decomposition (EMD) is a data-adaptive time-frequency representation technique proposed by Huang. It requires only that the decomposed component consists of a simple intrinsic mode of oscillations [19], which makes it quite different from WT. This method is most suitable for nonlinear and nonstationary time series data. The EMD methodology is based on a sifting process, which identifies local maxima and minima and results in intrinsic mode functions (IMFs). In detail, it decomposes the time series into a finite sum of intrinsic mode functions [20].
EMD decomposes a nonstationary signal into several stationary series components. And its combination with the ARIMA model improves prediction performance than traditional [21]. The hybrid EMD-ARIMA model shows effective long-term and short-term prediction [22], [23].
After denoising the original series with EMD and wavelet analyzer for the applied ARIMA model, we must get a suitable parameters estimation technique to build a well-fitted model, which is the main problem in model selection. A good model selecting process will balance the goodness of model fit with simplicity. In most studies, AIC and BIC are used to estimate the parameters of the ARIMA (p, d, q) model. Another effective method of finding a well fit model is based on the residual structure of the model [24]. Accurate parameter estimation makes its residual sequence stationary, which means a well fit model makes a stationary residual series. From this perspective, [25] used the stationarity of estimated residual series to calibrate inaccurate estimation in the case of data with complex noise. A nonstationary residual series shows a trend in the sequence that the model does not capture, which lead to under-fitting. And the stationarity of residual series can be measured by the nonstationary measure (NS) technique proposed by Ding et al. [26], [27], [28]. NS is constructed based on Shannon entropy and ergodic theory, the level of nonstationarity of a data series and the value of NS is in the interval [0, 1]. But the more the NS value is smaller, the more stationary the series is. From this perspective, the Nonstationary measure is used in this paper to check the NS value of residual in both the denoising and model selection process. This method has been used in different studies, such as [29] using NS measure to select forecasting models for irrigation water consumption and [30] using decomposition of noise and trend of a series based EMD and nonstationary measure. This paper aims to find a model that can predict all data features. And the main problems are the limitation of the ARIMA model to nonstationary data and model selection techniques. This paper examines the daily mean temperature's statistical properties. We constructed a predictive model to forecast the daily mean temperature for long term prediction, using a comparative study of the seasonal ARIMA model based on decomposition methods. Wavelet transforms (WT) and Empirical mode decomposition (EMD) remove noise from the temperature series and Seasonal autoregressive integrated moving-average (SARIMA) model applied to the denoised series. And nonstationary measures are used in model construction in all processes. Model performance is tested in three years and one year and a half ahead forecasting.

Study area and data collection
Temperature time-series data is used in this study. Our study aims to build a well fit model representing the series and forecasting the future temperature trend. The study area is in Eritrea, and one city is chosen, Asmara. We chose Asmara as the study location to represent Eritrea's high land climatic condition as it is the central area of the region. Daily average temperature data of 30 years collected from January 1, 1991, to December 31, 2020, from the website-Int. J. Anal. Appl. (2022), 20:17 5 https://power.larc.nasa.gov/data-access-viewer/. Collected time series data has strong seasonality, irregular patterns and trends. We do not have any missing data. Eritrea found in East Africa or the Horn of Africa, between 12° and 18° north, and 36° and 44° east. It borders the Red Sea, Ethiopia, Djibouti, and Sudan. Asmara is the capital city of Eritrea, home to almost 1 million people, and it is the largest and most populated city. Asmara has a high temperature in the spring from April to June and again in September reaches an averagely of 27.8°c. It went to the lowest temperature from November to February, on averagely, to 21.6°c. The Horn Africa region has an unpredictable change in climate, which causes frequent droughts. So as Eritrea is part of this region, the climate signals and future prediction are essential.

Wavelet analysis
Wavelet analysis is a mathematical model that reveals nonstationary time series data information in the time and frequency domain. That is why it is suitable for the nonstationary and seasonal temperature series. The main advantage of wavelet transform is that it decomposes a signal into different frequencies at a different resolution and is used as an alternative to Fourier transform.
The basic wavelet function called "mother wavelet" is defined as: the parameter a is called the scaling parameter and measure the degree of compression, while b the translation parameter moves the function in the time location to analyze the signal. The transformation process is implemented using a multi-resolution decomposition technique. It divides the original series into different domain components, the detail (high frequency) series , and the smooth (approximate) series , using a high pass filter and low pass filter, respectively. There exist various choices of wavelet basis functions such as Haar, Daubechies, Symlet, Meyer, biorthogonal wavelet, etc. A detailed explanation of wavelet transform functions is in [11], [13]. In our study, the wavelet family was selected based on the NS measure of the residual between denoised series and original series.
When conducting wavelet analysis, selecting the optimal number of decomposition levels is one of the keys to determining the performance of a model in the wavelet domain. Based on [9], the number of decomposition levels is selected using the following formula: where L is the number of decomposition levels, N is time-series length. The original time-series data will be decomposed into its detailed , and approximate (smooth) series , . And with the wavelet de-nosing method, the series was reconstructed by removing the high-frequency component. Generally, EMD decomposes a noisy signal into different IMFs plus residuals. The decomposed signal is written as: where ℎ , indicates the i th IMF and represents the residual of the signal . In the EMD, each IMF function must satisfy the following conditions: 1. The number of maximum and minimum extrema and the number of zero crossings must either be equal or differ at most by one.
2. At any point in the IMF series, the mean of the envelope defined by the local maxima and the local minima is zero.
From the second condition, we can understand that IMF is stationary, which simplifies the analysis, but IMF can have variable amplitude and frequency. An iterative procedure of EMD called the "sifting process" is adopted to extract the separate IMF components. The detailed guidelines of the process can refer it in [31]. Step1: let be the original time series, decompose the series with EMD decomposition procedure Step2: construct the Denoised series by adding low-frequency components Step3: calculate the difference series between original and denoised data = − , t=1, 2,…, n (3. 8) Step4: If the Ns value of the difference series is ( ) ≤ 0.05 then is the new Denoised series. If the value is bigger, go to step 2. and i=i+1. Continue the process until we get a stationary residual series.  where testing set and ̂ forecasted data series with n length series. MAE is the mean absolute difference between the testing set and forecasted data. RMSE, the square root of the average square difference of test set and forecasted value. In both, the minimum value is a better result.
The formula gives for MASE: where Q is the scaling statistic computed on the training set, for non-seasonal time series, the Q value is the mean absolute error of the one-step naïve forecasting method.
and for seasonal series: where m is the season period and n the size of the training set, if the MASE result is less than one, it means a better forecasting value than the naïve forecast, and if the result is greater than one, it shows worse forecasting. A model with a minimum value in all tests is considered better performance.

Data Division
The average temperature time series data used in this study were daily collected data of 30 years in Asmara, Eritrea. A nonparametric trend analysis is used in our research. The seasonal Mann-Kendall (M-K) test shows an increasing trend in the series as the p-value is <0.0001. Table 1.
demonstrates the statistical description of the data and the type of trend in the series. Sen's slope refers to the slope of the trend, which is 0.013. As we see the structure of our data next step was to find a model that captures the trend and features of the series. And categorizing the series into groups is essential because different sets are needed for training data, while another set is used as testing data. Training data is manipulated to train and develop the forecasting model.
In contrast, testing data is treated as future data that must be used to compare the fitted model's predictability accuracy. It gives the ability to compare the effectiveness of different models of prediction. The partitioning way of the series is essential, which could affect the performance of the forecasting result of the models, and several aspects such as the data type and the size of the available data should be taken into consideration. In this research, the training set was chosen from    The time-series data is stationary if the ACF graph cuts off rapidly, but if the ACF graph dies down too slow, the series is considered nonstationary. In Fig.6, it is noticeable that the ACF graph drops gradually; thus, the training set of the actual data is considered nonstationary and nonlinear.
In this study, decomposition methods were added to remove the noise component of the series, and as the decomposed components illustrate, the original series has seasonal behavior. Therefore, the ARIMA model will use seasonal differencing to handle the seasonal behavior of the series.  But in some models, The result of three model selection criteria may not reduce proportionally. In this situation model is selected by balancing these three criteria to find the minimum value of AIC and BIC and stationary residual series where the residual NS value must be less or equal to 0.05.
Results of AIC, BIC and NS of sample models from WT and EMD based SARIMA are listed in SARIMA(2,0,1)(1,1,1)365 has the minimum AIC and BIC value, but as shown in Table 2., the NS value of its residual is greater than 0.05, which implies its residual series is not stationary. While  Table 3. illustrate all the result of the selected models, and all hybrid models have stationary residual series, which show how well they fit the training set.     Figure 7: 1095 days forecasting series Vs testing set The estimated coefficients value, standard error, t-statistic and p-value of the model are given in Table 6. Since all P-values are less than 0.05, the results are statistically significant.

Conclusion
Eritrea is located in the area where climate change is a major issue. And the climate of this country is affected by topography, which divide the country into high land and low land. Most of the population lives in the high land region and more than 70% of the population life depends on agriculture. Therefore, studies conducting on climate change are important to understand the trend and future prediction. As temperature is one element of climate change in this study we focus on finding a model for long term prediction of temperature in the city Asmara. This study assessed the capability of prediction of SARIMA based two different decomposition methods and the characteristics of the temperature data. the models capture all the information and structure of the series. As we see in the model construction and forecasting evaluation, models chosen based on the NS value has higher performance. As we are going for minimum AIC and BIC, the model's residual is showing nonstationarity. And this is one drawback of AIC and BIC we see in this study. Measuring the NS value of the residual is practical to find a well-fitted model.