資料集

澳洲Virotia州從2002到2015的電力需求資料,於下列論文使用:

K. Bandura, R.J. Hyndman, and C. Bergmeir (2021)
    MSTL: A Seasonal-Trend Decomposition Algorithm for Time Series with Multiple
    Seasonal Patterns. arXiv preprint arXiv:2107.13462.

Import

import datetime
import re

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

sns.set_context("talk")

Load Data

data = pd.read_csv(
    "../Datasets/victoria_electricity_demand.csv",
    usecols=["demand", "date_time"],
    parse_dates=["date_time"],
    index_col=["date_time"],
)
data.shape

Plot the Data

fig, ax = plt.subplots(figsize=[20, 5])
data.loc["2012":].plot(y="demand", legend=None, ax=ax)
ax.set_xlabel("Time")
ax.set_ylabel("Electricity Demand")
ax.set_title("Electricity Demand")
plt.tight_layout()

Using sklearn to create dummy features

from sklearn.preprocessing import OneHotEncoder

# Let's ensure all sklearn transformers output pandas dataframes
from sklearn import set_config
set_config(transform_output="pandas") 
df = data.copy()

Create some features from the date

df["month_of_year"] = df.index.month
df["week_of_year"] = df.index.isocalendar().week

Create the one hot encoder transformer

transformer = OneHotEncoder(sparse_output=False, # Required to enable
                                                 # pandas output.
                            drop="first", # To avoid the dummy variable
                                          # trap we drop the first dummy.
                            )

Create seasonal dummy variables from our date features

result = transformer.fit_transform(df[["month_of_year", "week_of_year"]])
result.head()