lhf.tex

\documentclass[11pt,a4paper,]{article}
\usepackage{lmodern}

\usepackage{amssymb,amsmath}
\usepackage{ifxetex,ifluatex}
\usepackage{fixltx2e} % provides \textsubscript
\ifnum 0\ifxetex 1\fi\ifluatex 1\fi=0 % if pdftex
  \usepackage[T1]{fontenc}
  \usepackage[utf8]{inputenc}
\else % if luatex or xelatex
  \usepackage{unicode-math}
  \defaultfontfeatures{Ligatures=TeX,Scale=MatchLowercase}
\fi
% use upquote if available, for straight quotes in verbatim environments
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
% use microtype if available
\IfFileExists{microtype.sty}{%
\usepackage[]{microtype}
\UseMicrotypeSet[protrusion]{basicmath} % disable protrusion for tt fonts
}{}
\PassOptionsToPackage{hyphens}{url} % url is loaded by hyperref
\usepackage[unicode=true]{hyperref}
\hypersetup{
            pdftitle={Fast forecast reconciliation using linear models},
            pdfkeywords={hierarchical forecasting, grouped forecasting, reconciling forecast, linear regression},
            pdfborder={0 0 0},
            breaklinks=true}
\urlstyle{same}  % don't use monospace font for urls
\usepackage{geometry}
\geometry{left=2.5cm,right=2.5cm,top=2.5cm,bottom=2.5cm}
\usepackage[style=authoryear-comp,]{biblatex}
\addbibresource{references.bib}
\usepackage{longtable,booktabs}
% Fix footnotes in tables (requires footnote package)
\IfFileExists{footnote.sty}{\usepackage{footnote}\makesavenoteenv{long table}}{}
\IfFileExists{parskip.sty}{%
\usepackage{parskip}
}{% else
\setlength{\parindent}{0pt}
\setlength{\parskip}{6pt plus 2pt minus 1pt}
}
\setlength{\emergencystretch}{3em}  % prevent overfull lines
\providecommand{\tightlist}{%
  \setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}
\setcounter{secnumdepth}{5}

% set default figure placement to htbp
\makeatletter
\def\fps@figure{htbp}
\makeatother


\title{Fast forecast reconciliation using linear models}

%% MONASH STUFF

%% CAPTIONS
\RequirePackage{caption}
\DeclareCaptionStyle{italic}[justification=centering]
 {labelfont={bf},textfont={it},labelsep=colon}
\captionsetup[figure]{style=italic,format=hang,singlelinecheck=true}
\captionsetup[table]{style=italic,format=hang,singlelinecheck=true}

%% FONT
\RequirePackage{bera}
\RequirePackage{mathpazo}

%% HEADERS AND FOOTERS
\RequirePackage{fancyhdr}
\pagestyle{fancy}
\rfoot{\Large\sffamily\raisebox{-0.1cm}{\textbf{\thepage}}}
\makeatletter
\lhead{\textsf{\expandafter{\@title}}}
\makeatother
\rhead{}
\cfoot{}
\setlength{\headheight}{15pt}
\renewcommand{\headrulewidth}{0.4pt}
\renewcommand{\footrulewidth}{0.4pt}
\fancypagestyle{plain}{%
\fancyhf{} % clear all header and footer fields
\fancyfoot[C]{\sffamily\thepage} % except the center
\renewcommand{\headrulewidth}{0pt}
\renewcommand{\footrulewidth}{0pt}}

%% MATHS
\RequirePackage{bm,amsmath}
\allowdisplaybreaks

%% GRAPHICS
\RequirePackage{graphicx}
\setcounter{topnumber}{2}
\setcounter{bottomnumber}{2}
\setcounter{totalnumber}{4}
\renewcommand{\topfraction}{0.85}
\renewcommand{\bottomfraction}{0.85}
\renewcommand{\textfraction}{0.15}
\renewcommand{\floatpagefraction}{0.8}

%\RequirePackage[section]{placeins}

%% SECTION TITLES
\RequirePackage[compact,sf,bf]{titlesec}
\titleformat{\section}[block]
  {\fontsize{15}{17}\bfseries\sffamily}
  {\thesection}
  {0.4em}{}
\titleformat{\subsection}[block]
  {\fontsize{12}{14}\bfseries\sffamily}
  {\thesubsection}
  {0.4em}{}
\titlespacing{\section}{0pt}{*5}{*1}
\titlespacing{\subsection}{0pt}{*2}{*0.2}


%% TITLE PAGE
\def\Date{\number\day}
\def\Month{\ifcase\month\or
 January\or February\or March\or April\or May\or June\or
 July\or August\or September\or October\or November\or December\fi}
\def\Year{\number\year}

\makeatletter
\def\wp#1{\gdef\@wp{#1}}\def\@wp{??/??}
\def\jel#1{\gdef\@jel{#1}}\def\@jel{??}
\def\showjel{{\large\textsf{\textbf{JEL classification:}}~\@jel}}
\def\nojel{\def\showjel{}}
\def\addresses#1{\gdef\@addresses{#1}}\def\@addresses{??}
\def\cover{{\sffamily\setcounter{page}{0}
        \thispagestyle{empty}
        \placefig{2}{1.5}{width=5cm}{monash2}
        \placefig{16.9}{1.5}{width=2.1cm}{MBusSchool}
        \begin{textblock}{4}(16.9,4)ISSN 1440-771X\end{textblock}
        \begin{textblock}{7}(12.7,27.9)\hfill
        \includegraphics[height=0.7cm]{AACSB}~~~
        \includegraphics[height=0.7cm]{EQUIS}~~~
        \includegraphics[height=0.7cm]{AMBA}
        \end{textblock}
        \vspace*{2cm}
        \begin{center}\Large
        Department of Econometrics and Business Statistics\\[.5cm]
        \footnotesize http://monash.edu/business/ebs/research/publications
        \end{center}\vspace{2cm}
        \begin{center}
        \fbox{\parbox{14cm}{\begin{onehalfspace}\centering\Huge\vspace*{0.3cm}
                \textsf{\textbf{\expandafter{\@title}}}\vspace{1cm}\par
                \LARGE\@author\end{onehalfspace}
        }}
        \end{center}
        \vfill
                \begin{center}\Large
                \Month~\Year\\[1cm]
                Working Paper \@wp
        \end{center}\vspace*{2cm}}}
\def\pageone{{\sffamily\setstretch{1}%
        \thispagestyle{empty}%
        \vbox to \textheight{%
        \raggedright\baselineskip=1.2cm
     {\fontsize{24.88}{30}\sffamily\textbf{\expandafter{\@title}}}
        \vspace{2cm}\par
        \hspace{1cm}\parbox{14cm}{\sffamily\large\@addresses}\vspace{1cm}\vfill
        \hspace{1cm}{\large\Date~\Month~\Year}\\[1cm]
        \hspace{1cm}\showjel\vss}}}
\def\blindtitle{{\sffamily
     \thispagestyle{plain}\raggedright\baselineskip=1.2cm
     {\fontsize{24.88}{30}\sffamily\textbf{\expandafter{\@title}}}\vspace{1cm}\par
        }}
\def\titlepage{{\cover\newpage\pageone\newpage\blindtitle}}

\def\blind{\def\titlepage{{\blindtitle}}\let\maketitle\blindtitle}
\def\titlepageonly{\def\titlepage{{\pageone\end{document}}}}
\def\nocover{\def\titlepage{{\pageone\newpage\blindtitle}}\let\maketitle\titlepage}
\let\maketitle\titlepage
\makeatother

%% SPACING
\RequirePackage{setspace}
\spacing{1.5}

%% LINE AND PAGE BREAKING
\sloppy
\clubpenalty = 10000
\widowpenalty = 10000
\brokenpenalty = 10000
\RequirePackage{microtype}

%% PARAGRAPH BREAKS
\setlength{\parskip}{1.4ex}
\setlength{\parindent}{0em}

%% HYPERLINKS
\RequirePackage{xcolor} % Needed for links
\definecolor{darkblue}{rgb}{0,0,.6}
\RequirePackage{url}

\makeatletter
\@ifpackageloaded{hyperref}{}{\RequirePackage{hyperref}}
\makeatother
\hypersetup{
     citecolor=0 0 0,
     breaklinks=true,
     bookmarksopen=true,
     bookmarksnumbered=true,
     linkcolor=darkblue,
     urlcolor=blue,
     citecolor=darkblue,
     colorlinks=true}

%% KEYWORDS
\newenvironment{keywords}{\par\vspace{0.5cm}\noindent{\sffamily\textbf{Keywords:}}}{\vspace{0.25cm}\par\hrule\vspace{0.5cm}\par}

%% ABSTRACT
\renewenvironment{abstract}{\begin{minipage}{\textwidth}\parskip=1.4ex\noindent
\hrule\vspace{0.1cm}\par{\sffamily\textbf{\abstractname}}\newline}
  {\end{minipage}}


\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}

\usepackage[showonlyrefs]{mathtools}
\usepackage[no-weekday]{eukdate}

%% BIBLIOGRAPHY

\makeatletter
\@ifpackageloaded{biblatex}{}{\usepackage[style=authoryear-comp, backend=biber, natbib=true]{biblatex}}
\makeatother
\ExecuteBibliographyOptions{bibencoding=utf8,minnames=1,maxnames=3, maxbibnames=99,dashed=false,terseinits=true,giveninits=true,uniquename=false,uniquelist=false,doi=false, isbn=false,url=true,sortcites=false}

\DeclareFieldFormat{url}{\texttt{\url{#1}}}
\DeclareFieldFormat[article]{pages}{#1}
\DeclareFieldFormat[inproceedings]{pages}{\lowercase{pp.}#1}
\DeclareFieldFormat[incollection]{pages}{\lowercase{pp.}#1}
\DeclareFieldFormat[article]{volume}{\mkbibbold{#1}}
\DeclareFieldFormat[article]{number}{\mkbibparens{#1}}
\DeclareFieldFormat[article]{title}{\MakeCapital{#1}}
\DeclareFieldFormat[inproceedings]{title}{#1}
\DeclareFieldFormat{shorthandwidth}{#1}
% No dot before number of articles
\usepackage{xpatch}
\xpatchbibmacro{volume+number+eid}{\setunit*{\adddot}}{}{}{}
% Remove In: for an article.
\renewbibmacro{in:}{%
  \ifentrytype{article}{}{%
  \printtext{\bibstring{in}\intitlepunct}}}

\makeatletter
\DeclareDelimFormat[cbx@textcite]{nameyeardelim}{\addspace}
\makeatother
\renewcommand*{\finalnamedelim}{%
  %\ifnumgreater{\value{liststop}}{2}{\finalandcomma}{}% there really should be no funny Oxford comma business here
  \addspace\&\space}


\wp{29/19}
\jel{C10,C14,C22}

\RequirePackage[absolute,overlay]{textpos}
\setlength{\TPHorizModule}{1cm}
\setlength{\TPVertModule}{1cm}
\def\placefig#1#2#3#4{\begin{textblock}{.1}(#1,#2)\rlap{\includegraphics[#3]{#4}}\end{textblock}}


\author{Mahsa~Ashouri, Rob J~Hyndman, Galit~Shmueli}
\addresses{\textbf{Mahsa Ashouri}\newline
Institute of Service Science, National Tsing Hua University, Taiwan
\newline{Email: \href{mailto:mahsa.ashouri@iss.nthu.edu.tw}{\nolinkurl{mahsa.ashouri@iss.nthu.edu.tw}}}\newline Corresponding author\\[1cm]
\textbf{Rob J Hyndman}\newline
Monash University, Clayton VIC 3800, Australia
\newline{Email: \href{mailto:rob.hyndman@monash.edu}{\nolinkurl{rob.hyndman@monash.edu}}}\\[1cm]
\textbf{Galit Shmueli}\newline
Institute of Service Science, National Tsing Hua University, Taiwan
\newline{Email: \href{mailto:galit.shmueli@iss.nthu.edu.tw}{\nolinkurl{galit.shmueli@iss.nthu.edu.tw}}}\\[1cm]
}

\date{\sf\Date~\Month~\Year}
\makeatletter
 \lfoot{\sf Ashouri, Hyndman, Shmueli: \@date}
\makeatother

%% Any special functions or other packages can be loaded here.

\usepackage{booktabs}
\usepackage{float}
\usepackage{longtable}
\usepackage{cases}
\usepackage{array}
\usepackage{todonotes}
%\usepackage[ruled,vlined,linesnumbered]{algorithm2e}
%\usepackage{algpseudocode}
\usepackage{algorithm,algorithmic,amsmath}

%\usepackage[backend=biber]{biblatex}
%\usepackage[backend=biber, bibencoding=utf8, style=authoryear, citestyle=authoryear]{biblatex}

\mathtoolsset{showonlyrefs=true}
\allowdisplaybreaks

\def\addlinespace{}
\usepackage[section]{placeins}

\usepackage{float}
\let\origfigure\figure
\let\endorigfigure\endfigure
\renewenvironment{figure}[1][2] {
    \expandafter\origfigure\expandafter[!htbp]
} {
    \endorigfigure
}

\let\origtable\table
\let\endorigtable\endtable
\renewenvironment{table}[1][2] {
    \expandafter\origtable\expandafter[!htbp]
} {
    \endorigtable
}
\usepackage{booktabs}
\usepackage{longtable}
\usepackage{array}
\usepackage{multirow}
\usepackage{wrapfig}
\usepackage{float}
\usepackage{colortbl}
\usepackage{pdflscape}
\usepackage{tabu}
\usepackage{threeparttable}
\usepackage{threeparttablex}
\usepackage[normalem]{ulem}
\usepackage{makecell}
\usepackage{xcolor}


\begin{document}
\maketitle
\begin{abstract}
Forecasting hierarchical or grouped time series using a reconciliation approach involves two steps: computing base forecasts and reconciling the forecasts. Base forecasts can be computed by popular time series forecasting methods such as Exponential Smoothing (ETS) and Autoregressive Integrated Moving Average (ARIMA) models. The reconciliation step is a linear process that adjusts the base forecasts to ensure they are coherent. However using ETS or ARIMA for base forecasts can be computationally challenging when there are a large number of series to forecast, as each model must be numerically optimized for each series. We propose a linear model that avoids this computational problem and handles the forecasting and reconciliation in a single step. The proposed method is very flexible in incorporating external data, handling missing values and model selection. We illustrate our approach using a dataset on monthly Australian domestic tourism, as well as a simulated dataset. We compare our approach to reconciliation using ETS and ARIMA, and show that our approach is much faster while providing similar levels of forecast accuracy.
\end{abstract}
\begin{keywords}
hierarchical forecasting, grouped forecasting, reconciling forecast, linear regression
\end{keywords}

\hypertarget{introduction}{%
\section{Introduction}\label{introduction}}

Modern data collection tools have dramatically increased the amount of available time series data \autocite{januschowski2013forecasting}. For example, the internet of things and point-of-sale scanning produce huge volumes of time series in a short period of time. Naturally, there is an interest in forecasting these time series, yet forecasting large collections of time series is computationally challenging.

\hypertarget{hierarchical-and-grouped-time-series}{%
\subsection{Hierarchical and grouped time series}\label{hierarchical-and-grouped-time-series}}

In many cases, these time series can be structured and disaggregated based on hierarchies or groups such as geographic location, product type, gender, etc. An example of hierarchical time series is the monthly number of Australian domestic tourists, which can be disaggregated into different states and then into different zones. Figure \ref{fig:hierarchicalexample} shows a schematic of such a hierarchical time series structure with three levels. The top level is the total series, formed by aggregating all the bottom level series. In the middle level, series are aggregations of their own child series; for instance, series A is the aggregation of AW and AX. Finally, the bottom level series, includes the most disaggregated series. In our example, A might represent the Northern Territory (NT) state, which can be disaggregated into northern coast NT and central NT.

\begin{figure}

{\centering \includegraphics[width=200px,height=170px,trim=0 0 190 0,clip=true]{Paper-Figures/hierarchical_example} 

}

\caption{An example of a two level hierarchical structure.}\label{fig:hierarchicalexample}
\end{figure}

Grouped time series involve more complicated aggregation structures compared to strictly hierarchical time series. To continue our Australian tourism example: suppose we have two grouping factors for each tourist that are not nested: purpose of travel (Business/Holiday) and sex (Male/Female). The disaggregated series for each combination of purpose of travel and sex can be combined to form purpose of travel subtotals, or sex subtotals. These subtotals can be combined to give the overall total. Both subtotals are of interest.

We can think of such structures as hierarchical time series without a unique hierarchy. A schematic of this grouped time series structure is shown in Figure \ref{fig:groupexample} with two grouping factors, each of two levels (A/B and C/D). The series in this structure can be split first into groups A and B and then subdivided further into C and D (left side), or split first into C and D and then subdivided into A and B (right side). The final disaggregation is identical in both cases, but the middle level aggregates are different.

\begin{figure}

{\centering \includegraphics[width=330px,height=180px]{lhf_files/figure-latex/groupexample-1} 

}

\caption{An example of a two level grouped structure.}\label{fig:groupexample}
\end{figure}

We use the same notation \autocite[following][]{fpp2} for both hierarchical and grouped time series. We denote the total series at time \(t\) by \(y_t\), and the series at node \(Z\) (subaggregation level \(Z\)) and time \(t\) by \(y_{Z,t}\). For describing the relationships between series, we use an \(N\times M\) matrix, called the ``summing matrix'', denoted by \(\bm{S}\), in which \(N\) is the overall number of nodes and \(M\) is the number of bottom level nodes. For example in Figure \ref{fig:hierarchicalexample}, \(N = 7\) and \(M = 4\), while in Figure \ref{fig:groupexample}, \(N=9\) and \(M=4\). Then we can write \(\bm{y}_t=\bm{S}\bm{b}_t\), where \(\bm{y}_t\) is a vector of all the level nodes at time \(t\) and \(\bm{b}_t\) is the vector of all the bottom level nodes at time \(t\). For the example shown in Figure \ref{fig:groupexample}, the equation can be written as follows:
\begin{equation}\label{eq:Smatrixexample}
  \begin{pmatrix}
    y_{t}\\y_{A,t}\\y_{B,t}\\y_{C,t}\\y_{D,t}\\y_{AC,t}\\y_{AD,t}\\y_{BC,t}\\y_{BD,t}
  \end{pmatrix} =
  \begin{pmatrix}
    1&1&1&1\\1&1&0&0\\0&0&1&1\\1&0&1&0\\0&1&0&1\\1&0&0&0\\0&1&0&0\\0&0&1&0\\0&0&0&1\\
  \end{pmatrix}
  \begin{pmatrix}
    y_{AC,t}\\y_{AD,t}\\y_{BC,t}\\y_{BD,t}\\
  \end{pmatrix}.
\end{equation}

\hypertarget{forecasting-hierarchical-time-series}{%
\subsection{Forecasting hierarchical time series}\label{forecasting-hierarchical-time-series}}

If we just forecast each series individually, we are ignoring the hierarchical or grouping structure, and the forecasts will not be ``coherent''. That is, forecasts for lower level series will not necessarily add up to forecasts for higher level series. This means forecasts will not add up in a way that is consistent with the aggregation structure of the time series collection \autocite{fpp2}.

There are several available methods that consider the hierarchical structure information when forecasting time series. These include the top-down \autocite{gross1990disaggregation,fliedner2001hierarchical}, bottom-up \autocite{kahn1998revisiting}, middle-out and optimal combination \autocite{hyndman2011optimal} approaches. In the top-down approach, we first forecast the total series and then disaggregate the forecast to form lower level series forecasts based on a set of historical or forecasted proportions \autocite[for details see][]{athanasopoulos2009hierarchical}. In the bottom-up approach, the forecasts in each level of the hierarchy can be computed by aggregating the bottom level series forecasts. However, we may not get good upper-level forecasts because the most disaggregated series can be noisy and so their forecasts are often inaccurate. In the middle-out approach, the process can be started from one of the middle levels and other forecasts can be computed using aggregation for upper levels and disaggregation for lower levels. Finally, optimal combination uses all the \(N\) forecasts for all of the series in the entire structure, and then uses an optimization process to reconcile the resulting forecasts. The advantage of the optimal combination method, compared with the other methods, is that it considers all information in the hierarchy, including any correlations among the series.

In the optimal combination method, reconciled forecasts can be computed using the following generalized least squares equation \autocite{mint2018}
\begin{equation}\label{eq:mint}
  \tilde{\bm{y}}_{t+h}=\bm{S}(\bm{S}'\bm{W}_{h}^{-1}\bm{S})^{-1}\bm{S}'\bm{W}_{h}^{-1}\hat{\bm{y}}_{t+h},
\end{equation}
where \(\hat{\bm{y}}_{t+h}\) represents a vector of \(h\)-step-ahead base forecasts for all levels of the hierarchy, and \(\bm{W}_{h}\) is the covariance matrix of forecast errors for the \(h\)-step-ahead base forecasts.

Several possible methods for estimating \(\bm{W}_{h}\) are available. \textcite{mint2018} discuss a simple ``structural scaling'' approximation whereby \(\bm{W}_{h} = k_{h} \bm{\Lambda}\) with \(k_{h}\) being a positive constant, \(\bm{\Lambda} = \text{diag}(\bm{S}\bm{1})\), and \(\bm{1}\) being a unit vector of dimension \(M\) (the number of bottom level series). Note that \(\bm{\Lambda}\) simply contains the row sums of the summing matrix \(\bm{S}\), and that \(k_{h}\) will cancel out in \eqref{eq:mint}.\footnote{For simplicity of exposition, we used the structural scaling (wls\_struct) summing matrix for reconciliation in all the results. In \protect\hyperlink{appendixA}{Appendix A}, we compare two alternatives: Tables \ref{tab:Tourismdatadifrecrolling} and \ref{tab:Tourismdatadifrecfix} display the results for estimates of \(\bm{W}_h\) using a shrinkage estimator (\texttt{mint\_shrink}) and variance scaling (\texttt{wls\_var}) for the Australian tourism example.} Thus
\begin{equation}\label{eq:mint2}
  \tilde{\bm{y}}_{t+h}=\bm{S}(\bm{S}'\bm{\Lambda}^{-1}\bm{S})^{-1}\bm{S}'\bm{\Lambda}^{-1}\hat{\bm{y}}_{t+h}.
\end{equation}

The most computationally challenging part of the optimal combination method is to produce all the base forecasts that make up \(\hat{\bm{y}}_{t+h}\). In many applications, there may be thousands or even millions of individual series, and each of them must be forecast independently. The most popular time series forecasting methods such as ETS and ARIMA models \autocite{fpp2} involve non-linear optimization routines to estimate the parameters via maximum likelihood estimation. Usually, multiple models are fitted for each series, and the best is selected by minimizing Akaike's Information Citerion \autocite{akaike1998information}. This computational challenges increases with the number of lower level series as well as in the number of aggregations of interest.

We therefore propose a new approach to compute the base forecasts that is both computationally fast while maintaining an acceptable forecasting accuracy level. Our proposed approach avoids the computational challenge of using ETS or ARIMA that require numerical optimization for each series. It is very flexible in terms of incorporating external data, handling missing values, and model selection. And finally, our approach handles the forecasting and reconciliation in a single step.

\hypertarget{proposed-approach-linear-model}{%
\section{\texorpdfstring{Proposed approach: Linear model \label{sec:proposedapproach1}}{Proposed approach: Linear model }}\label{proposed-approach-linear-model}}

Our proposed approach is based on using linear regression models for computing base forecasts. Suppose we have a linear model that we use for forecasting, and we wish to apply it to \(N\) different series which have some aggregation constraints. We have observations \(y_{t,i}\) from times \(t=1,\dots,T\) and series \(i=1,\dots,N\). Then

\begin{equation}
\label{eq:basicequation}
  y_{t,i} = \bm{\beta}_{i}' \bm{x}_{t,i} + \varepsilon_{t,i}
\end{equation}

where \(\bm{x}_{t,i}= (1, x_{t,1,i},\dots,x_{t,p,i})\) is a \((p+1)\)-vector of regression variables, \({\bm{\beta}}_i = (\beta_{0,i}, \beta_{1,i}, \beta_{2,i}, \dots, \beta_{p,i})\) is a \((p+1)\)-vector of coefficients and \({\varepsilon}_{t,i}\) is the error\footnote{If different predictors are chosen for different series, we can still choose a constant \(p\) that includes all the predictors, and then set the relevant coefficients in the \(X\) matrix to \(0\).}. This equation for all the observations in matrix form can be written as follows:
\begin{align}\label{eq:linearmodel}
  \begin{pmatrix}
  \bm{y}_1\\
  \bm{y}_2\\
  \bm{y}_3 \\
  \vdots\\
  \bm{y}_N
  \end{pmatrix}&=
  \begin{pmatrix}
  \bm{X}_1 & 0        & 0        & \dots  & 0\\
  0        & \bm{X}_2 & 0        & \dots  & 0\\
  0        & 0        & \bm{X}_3 & \ddots & \vdots \\
  \vdots   & \vdots   & \ddots   & \ddots & 0\\
  0        & 0        & \dots    & 0      & \bm{X}_N
  \end{pmatrix}
  \begin{pmatrix}
  \bm{\beta}_1\\
  \bm{\beta}_2\\
  \bm{\beta}_3\\
  \vdots\\
  \bm{\beta}_N
  \end{pmatrix}+
  \begin{pmatrix}
  \bm{\varepsilon}_1\\
  \bm{\varepsilon}_2\\
  \bm{\varepsilon}_3\\
  \vdots \\
  \bm{\varepsilon}_N
  \end{pmatrix}
  \\[2ex]
  \bm{Y} &= \bm{BX} + \bm{E},
\end{align}
where \(\bm{y}_i = (y_{1,i}, y_{2,i}, \dots, y_{T,i})\) is a \(T\)-vector, \({\bm{\beta}}_i = (\beta_{0,i}, \beta_{1,i}, \beta_{2,i}, \dots, \beta_{p,i})\) is a \((p+1)\)-vector, \({\bm{\varepsilon}}_i = (\varepsilon_{1,i}, \varepsilon_{2,i}, \dots, \varepsilon_{T,i})\) is a \(T\)-vector and \(\bm{X}_i\) is the \(T\times (p+1)\)-matrix
\begin{equation}\label{eq:Xmatrixdefinition}
  \bm{X}_i = \begin{pmatrix}
  1 & x_{1,i,1} & x_{1,i,2} & \dots & x_{1,i,p}\\
  1 & x_{2,i,1} & x_{2,i,2} & \dots & x_{2,i,p}\\
  \vdots & \vdots & \vdots & & \vdots \\
  1 & x_{T,i,1} & x_{T,i,2} & \dots & x_{T,i,p}
\end{pmatrix}.
\end{equation}

Equation \eqref{eq:linearmodel} can be written as \(\bm{Y} = \bm{X} \bm{B} + \bm{E}\), with parameter estimates given by \(\hat{\bm{B}} = (\bm{X}'\bm{X})^{-1} \bm{X}'\bm{Y}\). Then the base forecasts are obtained using
\begin{equation}\label{eq:baseforecasts}
  \hat{\bm{y}}_{t+h} = \bm{X}_{t+h}^* \hat{\bm{B}},
\end{equation}
where \(\hat{\bm{y}}_{t+h}\) is an \(N\)-vector of forecasts, \(\hat{\bm{B}}\) comprises \(N\) stacked \((p+1)\)-vectors of estimated coefficients, and \(\bm{X}_{t+h}^*\) is the \(N\times N(p+1)\) matrix
\pagebreak[3]\begin{equation}
  \bm{X}_{t+h}^* =
  \begin{pmatrix}
  \bm{x}_{t+h,1}' & 0               & 0               & \dots  & 0\\
  0               & \bm{x}_{t+h,2}' & 0               & \dots  & 0\\
  0               & 0               & \bm{x}_{t+h,3}' & \ddots & \vdots \\
  \vdots          & \vdots          & \ddots          & \ddots & 0\\
  0               & 0               & \dots           & 0      & \bm{x}_{t+h,N}'
  \end{pmatrix}.
\end{equation}
Note that we use \(\bm{X}^*_{t}\) to distinguish this matrix, which combines \(\bm{x}_{t,i}\) across all series for one time from \(\bm{X}_i\) which combines \(\bm{x}_{t,i}\) across all time for one series.

Finally, we can combine the two linear equations for computing base forecasts and reconciled forecasts (Equations \eqref{eq:mint2} and \eqref{eq:baseforecasts}) to obtain the reconciled forecasts with a single equation:
\begin{equation}\label{eq:singlestep}
    \tilde{\bm{y}}_{t+h} = \bm{S}(\bm{S}'\bm{\Lambda}\bm{S})^{-1}\bm{S}'\bm{\Lambda}
                            (\bm{X}_{t+h}^* \hat{\bm{B}})
                         = \bm{S}(\bm{S}'\bm{\Lambda}\bm{S})^{-1}\bm{S}'\bm{\Lambda}
                            \bm{X}_{t+h}^* (\bm{X}'\bm{X})^{-1} \bm{X}'\bm{Y}.
\end{equation}

\hypertarget{simplified-formulation-for-a-fixed-set-of-predictors-bf-x}{%
\subsection{\texorpdfstring{Simplified formulation for a fixed set of predictors (\(\bf {X}\)) \label{sec:proposedapproach2}}{Simplified formulation for a fixed set of predictors (\textbackslash bf \{X\}) }}\label{simplified-formulation-for-a-fixed-set-of-predictors-bf-x}}

If we have the same set of predictor variables, \(\bm{X}\), for all the series, we can write Equations \eqref{eq:linearmodel} to \eqref{eq:singlestep} more easily using multivariate regression equations, and we can obtain all the reconciled forecasts for all the series in one equation. In that case, Equation \eqref{eq:linearmodel} can be rearranged as follows:
\begin{equation}\label{eq:linearmodelsameX}
  \begin{pmatrix}
  y_{11} & \dots & y_{1N}\\
  y_{21} & \dots & y_{2N}\\
  \vdots &       & \vdots\\
  y_{T1} & \dots & y_{TN}
  \end{pmatrix} =
  \begin{pmatrix}
  1      & X_{11} & \dots & X_{1p}\\
  1      & X_{21} & \dots & X_{2p}\\
  \vdots & \vdots &       & \vdots\\
  1      & X_{T1} & \dots & X_{Tp}
  \end{pmatrix}
  \begin{pmatrix}
  \beta_{01} & \dots & \beta_{0N}\\
  \beta_{11} & \dots & \beta_{1N}\\
  \vdots     &       & \vdots\\
  \beta_{p1} & \dots & \beta_{pN}
  \end{pmatrix} \\
  +
  \begin{pmatrix}
  \varepsilon_{11} & \dots & \varepsilon_{1N}\\
  \varepsilon_{21} & \dots & \varepsilon_{2N}\\
  \vdots           &       & \vdots\\
  \varepsilon_{T1} & \dots & \varepsilon_{TN}
  \end{pmatrix},
\end{equation}
where \(\bm{Y}\), \(\bm{X}\), \(\bm{B}\) and \(\bm{E}\) are now matrices of size \(T\times N\), \(T\times (p+1)\), \((p+1)\times N\) and \(T \times N\), respectively. Equations \eqref{eq:baseforecasts} to \eqref{eq:singlestep} can be written accordingly using Equation \eqref{eq:linearmodelsameX} and here \(\bm{X}^*_{t+h,i} = \bm{X}^*_{t+h}\), where \(\bm{X}^*_{t+h}\) is an \(h\times (p+1)\) matrix. This simpler formulation translates into a computational advantage, as the \(\bm X\) matrix is smaller, thereby easing matrix multiplication operations.

\hypertarget{ols-predictors}{%
\subsection{OLS predictors}\label{ols-predictors}}

As an example of the \(\bm{X}_t\) matrix in Equation \eqref{eq:linearmodel}, we can refer to the set of predictors proposed in \textcite{ashouri2018} for modeling trend, seasonality and autocorrelation by using lagged values (\(y_{t-1}\), \(y_{t-2}\), \dots), trend variables and seasonal dummy variables:
\begin{equation}\label{eq:linearmodelexample}
   y_t = \alpha_0 + \alpha_1 t + \beta_1 s_{1,t} + \cdots + \beta_{m-1} s_{m-1,t} + \gamma_1 y_{t-1} + \cdots + \gamma_p y_{t-p} + \delta z_t + \varepsilon_t.
\end{equation}
Here, \(s_{j,t}\) is a dummy variable taking value 1 if time \(t\) is in season \(j\) (\(j=1, 2, \dots, m\)), \(y_{t-k}\) is the \(k\)th lagged value for \(y_t\) and \(z_t\) is some external information at time \(t\). The seasonal period \(m\) depends on the problem; for instance, if we have daily data with day-of-week seasonality, then \(m=7\).

When a single OLS model is fitted to a collection of time series (e.g.~several bottom-level series), then trend and seasonality predictors are the same for all series and we can use the simpler multivariate regression models in Equation \eqref{eq:linearmodelsameX}. However, this formulation is inappropriate when including lags, which differ for each series, and/or series-specific external series, in which case we use the formulation in Equation \eqref{eq:linearmodel}.

When there are many options for choosing predictors, such as many seasonal dummy variables, lags, or high order trend terms, we can consider applying a model selection approach such as Akaike's Information Criterion or leave-one-out cross-validation (LOOCV) to select the best set of predictors in terms of prediction. In practice, LOOCV can be computationally heavy except in the special case of linear models \autocite{christensenplane} and therefore using linear models provide a viable solution. Also, when the number of seasons \(m\) is large (e.g.~in hourly data), Fourier terms can result in fewer predictors than dummy variables. The number of Fourier terms can also be determined using the same AIC or LOOCV approach \autocite{fpp2}.

\hypertarget{computational-considerations}{%
\subsection{\texorpdfstring{Computational considerations \label{sec:computationalconsiderations}}{Computational considerations }}\label{computational-considerations}}

There are two ways for computing the above forecasts. First, we could create the matrices \(\bm{Y}\), \(\bm{X}\) and \(\bm{E}\), and then directly use the above equations (taking advantage of sparse matrix routines) to obtain the forecasts. Alternatively, we could use separate regression models to compute the coefficients for each linear model individually. Although the matrix, \(\bm{X}'\bm{X}\), which we need to invert is sparse and block diagonal, it is still faster to use the second approach involving separate regression models.

\hypertarget{prediction-intervals}{%
\subsection{Prediction intervals}\label{prediction-intervals}}

For obtaining prediction intervals, we need to compute the variance of reconciled forecasts as follows \autocite{mint2018}:
\begin{equation}\label{eq:variance}
    \text{Var}(\tilde{\bm{y}}_{t+h})
        = \bm{S}\bm{G}{\bm{\Sigma}_{t+h}} \bm{G}'\bm{S}',
\end{equation}
where \(\bm{G} = (\bm{S}'\bm{\Lambda}\bm{S})^{-1}\bm{S}'\bm{\Lambda}\) and \({\bm{\Sigma}_{t+h}}\) denotes the variance of the base forecasts given by the usual linear model formula \autocite{fpp2}
\begin{equation}\label{eq:recvariance}
  \bm{\Sigma}_{t+h} = \sigma^2\left[1 + \bm{X}_{t+h}^*(\bm{X}'\bm{X})^{-1}(\bm{X}_{t+h}^*)'\right].
\end{equation}
where \(\sigma^2\) is the variance of the base model residuals. Assuming normally distributed errors, we can easily obtain any required prediction intervals corresponding to elements of \(\tilde{\bm{y}}_{t+h}\) using the diagonals of \eqref{eq:variance}.

\hypertarget{applications}{%
\section{Applications}\label{applications}}

In this section we illustrate our approach using a real dataset and a simulated dataset\footnote{All methods were run on a Linux server with Intel Xeon Silver 4108 (1.80GHz / 8-Cores / 11MB Cache)*2 and 8GB DDR4 2666 DIMM ECC Registered Memory. R version 3.6.1. All the displayed computation times are only for the reconciled point forecasts.}. The real data study includes forecasting monthly Australian domestic tourism. This dataset contains 304 series with both hierarchical and grouped structure with strong seasonality. In the simulation studies, we simulate series based on the monthly Australian domestic tourism data and systematically modify the forecasting horizon, noise level, hierarchy levels, and number of series. In \protect\hyperlink{appendixB}{Appendix B} , we use another real dataset involving daily Wikipedia pageviews. In all these cases, we compare the forecasting accuracy of ETS, ARIMA\footnote{For running ETS and ARIMA, we applied \texttt{ETS} and \texttt{ARIMA} functions from the \texttt{fable} package \autocite{o2019fable}. The two sets of functions were run independently and not immediately one after the other.} and the proposed linear OLS forecasting model, with and without the reconciliation step. In these applications, we used the weighted reconciliation approach from Equation \eqref{eq:mint2}. For comparing these methods, we use the average of Root Mean Square Errors (RMSEs) across all series and also display box plots for forecast errors along with the raw forecast errors. To aid visibility, we suppress plotting the outliers.

We apply two methods for generating forecasts that align with two different practical forecasting scenarios. The first approach is \emph{rolling origin} forecasting, where we generate one-step-ahead forecasts (\(\tilde{\bm{y}}_{t+1}\) where \(t\) changes). This mimics the scenario where data are refreshed every time period. In the second \emph{fixed origin} method, forecasts are generated at a fixed time \(t\) for \(h\) steps ahead: \(\tilde{\bm{y}}_{t+1}, \tilde{\bm{y}}_{t+2},\dots, \tilde{\bm{y}}_{t+h}\) (we replace lagged values of \(y\) by their forecasts if they occur at periods after the forecast origin; See algorithms \ref{alg:algorithmrolling} and \ref{alg:algorithmfixed} for more details of the rolling and fixed origin approaches).

\begin{algorithm}[h]
  \begin{algorithmic}[1]
  \caption{Hierarchical and grouped time series rolling origin OLS forecast reconciliation}\label{alg:algorithmrolling}

  \STATE $data \gets matrix(y_{1:T, 1:N}, x_{1:T, 1:N, 1:p})$
    \FOR{$i \in \{1,\dots,N\}$}
      \FOR{$k \in \{1,\dots,h\}$}
       \STATE $data \gets data_{(1:(T-h) +(k-1)),i}$
       \STATE $newdata \gets data_{((T-h)+k),i}$
       \STATE $fit \gets lm(y_{t,i} \sim x_{t,i,1} + x_{t,i,2} + \dots + x_{t,i,p}, data = data)$
       \STATE $\hat{y}_{t+k,i} \gets predict(fit, newdata = newdata)$
      \ENDFOR
    \ENDFOR
    \STATE ${\bm{S}} \gets summing~matrix(groups_{data})$
    \STATE $\Lambda \gets diag(\bm{S1})$
    \STATE $adjust \gets \bm{S}(\bm{S'}\Lambda\bm{S})^{-1}\bm{S}\Lambda$
    \FOR{$i \in {1, \dots,h}$}
    \STATE $\tilde{y}_{(T-h)+i,1:N} \gets adjust \times \hat{y}_{(T-h)+i, 1:N}$
    \ENDFOR
    \STATE \bf{return} $\tilde{y}_{((T-h):T,1:N}$
  \end{algorithmic}
\end{algorithm}

\begin{algorithm}[h]
  \begin{algorithmic}[1]
  \caption{Hierarchical and grouped time series fixed origin OLS forecast reconciliation}\label{alg:algorithmfixed}
  \STATE $data \gets matrix(y_{1:T, 1:N}, x_{1:T, 1:N, 1:p})$
    \FOR{$i \in \{1,\dots,N\}$}
    \STATE $data \gets data_{(1:(T-h),i}$
    \STATE $fit \gets lm(y_{t,i} \sim x_{t,i,1} + x_{t,i,2} + \dots + x_{t,i,p}, data = data)$
    \STATE $newdata \gets data_{((T-h)+1),i}$ \FOR{$k \in \{1,\dots,h\}$}
        \STATE $\hat{y}_{t+k,i} \gets predict(fit, newdata = newdata)$
        \STATE $newdata \gets \hat {data}_{((T-h)+k),i}$
      \ENDFOR
    \ENDFOR
    \STATE ${\bm{S}} \gets summing~matrix(groups_{data})$
    \STATE $\Lambda \gets diag(\bm{S1})$
    \STATE $adjust \gets \bm{S}(\bm{S'}\Lambda\bm{S})^{-1}\bm{S}\Lambda$
    \FOR{$i \in {1, \dots,h}$}
    \STATE $\tilde{y}_{(T-h)+i,1:N} \gets adjust \times \hat{y}_{(T-h)+i, 1:N}$
    \ENDFOR
    \STATE \bf{return} $\tilde{y}_{((T-h):T,1:N}$
  \end{algorithmic}
\end{algorithm}

\hypertarget{australian-domestic-tourism}{%
\subsection{Australian domestic tourism}\label{australian-domestic-tourism}}

This dataset has 19 years of monthly visitor nights in Australia by Australian tourists, a measure used as an indicator of tourism activity \autocite{mint2018}. The data were collected by computer-assisted telephone interviews with 120,000 Australians aged 15 and over \autocite{researchAustralia2005}. The dataset includes 304 time series each of length 228 observations. The hierarchy and grouping structure for this dataset is made using geographic and purpose of travel information.

\begingroup\fontsize{9}{11}\selectfont

\begin{longtable}[t]{rllrll}
\caption{\label{tab:Australiageographicaldivision}Australian geographic hierarchical structure.}\\
\toprule
Series & Name & Label & Series & Name & Label\\
\midrule
Total &  &  & Region &  & \\
1 & Australia & Total & 55 & Lakes & BCA\\
State &  &  & 56 & Gippsland & BCB\\
2 & NSW & A & 57 & Phillip Island & BCC\\
3 & VIC & B & 58 & General Murray & BDA\\
4 & QLD & C & 59 & Goulburn & BDB\\
5 & SA & D & 60 & High Country & BDC\\
6 & WA & E & 61 & Melbourne East & BDD\\
7 & TAS & F & 62 & Upper Yarra & BDE\\
8 & NT & G & 63 & Murray East & BDF\\
Zone &  &  & 64 & Wimmera+Mallee & BEA\\
9 & Metro NSW & AA & 65 & Western Grampians & BEB\\
10 & Nth Coast NSW & AB & 66 & Bendigo Loddon & BEC\\
11 & Sth Coast NSW & AC & 67 & Macedon & BED\\
12 & Sth NSW & AD & 68 & Spa Country & BEE\\
13 & Nth NSW & AE & 69 & Ballarat & BEF\\
14 & ACT & AF & 70 & Central Highlands & BEG\\
15 & Metro VIC & BA & 71 & Gold Coast & CAA\\
16 & West Coast VIC & BB & 72 & Brisbane & CAB\\
17 & East Coast VIC & BC & 73 & Sunshine Coast & CAC\\
18 & Nth East VIC & BD & 74 & Central Queensland & CBA\\
19 & Nth West VIC & BE & 75 & Bundaberg & CBB\\
20 & Metro QLD & CA & 76 & Fraser Coast & CBC\\
21 & Central Coast QLD & CB & 77 & Mackay & CBD\\
22 & Nth Coast QLD & CC & 78 & Whitsundays & CCA\\
23 & Inland QLD & CD & 79 & Northern & CCB\\
24 & Metro SA & DA & 80 & Tropical North Queensland & CCC\\
25 & Sth Coast SA & DB & 81 & Darling Downs & CDA\\
26 & Inland SA & DC & 82 & Outback & CDB\\
27 & West Coast SA & DD & 83 & Adelaide & DAA\\
28 & West Coast WA & EA & 84 & Barossa & DAB\\
29 & Nth WA & EB & 85 & Adelaide Hills & DAC\\
30 & Sth WA & EC & 86 & Limestone Coast & DBA\\
31 & Sth TAS & FA & 87 & Fleurieu Peninsula & DBB\\
32 & Nth East TAS & FB & 88 & Kangaroo Island & DBC\\
33 & Nth West TAS & FC & 89 & Murraylands & DCA\\
34 & Nth Coast NT & GA & 90 & Riverland & DCB\\
35 & Central NT & GB & 91 & Clare Valley & DCC\\
Region &  &  & 92 & Flinders Range and Outback & DCD\\
36 & Sydney & AAA & 93 & Eyre Peninsula & DDA\\
37 & Central Coast & AAB & 94 & Yorke Peninsula & DDB\\
38 & Hunter & ABA & 95 & Australia's Coral Coast & EAA\\
39 & North Coast NSW & ABB & 96 & Experience Perth & EAB\\
40 & Northern Rivers Tropical NSW & ABC & 97 & Australia's SouthWest & EAC\\
41 & South Coast & ACA & 98 & Australia's North West & EBA\\
42 & Snowy Mountains & ADA & 99 & Australia's Golden Outback & ECA\\
43 & Capital Country & ADB & 100 & Hobart and the South & FAA\\
44 & The Murray & ADC & 101 & East Coast & FBA\\
45 & Riverina & ADD & 102 & Launceston, Tamar and the North & FBB\\
46 & Central NSW & AEA & 103 & North West & FCA\\
47 & New England North West & AEB & 104 & Wilderness West & FCB\\
48 & Outback NSW & AEC & 105 & Darwin & GAA\\
49 & Blue Mountains & AED & 106 & Kakadu Arnhem & GAB\\
50 & Canberra & AFA & 107 & Katherine Daly & GAC\\
51 & Melbourne & BAA & 108 & Barkly & GBA\\
52 & Peninsula & BAB & 109 & Lasseter & GBB\\
53 & Geelong & BAC & 110 & Alice Springs & GBC\\
54 & Western & BBA & 111 & MacDonnell & GBD\\
\bottomrule
\end{longtable}
\endgroup{}

\begin{figure}

{\centering \includegraphics[width=450px,height=150px]{Paper-Figures/Australian_hierarchy_structure} 

}

\caption{Australian geographic hierarchical structure.}\label{fig:Australiahierarchystructure}
\end{figure}

\begin{figure}

{\centering \includegraphics[width=450px,height=360px]{Paper-Figures/ausTurRegionsBW} 

}

\caption{Australia map, showing the eight states.}\label{fig:Australiahierarchystructuremap}
\end{figure}

In this dataset we have three levels of geographic divisions in Australia. In the first level, Australia is divided into seven ``States'' including New South Wales (NSW), Victoria (VIC), Queensland (QLD), South Australia (SA), Western Australia (WA), Tasmania (TAS) and Northern Territory (NT). In the second and third levels, it is divided into 27 ``Zones'' and 76 ``Regions'' (for details about Australia geographic divisions see Figure \ref{fig:Australiahierarchystructure} and Table \ref{tab:Australiageographicaldivision} and also Figure \ref{fig:Australiahierarchystructuremap} which shows Australia map divided by tourism region and colored by states\footnote{Reproduced from: \url{www.tra.gov.au/tra/2016/Tourism_Region_Profiles/Region_profiles/index.html}}).

We have four purposes of travel: Holiday (Hol), Visiting friends and relatives (Vis), Business (Bus) and Other (Oth). So there are \(76\times4 = 304\) series at the most disaggregate level. Based on the geographic hierarchy and purpose grouping, we end up with 8 aggregation levels with 555 series in total as shown in Table \ref{tab:Australiageographicalpurposedivision}.

\begin{table}[!h]

\caption{\label{tab:Australiageographicalpurposedivision}Number of Australian domestic tourism series at each aggregation level.}
\centering
\begin{tabular}[t]{lr}
\toprule
Division & Series\\
\midrule
Australia & 1\\
State & 7\\
Zone & 27\\
Region & 76\\
Purpose & 4\\
State x Purpose & 28\\
Zone x Purpose & 108\\
Region x Purpose & 304\\
\hline
Total & 555\\
\bottomrule
\end{tabular}
\end{table}

We report the forecast results for all these aggregation levels, as well as the average RMSE across all the levels of the hierarchy. We used the same predictors in the OLS predictor
matrix for the rolling and fixed origin approaches. For the rolling and fixed origin model, we include a quadratic trend, 11 dummy variables, and lags 1 and 12. This is intended to capture the monthly seasonality. In addition, before running the model, we partition the data into training and test sets, with the last 24 months (2 years) as our test set, and the rest as our training set.

\begin{table}[!h]

\caption{\label{tab:Tourismdataresulrolling}Mean(RMSE) on 24 months test set for ETS, ARIMA and OLS with and without reconciliation - Rolling origin.}
\centering
\begin{tabular}[t]{lrrrrrr}
\toprule
\multicolumn{1}{c}{} & \multicolumn{3}{c}{Unreconciled} & \multicolumn{3}{c}{Reconciled} \\
\cmidrule(l{3pt}r{3pt}){2-4} \cmidrule(l{3pt}r{3pt}){5-7}
Level & ETS & ARIMA & OLS & ETS & ARIMA & OLS\\
\midrule
Total & 1516 & 1504 & 1634 & 1733 & 1840 & 1864\\
State & 511 & 501 & 498 & 497 & 482 & 509\\
Zone & 215 & 217 & 213 & 211 & 210 & 213\\
Region & 123 & 125 & 117 & 118 & 120 & 117\\
Purpose & 676 & 674 & 682 & 673 & 713 & 713\\
State x Purpose & 213 & 217 & 213 & 208 & 209 & 213\\
Zone x Purpose & 98 & 103 & 98 & 96 & 99 & 97\\
Region x Purpose & 56 & 58 & 56 & 56 & 57 & 56\\
\bottomrule
\end{tabular}
\end{table}

\begin{table}

\caption{\label{tab:TourismdataresultRMSE}Mean(RMSE) on 24 months test set for ETS, ARIMA and OLS with and without reconciliation - Fixed origin.}
\centering
\begin{tabular}[t]{lrrrrrr}
\toprule
\multicolumn{1}{c}{} & \multicolumn{3}{c}{Unreconciled} & \multicolumn{3}{c}{Reconciled} \\
\cmidrule(l{3pt}r{3pt}){2-4} \cmidrule(l{3pt}r{3pt}){5-7}
Level & ETS & ARIMA & OLS & ETS & ARIMA & OLS\\
\midrule
Total & 2239 & 3433 & 2529 & 2492 & 2744 & 2819\\
State & 594 & 610 & 597 & 573 & 583 & 612\\
Zone & 240 & 230 & 243 & 237 & 234 & 243\\
Region & 133 & 132 & 127 & 127 & 127 & 126\\
Purpose & 767 & 829 & 876 & 822 & 889 & 921\\
State x Purpose & 227 & 233 & 237 & 222 & 226 & 236\\
Zone x Purpose & 103 & 106 & 105 & 102 & 102 & 104\\
Region x Purpose & 59 & 59 & 59 & 58 & 58 & 58\\
\bottomrule
\end{tabular}
\end{table}

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/boxplotrollingtourism-1} 

}

\caption{Box plots of rolling origin forecast errors from reconciled and unreconciled ETS, ARIMA and OLS methods at each hierarchical level.}\label{fig:boxplotrollingtourism}
\end{figure}

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/boxplottourism-1} 

}

\caption{Box plots of fixed origin forecast errors for reconciled and unreconciled ETS, ARIMA and OLS methods at each hierarchical level.}\label{fig:boxplottourism}
\end{figure}

In Figures \ref{fig:boxplotrollingtourism} and \ref{fig:boxplottourism} we display the error box plots for both reconciled and unreconciled forecasts using all three methods, for the rolling origin and fixed origin forecasts. In these figures we see the error distributions across all the models.

Together with Tables \ref{tab:Tourismdataresulrolling} and \ref{tab:TourismdataresultRMSE}, results show that our proposed OLS forecasting model produces forecast accuracy similar to ETS and ARIMA, which are computationally heavy for many time series (see Table \ref{tab:Tourismdatacomputationtime}). We also see the usefulness of the reconciliation in decreasing the average RMSE in all three methods. Except for the total series, reconciliation improves forecasts in all the hierarchy levels. Also, because the higher level series have higher counts, the errors are larger in magnitude (\protect\hyperlink{appendixA}{Appendix A} shows the box plots with scaled errors\footnote{Scaled errors are computed by subtracting the mean and dividing by the standard deviation.}, to better compare errors across all the hierarchy levels). In addition, we see that (as expected) by applying rolling origin 1-step-ahead forecasts, the error densities are closer and more tightly distributed around zero than the fixed origin multi-step-ahead forecasts.

Figures \ref{fig:forecstrolling24tourismtotal} and \ref{fig:forecstrolling24tourism} show the rolling and fixed origin forecast results for the total series and one of the bottom level series, AAAVis (Sydney - Business). In these plots we have both reconciled (dashed lines) and unreconciled (dotted lines) forecasts and we see that the reconciliation step improves the forecasts in this series. We also see that the OLS model forecast accuracy is similar to the other two methods.

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/forecstrolling24tourismtotal-1} 

}

\caption{Comparing ETS, ARIMA and OLS forecasts (reconciled and unreconciled) for 'Total' series. (Top: rolling origin, Bottom: fixed origin).}\label{fig:forecstrolling24tourismtotal}
\end{figure}

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/forecstrolling24tourism-1} 

}

\caption{Comparing ETS, ARIMA and OLS forecasts (reconciled and unreconciled) for 'AAAVis' bottom-level series. (Top: rolling origin, Bottom: fixed origin).}\label{fig:forecstrolling24tourism}
\end{figure}

Figures \ref{fig:predinttotal} and \ref{fig:predintAAAVis} display the prediction interval for the OLS approach, with and without reconciliation forecasts for the total series and one of the bottom level series, AAAVis (Sydney - Visiting).

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/predinttotal-1} 

}

\caption{Comparing ETS, ARIMA and OLS reconciled forecasts and prediction intervals for 'Total' series. (Top: rolling origin, Bottom: fixed origin).}\label{fig:predinttotal}
\end{figure}

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/predintAAAVis-1} 

}

\caption{Comparing ETS, ARIMA and OLS reconciled forecasts and prediction intervals for 'AAAVis' bottom-level series. (Top: rolling origin, Bottom: fixed origin).}\label{fig:predintAAAVis}
\end{figure}

Table \ref{tab:Tourismdatacomputationtime} compares the computation time of the three methods for rolling and fixed origin forecasting. We see that the OLS forecasting model is much faster compared to the other methods. Also, since reconciliation is a linear process, in all methods it is very fast and does not affect computation time significantly.

Since we are using a linear model, we can easily include exogenous variables which can often be helpful in improving forecast accuracy. In this application, we tried including an ``Easter'' dummy variable indicating the timing of Easter, but its affect on forecast accuracy was minimal, so it was omitted in the model reported here.

Finally, Table \ref{tab:Tourismdatacomputationtimeappendix} shows that, as mentioned in Section \ref{sec:computationalconsiderations}, computation is faster using separate regression models compared to the matrix approach (even using sparse matrix algebra).

\begin{table}

\caption{\label{tab:Tourismdatacomputationtime}Computation time (seconds) for ETS, ARIMA and OLS with reconciliation, for  Rolling and fixed origin forecasts, on a 24 months test set.}
\centering
\begin{tabular}[t]{lrr}
\toprule
 & Rolling origin & Fixed origin\\
\midrule
ETS & 14648 & 618\\
ARIMA & 30346 & 1085\\
OLS & 48 & 18\\
\bottomrule
\end{tabular}
\end{table}

\begin{table}

\caption{\label{tab:Tourismdatacomputationtimeappendix}Computation time (seconds) for OLS using the matrix approach and separate regression models, with reconciliation, for rolling and fixed origin, on a 12 months test set.}
\centering
\begin{tabular}[t]{lrr}
\toprule
 & Rolling origin & Fixed origin\\
\midrule
Matrix approach & 210 & 106\\
Separate models & 48 & 18\\
\bottomrule
\end{tabular}
\end{table}

\hypertarget{australian-domestic-tourism-simulation-study}{%
\subsection{Australian domestic tourism simulation study}\label{australian-domestic-tourism-simulation-study}}

We provide results from two simulation studies based on the Australian domestic tourism dataset, to evaluate the sensitivity of our results to several factors. In the first study, we simulate bottom-level series similar to the real bottom-level series of the tourism data, with the same number of series and the same length. We then generate forecasts for four forecast horizons (12, 24, 36 and 48 months) with four different noise levels (standard deviation=0.01, 0.1, 0.5 and 1)\footnote{Since the level of the series are different, we first scale the simulated series (subtracting by mean and dividing by standard deviation), add the white noise series and then we rescale the series.}.

Tables \ref{tab:TourismdatasimrollingnoiseFH} and \ref{tab:TourismdatasimfixnoiseFH} display the average of the RMSEs for 12 to 48 month-ahead forecasts with different noise levels. Results are shown for the base and the reconciled forecasts for both rolling and fixed origin approaches. The results show that, as expected, by increasing the forecast horizon and/or noise level, the average RMSE increases in all the three methods. Also, the proposed OLS approach shows similar results compared with ETS and ARIMA. It should be noted that for both rolling and fixed origin forecasts in the OLS approach we use the same set of predictors as the Australian domestic tourism example.

\begin{table}[!h]

\caption{\label{tab:TourismdatasimrollingnoiseFH}Mean RMSE of rolling origin forecasts for simulated data (304 bottom-level series and 8 levels of hierarchy), for different error levels,  methods (ETS, ARIMA, OLS), forecast horizons, with/without reconciliation.}
\centering
\begin{tabular}[t]{lrrrrl}
\toprule
Reconciliation & Error & Forecast horizon & ETS & ARIMA & OLS\\
\midrule
rec & 0.01 & 12 & 123.8 & 119.8 & 130.4\\
rec & 0.01 & 24 & 122.6 & 119.4 & 128.8\\
rec & 0.01 & 36 & 124.6 & 122.3 & 131.8\\
rec & 0.01 & 48 & 122.1 & 120.4 & 129.0\\
rec & 0.10 & 12 & 123.7 & 120.1 & 129.7\\
rec & 0.10 & 24 & 122.9 & 120.3 & 129.2\\
rec & 0.10 & 36 & 125.8 & 123.7 & 133.1\\
rec & 0.10 & 48 & 123.5 & 122.1 & 130.5\\
rec & 0.50 & 12 & 143.9 & 143.2 & 146.9\\
rec & 0.50 & 24 & 149.9 & 146.3 & 153.2\\
rec & 0.50 & 36 & 155.8 & 151.5 & 160.0\\
rec & 0.50 & 48 & 154.5 & 150.8 & 159.7\\
rec & 1.00 & 12 & 192.9 & 198.1 & 193.9\\
rec & 1.00 & 24 & 207.1 & 209.0 & 209.3\\
rec & 1.00 & 36 & 215.7 & 215.4 & 218.0\\
rec & 1.00 & 48 & 218.4 & 217.5 & 220.9\\
unrec & 0.01 & 12 & 132.2 & 125.8 & 132.9\\
unrec & 0.01 & 24 & 128.5 & 125.4 & 131.0\\
unrec & 0.01 & 36 & 130.2 & 128.5 & 134.0\\
unrec & 0.01 & 48 & 127.8 & 126.6 & 131.7\\
unrec & 0.10 & 12 & 132.0 & 126.2 & 132.2\\
unrec & 0.10 & 24 & 129.5 & 126.0 & 131.4\\
unrec & 0.10 & 36 & 131.9 & 129.9 & 135.1\\
unrec & 0.10 & 48 & 129.5 & 128.3 & 133.0\\
unrec & 0.50 & 12 & 149.5 & 149.7 & 148.2\\
unrec & 0.50 & 24 & 156.5 & 153.7 & 154.5\\
unrec & 0.50 & 36 & 163.3 & 158.1 & 160.7\\
unrec & 0.50 & 48 & 161.9 & 157.7 & 160.7\\
unrec & 1.00 & 12 & 200.9 & 205.4 & 194.5\\
unrec & 1.00 & 24 & 214.9 & 216.3 & 210.1\\
unrec & 1.00 & 36 & 222.0 & 221.9 & 218.1\\
unrec & 1.00 & 48 & 223.6 & 223.5 & 221.2\\
\bottomrule
\end{tabular}
\end{table}

\begin{table}[!h]

\caption{\label{tab:TourismdatasimfixnoiseFH}Mean RMSE of fixed origin forecasts for simulated data (304 bottom-level series and 8 levels of hierarchy), for different error levels, methods (ETS, ARIMA, OLS), forecast horizons, with/without reconciliation.}
\centering
\begin{tabular}[t]{lrrrrr}
\toprule
Reconciliation & Error & Forecast horizon & ETS & ARIMA & OLS\\
\midrule
rec & 0.01 & 12 & 123.6 & 119.8 & 131.4\\
rec & 0.01 & 24 & 126.5 & 124.2 & 137.3\\
rec & 0.01 & 36 & 133.4 & 131.9 & 148.6\\
rec & 0.01 & 48 & 133.5 & 133.9 & 152.3\\
rec & 0.10 & 12 & 124.8 & 120.3 & 130.7\\
rec & 0.10 & 24 & 128.0 & 124.5 & 137.2\\
rec & 0.10 & 36 & 135.3 & 132.2 & 148.9\\
rec & 0.10 & 48 & 135.5 & 134.3 & 152.6\\
rec & 0.50 & 12 & 144.6 & 143.4 & 147.8\\
rec & 0.50 & 24 & 154.5 & 151.6 & 158.2\\
rec & 0.50 & 36 & 162.8 & 160.7 & 170.0\\
rec & 0.50 & 48 & 164.1 & 163.5 & 174.3\\
rec & 1.00 & 12 & 194.1 & 197.7 & 194.7\\
rec & 1.00 & 24 & 212.4 & 213.3 & 213.0\\
rec & 1.00 & 36 & 221.9 & 222.1 & 224.8\\
rec & 1.00 & 48 & 225.8 & 227.1 & 231.0\\
unrec & 0.01 & 12 & 133.1 & 125.3 & 133.3\\
unrec & 0.01 & 24 & 136.3 & 129.7 & 139.1\\
unrec & 0.01 & 36 & 143.4 & 138.9 & 150.2\\
unrec & 0.01 & 48 & 143.8 & 141.0 & 153.8\\
unrec & 0.10 & 12 & 133.4 & 126.2 & 132.5\\
unrec & 0.10 & 24 & 136.9 & 130.7 & 138.9\\
unrec & 0.10 & 36 & 144.4 & 140.0 & 150.4\\
unrec & 0.10 & 48 & 145.0 & 142.4 & 154.0\\
unrec & 0.50 & 12 & 150.3 & 147.8 & 148.7\\
unrec & 0.50 & 24 & 161.0 & 156.6 & 159.2\\
unrec & 0.50 & 36 & 170.4 & 167.5 & 170.8\\
unrec & 0.50 & 48 & 172.2 & 171.0 & 175.0\\
unrec & 1.00 & 12 & 202.2 & 204.2 & 195.0\\
unrec & 1.00 & 24 & 220.9 & 220.4 & 213.5\\
unrec & 1.00 & 36 & 230.1 & 229.1 & 225.1\\
unrec & 1.00 & 48 & 233.8 & 233.3 & 231.3\\
\bottomrule
\end{tabular}
\end{table}

Figures \ref{fig:TourismdatasimPIrollingnoiseFH} and \ref{fig:TourismdatasimPIfixnoiseFH} display one of the bottom level series with its ARIMA, ETS, and OLS 12 to 48 months ahead reconciled forecasts and prediction intervals, while changing the noise levels. From these two figures we see the OLS approach prediction intervals are almost identical to those from ARIMA and ETS.

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/TourismdatasimPIrollingnoiseFH-1} 

}

\caption{Comparing reconciled 'rolling origin' forecasts and prediction intervals for a sample bottom-level series across different error levels (different panels).}\label{fig:TourismdatasimPIrollingnoiseFH}
\end{figure}

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/TourismdatasimPIfixnoiseFH-1} 

}

\caption{Comparing reconciled 'fixed origin' forecasts and prediction intervals for a sample bottom-level series across different error levels (different panels).}\label{fig:TourismdatasimPIfixnoiseFH}
\end{figure}

Figures \ref{fig:TourismsimcomputationtimerollingnoiseFH} and \ref{fig:TourismsimcomputationtimefixednoiseFH} show the computation time (seconds) for ETS, ARIMA and OLS methods on rolling and fixed origin forecasts. From these figures we see that increasing the forecast horizon from one to 48 months increases computation time almost linearly, while the noise level does not change the computation time. Also, the computation time for ARIMA and ETS is much longer than OLS. Note that the computation time for the reconciliation step is less than a second and therefore that would be similar for base and reconciled forecasts.

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/TourismsimcomputationtimerollingnoiseFH-1} 

}

\caption{Computation time (seconds) for rolling origin reconciled forecasts using ETS, ARIMA and OLS, by horizon and by different error values (panels), for 304 bottom-level series and 8 levels of hierarchy.}\label{fig:TourismsimcomputationtimerollingnoiseFH}
\end{figure}

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/TourismsimcomputationtimefixednoiseFH-1} 

}

\caption{ Computation time (seconds) for fixed origin reconciled forecasts using ETS, ARIMA and OLS, by horizon and by different error values (panels), for 304 bottom-level series and 8 levels of hierarchy.}\label{fig:TourismsimcomputationtimefixednoiseFH}
\end{figure}

In the second simulation study, we fix the forecast horizon at \(h=24\) and the noise at 0.5, and then create 4 different hierarchy levels (8, 10, 12 and 18) --- obtained using the hierarchy structures in Table \ref{tab:Australiageographicalpurposedivision}, \ref{tab:simlevel1}, \ref{tab:simlevel1level2} and \ref{tab:simlevel1level2group1} (\(8 =\) same as Australian domestic tourism data; \(9 =\) adding one hierarchy factor, resulting in 10 levels; \(10=\) adding two hierarchy factors, resulting in 12 levels; and \(11 =\) adding two hierarchy factors and one grouping factor, resulting in 18 levels)\footnote{For simplicity we just include 2-way aggregation combinations.}).

\begin{table}[!h]

\caption{\label{tab:simlevel1}Number of simulated Australian domestic tourism series at each aggregation level - adding one hierarchy variable (Level 1)}
\centering
\begin{tabular}[t]{lr}
\toprule
Division & Series\\
\midrule
Australia & 1\\
Level 1 & 3\\
State & 7\\
Zone & 27\\
Region & 76\\
Purpose & 4\\
Level 1 x Purpose & 12\\
State x Purpose & 28\\
Zone x Purpose & 108\\
Region x Purpose & 304\\
\hline
Total & 570\\
\bottomrule
\end{tabular}
\end{table}

\begin{table}[!h]

\caption{\label{tab:simlevel1level2}Number of simulated Australian domestic tourism series at each aggregation level - adding two hierarchy variables (Level 1 and Level 2)}
\centering
\begin{tabular}[t]{lr}
\toprule
Division & Series\\
\midrule
Australia & 1\\
Level 1 & 3\\
Level 2 & 5\\
State & 7\\
Zone & 27\\
Region & 76\\
Purpose & 4\\
Level 1 x Purpose & 12\\
Level 2 x Purpose & 20\\
State x Purpose & 28\\
Zone x Purpose & 108\\
Region x Purpose & 304\\
\hline
Total & 595\\
\bottomrule
\end{tabular}
\end{table}

\begin{table}[!h]

\caption{\label{tab:simlevel1level2group1}Number of simulated Australian domestic tourism series at each aggregation level - adding two hierarchy and one grouping variables (Level 1, Level 2 and Group 1)}
\centering
\begin{tabular}[t]{lr}
\toprule
Division & Series\\
\midrule
Australia & 1\\
Level 1 & 3\\
Level 2 & 5\\
State & 7\\
Zone & 27\\
Region & 76\\
Purpose & 4\\
Group 1 & 5\\
Level 1 x Purpose & 12\\
Level 2 x Purpose & 20\\
State x Purpose & 28\\
Zone x Purpose & 108\\
Level 1 x Group 1 & 5\\
Level 2 x Group 1 & 6\\
State x Group 1 & 7\\
Zone x Group 1 & 27\\
Purpose x Group 1 & 20\\
Bottom level & 304\\
\hline
Total & 665\\
\bottomrule
\end{tabular}
\end{table}

We also simulated four sizes of bottom-level series (304, 608, 1520 and 3040). In order to add series, we change the number of `Purpose' categories (grouping factor) in the Australian domestic tourism example. Table \ref{tab:TourismsimlevelNS} displays the total number of series based on 304, 608, 1520 and 3040 bottom levels series with 8, 10, 12 and 18 hierarchy levels.

\begin{table}

\caption{\label{tab:TourismsimlevelNS}Total number of the series in the hierarchy structure based on the different number of series with 8, 10, 12 and 18 levels of the hierarchy.}
\centering
\begin{tabular}[t]{lrrrr}
\toprule
\multicolumn{1}{c}{} & \multicolumn{4}{c}{Total series} \\
\cmidrule(l{3pt}r{3pt}){2-5}
Bottom level series & 8 & 10 & 12 & 18\\
\midrule
304 & 555 & 570 & 595 & 665\\
608 & 999 & 1026 & 1071 & 1161\\
1520 & 2331 & 2394 & 2499 & 2649\\
3040 & 4551 & 4674 & 2643 & 5129\\
\bottomrule
\end{tabular}
\end{table}

Tables \ref{tab:TourismdatasimrollinglevelNS} and \ref{tab:TourismdatasimfixlevelNS} display the average RMSEs by number of bottom level series (304, 608, 1520 and 3040), number of hierarchy levels (8, 10, 12 and 18), for ARIMA, ETS and OLS, with rolling and fixed origin forecasts. These results are both for reconciled and unreconciled forecasts. We see from these tables that reconciling the forecasts decreases the mean RMSE, while increasing the number of series increases the mean RMSE.

\begin{table}[!h]

\caption{\label{tab:TourismdatasimrollinglevelNS}Mean RMSE by number of hierarchy levels, number of bottom-level series, method, with/without reconciliation. Simulated series has error value 0.5. Forecasting uses rolling origin for a 24-month horizon.}
\centering
\begin{tabular}[t]{lrrrrr}
\toprule
Reconciliation & Series & Levels & ETS & ARIMA & OLS\\
\midrule
rec & 304 & 8 & 149.9 & 146.3 & 153.2\\
rec & 304 & 10 & 164.6 & 160.5 & 168.1\\
rec & 304 & 12 & 176.6 & 172.3 & 180.2\\
rec & 304 & 18 & 204.6 & 208.4 & 207.1\\
rec & 608 & 8 & 786.6 & 806.2 & 822.5\\
rec & 608 & 10 & 879.5 & 904.1 & 921.0\\
rec & 608 & 12 & 937.6 & 966.8 & 982.5\\
rec & 608 & 18 & 1070.4 & 1105.7 & 1122.8\\
rec & 1520 & 8 & 1075.3 & 1182.3 & 1059.6\\
rec & 1520 & 10 & 1202.9 & 1329.3 & 1188.2\\
rec & 1520 & 12 & 1284.4 & 1422.7 & 1268.9\\
rec & 1520 & 18 & 1472.9 & 1651.1 & 1459.3\\
rec & 3040 & 8 & 1156.4 & 1257.7 & 1155.1\\
rec & 3040 & 10 & 1296.4 & 1416.6 & 1295.4\\
rec & 3040 & 12 & 1403.0 & 1518.7 & 1383.5\\
rec & 3040 & 18 & 1599.8 & 1762.6 & 1601.3\\
unrec & 304 & 8 & 156.5 & 153.7 & 154.5\\
unrec & 304 & 10 & 172.3 & 168.9 & 169.7\\
unrec & 304 & 12 & 185.1 & 180.9 & 181.8\\
unrec & 304 & 18 & 212.9 & 199.6 & 208.9\\
unrec & 608 & 8 & 788.4 & 813.0 & 822.5\\
unrec & 608 & 10 & 882.8 & 912.4 & 921.0\\
unrec & 608 & 12 & 941.3 & 975.5 & 982.5\\
unrec & 608 & 18 & 1075.0 & 1114.1 & 1122.8\\
unrec & 1520 & 8 & 1042.3 & 1220.3 & 1059.6\\
unrec & 1520 & 10 & 1170.7 & 1368.6 & 1188.2\\
unrec & 1520 & 12 & 1253.0 & 1463.0 & 1268.9\\
unrec & 1520 & 18 & 1447.7 & 1691.1 & 1459.3\\
unrec & 3040 & 8 & 1149.4 & 1294.2 & 1155.1\\
unrec & 3040 & 10 & 1290.0 & 1454.2 & 1295.4\\
unrec & 3040 & 12 & 1400.4 & 1557.3 & 1383.5\\
unrec & 3040 & 18 & 1598.3 & 1801.2 & 1601.3\\
\bottomrule
\end{tabular}
\end{table}

\begin{table}[!h]

\caption{\label{tab:TourismdatasimfixlevelNS}Mean RMSE by number of hierarchy levels, number of bottom-level series, method, with/without reconciliation. Simulated series has error value 0.5. Forecasting uses fixed origin for a 24-month horizon.}
\centering
\begin{tabular}[t]{lrrrrr}
\toprule
Reconciliation & Series & Levels & ETS & ARIMA & OLS\\
\midrule
rec & 304 & 8 & 154.5 & 151.6 & 158.2\\
rec & 304 & 10 & 169.6 & 166.4 & 174.1\\
rec & 304 & 12 & 181.8 & 178.2 & 186.4\\
rec & 304 & 18 & 209.8 & 205.0 & 213.4\\
rec & 608 & 8 & 786.0 & 809.0 & 802.7\\
rec & 608 & 10 & 880.5 & 907.5 & 898.6\\
rec & 608 & 12 & 939.4 & 970.4 & 958.6\\
rec & 608 & 18 & 1070.9 & 1107.3 & 1095.3\\
rec & 1520 & 8 & 1077.2 & 1181.9 & 1039.1\\
rec & 1520 & 10 & 1208.9 & 1327.8 & 1165.4\\
rec & 1520 & 12 & 1291.0 & 1421.5 & 1244.8\\
rec & 1520 & 18 & 1486.4 & 1643.6 & 1430.8\\
rec & 3040 & 8 & 1148.8 & 1263.0 & 1134.8\\
rec & 3040 & 10 & 1288.3 & 1420.5 & 1272.7\\
rec & 3040 & 12 & 1376.2 & 1522.0 & 1359.3\\
rec & 3040 & 18 & 1587.6 & 1764.4 & 1573.6\\
unrec & 304 & 8 & 161.0 & 156.6 & 159.2\\
unrec & 304 & 10 & 177.1 & 172.9 & 175.4\\
unrec & 304 & 12 & 189.9 & 185.6 & 187.7\\
unrec & 304 & 18 & 217.9 & 213.0 & 214.7\\
unrec & 608 & 8 & 790.4 & 816.6 & 802.7\\
unrec & 608 & 10 & 886.4 & 916.0 & 898.6\\
unrec & 608 & 12 & 945.6 & 979.2 & 958.7\\
unrec & 608 & 18 & 1078.2 & 1116.4 & 1095.4\\
unrec & 1520 & 8 & 1038.2 & 1196.3 & 1037.7\\
unrec & 1520 & 10 & 1174.9 & 1343.8 & 1163.8\\
unrec & 1520 & 12 & 1258.2 & 1438.1 & 1243.0\\
unrec & 1520 & 18 & 1465.3 & 1659.4 & 1429.1\\
unrec & 3040 & 8 & 1136.1 & 1274.0 & 1142.8\\
unrec & 3040 & 10 & 1274.9 & 1432.0 & 1281.8\\
unrec & 3040 & 12 & 1362.7 & 1532.8 & 1369.1\\
unrec & 3040 & 18 & 1578.6 & 1777.8 & 1583.2\\
\bottomrule
\end{tabular}
\end{table}

In Figures \ref{fig:TourismdatasimPIrollinglevelNS} and \ref{fig:TourismdatasimPIfixlevelNS}, we compare the reconciled ETS, ARIMA, and OLS forecasts and prediction intervals for one of the bottom level series. From these figures we see that all three approaches have similar prediction intervals.

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/TourismdatasimPIrollinglevelNS-1} 

}

\caption{Comparing reconciled rolling origin forecasts and prediction intervals for a sample bottom-level series, for different number of bottom-level series and hierarchy levels (different panels). Simulated series has error value 0.5 and 24 months test set.}\label{fig:TourismdatasimPIrollinglevelNS}
\end{figure}

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/TourismdatasimPIfixlevelNS-1} 

}

\caption{Comparing reconciled fixed origin forecasts and prediction intervals for a sample bottom-level series, for different number of bottom-level series and hierarchy levels (different panels). Simulated series has error value 0.5 and 24 months test set.}\label{fig:TourismdatasimPIfixlevelNS}
\end{figure}

Finally, in Figures \ref{fig:TourismsimcomputationtimerollinglevelNS} and \ref{fig:TourismsimcomputationtimefixedlevelNS}, we display the computation time for rolling and fixed origin approaches with different levels and number of bottom level series. We see that increasing the number of series increases computation time linearly. Increasing the number of levels slightly increases the computation time. In both figures we see a substantial difference in computation time between ARIMA and ETS compared with our OLS approach.\footnote{The reconciliation step computation time varies from less than one second to around three seconds, as a function of the number of series. All figures display only the base forecasts computation times.}

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/TourismsimcomputationtimerollinglevelNS-1} 

}

\caption{Computation time (seconds) for rolling origin reconciled forecasts using ETS, ARIMA and OLS, by number of bottom-level series (x-axis), and by levels of hierarchy (panels). Simulated series has error value 0.5 and 24 months test set.}\label{fig:TourismsimcomputationtimerollinglevelNS}
\end{figure}

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/TourismsimcomputationtimefixedlevelNS-1} 

}

\caption{Computation time (seconds) for fixed origin reconciled forecasts using ETS, ARIMA and OLS, by number of bottom-level series (x-axis), and by levels of hierarchy (panels). Simulated series has error value 0.5 and 24 months test set.}\label{fig:TourismsimcomputationtimefixedlevelNS}
\end{figure}

\FloatBarrier

\hypertarget{conclusion}{%
\section{Conclusion}\label{conclusion}}

We have proposed a linear model approach to fast forecasting of hierarchical or grouped time series, with accuracy that nearly matches that of forecast methods such as ETS and ARIMA. This is especially useful in large collections of time series, as is typical in hierarchical and grouped structures. Although ETS and ARIMA are advantageous in terms of forecasting power and accuracy, they can be computationally heavy when facing large collections of time series in the hierarchy.

An important feature of our proposed OLS model is its ability to easily include external information such as holiday dummies or other external series. We also note that OLS has the additional practical advantage of handling missing data while ETS and ARIMA require imputation.

\textcite{pennings2017} proposed another approach for forecasting hierarchical time series using state space models. Although their approach is flexible in handling outliers, missing data and external features, it is less flexible across different kinds of datasets and is computationally much more demanding.

Another advantage of our approach is that it can be computed in a single matrix equation \eqref{eq:singlestep}. This makes it extremely fast and easy to implement, and enables standard results to be derived with minimal effort (e.g., prediction intervals).

In this paper, we computed prediction intervals based on a normality assumption for all the real and simulated examples. One future direction for potentially improving prediction intervals is applying bootstrap-based methods which only assume that the forecast errors are uncorrelated.

\hypertarget{acknowledgements}{%
\section*{Acknowledgements}\label{acknowledgements}}
\addcontentsline{toc}{section}{Acknowledgements}

Ashouri and Shmueli were partially funded by Ministry of Science and Technology (MOST), Taiwan {[}Grant 106-2420-H-007-019{]}. Hyndman's research is supported by the Australian Center of Excellence in Mathematical and Statistical Frontiers.

\clearpage

\hypertarget{appendix-appendix}{%
\appendix}


\hypertarget{appendix-a}{%
\section{Appendix A}\label{appendix-a}}

Figures \ref{fig:boxplotrollingtourismappendix} and \ref{fig:boxplottourismappendix} display boxplots of the scaled forecast errors for the tourism example. These plots are displayed for both rolling forward and multiple-step-ahead forecasts, respectively. We see that scaled error magnitudes are similar at different hierarchy/grouping levels.

\begin{figure}

{\centering \includegraphics[width=0.88\linewidth]{lhf_files/figure-latex/boxplotrollingtourismappendix-1} 

}

\caption{Box plots of scaled forecast errors from reconciled and unreconciled ETS, ARIMA, and OLS rolling origin forecasts for tourism demand data. Panels are different hierarchy levels.}\label{fig:boxplotrollingtourismappendix}
\end{figure}

\begin{figure}

{\centering \includegraphics[width=0.88\linewidth]{lhf_files/figure-latex/boxplottourismappendix-1} 

}

\caption{Box plots of scaled forecast errors from reconciled and unreconciled ETS, ARIMA, and OLS fixed origin forecasts for tourism demand data. Panels are different hierarchy levels.}\label{fig:boxplottourismappendix}
\end{figure}

Tables \ref{tab:Tourismdatadifrecrolling} and \ref{tab:Tourismdatadifrecfix} display the reconciliation results using two other reconciliation matrices introduced in \autocite{mint2018}: shrinkage estimator (\texttt{mint\_shrink}) and variance scaling (\texttt{wls\_var}). These results are presented for both rolling origin (Table \ref{tab:Tourismdatadifrecrolling}) and fixed origin forecasts (Table \ref{tab:Tourismdatadifrecfix}). We see that in this case the shrinkage estimator provides the best reconciled forecasts.

\begin{table}

\caption{\label{tab:Tourismdatadifrecrolling}Comparing Mean(RMSE) of three   different reconciliation matrices for rolling origin forecasts on a 24 months test set.}
\centering
\begin{tabular}[t]{lrrrrrrrrr}
\toprule
\multicolumn{1}{c}{} & \multicolumn{3}{c}{mint\_shrink} & \multicolumn{3}{c}{wls\_var} & \multicolumn{3}{c}{wls\_struct} \\
\cmidrule(l{3pt}r{3pt}){2-4} \cmidrule(l{3pt}r{3pt}){5-7} \cmidrule(l{3pt}r{3pt}){8-10}
Level & ETS & ARIMA & OLS & ETS & ARIMA & OLS & ETS & ARIMA & OLS\\
\midrule
Total & 1600 & 1543 & 1834 & 1705 & 1853 & 1910 & 1628 & 1680 & 1753\\
State & 483 & 448 & 500 & 489 & 478 & 511 & 504 & 492 & 503\\
Zone & 208 & 202 & 209 & 210 & 209 & 213 & 213 & 214 & 213\\
Region & 117 & 116 & 115 & 117 & 119 & 116 & 120 & 122 & 117\\
Purpose & 662 & 645 & 703 & 675 & 725 & 720 & 675 & 694 & 698\\
State x Purpose & 207 & 203 & 209 & 208 & 209 & 213 & 210 & 213 & 213\\
Zone x Purpose & 96 & 97 & 96 & 97 & 99 & 97 & 97 & 101 & 98\\
Region x Purpose & 56 & 57 & 55 & 56 & 57 & 55 & 56 & 58 & 56\\
\bottomrule
\end{tabular}
\end{table}

\begin{table}

\caption{\label{tab:Tourismdatadifrecfix}Comparing Mean(RMSE) of three   different reconciliation matrices for fixed origin forecasts on a 24 months test set.}
\centering
\begin{tabular}[t]{lrrrrrrrrr}
\toprule
\multicolumn{1}{c}{} & \multicolumn{3}{c}{mint\_shrink} & \multicolumn{3}{c}{wls\_var} & \multicolumn{3}{c}{wls\_struct} \\
\cmidrule(l{3pt}r{3pt}){2-4} \cmidrule(l{3pt}r{3pt}){5-7} \cmidrule(l{3pt}r{3pt}){8-10}
Level & ETS & ARIMA & OLS & ETS & ARIMA & OLS & ETS & ARIMA & OLS\\
\midrule
Total & 2444 & 2721 & 2506 & 2541 & 2738 & 2880 & 2369 & 3108 & 2678\\
State & 567 & 580 & 569 & 577 & 585 & 618 & 583 & 597 & 604\\
Zone & 232 & 233 & 228 & 235 & 236 & 244 & 238 & 232 & 243\\
Region & 125 & 126 & 120 & 125 & 127 & 126 & 130 & 129 & 127\\
Purpose & 819 & 868 & 849 & 836 & 898 & 936 & 795 & 859 & 898\\
State x Purpose & 221 & 224 & 224 & 223 & 226 & 238 & 225 & 229 & 236\\
Zone x Purpose & 101 & 102 & 100 & 101 & 103 & 104 & 103 & 104 & 105\\
Region x Purpose & 58 & 58 & 56 & 58 & 58 & 58 & 59 & 59 & 59\\
\bottomrule
\end{tabular}
\end{table}

In Figure \ref{fig:predintAAAVisrecunrec}, we compare the prediction intervals for reconciled and unreconciled forecasts for rolling and fixed origin forecasts. We can see the effect of the reconciliation on improving the forecasts' standard deviations (i.e.~narrower prediction intervals).

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/predintAAAVisrecunrec-1} 

}

\caption{Comparing forecasts and prediction intervals of 'AAAVis' bottom-level series across methods. Left: rolling origin. Right: fixed origin.}\label{fig:predintAAAVisrecunrec}
\end{figure}

\hypertarget{appendix-b-wikipedia-pageviews-grouped-structure}{%
\section{Appendix B: Wikipedia pageviews: Grouped structure}\label{appendix-b-wikipedia-pageviews-grouped-structure}}

We illustrate our method on another real dataset. The Wikipedia dataset includes 12 months of daily data (2016-06-01 to 2017-06-29) on Wikipedia pageviews for the most popular social networks articles \autocite{ashouri2018}. This dataset is noisier than the Australian monthly tourism data, making forecasting more challenging. The data has a grouped structure with the following attributes (see Table \ref{tab:wikipediagroupingstructure}):

\begin{itemize}
\tightlist
\item
  \emph{Agent}: Spider, User;
\item
  \emph{Access}: Desktop, Mobile app, Mobile web;
\item
  \emph{Language}: en (English), de (German), es (Spanish), zh (Chinese); and
\item
  \emph{Purpose}: Blogging related, Business, Gaming, General purpose, Life style, Photo sharing, Reunion, Travel, Video.
\end{itemize}

Figure \ref{fig:wikigroupstructure} shows one possible hierarchy for this dataset, but the order of the hierarchy can be switched.

\begin{figure}

{\centering \includegraphics[width=500px,height=250px]{Paper-Figures/Wiki_group_structure} 

}

\caption{One possible hierarchical structure for the Wikipedia pageviews dataset.}\label{fig:wikigroupstructure}
\end{figure}

\begin{table}

\caption{\label{tab:wikipediagroupingstructure}Social networking Wikipedia article grouping structure}
\centering
\begin{tabular}[t]{llll}
\toprule
Grouping & Series & Grouping & Series\\
\midrule
Total &  & Language & \\
 & 1. Social Network &  & 10. zh (Chinese)\\
Access &  & Purpose & \\
 & 2. Desktop &  & 11. Blogging related\\
 & 3. Mobile app &  & 12. Business\\
Agent &  &  & 13. Gaming\\
 & 4.  Mobile web &  & 14. General purpose\\
 & 5. Spider &  & 15. Life style\\
 & 6. User &  & 16. Photo sharing\\
Language &  &  & 17. Reunion\\
 & 7. en (English) &  & 18. Travel\\
 & 8. de (German) &  & 19. Video\\
 & 9. es (Spanish) &  & \\
\bottomrule
\end{tabular}
\end{table}

We consider the main aggregation factors and their two-way combinations.\footnote{First, while we present results for single groups, we applied all the two-way combinations in the reconciliation step. Second, there are four more 3-way aggregation combinations that we do not include: Agent \(\times\) Access \(\times\) Language, Agent \(\times\) Access \(\times\) Purpose, Agent \(\times\) Language \(\times\) Purpose, and Access \(\times\) Language \(\times\) Purpose. Including these four additional aggregations might slightly improve the results but for simplicity, we excluded them.} The final dataset includes 913 time series, each with length 394. Table \ref{tab:wikidivision} shows the group structure's different levels and the number of series in each level.

\begin{table}[!h]

\caption{\label{tab:wikidivision}Number of Wikipedia pageviews series at each aggregation level.}
\centering
\begin{tabular}[t]{lr}
\toprule
Division & Series\\
\midrule
Total pageviews & 1\\
Access & 3\\
Agent & 2\\
Language & 4\\
Purpose & 9\\
Access x Agent & 5\\
Access x Language & 12\\
Access x Purpose & 27\\
Agent x Language & 8\\
Agent x Purpose & 18\\
Language x Purpose & 33\\
Bottom level & 913\\
\hline
Total & 1035\\
\bottomrule
\end{tabular}
\end{table}

For this daily dataset, in the OLS forecasting model we include in the predictor matrix the following terms: a quadratic trend, 6 seasonal dummies, and lags 1 and 7 for rolling and fixed origin models. We partitioned the data into training and test sets. We used the last 28 days for our test set and the rest for the training set.

Tables \ref{tab:wikipediadataresulrolling} and \ref{tab:wikipediadataresultRMSE} show the RMSE results. Although these time series are noisier, we still get acceptable results for the OLS forecasting model compared with ETS and ARIMA. In this case, we get similar results with and without the reconciliation step.

\begin{table}[!h]

\caption{\label{tab:wikipediadataresulrolling}Mean RMSE for ETS, ARIMA and OLS with and without reconciliation - Rolling origin - Wikipedia dataset}
\centering
\begin{tabular}[t]{lrrrrrr}
\toprule
\multicolumn{1}{c}{} & \multicolumn{3}{c}{Unreconciled} & \multicolumn{3}{c}{Reconciled} \\
\cmidrule(l{3pt}r{3pt}){2-4} \cmidrule(l{3pt}r{3pt}){5-7}
Level & ETS & ARIMA & OLS & ETS & ARIMA & OLS\\
\midrule
Total & 11224 & 14835 & 12968 & 12745 & 13304 & 11959\\
Access & 6532 & 6660 & 5832 & 6787 & 6818 & 5885\\
Agent & 8263 & 10278 & 9372 & 8571 & 8992 & 8504\\
Language & 4889 & 6273 & 5688 & 6059 & 5879 & 5448\\
Purpose & 5237 & 4663 & 4107 & 4548 & 4132 & 3805\\
Bottom level & 358 & 240 & 262 & 364 & 242 & 263\\
\bottomrule
\end{tabular}
\end{table}

\begin{table}

\caption{\label{tab:wikipediadataresultRMSE}Mean RMSE for ETS, ARIMA and OLS with and without reconciliation - Fixed origin - Wikipedia dataset}
\centering
\begin{tabular}[t]{lrrrrrr}
\toprule
\multicolumn{1}{c}{} & \multicolumn{3}{c}{Unreconciled} & \multicolumn{3}{c}{Reconciled} \\
\cmidrule(l{3pt}r{3pt}){2-4} \cmidrule(l{3pt}r{3pt}){5-7}
Level & ETS & ARIMA & OLS & ETS & ARIMA & OLS\\
\midrule
Total & 14731 & 24206 & 20204 & 16570 & 26148 & 19691\\
Access & 7098 & 10710 & 8497 & 8728 & 11519 & 8381\\
Agent & 13634 & 18163 & 14986 & 12508 & 17656 & 14580\\
Language & 6248 & 9505 & 7914 & 6855 & 10756 & 8022\\
Purpose & 7442 & 8614 & 5694 & 5259 & 7508 & 5542\\
Bottom level & 437 & 388 & 366 & 441 & 390 & 366\\
\bottomrule
\end{tabular}
\end{table}

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/boxplotrollingwiki-1} 

}

\caption{Box plots of forecast errors for reconciled and unreconciled ETS, ARIMA and OLS methods at each hierarchy level for rolling origin forecasts of Wikipedia pageviews.}\label{fig:boxplotrollingwiki}
\end{figure}

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/boxplotwiki-1} 

}

\caption{Box plots of forecast errors for reconciled and unreconciled ETS, ARIMA and OLS methods at each hierarchy level for fixed origin forecasts of Wikipedia pageviews.}\label{fig:boxplotwiki}
\end{figure}

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/forecstrolling28wikitotal-1} 

}

\caption{Comparing reconciled and unreconciled ETS, ARIMA and OLS rolling and fixed origin forecasts for Wikipedia pageviews 'Total' Series.}\label{fig:forecstrolling28wikitotal}
\end{figure}

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/forecstrolling28wiki-1} 

}

\caption{Comparing reconciled and unreconciled ETS, ARIMA and OLS rolling and fixed origin forecasts for Wikipedia pageviews 'desktopusenPho21' bottom-level series}\label{fig:forecstrolling28wiki}
\end{figure}

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/predinttotalwiki-1} 

}

\caption{The actual test set for the 'Total series' compared to the forecasts from reconciled and unreconciled OLS methods with prediction interval for rolling and fixed origin Wikipedia pageviews.}\label{fig:predinttotalwiki}
\end{figure}

\begin{figure}

{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/predintdesktopus-1} 

}

\caption{The actual test set for the 'desktopusenPho21' bottom level series compared to the forecasts from reconciled and unreconciled OLS methods with prediction interval for rolling and fixed origin  Wikipedia pageviews.}\label{fig:predintdesktopus}
\end{figure}

Figures \ref{fig:boxplotrollingwiki} and \ref{fig:boxplotwiki} display the forecast error box plot. These plots are for rolling and fixed origin forecasts over 28 days in each level of grouping. We see that the error distribution is similar in all levels across the different methods. The only exception is the Total series, where ETS performs significantly better than ARIMA and OLS. We also note that the reconciliation is less effective. As in the tourism example, in higher levels series have higher counts and therefore their error magnitudes are larger.

In Figures \ref{fig:forecstrolling28wikitotal} and \ref{fig:forecstrolling28wiki}, we display results for the total and one of the bottom level series, ``desktopusenPho21'' (desktop-user-english-photo sharing). The plot shows rolling and fixed origin forecast results over the 28 day test set for ETS, ARIMA and OLS, with (dashed lines) and without (dotted lines) applying reconciliation. We see that the OLS forecasting model performs close to the other two methods, and reconciliation improves the forecasts. In Figures \ref{fig:predinttotalwiki} and \ref{fig:predintdesktopus}, we present the reconciled prediction intervals for the total and ``desktopusenPho21'' series. In the fixed origin results, the ETS prediction intervals were much wider and hence have been omitted to allow for easier visualization of the results.

Table \ref{tab:wikipediadatacomputationtime} presents the computation times for all three methods. ETS and ARIMA are clearly much more computationally heavy compared with OLS. As in the Australian tourism dataset, running reconciliation does not have much effect on computation time.

\begin{table}

\caption{\label{tab:wikipediadatacomputationtime}Computation time (seconds) for ETS, ARIMA and OLS with reconciliation - Rolling and fixed origin forecasts - Wikipedia dataset}
\centering
\begin{tabular}[t]{lrr}
\toprule
 & Rolling origin & Fixed origin\\
\midrule
ETS & 22592 & 882\\
ARIMA & 20016 & 1682\\
OLS & 116 & 61\\
\bottomrule
\end{tabular}
\end{table}

\clearpage

\printbibliography

\end{document}