forked from robjhyndman/linear-hierarchical-forecasting
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathlhf-diffda1d259.tex
1541 lines (1302 loc) · 99.1 KB
/
lhf-diffda1d259.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass[11pt,a4paper,]{article}
%DIF LATEXDIFF DIFFERENCE FILE
%DIF DEL lhf-oldtmp-125924.tex Fri Jul 31 17:07:10 2020
%DIF ADD lhf.tex Fri Jul 31 17:05:41 2020
\usepackage{lmodern}
\usepackage{amssymb,amsmath}
\usepackage{ifxetex,ifluatex}
\usepackage{fixltx2e} % provides \textsubscript
\ifnum 0\ifxetex 1\fi\ifluatex 1\fi=0 % if pdftex
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\else % if luatex or xelatex
\usepackage{unicode-math}
\defaultfontfeatures{Ligatures=TeX,Scale=MatchLowercase}
\fi
% use upquote if available, for straight quotes in verbatim environments
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
% use microtype if available
\IfFileExists{microtype.sty}{%
\usepackage[]{microtype}
\UseMicrotypeSet[protrusion]{basicmath} % disable protrusion for tt fonts
}{}
\PassOptionsToPackage{hyphens}{url} % url is loaded by hyperref
\usepackage[unicode=true]{hyperref}
\hypersetup{
pdftitle={Fast forecast reconciliation using linear models},
pdfkeywords={hierarchical forecasting, grouped forecasting, reconciling forecast, linear regression},
pdfborder={0 0 0},
breaklinks=true}
\urlstyle{same} % don't use monospace font for urls
\usepackage{geometry}
\geometry{left=2.5cm,right=2.5cm,top=2.5cm,bottom=2.5cm}
\usepackage[style=authoryear-comp,]{biblatex}
\addbibresource{references.bib}
\usepackage{longtable,booktabs}
% Fix footnotes in tables (requires footnote package)
\IfFileExists{footnote.sty}{\usepackage{footnote}\makesavenoteenv{long table}}{}
\IfFileExists{parskip.sty}{%
\usepackage{parskip}
}{% else
\setlength{\parindent}{0pt}
\setlength{\parskip}{6pt plus 2pt minus 1pt}
}
\setlength{\emergencystretch}{3em} % prevent overfull lines
\providecommand{\tightlist}{%
\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}
\setcounter{secnumdepth}{5}
% set default figure placement to htbp
\makeatletter
\def\fps@figure{htbp}
\makeatother
\title{Fast forecast reconciliation using linear models}
%% MONASH STUFF
%% CAPTIONS
\RequirePackage{caption}
\DeclareCaptionStyle{italic}[justification=centering]
{labelfont={bf},textfont={it},labelsep=colon}
\captionsetup[figure]{style=italic,format=hang,singlelinecheck=true}
\captionsetup[table]{style=italic,format=hang,singlelinecheck=true}
%% FONT
\RequirePackage{bera}
\RequirePackage{mathpazo}
%% HEADERS AND FOOTERS
\RequirePackage{fancyhdr}
\pagestyle{fancy}
\rfoot{\Large\sffamily\raisebox{-0.1cm}{\textbf{\thepage}}}
\makeatletter
\lhead{\textsf{\expandafter{\@title}}}
\makeatother
\rhead{}
\cfoot{}
\setlength{\headheight}{15pt}
\renewcommand{\headrulewidth}{0.4pt}
\renewcommand{\footrulewidth}{0.4pt}
\fancypagestyle{plain}{%
\fancyhf{} % clear all header and footer fields
\fancyfoot[C]{\sffamily\thepage} % except the center
\renewcommand{\headrulewidth}{0pt}
\renewcommand{\footrulewidth}{0pt}}
%% MATHS
\RequirePackage{bm,amsmath}
\allowdisplaybreaks
%% GRAPHICS
\RequirePackage{graphicx}
\setcounter{topnumber}{2}
\setcounter{bottomnumber}{2}
\setcounter{totalnumber}{4}
\renewcommand{\topfraction}{0.85}
\renewcommand{\bottomfraction}{0.85}
\renewcommand{\textfraction}{0.15}
\renewcommand{\floatpagefraction}{0.8}
%\RequirePackage[section]{placeins}
%% SECTION TITLES
\RequirePackage[compact,sf,bf]{titlesec}
\titleformat{\section}[block]
{\fontsize{15}{17}\bfseries\sffamily}
{\thesection}
{0.4em}{}
\titleformat{\subsection}[block]
{\fontsize{12}{14}\bfseries\sffamily}
{\thesubsection}
{0.4em}{}
\titlespacing{\section}{0pt}{*5}{*1}
\titlespacing{\subsection}{0pt}{*2}{*0.2}
%% TITLE PAGE
\def\Date{\number\day}
\def\Month{\ifcase\month\or
January\or February\or March\or April\or May\or June\or
July\or August\or September\or October\or November\or December\fi}
\def\Year{\number\year}
\makeatletter
\def\wp#1{\gdef\@wp{#1}}\def\@wp{??/??}
\def\jel#1{\gdef\@jel{#1}}\def\@jel{??}
\def\showjel{{\large\textsf{\textbf{JEL classification:}}~\@jel}}
\def\nojel{\def\showjel{}}
\def\addresses#1{\gdef\@addresses{#1}}\def\@addresses{??}
\def\cover{{\sffamily\setcounter{page}{0}
\thispagestyle{empty}
\placefig{2}{1.5}{width=5cm}{monash2}
\placefig{16.9}{1.5}{width=2.1cm}{MBusSchool}
\begin{textblock}{4}(16.9,4)ISSN 1440-771X\end{textblock}
\begin{textblock}{7}(12.7,27.9)\hfill
\includegraphics[height=0.7cm]{AACSB}~~~
\includegraphics[height=0.7cm]{EQUIS}~~~
\includegraphics[height=0.7cm]{AMBA}
\end{textblock}
\vspace*{2cm}
\begin{center}\Large
Department of Econometrics and Business Statistics\\[.5cm]
\footnotesize http://monash.edu/business/ebs/research/publications
\end{center}\vspace{2cm}
\begin{center}
\fbox{\parbox{14cm}{\begin{onehalfspace}\centering\Huge\vspace*{0.3cm}
\textsf{\textbf{\expandafter{\@title}}}\vspace{1cm}\par
\LARGE\@author\end{onehalfspace}
}}
\end{center}
\vfill
\begin{center}\Large
\Month~\Year\\[1cm]
Working Paper \@wp
\end{center}\vspace*{2cm}}}
\def\pageone{{\sffamily\setstretch{1}%
\thispagestyle{empty}%
\vbox to \textheight{%
\raggedright\baselineskip=1.2cm
{\fontsize{24.88}{30}\sffamily\textbf{\expandafter{\@title}}}
\vspace{2cm}\par
\hspace{1cm}\parbox{14cm}{\sffamily\large\@addresses}\vspace{1cm}\vfill
\hspace{1cm}{\large\Date~\Month~\Year}\\[1cm]
\hspace{1cm}\showjel\vss}}}
\def\blindtitle{{\sffamily
\thispagestyle{plain}\raggedright\baselineskip=1.2cm
{\fontsize{24.88}{30}\sffamily\textbf{\expandafter{\@title}}}\vspace{1cm}\par
}}
\def\titlepage{{\cover\newpage\pageone\newpage\blindtitle}}
\def\blind{\def\titlepage{{\blindtitle}}\let\maketitle\blindtitle}
\def\titlepageonly{\def\titlepage{{\pageone\end{document}}}}
\def\nocover{\def\titlepage{{\pageone\newpage\blindtitle}}\let\maketitle\titlepage}
\let\maketitle\titlepage
\makeatother
%% SPACING
\RequirePackage{setspace}
\spacing{1.5}
%% LINE AND PAGE BREAKING
\sloppy
\clubpenalty = 10000
\widowpenalty = 10000
\brokenpenalty = 10000
\RequirePackage{microtype}
%% PARAGRAPH BREAKS
\setlength{\parskip}{1.4ex}
\setlength{\parindent}{0em}
%% HYPERLINKS
\RequirePackage{xcolor} % Needed for links
\definecolor{darkblue}{rgb}{0,0,.6}
\RequirePackage{url}
\makeatletter
\@ifpackageloaded{hyperref}{}{\RequirePackage{hyperref}}
\makeatother
\hypersetup{
citecolor=0 0 0,
breaklinks=true,
bookmarksopen=true,
bookmarksnumbered=true,
linkcolor=darkblue,
urlcolor=blue,
citecolor=darkblue,
colorlinks=true}
%% KEYWORDS
\newenvironment{keywords}{\par\vspace{0.5cm}\noindent{\sffamily\textbf{Keywords:}}}{\vspace{0.25cm}\par\hrule\vspace{0.5cm}\par}
%% ABSTRACT
\renewenvironment{abstract}{\begin{minipage}{\textwidth}\parskip=1.4ex\noindent
\hrule\vspace{0.1cm}\par{\sffamily\textbf{\abstractname}}\newline}
{\end{minipage}}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage[showonlyrefs]{mathtools}
\usepackage[no-weekday]{eukdate}
%% BIBLIOGRAPHY
\makeatletter
\@ifpackageloaded{biblatex}{}{\usepackage[style=authoryear-comp, backend=biber, natbib=true]{biblatex}}
\makeatother
\ExecuteBibliographyOptions{bibencoding=utf8,minnames=1,maxnames=3, maxbibnames=99,dashed=false,terseinits=true,giveninits=true,uniquename=false,uniquelist=false,doi=false, isbn=false,url=true,sortcites=false}
\DeclareFieldFormat{url}{\texttt{\url{#1}}}
\DeclareFieldFormat[article]{pages}{#1}
\DeclareFieldFormat[inproceedings]{pages}{\lowercase{pp.}#1}
\DeclareFieldFormat[incollection]{pages}{\lowercase{pp.}#1}
\DeclareFieldFormat[article]{volume}{\mkbibbold{#1}}
\DeclareFieldFormat[article]{number}{\mkbibparens{#1}}
\DeclareFieldFormat[article]{title}{\MakeCapital{#1}}
\DeclareFieldFormat[inproceedings]{title}{#1}
\DeclareFieldFormat{shorthandwidth}{#1}
% No dot before number of articles
\usepackage{xpatch}
\xpatchbibmacro{volume+number+eid}{\setunit*{\adddot}}{}{}{}
% Remove In: for an article.
\renewbibmacro{in:}{%
\ifentrytype{article}{}{%
\printtext{\bibstring{in}\intitlepunct}}}
\makeatletter
\DeclareDelimFormat[cbx@textcite]{nameyeardelim}{\addspace}
\makeatother
\renewcommand*{\finalnamedelim}{%
%\ifnumgreater{\value{liststop}}{2}{\finalandcomma}{}% there really should be no funny Oxford comma business here
\addspace\&\space}
\wp{29/19}
\jel{C10,C14,C22}
\RequirePackage[absolute,overlay]{textpos}
\setlength{\TPHorizModule}{1cm}
\setlength{\TPVertModule}{1cm}
\def\placefig#1#2#3#4{\begin{textblock}{.1}(#1,#2)\rlap{\includegraphics[#3]{#4}}\end{textblock}}
\author{Mahsa~Ashouri, Rob J~Hyndman, Galit~Shmueli}
\addresses{\textbf{Mahsa Ashouri}\newline
Institute of Service Science, National Tsing Hua University, Taiwan
\newline{Email: \href{mailto:[email protected]}{\nolinkurl{[email protected]}}}\newline Corresponding author\\[1cm]
\textbf{Rob J Hyndman}\newline
Monash University, Clayton VIC 3800, Australia
\newline{Email: \href{mailto:[email protected]}{\nolinkurl{[email protected]}}}\\[1cm]
\textbf{Galit Shmueli}\newline
Institute of Service Science, National Tsing Hua University, Taiwan
\newline{Email: \href{mailto:[email protected]}{\nolinkurl{[email protected]}}}\\[1cm]
}
\date{\sf\Date~\Month~\Year}
\makeatletter
\lfoot{\sf Ashouri, Hyndman, Shmueli: \@date}
\makeatother
%% Any special functions or other packages can be loaded here.
\usepackage{booktabs}
\usepackage{float}
\usepackage{longtable}
\usepackage{cases}
\usepackage{array}
\usepackage{todonotes}
%\usepackage[backend=biber]{biblatex}
%\usepackage[backend=biber, bibencoding=utf8, style=authoryear, citestyle=authoryear]{biblatex}
\mathtoolsset{showonlyrefs=true}
\allowdisplaybreaks
\def\addlinespace{}
\usepackage[section]{placeins}
\usepackage{float}
\let\origfigure\figure
\let\endorigfigure\endfigure
\renewenvironment{figure}[1][2] {
\expandafter\origfigure\expandafter[!htbp]
} {
\endorigfigure
}
\let\origtable\table
\let\endorigtable\endtable
\renewenvironment{table}[1][2] {
\expandafter\origtable\expandafter[!htbp]
} {
\endorigtable
}
\usepackage{booktabs}
\usepackage{longtable}
\usepackage{array}
\usepackage{multirow}
\usepackage{wrapfig}
\usepackage{float}
\usepackage{colortbl}
\usepackage{pdflscape}
\usepackage{tabu}
\usepackage{threeparttable}
\usepackage{threeparttablex}
\usepackage[normalem]{ulem}
\usepackage{makecell}
\usepackage{xcolor}
%DIF PREAMBLE EXTENSION ADDED BY LATEXDIFF
%DIF UNDERLINE PREAMBLE %DIF PREAMBLE
\RequirePackage[normalem]{ulem} %DIF PREAMBLE
\RequirePackage{color}\definecolor{RED}{rgb}{1,0,0}\definecolor{BLUE}{rgb}{0,0,1} %DIF PREAMBLE
\providecommand{\DIFaddtex}[1]{{\protect\color{blue}\uwave{#1}}} %DIF PREAMBLE
\providecommand{\DIFdeltex}[1]{{\protect\color{red}\sout{#1}}} %DIF PREAMBLE
%DIF SAFE PREAMBLE %DIF PREAMBLE
\providecommand{\DIFaddbegin}{} %DIF PREAMBLE
\providecommand{\DIFaddend}{} %DIF PREAMBLE
\providecommand{\DIFdelbegin}{} %DIF PREAMBLE
\providecommand{\DIFdelend}{} %DIF PREAMBLE
\providecommand{\DIFmodbegin}{} %DIF PREAMBLE
\providecommand{\DIFmodend}{} %DIF PREAMBLE
%DIF FLOATSAFE PREAMBLE %DIF PREAMBLE
\providecommand{\DIFaddFL}[1]{\DIFadd{#1}} %DIF PREAMBLE
\providecommand{\DIFdelFL}[1]{\DIFdel{#1}} %DIF PREAMBLE
\providecommand{\DIFaddbeginFL}{} %DIF PREAMBLE
\providecommand{\DIFaddendFL}{} %DIF PREAMBLE
\providecommand{\DIFdelbeginFL}{} %DIF PREAMBLE
\providecommand{\DIFdelendFL}{} %DIF PREAMBLE
%DIF HYPERREF PREAMBLE %DIF PREAMBLE
\providecommand{\DIFadd}[1]{\texorpdfstring{\DIFaddtex{#1}}{#1}} %DIF PREAMBLE
\providecommand{\DIFdel}[1]{\texorpdfstring{\DIFdeltex{#1}}{}} %DIF PREAMBLE
\newcommand{\DIFscaledelfig}{0.5}
%DIF HIGHLIGHTGRAPHICS PREAMBLE %DIF PREAMBLE
\RequirePackage{settobox} %DIF PREAMBLE
\RequirePackage{letltxmacro} %DIF PREAMBLE
\newsavebox{\DIFdelgraphicsbox} %DIF PREAMBLE
\newlength{\DIFdelgraphicswidth} %DIF PREAMBLE
\newlength{\DIFdelgraphicsheight} %DIF PREAMBLE
% store original definition of \includegraphics %DIF PREAMBLE
\LetLtxMacro{\DIFOincludegraphics}{\includegraphics} %DIF PREAMBLE
\newcommand{\DIFaddincludegraphics}[2][]{{\color{blue}\fbox{\DIFOincludegraphics[#1]{#2}}}} %DIF PREAMBLE
\newcommand{\DIFdelincludegraphics}[2][]{% %DIF PREAMBLE
\sbox{\DIFdelgraphicsbox}{\DIFOincludegraphics[#1]{#2}}% %DIF PREAMBLE
\settoboxwidth{\DIFdelgraphicswidth}{\DIFdelgraphicsbox} %DIF PREAMBLE
\settoboxtotalheight{\DIFdelgraphicsheight}{\DIFdelgraphicsbox} %DIF PREAMBLE
\scalebox{\DIFscaledelfig}{% %DIF PREAMBLE
\parbox[b]{\DIFdelgraphicswidth}{\usebox{\DIFdelgraphicsbox}\\[-\baselineskip] \rule{\DIFdelgraphicswidth}{0em}}\llap{\resizebox{\DIFdelgraphicswidth}{\DIFdelgraphicsheight}{% %DIF PREAMBLE
\setlength{\unitlength}{\DIFdelgraphicswidth}% %DIF PREAMBLE
\begin{picture}(1,1)% %DIF PREAMBLE
\thicklines\linethickness{2pt} %DIF PREAMBLE
{\color[rgb]{1,0,0}\put(0,0){\framebox(1,1){}}}% %DIF PREAMBLE
{\color[rgb]{1,0,0}\put(0,0){\line( 1,1){1}}}% %DIF PREAMBLE
{\color[rgb]{1,0,0}\put(0,1){\line(1,-1){1}}}% %DIF PREAMBLE
\end{picture}% %DIF PREAMBLE
}\hspace*{3pt}}} %DIF PREAMBLE
} %DIF PREAMBLE
\LetLtxMacro{\DIFOaddbegin}{\DIFaddbegin} %DIF PREAMBLE
\LetLtxMacro{\DIFOaddend}{\DIFaddend} %DIF PREAMBLE
\LetLtxMacro{\DIFOdelbegin}{\DIFdelbegin} %DIF PREAMBLE
\LetLtxMacro{\DIFOdelend}{\DIFdelend} %DIF PREAMBLE
\DeclareRobustCommand{\DIFaddbegin}{\DIFOaddbegin \let\includegraphics\DIFaddincludegraphics} %DIF PREAMBLE
\DeclareRobustCommand{\DIFaddend}{\DIFOaddend \let\includegraphics\DIFOincludegraphics} %DIF PREAMBLE
\DeclareRobustCommand{\DIFdelbegin}{\DIFOdelbegin \let\includegraphics\DIFdelincludegraphics} %DIF PREAMBLE
\DeclareRobustCommand{\DIFdelend}{\DIFOaddend \let\includegraphics\DIFOincludegraphics} %DIF PREAMBLE
\LetLtxMacro{\DIFOaddbeginFL}{\DIFaddbeginFL} %DIF PREAMBLE
\LetLtxMacro{\DIFOaddendFL}{\DIFaddendFL} %DIF PREAMBLE
\LetLtxMacro{\DIFOdelbeginFL}{\DIFdelbeginFL} %DIF PREAMBLE
\LetLtxMacro{\DIFOdelendFL}{\DIFdelendFL} %DIF PREAMBLE
\DeclareRobustCommand{\DIFaddbeginFL}{\DIFOaddbeginFL \let\includegraphics\DIFaddincludegraphics} %DIF PREAMBLE
\DeclareRobustCommand{\DIFaddendFL}{\DIFOaddendFL \let\includegraphics\DIFOincludegraphics} %DIF PREAMBLE
\DeclareRobustCommand{\DIFdelbeginFL}{\DIFOdelbeginFL \let\includegraphics\DIFdelincludegraphics} %DIF PREAMBLE
\DeclareRobustCommand{\DIFdelendFL}{\DIFOaddendFL \let\includegraphics\DIFOincludegraphics} %DIF PREAMBLE
%DIF LISTINGS PREAMBLE %DIF PREAMBLE
\RequirePackage{listings} %DIF PREAMBLE
\RequirePackage{color} %DIF PREAMBLE
\lstdefinelanguage{DIFcode}{ %DIF PREAMBLE
%DIF DIFCODE_UNDERLINE %DIF PREAMBLE
moredelim=[il][\color{red}\sout]{\%DIF\ <\ }, %DIF PREAMBLE
moredelim=[il][\color{blue}\uwave]{\%DIF\ >\ } %DIF PREAMBLE
} %DIF PREAMBLE
\lstdefinestyle{DIFverbatimstyle}{ %DIF PREAMBLE
language=DIFcode, %DIF PREAMBLE
basicstyle=\ttfamily, %DIF PREAMBLE
columns=fullflexible, %DIF PREAMBLE
keepspaces=true %DIF PREAMBLE
} %DIF PREAMBLE
\lstnewenvironment{DIFverbatim}{\lstset{style=DIFverbatimstyle}}{} %DIF PREAMBLE
\lstnewenvironment{DIFverbatim*}{\lstset{style=DIFverbatimstyle,showspaces=true}}{} %DIF PREAMBLE
%DIF END PREAMBLE EXTENSION ADDED BY LATEXDIFF
\begin{document}
\maketitle
\begin{abstract}
Forecasting hierarchical or grouped time series usually involves two steps: computing base forecasts and reconciling the forecasts. Base forecasts can be computed by popular time series forecasting methods such as Exponential Smoothing (ETS) and Autoregressive Integrated Moving Average (ARIMA) models. The reconciliation step is a linear process that adjusts the base forecasts to ensure they are coherent. However using ETS or ARIMA for base forecasts can be computationally challenging when there are a large number of series to forecast, as each model must be numerically optimized for each series. We propose a linear model that avoids this computational problem and handles the forecasting and reconciliation in a single step. The proposed method is very flexible in incorporating external data, handling missing values and model selection. We illustrate our approach using two datasets: monthly Australian domestic tourism and daily Wikipedia pageviews. We compare our approach to reconciliation using ETS and ARIMA, and show that our approach is much faster while providing similar levels of forecast accuracy.
\end{abstract}
\begin{keywords}
hierarchical forecasting, grouped forecasting, reconciling forecast, linear regression
\end{keywords}
\hypertarget{introduction}{%
\section{Introduction}\label{introduction}}
Modern data collection tools have dramatically increased the amount of available time series data \DIFaddbegin \autocite{januschowski2013forecasting}\DIFaddend . For example, the \DIFdelbegin \DIFdel{Internet of Things }\DIFdelend \DIFaddbegin \DIFadd{internet of things }\DIFaddend and point-of-sale scanning produce huge volumes of time series in a short period of time. Naturally, there is an interest in forecasting these time series, yet forecasting large collections of time series is computationally challenging.
\hypertarget{hierarchical-and-grouped-time-series}{%
\subsection{Hierarchical and grouped time series}\label{hierarchical-and-grouped-time-series}}
In many cases, these time series can be structured and disaggregated based on hierarchies or groups such as geographic location, product type, gender, etc. An example of hierarchical time series is sales in restaurant chains, which can be disaggregated into different \DIFdelbegin \DIFdel{stores and then different types of food or drinks}\DIFdelend \DIFaddbegin \DIFadd{states and then into different stores}\DIFaddend . Figure \ref{fig:hierarchicalexample} shows a schematic of such a hierarchical time series structure with three levels. The top level is the total series, formed by aggregating all the bottom level series. In the middle level, series are aggregations of their own child series; for instance, series A is the aggregation of AW and AX. Finally, the bottom level series, includes the most disaggregated series.
\begin{figure}
{\centering \includegraphics[width=200px,height=170px,trim=0 0 190 0,clip=true]{Paper-Figures/hierarchical_example}
}
\caption{An example of a two level hierarchical structure.}\label{fig:hierarchicalexample}
\end{figure}
Grouped time series involve more complicated aggregation structures compared to strictly hierarchical time series. To take the simplest example, suppose we have two grouping factors which are not nested: sex (Male/Female) and city (New York/San Francisco). The disaggregated series for each combination of sex and city can be combined to form city sub-totals, or sex sub-totals. These sub-totals can be combined to give the overall total. Both sub-totals are of interest.
We can think of such structures as hierarchical time series without a unique hierarchy. A schematic of this grouped time series structure is shown in Figure \ref{fig:groupexample} with two grouping factors, each of two levels (A/B and C/D). The series in this structure can be split first into groups A and B and then subdivided further into C and D (left side), or split first into C and D and then subdivided into A and B (right side). The final disaggregation is identical in both cases, but the middle level aggregates are different.
\begin{figure}
{\centering \includegraphics[width=330px,height=180px]{lhf_files/figure-latex/groupexample-1}
}
\caption{An example of a two level grouped structure.}\label{fig:groupexample}
\end{figure}
We use the same notation \autocite[following][]{fpp2} for both hierarchical and grouped time series. We denote the total series at time \(t\) by \(y_t\), and the series at node \(Z\) (subaggregation level \(Z\)) and time \(t\) by \(y_{Z,t}\). For describing the relationships between series, we use an \(N\times M\) matrix, called the ``summing matrix'', denoted by \(\bm{S}\), in which \(N\) is the overall number of nodes and \(M\) is the number of bottom level nodes. For example in Figure \ref{fig:hierarchicalexample}, \(N = 7\) and \(M = 4\), while in Figure \ref{fig:groupexample}, \(N=9\) and \(M=4\). Then we can write \(\bm{y}_t=\bm{S}\bm{b}_t\), where \(\bm{y}_t\) is a vector of all the level nodes at time \(t\) and \(\bm{b}_t\) is the vector of all the bottom level nodes at time \(t\). For the example shown in Figure \ref{fig:groupexample}, the equation can be written as follows:
\begin{equation}\label{eq:Smatrixexample}
\begin{pmatrix}
y_{t}\\y_{A,t}\\y_{B,t}\\y_{C,t}\\y_{D,t}\\y_{AC,t}\\y_{AD,t}\\y_{BC,t}\\y_{BD,t}
\end{pmatrix} =
\begin{pmatrix}
1&1&1&1\\1&1&0&0\\0&0&1&1\\1&0&1&0\\0&1&0&1\\1&0&0&0\\0&1&0&0\\0&0&1&0\\0&0&0&1\\
\end{pmatrix}
\begin{pmatrix}
y_{AC,t}\\y_{AD,t}\\y_{BC,t}\\y_{BD,t}\\
\end{pmatrix}.
\end{equation}
\hypertarget{forecasting-hierarchical-time-series}{%
\subsection{Forecasting hierarchical time series}\label{forecasting-hierarchical-time-series}}
If we just forecast each series individually, we are ignoring the hierarchical or grouping structure, and the forecasts will not be ``coherent''\DIFdelbegin \DIFdel{(}\DIFdelend \DIFaddbegin \DIFadd{. That is, }\DIFaddend they will not add up \DIFdelbegin \DIFdel{appropriately)}\DIFdelend \DIFaddbegin \DIFadd{in a way that is consistent with the aggregation structure of the time series collection }\autocite{fpp2}\DIFaddend .
There are several available methods that consider the hierarchical structure information when forecasting time series. These include the top-down \autocite{gross1990disaggregation,fliedner2001hierarchical}, bottom-up \autocite{kahn1998revisiting}, middle-out and optimal combination \autocite{hyndman2011optimal} approaches. In the top-down approach, we first forecast the total series and then disaggregate the forecast to form lower level series forecasts based on a set of historical and forecasted proportions \autocite[for details see][]{athanasopoulos2009hierarchical}. In the bottom-up approach, the forecasts in each level of the hierarchy can be computed by aggregating the bottom level series forecasts. However, we may not get good upper-level forecasts because the most disaggregated series can be noisy and so their forecasts are often inaccurate. In the middle-out approach, the process can be started from one of the middle levels and other forecasts can be computed using aggregation for upper levels and disaggregation for lower levels. Finally, optimal combination uses all the \DIFdelbegin \DIFdel{\(n\) }\DIFdelend \DIFaddbegin \DIFadd{\(N\) }\DIFaddend forecasts for all of the series in the entire structure, and then uses an optimization process to reconcile the resulting forecasts. The advantage of the optimal combination method, compared with the other methods, is that it considers all information in the hierarchy, including any correlations among the series.
In the optimal combination method, reconciled forecasts can be computed using the following equation known as weighted least squares (WLS) \autocite{mint2018}
\begin{equation}\label{eq:mint}
\tilde{\bm{y}}_{h}=\bm{S}(\bm{S}'\bm{W}_h^{-1}\bm{S})^{-1}\bm{S}'\bm{W}_h^{-1}\hat{\bm{y}}_h,
\end{equation}
where \(\hat{\bm{y}}_h\) represents a vector of \(h\)-step-ahead base forecasts for all levels of the hierarchy, and \(\bm{W}_h\) is the covariance matrix of forecast errors for the \(h\)-step-ahead base forecasts.
Several possible simple methods for estimating \(\bm{W}_h\) are available. \textcite{mint2018} discuss a simple approximation whereby \(\bm{W}_h = k_h \bm{\Lambda}\) with \(k_h\) being a positive constant, \(\bm{\Lambda} = \text{diag}(\bm{S}\bm{1})\), and \(\bm{1}\) being a column of 1s. Note that \(\bm{\Lambda}\) simply contains the row sums of the summing matrix \(\bm{S}\), and that \(k_h\) will cancel out in \eqref{eq:mint}. Thus
\begin{equation}\label{eq:mint2}
\tilde{\bm{y}}_{h}=\bm{S}(\bm{S}'\bm{\Lambda}^{-1}\bm{S})^{-1}\bm{S}'\bm{\Lambda}^{-1}\hat{\bm{y}}_h.
\end{equation}
The most computationally challenging part of the optimal combination method is to produce all the base forecasts that make up \(\hat{\bm{y}}_h\). In many applications, there may be thousands or even millions of individual series, and each of them must be forecast independently. The most popular time series forecasting methods such as ETS and ARIMA models \autocite{fpp2} involve non-linear optimization routines to estimate the parameters via maximum likelihood estimation. Usually, multiple models are fitted for each series, and the best is \DIFdelbegin \DIFdel{select }\DIFdelend \DIFaddbegin \DIFadd{selected }\DIFaddend by minimizing Akaike's Information Citerion \autocite{akaike1998information}. This computational challenges increases with the number of lower level series as well as in the number of aggregations of interest.
We therefore propose a new approach to compute the base forecasts that is both computationally fast while maintaining an acceptable forecasting accuracy level.
\hypertarget{proposed-approach-linear-model}{%
\section{\texorpdfstring{Proposed approach: Linear model \label{sec:proposedapproach1}}{Proposed approach: Linear model }}\label{proposed-approach-linear-model}}
Our proposed approach is based on using linear regression models for computing base forecasts. Suppose we have a linear model that we use for forecasting, and we wish to apply it to \(N\) different series which have some aggregation constraints. We have observations \(y_{t,i}\) from times \(t=1,\dots,T\) and series \(i=1,\dots,N\). Then
\begin{equation}\label{eq:basicequation}
y_{t,i} = \bm{\beta}_{i}' \bm{x}_{t,i} + \varepsilon_{t,i}
\end{equation}
where \(\bm{x}_{t,i}=\{1, x_{t,i,1},\dots,x_{t,i,p}\}\) is a \((p+1)\)-vector of regression variables. This equation for all the observations in matrix form can be written as follows:
\begin{equation}\label{eq:linearmodel}
\begin{pmatrix}
\bm{y}_1\\
\bm{y}_2\\
\bm{y}_3 \\
\vdots\\
\bm{y}_N
\end{pmatrix}=
\begin{pmatrix}
\bm{X}_1 & 0 & 0 & \dots & 0\\
0 & \bm{X}_2 & 0 & \dots & 0\\
0 & 0 & \bm{X}_3 & \ddots & \vdots \\
\vdots & \vdots & \ddots & \ddots & 0\\
0 & 0 & \dots & 0 & \bm{X}_N
\end{pmatrix}
\begin{pmatrix}
\bm{\beta}_1\\
\bm{\beta}_2\\
\bm{\beta}_3\\
\vdots\\
\bm{\beta}_N
\end{pmatrix}+
\begin{pmatrix}
\bm{\varepsilon}_1\\
\bm{\varepsilon}_2\\
\bm{\varepsilon}_3\\
\vdots \\
\bm{\varepsilon}_N
\end{pmatrix},
\end{equation}
where \(\bm{y}_i = \{y_{1,i}, y_{2,i}, \dots, y_{T,i}\}\) is a \(T\)-vector, \({\bm{\beta}}_i = \{\beta_{0,i}, \beta_{1,i}, \beta_{2,i}, \dots, \beta_{p,i}\}\) is a \((p+1)\)-vector, \({\bm{\varepsilon}}_i = \{\varepsilon_{1,i}, \varepsilon_{2,i}, \dots, \varepsilon_{T,i}\}\) is a \(T\)-vector and \(\bm{X}_i\) is the \(T\times (p+1)\)-matrix
\begin{equation}\label{eq:Xmatrixdefinition}
\bm{X}_i = \begin{pmatrix}
1 & x_{1,i,1} & x_{1,i,2} & \dots & x_{1,i,p}\\
1 & x_{2,i,1} & x_{2,i,2} & \dots & x_{2,i,p}\\
\vdots & \vdots & \vdots & & \vdots \\
1 & x_{T,i,1} & x_{T,i,2} & \dots & x_{T,i,p}
\end{pmatrix}.
\end{equation}
Equation \eqref{eq:linearmodel} can be written as \(\bm{Y} = \bm{X} \bm{B} + \bm{E}\), with parameter estimates given by \(\hat{\bm{B}} = (\bm{X}'\bm{X})^{-1} \bm{X}'\bm{Y}\). Then the base forecasts are obtained using
\begin{equation}\label{eq:baseforecasts}
\hat{\bm{y}}_{t+h} = \bm{X}_{t+h}^* \hat{\bm{B}},
\end{equation}
where \(\hat{\bm{y}}_{t+h}\) is an \(N\)-vector of forecasts, \(\hat{\bm{B}}\) comprises \(N\) stacked \((p+1)\)-vectors of estimated coefficients, and \(\bm{X}_{t+h}^*\) is the \(N\times N(p+1)\) matrix
\pagebreak[3]\begin{equation}
\bm{X}_{t+h}^* =
\begin{pmatrix}
\bm{x}_{t+h,1}' & 0 & 0 & \dots & 0\\
0 & \bm{x}_{t+h,2}' & 0 & \dots & 0\\
0 & 0 & \bm{x}_{t+h,3}' & \ddots & \vdots \\
\vdots & \vdots & \ddots & \ddots & 0\\
0 & 0 & \dots & 0 & \bm{x}_{t+h,N}'
\end{pmatrix}.
\end{equation}
Note that we use \(\bm{X}^*_{t}\) to distinguish this matrix, which combines \(\bm{x}_{t,i}\) across all series for one time from \(\bm{X}_i\) which combines \(\bm{x}_{t,i}\) across all time for one series.
Finally, we can combine the two linear equations for computing base forecasts and reconciled forecasts (Equations \eqref{eq:mint2} and \eqref{eq:baseforecasts}) to obtain the reconciled forecasts with a single equation:
\begin{equation}\label{eq:singlestep}
\tilde{\bm{y}}_{t+h} = \bm{S}(\bm{S}'\bm{\Lambda}\bm{S})^{-1}\bm{S}'\bm{\Lambda}
(\bm{X}_{t+h}^* \hat{\bm{B}})
= \bm{S}(\bm{S}'\bm{\Lambda}\bm{S})^{-1}\bm{S}'\bm{\Lambda}
\bm{X}_{t+h}^* (\bm{X}'\bm{X})^{-1} \bm{X}'\bm{Y}.
\end{equation}
\DIFdelbegin %DIFDELCMD < \hypertarget{simplified-formulation-for-a-fixed-set-of-predictors-bf-x}{%
%DIFDELCMD < \subsection{\texorpdfstring{Simplified formulation for a fixed set of predictors (\(\bf {X}\)) \label{sec:proposedapproach2}}{Simplified formulation for a fixed set of predictors (\textbackslash{}bf \{X\}) }}\label{simplified-formulation-for-a-fixed-set-of-predictors-bf-x}}
%DIFDELCMD < %%%
\DIFdelend \DIFaddbegin \hypertarget{simplified-formulation-for-a-fixed-set-of-predictors-bf-x}{%
\subsection{\texorpdfstring{Simplified formulation for a fixed set of predictors (\(\bf {X}\)) \label{sec:proposedapproach2}}{Simplified formulation for a fixed set of predictors (\textbackslash bf \{X\}) }}\label{simplified-formulation-for-a-fixed-set-of-predictors-bf-x}}
\DIFaddend
If we have the same set of predictor variables, \(\bm{X}\), for all the series, we can write Equations \eqref{eq:linearmodel} to \eqref{eq:singlestep} more easily using multivariate regression equations, and we can obtain all the reconciled forecasts for all the series in one equation. In that case, Equation \eqref{eq:linearmodel} can be rearranged as follows:
\begin{equation}\label{eq:linearmodelsameX}
\begin{pmatrix}
y_{11} & \dots & y_{1N}\\
y_{21} & \dots & y_{2N}\\
\vdots & & \vdots\\
y_{T1} & \dots & y_{TN}
\end{pmatrix} =
\begin{pmatrix}
1 & X_{11} & \dots & X_{1p}\\
1 & X_{21} & \dots & X_{2p}\\
\vdots & \vdots & & \vdots\\
1 & X_{T1} & \dots & X_{Tp}
\end{pmatrix}
\begin{pmatrix}
\beta_{01} & \dots & \beta_{0N}\\
\beta_{11} & \dots & \beta_{1N}\\
\vdots & & \vdots\\
\beta_{p1} & \dots & \beta_{pN}
\end{pmatrix} \\
+
\begin{pmatrix}
\varepsilon_{11} & \dots & \varepsilon_{1N}\\
\varepsilon_{21} & \dots & \varepsilon_{2N}\\
\vdots & & \vdots\\
\varepsilon_{T1} & \dots & \varepsilon_{TN}
\end{pmatrix},
\end{equation}
where \(\bm{Y}\), \(\bm{X}\), \(\bm{B}\) and \(\bm{E}\) are now matrices of size \(T\times N\), \(T\times (p+1)\), \((p+1)\times N\) and \(T \times N\), respectively. Equations \eqref{eq:baseforecasts} to \eqref{eq:singlestep} can be written accordingly using Equation \eqref{eq:linearmodelsameX} and here \(\bm{X}^*_{t+h,i} = \bm{X}^*_{t+h}\), where \(\bm{X}^*_{t+h}\) is an \(h\times (p+1)\) matrix.
\hypertarget{ols-predictors}{%
\subsection{OLS predictors}\label{ols-predictors}}
As an example of the \(\bm{X}_t\) matrix in Equation \eqref{eq:linearmodel}, we can refer to the set of predictors proposed in \textcite{ashouri2018} for modeling trend, seasonality and autocorrelation by using lagged values (\(y_{t-1}\), \(y_{t-2}\), \dots), trend variables and seasonal dummy variables:
\begin{equation}\label{eq:linearmodelexample}
y_t = \alpha_0 + \alpha_1 t + \beta_1 s_{1,t} + \cdots + \beta_{m-1} s_{m-1,t} + \gamma_1 y_{t-1} + \cdots + \gamma_p y_{t-p} + \delta z_t + \varepsilon_t.
\end{equation}
Here, \(s_{j,t}\) is a dummy variable taking value 1 if time \(t\) is in season \(j\) (\(j=1, 2, \dots, m\)), \(y_{t-k}\) is the \(k\)th lagged value for \(y_t\) and \(z_t\) is some external information at time \(t\). The seasonal period \(m\) depends on the problem; for instance, if we have daily data with day-of-week seasonality, then \(m=7\).
Because of using lags and external series as predictors in Equation \eqref{eq:linearmodelexample}, we do not have same set of predictors for all the series, \(y_t\). However, if we just use trend and seasonality dummies as the predictors, then the simpler equations\DIFdelbegin \DIFdel{given in Section \ref{sec:proposedapproach1} }\DIFdelend \DIFaddbegin \DIFadd{, Equation }\eqref{eq:linearmodelsameX}\DIFadd{, }\DIFaddend can be written using multivariate regression models.
\DIFaddbegin \DIFadd{When there are many options for choosing predictors, such as many seasonal dummy variables, lags, or high order trend terms, we can consider applying a model selection approach such as Akaike's Information Criterion or leave-one-out cross-validation (LOOCV) to select the best set of predictors in terms of prediction. In practice, LOOCV can be computationally heavy except in the special case of linear models }\autocite{christensenplane} \DIFadd{and therefore using linear models provide a viable solution. Also, when the number of seasons \(m\) is large (e.g.~in hourly data), Fourier terms can result in fewer predictors than dummy variables. The number of Fourier terms can also be determined using the same AIC or LOOCV approach }\autocite{fpp2}\DIFadd{.
}
\DIFaddend While OLS is popular in practice for forecasting time series, it is often frowned upon due to its independence assumption. This can cause issues for parametric inference but is less of a problem for forecasting. In fact it often performs sufficiently well for forecasting as can be seen by its popular use in practice. Further, the use of autoregressive terms in the above model should model most of the autocorrelation in the data.
\hypertarget{computational-considerations}{%
\subsection{\texorpdfstring{Computational considerations \label{sec:computationalconsiderations}}{Computational considerations }}\label{computational-considerations}}
There are two ways for computing the above forecasts. First, we could create the matrices \(\bm{Y}\), \(\bm{X}\) and \(\bm{E}\), and then directly use the above equations (taking advantage of sparse matrix routines) to obtain the forecasts. Alternatively, we could use separate regression models to compute the coefficients for each linear model individually. Although the matrix, \(\bm{X}'\bm{X}\), which we need to invert is sparse and block diagonal, it is still faster to use the second approach involving separate regression models.
\hypertarget{prediction-intervals}{%
\subsection{Prediction intervals}\label{prediction-intervals}}
For obtaining prediction intervals, we need to compute the variance of reconciled forecasts as follows \autocite{mint2018}:
\begin{equation}\label{eq:variance}
\text{Var}(\tilde{\bm{y}}_{t+h})
= \bm{S}\bm{P}{\bm{\Sigma}_{t+h}} \bm{P}'\bm{S}',
\end{equation}
where \(\bm{P} = (\bm{S}'\bm{\Lambda}\bm{S})^{-1}\bm{S}'\bm{\Lambda}\) and \({\bm{\Sigma}_{t+h}}\) denotes the variance of the base forecasts given by the usual linear model formula \autocite{fpp2}
\[
\bm{\Sigma}_{t+h} = \sigma^2\left[1 + \bm{X}_{t+h}^*(\bm{X}'\bm{X})^{-1}(\bm{X}_{t+h}^*)'\right]\DIFdelbegin \DIFdel{.
}\DIFdelend \DIFaddbegin \DIFadd{,
}\DIFaddend \]
\DIFaddbegin \DIFadd{where \(\sigma^2\) is the variance of the base model residuals. }\DIFaddend Assuming normally distributed errors, we can easily obtain any required prediction intervals corresponding to elements of \DIFdelbegin \DIFdel{\(\bm{y}_{t+h}\) }\DIFdelend \DIFaddbegin \DIFadd{\(\tilde{\bm{y}}_{t+h}\) }\DIFaddend using the diagonals of \eqref{eq:variance}.
\hypertarget{applications}{%
\section{Applications}\label{applications}}
In this section we illustrate our approach using two \DIFdelbegin \DIFdel{examples: }\DIFdelend \DIFaddbegin \DIFadd{real data sets and one simulated example}\todo{Check R version. Currently 4.02}\footnote{\DIFadd{All methods were run on a Linux server with Intel Xeon Silver 4108 (1.80GHz / 8-Cores / 11MB Cache)*2 and 8GB DDR4 2666 DIMM ECC Registered Memory. R version 1.2.5019.}}\DIFadd{. The real data study includes }\DIFaddend forecasting monthly Australian domestic tourism and forecasting daily Wikipedia pageviews. \DIFaddbegin \DIFadd{In the simulation studies, we simulate series based on the monthly Australian domestic tourism data and systematically modify the forecasting horizon, noise level, hierarchy levels, and number of series. }\DIFaddend We compare the forecasting accuracy of ETS, ARIMA\DIFaddbegin \footnote{\DIFadd{For running ETS and ARIMA, we applied `ets' and `auto.arima' functions from the `forecast' package\} }\autocite{Rforecast}\DIFadd{. The two sets of functions were run independently and not immediately one after the other.}} \DIFaddend and the proposed linear OLS forecasting model, with and without the reconciliation step. In these applications, we used the weighted reconciliation approach from Equation \eqref{eq:mint2}. For comparing these methods, we use the average of Root Mean Square Errors (RMSEs) across all series and also display box plots for forecast errors along with the raw forecast errors. \DIFaddbegin \DIFadd{To aid visibility, we suppress plotting the outliers.
}\DIFaddend
The two \DIFaddbegin \DIFadd{real }\DIFaddend datasets differ in terms of structure, size and behavior. The tourism data contains 304 series with both hierarchical and grouped structure, while the Wikipedia pageviews dataset contains 913 series with grouped structure. The tourism dataset has strong seasonality while the Wikipedia data are noiser.
We apply two methods for generating forecasts \DIFdelbegin \DIFdel{, which differ in how they handle unobserved lagged values as inputs}\DIFdelend \DIFaddbegin \DIFadd{that align with two different practical forecasting scenarios}\DIFaddend . The first approach is \DIFdelbegin \DIFdel{ex post in that it uses actual values, even when they are future to the forecast origin. These values are known to us because they are in the test set. We call these }\DIFdelend \emph{rolling origin} \DIFdelbegin \DIFdel{forecasts}\DIFdelend \DIFaddbegin \DIFadd{forecasting, where we generate one-step-ahead forecasts (\(\tilde{\bm{y}}_{t+1}\) where \(t\) changes). This mimics the scenario where data are refreshed every time period}\DIFaddend . In the second \DIFdelbegin \DIFdel{ex ante method, }\DIFdelend \DIFaddbegin \emph{\DIFadd{fixed origin}} \DIFadd{method, forecasts are generated at a fixed time \(t\) for \(h\) steps ahead: \(\tilde{\bm{y}}_{t+1}, \tilde{\bm{y}}_{t+2},\dots, \tilde{\bm{y}}_{t+h}\) (}\DIFaddend we replace lagged values of \(y\) by their forecasts if they occur at periods after the forecast origin\DIFdelbegin \DIFdel{. We call these }\emph{\DIFdel{fixed origin}} %DIFAUXCMD
\DIFdel{forecasts}\DIFdelend \DIFaddbegin \DIFadd{)}\DIFaddend .
\hypertarget{australian-domestic-tourism}{%
\subsection{Australian domestic tourism}\label{australian-domestic-tourism}}
This dataset has 19 years of monthly visitor nights in Australia by Australian tourists, a measure used as an indicator of tourism activity \autocite{mint2018}. The data were collected by computer-assisted telephone interviews with 120,000 Australians aged 15 and over \autocite{researchAustralia2005}. The dataset includes 304 time series each of length 228 observations. The hierarchy and grouping structure for this dataset is made using geographic and purpose of travel information.
\begingroup\fontsize{9}{11}\selectfont
\begin{longtable}[t]{rllrll}
\caption{\label{tab:Australiageographicaldivision}Australia geographic hierarchical structure.}\\
\toprule
Series & Name & Label & Series & Name & Label\\
\midrule
Total & & & Region & & \\
1 & Australia & Total & 55 & Lakes & BCA\\
State & & & 56 & Gippsland & BCB\\
2 & NSW & A & 57 & Phillip Island & BCC\\
3 & VIC & B & 58 & General Murray & BDA\\
4 & QLD & C & 59 & Goulburn & BDB\\
5 & SA & D & 60 & High Country & BDC\\
6 & WA & E & 61 & Melbourne East & BDD\\
7 & TAS & F & 62 & Upper Yarra & BDE\\
8 & NT & G & 63 & Murray East & BDF\\
Zone & & & 64 & Wimmera+Mallee & BEA\\
9 & Metro NSW & AA & 65 & Western Grampians & BEB\\
10 & Nth Coast NSW & AB & 66 & Bendigo Loddon & BEC\\
11 & Sth Coast NSW & AC & 67 & Macedon & BED\\
12 & Sth NSW & AD & 68 & Spa Country & BEE\\
13 & Nth NSW & AE & 69 & Ballarat & BEF\\
14 & ACT & AF & 70 & Central Highlands & BEG\\
15 & Metro VIC & BA & 71 & Gold Coast & CAA\\
16 & West Coast VIC & BB & 72 & Brisbane & CAB\\
17 & East Coast VIC & BC & 73 & Sunshine Coast & CAC\\
18 & Nth East VIC & BD & 74 & Central Queensland & CBA\\
19 & Nth West VIC & BE & 75 & Bundaberg & CBB\\
20 & Metro QLD & CA & 76 & Fraser Coast & CBC\\
21 & Central Coast QLD & CB & 77 & Mackay & CBD\\
22 & Nth Coast QLD & CC & 78 & Whitsundays & CCA\\
23 & Inland QLD & CD & 79 & Northern & CCB\\
24 & Metro SA & DA & 80 & Tropical North Queensland & CCC\\
25 & Sth Coast SA & DB & 81 & Darling Downs & CDA\\
26 & Inland SA & DC & 82 & Outback & CDB\\
27 & West Coast SA & DD & 83 & Adelaide & DAA\\
28 & West Coast WA & EA & 84 & Barossa & DAB\\
29 & Nth WA & EB & 85 & Adelaide Hills & DAC\\
30 & Sth WA & EC & 86 & Limestone Coast & DBA\\
31 & Sth TAS & FA & 87 & Fleurieu Peninsula & DBB\\
32 & Nth East TAS & FB & 88 & Kangaroo Island & DBC\\
33 & Nth West TAS & FC & 89 & Murraylands & DCA\\
34 & Nth Coast NT & GA & 90 & Riverland & DCB\\
35 & Central NT & GB & 91 & Clare Valley & DCC\\
Region & & & 92 & Flinders Range and Outback & DCD\\
36 & Sydney & AAA & 93 & Eyre Peninsula & DDA\\
37 & Central Coast & AAB & 94 & Yorke Peninsula & DDB\\
38 & Hunter & ABA & 95 & Australia's Coral Coast & EAA\\
39 & North Coast NSW & ABB & 96 & Experience Perth & EAB\\
40 & Northern Rivers Tropical NSW & ABC & 97 & Australia's SouthWest & EAC\\
41 & South Coast & ACA & 98 & Australia's North West & EBA\\
42 & Snowy Mountains & ADA & 99 & Australia's Golden Outback & ECA\\
43 & Capital Country & ADB & 100 & Hobart and the South & FAA\\
44 & The Murray & ADC & 101 & East Coast & FBA\\
45 & Riverina & ADD & 102 & Launceston, Tamar and the North & FBB\\
46 & Central NSW & AEA & 103 & North West & FCA\\
47 & New England North West & AEB & 104 & Wilderness West & FCB\\
48 & Outback NSW & AEC & 105 & Darwin & GAA\\
49 & Blue Mountains & AED & 106 & Kakadu Arnhem & GAB\\
50 & Canberra & AFA & 107 & Katherine Daly & GAC\\
51 & Melbourne & BAA & 108 & Barkly & GBA\\
52 & Peninsula & BAB & 109 & Lasseter & GBB\\
53 & Geelong & BAC & 110 & Alice Springs & GBC\\
54 & Western & BBA & 111 & MacDonnell & GBD\\
\bottomrule
\end{longtable}
\endgroup{}
\begin{figure}
{\centering \includegraphics[width=450px,height=150px]{Paper-Figures/Australian_hierarchy_structure}
}
\caption{Australian geographic hierarchical structure.}\label{fig:Australiahierarchystructure}
\end{figure}
\begin{figure}
{\centering \includegraphics[width=450px,height=360px]{Paper-Figures/ausTurRegions}
}
\caption{Australia tourism region map - colors represent states.}\label{fig:Australiahierarchystructuremap}
\end{figure}
In this dataset we have three levels of geographic divisions in Australia. In the first level, Australia is divided into seven ``States'' including New South Wales (NSW), Victoria (VIC), Queensland (QLD), South Australia (SA), Western Australia (WA), Tasmania (TAS) and Northern Territory (NT). In the second and third levels, it is divided into 27 ``Zones'' and 76 ``Regions'' (for details about Australia geographic divisions see Figure \ref{fig:Australiahierarchystructure} and Table \ref{tab:Australiageographicaldivision} and also Figure \ref{fig:Australiahierarchystructuremap} which shows Australia map divided by tourism region and colored by states\footnote{\url{www.tra.gov.au/tra/2016/Tourism_Region_Profiles/Region_profiles/index.html}}).
We have four purposes of travel: Holiday (Hol), Visiting friends and relatives (Vis), Business (Bus) and Other (Oth). So there are \(76\times4 = 304\) series at the most disaggregate level. Based on the geographic hierarchy and purpose grouping, we end up with 8 aggregation levels with 555 series in total as shown in Table \ref{tab:Australiageographicalpurposedivision}.
\begin{table}[!h]
\caption{\label{tab:Australiageographicalpurposedivision}Number of Australian domestic tourism series at each aggregation level.}
\centering
\begin{tabular}[t]{lr}
\toprule
Division & Series\\
\midrule
Australia & 1\\
State & 7\\
Zone & 27\\
Region & 76\\
Purpose & 4\\
State x Purpose & 28\\
Zone x Purpose & 108\\
Region x Purpose & 304\\
\hline
Total & 555\\
\bottomrule
\end{tabular}
\end{table}
We report the forecast results for all these aggregation levels, as well as the average RMSE across all the levels of the hierarchy. We used different predictors in the OLS predictor
matrix for the rolling and fixed origin approaches. For the rolling \DIFdelbegin \DIFdel{origin model, we included a linear trend, 11 dummy variables, and 12 lags. For the }\DIFdelend \DIFaddbegin \DIFadd{and }\DIFaddend fixed origin model, we \DIFdelbegin \DIFdel{included a quadratic }\DIFdelend \DIFaddbegin \DIFadd{include a linear }\DIFaddend trend, 11 dummy variables, and lags 1 and 12. This is intended to capture the monthly seasonality. In addition, before running the model, we partition the data into training and test sets, with the last 24 months (2 years) as our test set, and the rest as our training set.
\begin{table}[!h]
\caption{\label{tab:Tourismdataresulrolling}Mean(RMSE) on 2 year test set for ETS, ARIMA and OLS with and without reconciliation - Rolling origin - Tourism dataset\DIFaddbeginFL \DIFaddFL{.}\DIFaddendFL }
\centering
\begin{tabular}[t]{lrrrrrr}
\toprule
\multicolumn{1}{c}{} & \multicolumn{3}{c}{Unreconciled} & \multicolumn{3}{c}{Reconciled} \\
\cmidrule(l{3pt}r{3pt}){2-4} \cmidrule(l{3pt}r{3pt}){5-7}
Level & ETS & ARIMA & OLS & ETS & ARIMA & OLS\\
\midrule
Total & \DIFdelbeginFL \DIFdelFL{1516.4 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{1516 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{1445.5 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{1445 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{1415.1 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{2191 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{1517.2 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{1517 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{1517.2 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{1517 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{1414.7}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{2194}\DIFaddendFL \\
State & \DIFdelbeginFL \DIFdelFL{511.4 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{511 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{493.1 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{493 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{510.8 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{594 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{499.9 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{500 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{499.9 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{500 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{491.6}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{561}\DIFaddendFL \\
Zone & \DIFdelbeginFL \DIFdelFL{214.8 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{215 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{219.0 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{219 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{224.5 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{234 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{209.6 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{210 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{209.6 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{210 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{213.6}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{219}\DIFaddendFL \\
Region & \DIFdelbeginFL \DIFdelFL{122.9 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{123 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{125.1 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{125 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{124.0 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{126 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{119.4 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{119 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{119.4 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{119 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{120.4}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{121}\DIFaddendFL \\
Purpose & \DIFdelbeginFL \DIFdelFL{676.0 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{676 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{709.2 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{709 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{694.5 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{781 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{674.2 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{674 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{674.2 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{674 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{681.9}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{786}\DIFaddendFL \\
State x Purpose & \DIFdelbeginFL \DIFdelFL{213.1 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{213 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{220.1 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{220 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{216.1 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{231 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{212.7 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{213 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{212.7 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{213 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{213.3}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{221}\DIFaddendFL \\
Zone x Purpose & \DIFdelbeginFL \DIFdelFL{97.5 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{98 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{102.4 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{102 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{101.0 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{102 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{96.8 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{97 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{96.8 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{97 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{99.0}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{98}\DIFaddendFL \\
Region x Purpose & \DIFdelbeginFL \DIFdelFL{56.2 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{56 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{58.2 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{58 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{58.2 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{57 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{56.2 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{56 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{56.2 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{56 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{57.4}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{56}\DIFaddendFL \\
\bottomrule
\end{tabular}
\end{table}
\begin{table}
\caption{\label{tab:TourismdataresultRMSE}Mean(RMSE) on 2 year test set for ETS, ARIMA and OLS with and without reconciliation - Fixed origin - Tourism dataset\DIFaddbeginFL \DIFaddFL{.}\DIFaddendFL }
\centering
\begin{tabular}[t]{lrrrrrr}
\toprule
\multicolumn{1}{c}{} & \multicolumn{3}{c}{Unreconciled} & \multicolumn{3}{c}{Reconciled} \\
\cmidrule(l{3pt}r{3pt}){2-4} \cmidrule(l{3pt}r{3pt}){5-7}
Level & ETS & ARIMA & OLS & ETS & ARIMA & OLS\\
\midrule
Total & \DIFdelbeginFL \DIFdelFL{2238.6 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{2239 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{3554.0 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{3554 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{2528.9 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{3873 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{2232.8 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{2233 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{3460.3 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{3460 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{2540.1}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{3877}\DIFaddendFL \\
State & \DIFdelbeginFL \DIFdelFL{593.6 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{594 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{570.1 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{570 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{596.5 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{789 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{555.7 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{556 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{658.5 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{659 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{579.0}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{777}\DIFaddendFL \\
Zone & \DIFdelbeginFL \DIFdelFL{239.5 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{240 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{229.6 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{230 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{243.3 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{273 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{235.3 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{235 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{249.8 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{250 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{235.4}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{265}\DIFaddendFL \\
Region & \DIFdelbeginFL \DIFdelFL{132.6 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{133 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{129.4 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{129 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{127.1 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{142 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{127.6 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{128 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{132.4 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{132 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{123.8}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{139}\DIFaddendFL \\
Purpose & \DIFdelbeginFL \DIFdelFL{766.8 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{767 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{824.0 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{824 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{875.5 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{1172 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{801.7 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{802 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{1019.3 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{1019 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{857.2}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{1169}\DIFaddendFL \\
State x Purpose & \DIFdelbeginFL \DIFdelFL{226.7 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{227 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{241.2 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{241 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{236.7 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{277 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{224.5 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{225 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{245.6 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{246 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{229.1}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{269}\DIFaddendFL \\
Zone x Purpose & \DIFdelbeginFL \DIFdelFL{103.0 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{103 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{105.4 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{105 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{104.9 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{110 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{102.4 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{102 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{105.8 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{106 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{102.9}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{108}\DIFaddendFL \\
Region x Purpose & \DIFdelbeginFL \DIFdelFL{59.1 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{59 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{58.8 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{59 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{58.6 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{62 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{58.8 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{59 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{59.3 }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{59 }\DIFaddendFL & \DIFdelbeginFL \DIFdelFL{57.9}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{61}\DIFaddendFL \\
\bottomrule
\end{tabular}
\end{table}
\begin{figure}
{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/boxplotrollingtourism-1}
}
\caption{Box plots of rolling origin forecast errors from reconciled and unreconciled ETS, ARIMA and OLS methods at each hierarchical level for tourism demand.}\label{fig:boxplotrollingtourism}
\end{figure}
\begin{figure}
{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/boxplottourism-1}
}
\caption{Box plots of fixed origin forecast errors for reconciled and unreconciled ETS, ARIMA and OLS methods at each hierarchical level for tourism demand.}\label{fig:boxplottourism}
\end{figure}
In Figures \ref{fig:boxplotrollingtourism} and \ref{fig:boxplottourism} we display the error box plots for both reconciled and unreconciled forecasts using all three methods, for the rolling origin and fixed origin forecasts. In these figures we see the error distributions across all the models.
Together with Tables \ref{tab:Tourismdataresulrolling} and \ref{tab:TourismdataresultRMSE}, results show that our proposed OLS forecasting model produces forecast accuracy similar to ETS and ARIMA, which are computationally heavy for many time series \DIFaddbegin \DIFadd{(see Table \ref{tab:Tourismdatacomputationtime})}\DIFaddend . We also see the usefulness of the reconciliation in decreasing the average RMSE in all three methods. Except for the total series, reconciliation improves forecasts in all the hierarchy levels. Also, because the higher level series have higher counts, the errors are larger in magnitude (\protect\hyperlink{appendixA}{Appendix A} shows the box plots with scaled errors\DIFaddbegin \footnote{\DIFadd{Scaled errors are computed by subtracting the mean and dividing by the standard deviation.}}\DIFaddend , to better compare errors across all the hierarchy levels). In addition, we see that (as expected) by applying rolling origin 1-step-ahead forecasts, the error densities are closer and more tightly distributed around zero than the fixed origin multi-step-ahead forecasts.
Figures \ref{fig:forecstrolling24tourismtotal} and \ref{fig:forecstrolling24tourism} show the rolling and fixed origin forecast results for the total series and one of the bottom level series, BACBus (Geelong - Business). In these plots we have both reconciled (\DIFdelbegin \DIFdel{solid }\DIFdelend \DIFaddbegin \DIFadd{dashed }\DIFaddend lines) and unreconciled (\DIFdelbegin \DIFdel{dashed }\DIFdelend \DIFaddbegin \DIFadd{dotted }\DIFaddend lines) forecasts and we see that the reconciliation step improves the forecasts in this series. We also see that the OLS model forecast accuracy is similar to the other two methods.
\begin{figure}
{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/forecstrolling24tourismtotal-1}
}
\caption{The actual test set for the 'Total series' compared to the forecasts from reconciled and unreconciled ETS, ARIMA and OLS methods for rolling and fixed origin tourism demand.}\label{fig:forecstrolling24tourismtotal}
\end{figure}
\begin{figure}
{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/forecstrolling24tourism-1}
}
\caption{The actual test set for the 'BACBus' bottom level series compared to the forecasts from reconciled and unreconciled ETS, ARIMA and OLS methods for rolling and fixed origin tourism demand.}\label{fig:forecstrolling24tourism}
\end{figure}
\DIFaddbegin \DIFadd{Figures \ref{fig:predinttotal} and \ref{fig:predintBACBus} display the prediction interval for the OLS approach, with and without reconciliation forecasts for the total series and one of the bottom level series, BACBus (Geelong - Business).}\todo{Why don't the intervals increase with the horizon?}
\begin{figure}
{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/predinttotal-1}
}
\caption{\DIFaddFL{The actual test set for the 'Total series' compared to the forecasts from reconciled and unreconciled OLS methods with prediction interval for fixed origin tourism demand.}}\label{fig:predinttotal}
\end{figure}
\begin{figure}
{\centering \includegraphics[width=1\linewidth]{lhf_files/figure-latex/predintBACBus-1}
}
\caption{\DIFaddFL{The actual test set for the 'BACBus' bottom level series compared to the forecasts from reconciled and unreconciled OLS methods with prediction interval for fixed origin tourism demand.}}\label{fig:predintBACBus}
\end{figure}
\DIFaddend Table \ref{tab:Tourismdatacomputationtime} compares the computation time of the three methods for rolling and fixed origin forecasting. We see that the OLS forecasting model is much faster compared to the other methods. Also, since reconciliation is a linear process, in all methods it is very fast and does not affect computation time significantly.
\DIFdelbegin %DIFDELCMD < \begin{table}
%DIFDELCMD <
%DIFDELCMD < %%%
%DIFDELCMD < \caption{%
{%DIFAUXCMD
%DIFDELCMD < \label{tab:Tourismdatacomputationtime}%%%
\DIFdelFL{Computation time (seconds) for ETS, ARIMA and OLS with and without reconciliation - Rolling and fixed origin forecasts on a 24 month test set - Tourism dataset}}
%DIFAUXCMD
%DIFDELCMD < \centering
%DIFDELCMD < \begin{tabular}[t]{>{\raggedright\arraybackslash}p{3cm}>{\raggedleft\arraybackslash}p{3cm}>{\raggedleft\arraybackslash}p{3cm}rr}
%DIFDELCMD < \toprule
%DIFDELCMD < \multicolumn{1}{c}{} & \multicolumn{2}{c}{Rolling origin} & \multicolumn{2}{c}{Fixed origin} \\
%DIFDELCMD < \cmidrule%%%
\DIFdelFL{(l}%DIFDELCMD < {%%%
\DIFdelFL{3pt}%DIFDELCMD < }%%%
\DIFdelFL{r}%DIFDELCMD < {%%%
\DIFdelFL{3pt}%DIFDELCMD < }%%%
\DIFdelFL{)}%DIFDELCMD < {%%%
\DIFdelFL{2-3}%DIFDELCMD < } \cmidrule%%%
\DIFdelFL{(l}%DIFDELCMD < {%%%
\DIFdelFL{3pt}%DIFDELCMD < }%%%
\DIFdelFL{r}%DIFDELCMD < {%%%
\DIFdelFL{3pt}%DIFDELCMD < }%%%
\DIFdelFL{)}%DIFDELCMD < {%%%
\DIFdelFL{4-5}%DIFDELCMD < }
%DIFDELCMD < & %%%
\DIFdelFL{Unreconciled }%DIFDELCMD < & %%%
\DIFdelFL{Reconciled }%DIFDELCMD < & %%%
\DIFdelFL{Unreconciled }%DIFDELCMD < & %%%
\DIFdelFL{Reconciled}%DIFDELCMD < \\
%DIFDELCMD < \midrule
%DIFDELCMD < %%%
\DIFdelFL{ETS }%DIFDELCMD < & %%%
\DIFdelFL{10924.57 }%DIFDELCMD < & %%%
\DIFdelFL{10924.60 }%DIFDELCMD < & %%%
\DIFdelFL{407.10 }%DIFDELCMD < & %%%
\DIFdelFL{407.15}%DIFDELCMD < \\
%DIFDELCMD < %%%
\DIFdelFL{ARIMA }%DIFDELCMD < & %%%
\DIFdelFL{31146.38 }%DIFDELCMD < & %%%
\DIFdelFL{31146.52 }%DIFDELCMD < & %%%
\DIFdelFL{1116.15 }%DIFDELCMD < & %%%
\DIFdelFL{1116.19}%DIFDELCMD < \\
%DIFDELCMD < %%%
\DIFdelFL{OLS }%DIFDELCMD < & %%%
\DIFdelFL{48.40 }%DIFDELCMD < & %%%
\DIFdelFL{48.31 }%DIFDELCMD < & %%%
\DIFdelFL{17.42 }%DIFDELCMD < & %%%
\DIFdelFL{17.80}%DIFDELCMD < \\
%DIFDELCMD < \bottomrule
%DIFDELCMD < \end{tabular}
%DIFDELCMD < \end{table}
%DIFDELCMD <
%DIFDELCMD < %%%
\DIFdelend Since we are using a linear model, we can easily include exogenous variables which can often be helpful in improving forecast accuracy. In this application, we tried including an ``Easter'' dummy variable indicating the timing of Easter, but its affect on forecast accuracy was minimal, so it was omitted in the model reported here.
Finally, Table \ref{tab:Tourismdatacomputationtimeappendix} shows that, as mentioned in Section \ref{sec:computationalconsiderations}, computation is faster using separate regression models compared to the matrix approach (even using sparse matrix algebra).
\begin{table}
\caption{\label{tab:Tourismdatacomputationtime}Computation time (seconds) for ETS, ARIMA and OLS with and without reconciliation - Rolling and fixed origin forecasts on a 24 month test set - Tourism dataset}
\centering
\begin{tabular}[t]{>{\raggedright\arraybackslash}p{3cm}>{\raggedleft\arraybackslash}p{3cm}>{\raggedleft\arraybackslash}p{3cm}rr}
\toprule
\multicolumn{1}{c}{} & \multicolumn{2}{c}{Rolling origin} & \multicolumn{2}{c}{Fixed origin} \\
\cmidrule(l{3pt}r{3pt}){2-3} \cmidrule(l{3pt}r{3pt}){4-5}
& Unreconciled & Reconciled & Unreconciled & Reconciled\\
\midrule
ETS & 10924.6 & 10924.6 & 407.1 & 407.1\\
ARIMA & 31146.4 & 31146.5 & 1116.2 & 1116.2\\
OLS & 48.4 & 48.3 & 17.4 & 17.8\\
\bottomrule
\end{tabular}
\end{table}
\begin{table}
\caption{\label{tab:Tourismdatacomputationtimeappendix}Computation time (seconds) for OLS using the matrix approach and separate regression models, with and without reconciliation, on a rolling and fixed origin for 24 steps ahead.}
\centering
\begin{tabular}[t]{>{\raggedright\arraybackslash}p{3cm}>{\raggedleft\arraybackslash}p{3cm}>{\raggedleft\arraybackslash}p{3cm}rr}
\toprule
\multicolumn{1}{c}{} & \multicolumn{2}{c}{Rolling origin} & \multicolumn{2}{c}{Fixed origin} \\
\cmidrule(l{3pt}r{3pt}){2-3} \cmidrule(l{3pt}r{3pt}){4-5}
& Unreconciled & Reconciled & Unreconciled & Reconciled\\
\midrule
Matrix approach & 202.1 & 209.8 & 87.7 & 105.7\\
Separate models & 48.4 & 48.3 & 16.7 & 16.9\\
\bottomrule
\end{tabular}
\end{table}
\DIFdelbegin %DIFDELCMD < \clearpage
%DIFDELCMD < %%%
\DIFdelend \DIFaddbegin \hypertarget{australian-domestic-tourism-simulation-study}{%
\subsection{Australian domestic tourism simulation study}\label{australian-domestic-tourism-simulation-study}}
\DIFadd{We provide results from two simulation studies based on the Australian domestic tourism dataset, to evaluate the sensitivity of our results to several factors. In the first study, we simulate bottom-level series similar to the real bottom-level series of the tourism data, with the same number of series and the same length. We then generate forecasts for four forecast horizons (12, 24, 36 and 48 months) with four different noise levels (standard deviation=0.01, 0.1, 0.5 and 1)}\footnote{\DIFadd{Since the level of the series are different, we first scale the simulated series (subtracting by mean and dividing by standard deviation), add the white noise series and then we rescale the series.}}\DIFadd{.
}
\DIFadd{Tables \ref{tab:TourismdatasimrollingnoiseFH} and \ref{tab:TourismdatasimfixnoiseFH} display the average of the RMSEs for 12 to 48 month-ahead forecasts with different noise levels. Results are shown for the base and the reconciled forecasts for both rolling and fixed origin approaches. The results show that, as expected, by increasing the forecast horizon and/or noise level, the average RMSE increases in all the three methods. Also, the proposed OLS approach shows similar results compared with ETS and ARIMA. It should be noted that for both rolling and fixed origin forecasts in the OLS approach we use the same set of predictors as the Australian domestic tourism example.
}
\begin{table}[!h]
\caption{\label{tab:TourismdatasimrollingnoiseFH}\DIFaddFL{Mean RMSE on one to four year test set with different error levels for ETS, ARIMA and OLS with and without reconciliation - Rolling origin - 304 bottom level series and 8 levels of hierarchy - Simulated tourism dataset}}
\centering
\begin{tabular}[t]{lrrrrl}
\toprule
\DIFaddFL{Reconceliation }& \DIFaddFL{Error }& \DIFaddFL{Forecast horizon }& \DIFaddFL{ETS }& \DIFaddFL{ARIMA }& \DIFaddFL{OLS}\\
\midrule
\DIFaddFL{rec }& \DIFaddFL{0.01 }& \DIFaddFL{12 }& \DIFaddFL{142.0 }& \DIFaddFL{139.5 }& \DIFaddFL{140.0}\\
\DIFaddFL{rec }& \DIFaddFL{0.01 }& \DIFaddFL{24 }& \DIFaddFL{167.3 }& \DIFaddFL{164.4 }& \DIFaddFL{162.4}\\
\DIFaddFL{rec }& \DIFaddFL{0.01 }& \DIFaddFL{36 }& \DIFaddFL{136.4 }& \DIFaddFL{136.1 }& \DIFaddFL{136.9}\\
\DIFaddFL{rec }& \DIFaddFL{0.01 }& \DIFaddFL{48 }& \DIFaddFL{127.9 }& \DIFaddFL{127.3 }& \DIFaddFL{130.8}\\
\DIFaddFL{rec }& \DIFaddFL{0.10 }& \DIFaddFL{12 }& \DIFaddFL{142.8 }& \DIFaddFL{142.5 }& \DIFaddFL{142.0}\\
\DIFaddFL{rec }& \DIFaddFL{0.10 }& \DIFaddFL{24 }& \DIFaddFL{146.7 }& \DIFaddFL{148.6 }& \DIFaddFL{146.5}\\
\DIFaddFL{rec }& \DIFaddFL{0.10 }& \DIFaddFL{36 }& \DIFaddFL{138.8 }& \DIFaddFL{138.8 }& \DIFaddFL{139.0}\\
\DIFaddFL{rec }& \DIFaddFL{0.10 }& \DIFaddFL{48 }& \DIFaddFL{129.7 }& \DIFaddFL{130.1 }& \DIFaddFL{132.1}\\
\DIFaddFL{rec }& \DIFaddFL{0.50 }& \DIFaddFL{12 }& \DIFaddFL{172.3 }& \DIFaddFL{171.5 }& \DIFaddFL{171.7}\\
\DIFaddFL{rec }& \DIFaddFL{0.50 }& \DIFaddFL{24 }& \DIFaddFL{178.4 }& \DIFaddFL{173.9 }& \DIFaddFL{175.9}\\