-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy pathPRE.tex
343 lines (243 loc) · 13.2 KB
/
PRE.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
\newpage
\section{Partial Redundancy Elimination}
Partial redundancy elimination (PRE) is a global optimization introduced by Morel and Renvoise\cite{morel1979global}. It
combines and extends two other techniques: common subexpression elimination and loop-invariant code motion.
The main difference between Partial Redundancy Elimination (PRE) and Common Subexpression Elimination (CSE)
is that PRE aims to eliminate redundancy in a limited set of execution paths,
whereas CSE aims to eliminate redundancy across all execution paths.
An expression is partially redundant at point p if it
is redundant along some, but not all, paths that reach
p. PRE converts partially-redundant expressions into
redundant expressions. The basic idea is simple. First,
it uses data-flow analysis to discover where expressions
are partially redundant. Next, it solves a data-flow
problem that shows where inserting copies of a computation would convert a partial redundancy into a full
redundancy. Finally, it inserts the appropriate code and
deletes the redundant copy of the expression.
\textbf{A key feature of PRE is that it never lengthens an
execution path.} To see this more clearly, consider the
example shown in Figure \ref{fig:p82}. In the fragment on the left, the second
computation of \texttt{x + y} is partially redundant; it is only
available along one path from the if. Inserting an evaluation of
\texttt{x + y} on the other path makes the computation
redundant and allows it to be eliminated, as shown in
the right-hand fragment. Note that the left path stays
the same length while the right path has been shortened.
\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{p82.png}
\caption{}
\label{fig:p82}
\end{figure}
Loop-invariant expressions are also partially redundant,
as shown in Figure \ref{fig:p83}. On the left, \texttt{x + y} is
partially redundant since it is available from one predecessor
(along the back edge of the loop), but not the
other. Inserting an evaluation of \texttt{x + y} before the loop
allows it to be eliminated from the loop body.
\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{p83.png}
\caption{}
\label{fig:p83}
\end{figure}
\subsection{Finding Partially Available Expressions}
For every expression, we can do a dataflow analysis.
\begin{center}
\begin{tabular}{|c|c|}
\hline Direction & Forward \\
\hline Meet operator & \( \cup \) \\
\hline Lattice & \( \{ 0,1 \} \) \\
\hline Top(T) & \( 0 \) \\
\hline Bottom & \( 1 \) \\
\hline Boundary condition for entry node & \( 0 \) \\
\hline Initialization for internal nodes & \(\mathrm{T}\) \\
\hline Finited escending chain? & \checkmark \\
\hline Transferfunction & \( PAVOUT[i] = (PAVIN[i] - KILL[i]) \cup AVLOC[i]\) \\
\hline Monotone\&Distributive? & \checkmark \\
\hline AVLOC & {Expression is {\color{blue}locally available (AVLOC)} if downwards exposed. } \\
\hline KILL & Expression is {\color{blue}killed ( KILL)} if any assignments to operands. \\
\hline
\end{tabular}
\end{center}
\begin{figure}[H]
\centering
\includegraphics[width=0.8\textwidth]{p84.pdf}
\caption{For \texttt{a+b}, the result of Partially Available Expressions's transfer function within a basic block.}
\label{fig:p84}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=0.8\textwidth]{p85.pdf}
\caption{For \texttt{a+b}, the result Partially Available Expressions of dataflow analysis.}
\label{fig:p85}
\end{figure}
\subsection{Finding Anticipated Expression}
For PRE, care must be taken that the hoisting would do no harm: Never introduce
a new expression along any path. Otherwise the
hoisting will lengthen at least one trace of the program, defying optimality; even
worse, if the hoisted instruction throws an exception, the program’s semantics
change.
\begin{definition}{Local Anticipability(ANTLOC)}
An expression may
be locally anticipated in a block i if there is at least one
computation of the expression in the block i, and if the
commands appearing in the block before the first computation
of the expression do not modify its operands.
\end{definition}
\begin{center}
\begin{tabular}{|c|c|}
\hline Direction & backward \\
\hline Meet operator & \( \cap \) \\
\hline Lattice & \( \{ 0,1 \} \) \\
\hline Top(T) & \( 1 \) \\
\hline Boundary condition for exit node & \( 0 \) \\
\hline Initialization for internal nodes & \(\mathrm{T}\) \\
\hline Finited escending chain? & \checkmark \\
\hline Transferfunction & \( ANTIN[i] = ANTLOC[i] \cup (ANTOUT[i] - KILL[i])\) \\
\hline Monotone\&Distributive? & \checkmark \\
\hline ANTLOC & Expression is locally {\color{blue}anticipated(ANTLOC))} is upward exposed. \\
\hline
\end{tabular}
\end{center}
\begin{figure}[H]
\centering
\includegraphics[width=0.8\textwidth]{p86.pdf}
\caption{For \texttt{a+b}, the result of Anticipated Expression's transfer function within a basic block.}
\label{fig:p86}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=0.8\textwidth]{p87.pdf}
\caption{For \texttt{a+b}, the result Anticipated Expression of dataflow analysis.}
\label{fig:p87}
\end{figure}
\subsection{Where Do we Want to Insert Computations?}
\begin{figure}[H]
\centering
\includegraphics[width=0.8\textwidth]{p88.pdf}
\caption{For \texttt{a+b}, \texttt{t2 = a + b} and \texttt{t3 = a + 4} are both partially redundant. But where can we insert the new computation \texttt{a + b} in order to optimally eliminate redundancy?
The best choice is BB1 because this wil make \texttt{a + b} fully redundant. But if we insert to BB2 and BB3, there are some path that calculate \texttt{a + b} more than once (e.g. Entry$\rightarrow$\texttt{t1= a+b} $\rightarrow$ BB2$\rightarrow$\texttt{t2= a+b}) }
\label{fig:p88}
\end{figure}
We define "\textbf{Placement Possible }"(PP) dataflow analysis. First of all, we are only going to place computations at the end of the blocks.
We want to insert the computation at the earliest place where our \textbf{Placement Possible }is true. If the \textbf{Placement Possible }is true output of a block(PPOUT), it means
it is fine to insert at the end of this block or earlier. If it is true athe the beginning of the block, it means we could insert it at the beginning of the block or earlier. Becuase we want to
insert it at the end of the block, so if \textbf{Placement Possible }is true at the beginning of the block(PPIN), you won't insert it in the block, it means you need to take care of something before.
So when PPIN is true, it really means is for every predecessor block, it is either possible to keep moving it back and make it fully redundant by placing in that block or earlier or it is not necessary because
it is already generatedin one of those blocks.
We insert if PPOUT is true and either PPIN is false so we cannot move it back any further or if we locally kill it then clearly we have to insert it here because trying to insert it earlier would not work. And we only want to insert it if it is not
already available.
We want to delete an expression where PPIN is true (somehow we make it fully redundant) and it is anticipated locally.
For safety reasons, if we want to place at output of a block, we want to place at entry of all successors. (PPOUT)
$$
\text { PPOUT[i] }=\left\{\begin{array}{cl}
0 & i=\text { entry } \\
\bigcap_{s \in \operatorname{succ}(i)} \operatorname{PPIN}[\mathrm{s}] & \text { otherwise }
\end{array}\right.
$$
When PPIN is true, if means
\begin{itemize}
\item we have a local computation to place, or a placement at the end of this block which we can move up
\item we want to move computation to output of all predecessors where expression is not already available (don’t insert at input) (for every predecessor,)
\item we gain something by moving it up (PAVIN heuristic) (not too far)
\end{itemize}
$$
\operatorname{PPIN}[\mathrm{i}]=\left\{\begin{array}{cc}
0 & i=\text { exit } \\
([\text { ANTLOC[i] } \cup \text { (PPOUT }[i]-\operatorname{KILL}[i])] & \\
\cap \bigcap_{p \in \text { preds }(i)}(\text { PPOUT }[p] \cup \text { AVOUT }[p]) & \text { otherwise } \\
\cap \text { PAVIN }[i]) &
\end{array}\right.
$$
\begin{figure}[H]
\centering
\includegraphics[width=0.8\textwidth]{p102.pdf}
\caption{Example for PRE for \texttt{a+b}}
\label{fig:p102}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=0.8\textwidth]{p103.pdf}
\caption{Example for PRE for \texttt{a+b}}
\label{fig:p103}
\end{figure}
\subsubsection{Safety}
It is safe to insert only if anticipated.
$$
\text { PPIN[i] } \subseteq(\text { PPOUT[i] - KILL[i]) U {\color{red}ANTLOC}[i] }
$$
$$
\text { PPOUT[i] }=\left\{\begin{array}{cl}
0 & i=\text { entry } \\
\bigcap_{s \in \operatorname{succ}(i)} \operatorname{PPIN}[\mathrm{s}] & \text { otherwise }
\end{array}\right.
$$
$$
{\color{red}\text{INSERT}} \subseteq \text{PPOUT} \subseteq {\color{red}\text{ANTOUT}}
$$
So it is safe.
\subsection{Perform}
On every path from an INSERT, there is a DELETE.
\subsection{Limiations}
\subsection{A new way to think about partial redundancy}
% We’ll start with an “enabling term.”
% \[
% \mathrm{CONST}[i] = \mathrm{ANTIN}[i] \cap ( \mathrm{PAVIN}[i] \cup ( \neg \mathrm{KILL}[i] \cap \neg \mathrm{ANTLOC[i]} ))
% \]
% This term say that we require the
% expression to be:
% \begin{itemize}
% \item Anticipated at the start of block i (somebody wants the expression) \\
% {\large \textbf{and}}
% \item {\large \textbf{(2a)}} The expression must be partially
% available (to perhaps transform
% into full availability)\\
% {\large \textbf{or}}
% \item {\large \textbf{(2b)}} The block neither kills nor
% computes the expression.
% \end{itemize}
% Next, we compute PPIn[i] and PPOut[i]
% .
% PP means “possible placement” of a computation at the start (PPIn[i]) or
% end (PPOut[i]) of a block. These values determine whether a
% computation of the expression would
% be “useful” at the start or end of a
% basic block.
% $$
% \text { PPOUT[i] }=\left\{\begin{array}{cl}
% 0 & i=\text { entry } \\
% \bigcap_{s \in \text { succ(i) }} \operatorname{PPIN}[\mathrm{s}] & \text { otherwise }
% \end{array}\right.
% $$
% We try to move computations "up" (nearer the start block). It makes sense to compute an
% expression at the end of a block if it
% makes sense to compute at the start
% of all the block’s successors.
% $$
% \operatorname{PPIN}[\mathrm{i}]=\left\{\begin{array}{cc}
% 0 & i=\text { exit } \\
% \text { ([ANTLOC[i] } \cup(\mathrm{PPOUT}[i]-\mathrm{KILL}[i])] & \\
% \bigcap_{p \in \text { preds(i) }}(\mathrm{PPOUT}[\mathrm{p}] \cup \mathrm{AVOUT}[\mathrm{p}]) & \text { otherwise } \\
% \cap \text { CONST[i]) }
% \end{array}\right.
% $$
% To determine if PPIni
% is true, we first
% check the enabling term. It makes sense
% to consider a computation of the
% expression at the start of block i if the
% expression is anticipated (wanted) and
% partially available or if the expression is
% anticipated (wanted) and it is neither
% computed nor killed in the block.
% We then check that the expression is
% anticipated locally or that it is
% unchanged within the block and possibly
% positioned at the end of the block.
% Finally, we check that all the block’s
% predecessors either have the expression
% available at their ends or are willing to
% position a computation at their end.
% Note also, the bi-directional nature of
% this equation.