Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert ddp.jl to Column-Major #109

Closed
wants to merge 13 commits into from
Closed

Convert ddp.jl to Column-Major #109

wants to merge 13 commits into from

Conversation

ZacCranko
Copy link
Contributor

@ZacCranko ZacCranko commented May 3, 2016

Summary

This is a placeholder to convert the "row-major" code in ddp.jl to a more Julia friendly column-major layout. To that end the specification for R is now of dimension states x actions, and Q is of dimension states_tomorrow x states x actions.

There are two functions where we reap the benefits of the new data structures, in the computation of the product Q * v and in solving for the state wise max with s_wise_max!.

Technical Details

There is a new method defined for ctranspose for Array{T, 3} that transposes the old format Q to the new one. In particular permuting according to

@inbounds for i in 1:m, j in 1:n, k in 1:o
    Q_new[k,i,j] = Q_old[i,j,k]
end

Transposes of a three dimension matrix are somewhat arbitrary to define, but the reason I choose this particular configuration for Q is that it is the most efficient in defining the multiplication Q * v. This is because with the new-format Q the elements of the resulting matrix Q * v are given by inner products of v and the "columns" of Q.

@inbounds for i in 1:s, j in 1:a
    R[j, i] = dot(v, Q[:, i, j])
end

Similarly the s_wise_max! code now computes the max over the columns of the computed matrix R instead of the rows.

Comparison

Solving the ddp0 model from the tests with VFI:
Branch master:

================ Benchmark Results ========================
     Time per evaluation: 17.92 μs [17.04 μs, 18.79 μs]
Proportion of time in GC: 1.07% [0.36%, 1.77%]
        Memory allocated: 8.31 kb
   Number of allocations: 80 allocations
       Number of samples: 5201
   Number of evaluations: 284701
         R² of OLS model: 0.743
 Time spent benchmarking: 10.47 s

Branch transddp:

================ Benchmark Results ========================
     Time per evaluation: 11.84 μs [11.32 μs, 12.36 μs]
Proportion of time in GC: 1.73% [0.90%, 2.56%]
        Memory allocated: 8.05 kb
   Number of allocations: 76 allocations
       Number of samples: 5801
   Number of evaluations: 504201
         R² of OLS model: 0.765
 Time spent benchmarking: 10.54 s

These results can be computed with

Pkg.checkout("QuantEcon", "transddp") # new code
# `Pkg.checkout("QuantEcon", "master")` # use this for the old code
using QuantEcon, Benchmarks
beta = 0.95
n, m = 2, 2
R = [5.0 10.0; -1.0 -Inf]
Q = Array{Float64}(n, m, n)
Q[:, :, 1] = [0.5 0.0; 0.0 0.0]
Q[:, :, 2] = [0.5 1.0; 1.0 1.0]
ddp0 = DiscreteDP(R, Q, beta)

v_init = [0.0, 1.0]
solve(ddp0, v_init, VFI)
@benchmark solve(ddp0, v_init, VFI)

Some issues to be addressed before merging:

  • Currently the old tests are being run with ctranspose transposing R and Q to the new format. This is convenient for troubleshooting, but once we're happy I'll hardcode the tests, and we should consider removing the ctranspose method for Array{T, 3} either as part of this PR or at some later stage.
  • I need to write some additional tests for the new input verification code in the DiscreteDP constructor (which checks that Q is of the correct orientation, to stop users accidentally using the old format).
  • Currently there is a test failing in random_mc.jl. I'll fix this once everyone is happy with the code in ddp.jl

Update: I originally mistakenly said the new format Q matrix is states x states_tomorrow x actions, this has been fixed.

sglyon and others added 12 commits April 5, 2016 16:13
to search over columns then rows i.e. over a transposed vals matrix
to search over columns then rows i.e. over a transposed vals matrix

transpose both s_wise_max fns

transpose R, tidy tests

transpose multiplication code properly

transpose RQ_sigma

tidy the tests up a bit

improve tests, tidy up

fix a commit mistake
@oyamad
Copy link
Member

oyamad commented May 3, 2016

@ZacCranko Thanks for your efforts on this!

What about the SA (or AS) formulation? For this case Q should be of shape states_tomorrow x sa_pairs. Then the expected value is computed by v * Q, where v is a vector of length L with L being the number of sa pairs and * is the standard dot product.

For s_wise_max I think sa paris should be ordered as (state_1, action_1), (state_1, action_2), (state_1, action_3), ....

(Actually I now suspect we don't need the two versions of internal data structures and in fact only the SA is enough (for both the Julia and Python versions).)

@oyamad
Copy link
Member

oyamad commented May 3, 2016

Let me add some more words.

There are two issues: (1) How to store the data in the memory; (2) How to access them.
("Product formulation"/ "SA formulation" concerns (2) and I think it is of secondary importance.)

(1)
We have to decide what the efficient order of the data is for the computation of * and s_wise_max and others. This is independent of whether it is Julia or Python/NumPy; the data, R and Q, are stored linearly anyway.

My view is the following:

The values (rewards) of R should be ordered in memory as:
r(s1, a1), r(s1, a2), r(s1, a3), ...

The values (probabilities) of Q should be ordered in memory as:
q(s1, a1, t1), q(s1, a1, t2), ..., q(s1, a2, t1), q(s1, a2, t2), ..., q(s1, a3, t1), q(s1, a3, t2), ...
where t's are states tomorrow.

Then given the values tomorrow v, the expected values are computed as:
q(s1, a1, t1) * v(t1) + q(s1, a1, t2) * v(t2) + ...
for (s1, a1) and so on.

(2)
There are two ways to know what in the linear data corresponds to what. One is to define s_indices a_indices and a_indptr and the other to give R and Q dimensions 2 and 3, respectively. Julia and Python differ in indexing in the latter case.

We should discuss the issue (1) first. Is there a unique most efficient way? Are there several equally efficient ways and should we decide based on the issue (2)?

@ZacCranko
Copy link
Contributor Author

ZacCranko commented May 4, 2016

My view is the following:

The values (rewards) of R should be ordered in memory as:
r(s1, a1), r(s1, a2), r(s1, a3), ...

The values (probabilities) of Q should be ordered in memory as:
q(s1, a1, t1), q(s1, a1, t2), ..., q(s1, a2, t1), q(s1, a2, t2), ..., q(s1, a3, t1), q(s1, a3, t2), ...
where t's are states tomorrow.

Then given the values tomorrow v, the expected values are computed as:
q(s1, a1, t1) * v(t1) + q(s1, a1, t2) * v(t2) + ...
for (s1, a1) and so on.

If I understand correctly I think the issue here is that with the column-major ordering means that internally an s x a x t dimension matrix Q will be stored in memory linearly as

[ [Q(1,1,1), Q(2,1,1), ... ,Q(s,1,1)], [Q(1,2,1), ... , Q(s,2,1)], ... , [Q(1,a,1), ... , Q(s, a,1)] ],
[ [Q(1,1,2), Q(2,1,2), ... ,Q(s,1,2)], [Q(1,2,2), ... , Q(s,2,2)], ... , [Q(1,a,1), ... , Q(s, a, 2)] ],
... ,
[ [Q(1,1,t), Q(2,1,t), ... ,Q(s,1,t)], [Q(1,2,t), ... , Q(s,2,t)], ... , [Q(1,a,t), ... , Q(s, a,t)] ]

where [ ] separate the various ellipses. This is why I permuted the Q matrix indices according to

@inbounds for i in 1:m, j in 1:n, k in 1:o
    Q_new[k,i,j] = Q_old[i,j,k]
end

Since the inner products happen over the 't' indices, we make these the "columns" in the matrix.
I.e., Q is now t x s x a. @oyamad If I have understood correctly, the result is the memory ordering you envisage.

@oyamad
Copy link
Member

oyamad commented May 4, 2016

(I mis-read your Q to be of dimension "states x states_tomorrow x actions".)

If we want to align the probabilities in memory as
q(s1, a1, t1), q(s1, a1, t2), ..., q(s1, a2, t1), q(s1, a2, t2), ..., q(s1, a3, t1), q(s1, a3, t2), ...
(state tomorrow changes first, action changes second, and then state today changes) and if we want to access them with a 3-dimensional Q, then shouldn't Q be of shape "states_tomorrow x actions x states"? Maybe we disagree on the "best" memory layout?

@oyamad
Copy link
Member

oyamad commented May 4, 2016

My design of the Python version may have misled us. I was ignorant about the distinction among:

  1. How to order the data (reward values and probability values)
    Common for Julia and Python/NumPy; the data are stored contiguously.
  2. How to access the data internally
    Here the issue is whether we need both DiscreteDP with (NQ, NR) = (3, 2) and DiscreteDP with (NQ, NR) = (2, 1).
  3. What constructors to provide to the user
    It is definitely useful to provide two constructors with (2-dim R, 3-dim Q) and with (1-dim R, 2-dim Q).

I didn't realize that there is no need for 2 to be the same as 3, and I paid no attention to 1 ex ante (but ex post it seems to be the efficient one).

We should discuss 1 first.

@ZacCranko
Copy link
Contributor Author

and if we want to access them with a 3-dimensional Q, then shouldn't Q be of shape "states_tomorrow x actions x states"? Maybe we disagree on the "best" memory layout?

There's no difference performance wise between picking an orientation for Q of states_tomorrow x actions x states versus (the current) states_tomorrow x states x actions. I can't think of a clear reason to prefer one over the other, but if you'd rather I implement it the other way then I am happy to do so.

@oyamad
Copy link
Member

oyamad commented May 4, 2016

On my machine there is a difference:

julia> n, m = 10^3, 10^3;

julia> vals = randn(n, m);

julia> vals_T = transpose(vals);

julia> @time maximum(vals, 2); @time maximum(vals, 2); @time maximum(vals, 2);
  0.003414 seconds (14 allocations: 8.359 KB)
  0.002886 seconds (14 allocations: 8.359 KB)
  0.002876 seconds (14 allocations: 8.359 KB)

julia> @time maximum(vals_T, 1); @time maximum(vals_T, 1); @time maximum(vals_T, 1);
  0.001320 seconds (14 allocations: 8.359 KB)
  0.001179 seconds (14 allocations: 8.359 KB)
  0.001205 seconds (14 allocations: 8.359 KB)

The reason is that it may be the case (on some machines) that it takes less time for the computer to refer to contiguous data than otherwise. Does this make sense?

@ZacCranko
Copy link
Contributor Author

When I said before "there was no difference" this was with regard to defining the three dimension matrix multiplication Q * v. Of course there is a difference when computing the column-wise vs. row-wise maximum.

@ZacCranko
Copy link
Contributor Author

Looking over the multiplication code, it may even be a tiny bit faster to use states_tomorrow x actions x states, but the difference is likely negligible.

@@ -571,7 +592,7 @@ s_wise_max!(ddp::DiscreteDP, vals::AbstractMatrix, out::Vector, out_argmax::Vect
Return the `Vector` `max_a vals(s, a)`, where `vals` is represented as a
`AbstractMatrix` of size `(num_states, num_actions)`.
"""
s_wise_max(vals::AbstractMatrix) = vec(maximum(vals, 2))
s_wise_max(vals::AbstractMatrix) = vec(maximum(vals, 1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change this while keeping the shape (states, actions) fixed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, I'm not sure. Should it be vec(maximum(vals, 2)) ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the sa pairs are in the order that the state changes first, so that the action is accessed by the second argument, then it should be vec(maximum(vals, 2)), and vice versa.

@oyamad
Copy link
Member

oyamad commented May 4, 2016

There seems to be a confusion (see the above comment).

Try this:

beta = 0.95;
n, m = 2, 2;
R = [5.0 10.0; -1.0 -Inf];
Q = Array{Float64}(n, n, m);
Q[:, 1, 1] = [0.5, 0.5];
Q[:, 1, 2] = [0, 1];
Q[:, 2, 1] = [0, 1];
Q[:, 2, 2] = [0.5, 0.5]  # Arbitrary;
ddp0 = DiscreteDP(R, Q, beta);

Then I get:

julia> bellman_operator(ddp0, [0, 0])
2-element Array{Float64,1}:
  5.0
 10.0

The outcome should be [10, -1].

@ZacCranko
Copy link
Contributor Author

Easy fix. I'll make sure to add this to the test suite.

@oyamad
Copy link
Member

oyamad commented May 4, 2016

Do we agree now that there is a clear reason why one of the two orders is preferred to the other?

@ZacCranko
Copy link
Contributor Author

I still don't know why one would be preferred to another. Please let me know how you'd like me to set it up and I'll be happy to implement it.

@oyamad
Copy link
Member

oyamad commented May 4, 2016

  • state_tomorrow versus sa_pair
    state_tomorrow should change faster for *.
    (We are in agreement on this, right?)

  • state_today versus action
    action should change faster for s_wise_max.

    Do you agree that maximum(vals_transpose, 1) is faster than maximum(vals, 2)?

@ZacCranko
Copy link
Contributor Author

ZacCranko commented May 5, 2016

Of course I agree that computing the maximum is faster over the columns of an array. I just don't understand what you'd like me to change in the codebase. Do you want me to change the orientation of the Q matrix to states_tomorrow x actions x states? What other changes need to be made?

@sglyon
Copy link
Member

sglyon commented Jul 6, 2016

Ping.

@ZacCranko and @oyamad where did we leave off with this. Can we make some checkboxes (either in a separate comment or in the original description above) that provide a set of TODO items that would help us get this merged?

@oyamad
Copy link
Member

oyamad commented Jul 6, 2016

I would like to discuss first what would be the best way to order the data in memory. If I understand correctly, it is language independent and so should be common between the Python and the Julia versions.

@sglyon
Copy link
Member

sglyon commented Jul 7, 2016

@oyamad by your most recent comment do you mean to say that you believe there is a single most efficient way the memory should be ordered for our particular algorithms with markov chains? If so, then I think we might be able to have a fruitful discussion. Otherwise, could you please explain more about what you meant above?

@oyamad
Copy link
Member

oyamad commented Jul 12, 2016

@spencerlyon2 Let's discuss in a separate issue #124.

@sglyon sglyon added the on hold label Jul 12, 2016
@ZacCranko ZacCranko closed this Oct 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants