-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Python iterator approach: vectorise without dtype #9
base: master
Are you sure you want to change the base?
Conversation
For later automation, it is easier to have all inputs vectorized.
Just because one of its parameters is an input and an output at the same time. And it also uses the complex eraASTROM structure.
…instead of empty() and copyto()
@mhvk Below is a verbatim copy of my timing script:
|
bm1_ = (<double *>(dataptrarray[3]))[0] | ||
ppr_ = (<double *>(dataptrarray[4])) | ||
eraAb(pnat_, v_, s_, bm1_, ppr_) | ||
status = iternext(GetNpyIter(it)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am not very versant in C, but would it be faster to assign GetNpyIter(it)
to a variable before entering the loop, and then use that variable rather than call this (inline) function multiple times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In principal it should be basically the same because GetNpyIter
is just an accessor that changes the type.
But I just tried it now like this:
cdef char** dataptrarray = GetDataPtrArray(GetNpyIter(it))
cdef IterNextFunc iternext = GetIterNext(GetNpyIter(it), NULL)
cdef NpyIter* ititer = GetNpyIter(it)
cdef int status = 1
while status:
pnat_ = (<double *>(dataptrarray[0]))
v_ = (<double *>(dataptrarray[1]))
s_ = (<double *>(dataptrarray[2]))[0]
bm1_ = (<double *>(dataptrarray[3]))[0]
ppr_ = (<double *>(dataptrarray[4]))
eraAb(pnat_, v_, s_, bm1_, ppr_)
status = iternext(ititer)
And I got an ~ 2% speedup on a 5000-element array. So quite small, but it does seem to help.
Unless there's some possibility the iter
pointer changes during iteration?
@jwoillez - Sorry, I was unclear: I wondered whether you could point me to the |
@jwoillez @mhvk - note that it's not entirely a fair comparison to compare this to what's in astropy/astropy#2992 because the latter has other changes dealing with error-processing. But hopefully that will be the same cost either way. @mhvk, to look at the |
Oh, but I realized I have a local copy - it basically looks like this, @mhvk:
|
I added d2dtf(), aper(), and atco13(), in order to see what the wrapper would look like in various cases. |
I would appreciate if someone could have a critical look at the content of |
In the commit above, to deal with |
@eteq - just to clarify, the approach with the regular arrays here is now 30% faster than with the record arrays? Given the copies made for the latter, I guess that is not so surprising; the record array version probably could be sped up as well, but nice not to have a need for that! |
This is based on the work in cython_numpy_auto and the non-automated tests in cython_npyiter
To make it match with the automated version and help me splot differences in generated erfa.pyx
And the commits above add automation to the |
@mhvk - yep, that's what I was seeing. @jwoillez - thanks! I'll try to adjust astropy/astropy#2992 to use this auto-generating approach, then, as it's clearly an improvement over the other one. Hopefully it won't be too much work to get this to behave fine with the error-checking and array-wrapping additions I mad there. If we're really lucky, maybe it will even fix the Once that's in, we'll have to decide what to do with this repo. My instinct is to maybe just merge everything and then add a clear note in the README that this is for experimentation, and that the file in the astropy repo is where people should go for "production" code. |
@eteq Do you really want to spend any time cleaning this repository? If the official python wrapper is going to be moved to astropy, I would just get rid of this one. If improvements and/or experiments are needed at a later date, why not do that in the astropy side? |
@jwoillez - I suppose that's true. I guess I'm just thinking of keeping this for "posterity" as it were, but it can just be left as-is right now. |
Have a look only at the last two commit, as this is a continuation of another PR.
To give it a spin:
The only tricky part in the implementation is that
NpyIter
is not exposed innumpy.pxd
. This requires the use of privateNpyIter_GetDataPtrArray()
fromnumpy/arrayobject.h
. This implementation is inspired from https://github.com/rainwoodman/chealpy/blob/master/chealpy/npyiter.pxdIt could be further improved by including the output parameter
ppr
in the iterator itself. This would save indexation time, after the call toeraAb
...