Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple rhs ma57 #58

Merged
merged 3 commits into from
Aug 17, 2019
Merged

Multiple rhs ma57 #58

merged 3 commits into from
Aug 17, 2019

Conversation

michel2323
Copy link
Member

This and using ifort for MA57 considerably speeds up the backsolve. On Theta it is 40%, on other CPUs even more.

@michel2323 michel2323 requested a review from cnpetra August 15, 2019 20:57
int *iwork = new int[n];
double *work = new double[NRHS*n];
int lwork=NRHS*N;
FNAME(ma57cd)( &job, &n,
Copy link
Collaborator

@cnpetra cnpetra Aug 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code was doing one rhs at the time to make make sure the solve is with iterative solve.

When doing multiple rhs, ma57cd does not perform iterative refinement, or at least this was the case a couple of years ago. Please confirm this has been changed and iterative refinement is performed on multiple rhs. Otherwise the accuracy of PIPS' linear solves will be severely affected

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly you might be able to do the iterative refinement outside of the ma57 in a multiple right hand side manner. I think you just need to compute the multiple right hand side residual with a multiple right hand side multiply (which will also be faster than single ones). Compute the norms of all of the residuals and then do a (possibly smaller) multiple right hand side solve on the columns selected for a second solve. The one part I don't understand is if the multiply is done somehow with extended precision inside of ma57 but could not be done this way outside.

Copy link
Collaborator

@cnpetra cnpetra Aug 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we could/should do it outside. ma57 manual does not say anything about extended precision. this would be the short story. The long story is that that part of the code is responsible for the majority of the execution time (and I believe for the numerical issues on large problems now that I see that in PIPS-NLP there is no iterative solve done). In PIPS-IPM we've managed to drastically reduce execution time with https://epubs.siam.org/doi/10.1137/130908737, which essentially exploits the sparsity of these multiple rhs -- the accuracy was not bad at all

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cnpetra I do iterative refinement (IR) outside. That is inspired by IPOPT, which also uses ma57cd and do IR outside. By doing this, the key point is that we can do IR based on the full KKT residual, instead of the residual of the augmented system. Currently PIPS-NLP has an option to choose which criteria to use (or use both). Note that I added this outside IR after I run the benchmark with CUTEr tests. Doing IR on the full system helps the convergence, but, probably it helps only 5 problems out of 900 problems.

I noticed that in my current implementation, I only do IR for the KKT system which comes with one rhs, I didn't do IR when we solve multiple rhs to build the Schur complement, i.e., in void sLinsys::addTermToDenseSchurCompl(sData *prob, DenseSymMatrix& SC)

I agree with you and @BarrySmith that we should add this when we build the Schur complement.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SC linear algebra is a different animal than Ipopt/pipsnlpserial. Without ir for multiple rhs, the solve for the correction step in your outer IR is done with a matrix that can have potentially large errors in its components, which makes me doubt that the outer ir is as effective as it should be.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I just realized that we didn't do IR for builing SC when Michel suggested to solving multiple rhs by MA57. I totally agree with you and we should have IR when buillding SC. This probably explains that why pipsnlp ends up getting slightly different steps for large problem with different number of processes..

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

glad we're on the same page

@nychiang
Copy link
Collaborator

nychiang commented Aug 15, 2019 via email

@cnpetra
Copy link
Collaborator

cnpetra commented Aug 16, 2019

yes @nychiang , I stand corrected, ma57dd. I was confused by the fact that PIPS-NLP's MA57Solver class does not seem to call ma57dd, while I remember exactly that it is in PIPS-IPM and helping a lot in terms of numerical stability in the linear solves. Am I missing something? Any idea why iterative refinement is not in PIPS-NLP? The lack of it would explain some of the numerical issues #40

@cnpetra
Copy link
Collaborator

cnpetra commented Aug 16, 2019

@michel2323 I really think we should address #59 in this PR

@cnpetra
Copy link
Collaborator

cnpetra commented Aug 17, 2019

Letting the pr go through— can’t do any more harm

@cnpetra cnpetra self-requested a review August 17, 2019 13:41
@cnpetra cnpetra merged commit e1f4252 into master Aug 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants