Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple rhs ma57 #58

Merged
merged 3 commits into from
Aug 17, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 16 additions & 6 deletions PIPS-NLP/Core/LinearSolvers/Ma57Solver/Ma57Solver.C
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@ extern int gOoqpPrintLevel;
extern int gOuterSolve;
extern int separateHandDiag;


extern double gHSL_PivotLV;
extern double gMA57_Ordering;

Expand Down Expand Up @@ -330,7 +329,6 @@ void Ma57Solver::solve( OoqpVector& rhs_in )
SimpleVector & rhs = dynamic_cast<SimpleVector &>(rhs_in);

double * drhs = rhs.elements();

int * iwork = new int[5*n];
double * work = new double[n];

Expand Down Expand Up @@ -410,10 +408,22 @@ void Ma57Solver::solve(GenMatrix& rhs_in)
assert(n==N);

// we need checks on the residuals, can't do that with multiple RHS
for (int i = 0; i < NRHS; i++) {
SimpleVector v(rhs[i],N);
solve(v);
}

double *drhs = rhs.elements();
//std::cout << "[MA57] N: " << N << " NRHS: " << NRHS << std::endl;
int job = 1; // Solve using A
int one = 1;

int *iwork = new int[n];
double *work = new double[NRHS*n];
int lwork=NRHS*N;
FNAME(ma57cd)( &job, &n,
Copy link
Collaborator

@cnpetra cnpetra Aug 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code was doing one rhs at the time to make make sure the solve is with iterative solve.

When doing multiple rhs, ma57cd does not perform iterative refinement, or at least this was the case a couple of years ago. Please confirm this has been changed and iterative refinement is performed on multiple rhs. Otherwise the accuracy of PIPS' linear solves will be severely affected

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly you might be able to do the iterative refinement outside of the ma57 in a multiple right hand side manner. I think you just need to compute the multiple right hand side residual with a multiple right hand side multiply (which will also be faster than single ones). Compute the norms of all of the residuals and then do a (possibly smaller) multiple right hand side solve on the columns selected for a second solve. The one part I don't understand is if the multiply is done somehow with extended precision inside of ma57 but could not be done this way outside.

Copy link
Collaborator

@cnpetra cnpetra Aug 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we could/should do it outside. ma57 manual does not say anything about extended precision. this would be the short story. The long story is that that part of the code is responsible for the majority of the execution time (and I believe for the numerical issues on large problems now that I see that in PIPS-NLP there is no iterative solve done). In PIPS-IPM we've managed to drastically reduce execution time with https://epubs.siam.org/doi/10.1137/130908737, which essentially exploits the sparsity of these multiple rhs -- the accuracy was not bad at all

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cnpetra I do iterative refinement (IR) outside. That is inspired by IPOPT, which also uses ma57cd and do IR outside. By doing this, the key point is that we can do IR based on the full KKT residual, instead of the residual of the augmented system. Currently PIPS-NLP has an option to choose which criteria to use (or use both). Note that I added this outside IR after I run the benchmark with CUTEr tests. Doing IR on the full system helps the convergence, but, probably it helps only 5 problems out of 900 problems.

I noticed that in my current implementation, I only do IR for the KKT system which comes with one rhs, I didn't do IR when we solve multiple rhs to build the Schur complement, i.e., in void sLinsys::addTermToDenseSchurCompl(sData *prob, DenseSymMatrix& SC)

I agree with you and @BarrySmith that we should add this when we build the Schur complement.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SC linear algebra is a different animal than Ipopt/pipsnlpserial. Without ir for multiple rhs, the solve for the correction step in your outer IR is done with a matrix that can have potentially large errors in its components, which makes me doubt that the outer ir is as effective as it should be.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I just realized that we didn't do IR for builing SC when Michel suggested to solving multiple rhs by MA57. I totally agree with you and we should have IR when buillding SC. This probably explains that why pipsnlp ends up getting slightly different steps for large problem with different number of processes..

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

glad we're on the same page

fact, &lfact, ifact, &lifact,
&NRHS, drhs, &n,
work, &lwork, iwork,
icntl, info );
delete [] iwork;
delete [] work;
}

int* Ma57Solver::new_iworkn(int dim)
Expand Down
1 change: 0 additions & 1 deletion PIPS-NLP/Core/LinearSolvers/MumpsSolver/MumpsSolver.C
Original file line number Diff line number Diff line change
Expand Up @@ -244,7 +244,6 @@ void MumpsDenseSolver::sysMatToSpTriplet()
int MumpsSolver::matrixChanged()
{
if(mumpsMpiComm!=MPI_COMM_NULL) {
printf("mumps Rank %d says hello\n", my_mumps_rank_);
#ifndef WITHOUT_PIPS
if(Msys!=NULL)
sysMatToSpTriplet();
Expand Down
6 changes: 3 additions & 3 deletions PIPS-NLP/Core/NlpStoch/sLinsysRoot.C
Original file line number Diff line number Diff line change
Expand Up @@ -261,7 +261,7 @@ void sLinsysRoot::afterFactor()
{
int mype; MPI_Comm_rank(mpiComm, &mype);

//if( (mype/256)*256==mype)
if( 0==mype)
{

for (size_t c=0; c<children.size(); c++) {
Expand Down Expand Up @@ -402,7 +402,7 @@ void sLinsysRoot::Ltsolve( sData *prob, OoqpVector& x )
#ifdef TIMING
int myRank; MPI_Comm_rank(MPI_COMM_WORLD, &myRank);

//if(256*(myRank/256) == myRank)
if(0 == myRank)
{
double tTotResChildren=0.0;
for(size_t it=0; it<children.size(); it++) {
Expand Down Expand Up @@ -741,7 +741,7 @@ int sLinsysRoot::factorizeKKT()
//MPI_Barrier(mpiComm);
int mype; MPI_Comm_rank(mpiComm, &mype);
// note, this will include noop scalapack processors
//if( (mype/256)*256==mype )
if( 0==mype )
printf(" rank %d 1stSTAGE FACT %g SEC ITER %d\n", mype, st, (int)g_iterNumber);
#endif
return negEValTemp;
Expand Down