Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider adding COLEQ and ROWEQ IndexUnaryOp operators #139

Open
eriknw opened this issue Apr 24, 2022 · 9 comments
Open

Consider adding COLEQ and ROWEQ IndexUnaryOp operators #139

eriknw opened this issue Apr 24, 2022 · 9 comments
Labels
enhancement New feature or request performance enhancement GraphBLAS is working but could be faster.

Comments

@eriknw
Copy link
Contributor

eriknw commented Apr 24, 2022

I think these would be nice to have. Their usage can come about when wanting to run an algorithm for a single node.

I know one can extract a column or row and assign this into a new matrix to get the same result. This is sometimes slower for me than performing select with ROWLE followed by another select with ROWGT, so I think a single select operation in SuiteSparse:GraphBLAS would be the fastest (and clearest user-code!).

I have no strong opinions on also adding COLNE and ROWNE (I probably would, but wouldn't object either way).

@eriknw
Copy link
Contributor Author

eriknw commented Apr 24, 2022

And I suppose the current best way to do this is to make a matrix with a single element on a diagonal and do a matrix multiply from the left or right.

@DrTimothyAldenDavis
Copy link
Owner

Good suggestion.

However, I also need to speed up GrB_Col_extract, particularly for the common case when the index list is GrB_ALL, to extract an entire row or an entire column. The method uses the standard GrB_Matrix_extract, which is a very general method. I should instead catch the special case of GrB_ALL for GrB_Col_extract, and speed up that case. I would then use the more general GrB_Matrix_extract only when the index list is something else.

This gives a single vector as the result, of course, not the matrix you are looking for.

@DrTimothyAldenDavis DrTimothyAldenDavis added enhancement New feature or request performance enhancement GraphBLAS is working but could be faster. labels Apr 25, 2022
@eriknw
Copy link
Contributor Author

eriknw commented May 13, 2022

btw, I found a different implementation, so I would not use these if available atm. Still, these may be nice (but def. not necessary), especially if they provide a performance improvement.

@DrTimothyAldenDavis
Copy link
Owner

OK. I'll close this for now, but I'll work on the basic cases of GrB_Col_extract to speed it up when extracting a whole row from a matrix held by row, or a whole column from a matrix held by column. Those operations are important special cases and should be a lot faster than they currently are.

@eriknw
Copy link
Contributor Author

eriknw commented May 26, 2022

btw, I'm playing around with implementing square clustering coefficient, and one of my variants could use COLNE and ROWNE IndexUnaryOp.

Do you know if square clustering coefficient is implemented anywhere using GraphBLAS? It's often a bit mind-bending to develop algorithms for the first time, so I'm not 100% confident my current approaches are the best. I can share once they pass tests.

Anyway, in one variant, I need to delete an entire row and an entire column (and I do this for each node, so it would be nice to make it fast). Instead of deleting these, I could instead apply e.g. ROWNE to set the row to 0, b/c I suspect changing the structure is costly. Another way I can apply 0 to values of a row is with row assign with a mask.

So, if these additional IndexUnaryOps existed, I would at least benchmark variants that use them.

edit: I'll also try UDFs, since I think what I need is something like ROWCOLNE i.e., to False where the row index or column index is equal to the thunk. Nevermind, only need to set e.g. the column to 0, so COLNE or ROWNE would be applicable. But, it may be best to simply apply POSITIONJ once, and then do equality checks.

@DrTimothyAldenDavis
Copy link
Owner

There is a "local clustering coefficient" method in LAGraph (experimental/algorithm/LAGraph_lcc.c). @hoke-t is also adding an implementation of graphlet counting, which might be related. Otherwise I don't know of anything.

@eriknw
Copy link
Contributor Author

eriknw commented May 28, 2022

Thanks for pointing out LAGraph's local clustering coefficient. I think I implemented this for undirected and directed, unweighted and weighted graphs. I should compare (and benchmark) against LAGraph. I use NetworkX's definition (and tests): https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.cluster.clustering.html. My implementations are in the file linked below.

Here's my version of square counting as of May 28:
https://github.com/python-graphblas/graphblas-algorithms/blob/a81498fbc3abed5f73d5163c25b6cc550610a688/graphblas_algorithms/algorithms/cluster.py#L243-L280

Instead of applying COLNE, I apply POSITIONJ once outside the loop, then apply ISNE within the loop. I expect performance to be dominated by other operations, but COLNE could still help with reducing memory, since there would be no need to keep the intermediate matrix from applying POSITIONJ.

Also, note that square clustering performs the outer product of the degrees vector (and uses a mask). I have found vector inner and outer product very useful. XREF: #57

@DrTimothyAldenDavis
Copy link
Owner

To be precise, you would like to see COLNE, ROWNE, and perhaps ROWCOLNE. The latter would be:

(i != y) and (j != y)

correct?

@eriknw
Copy link
Contributor Author

eriknw commented Jun 1, 2022

COLEQ, ROWEQ, COLNE, and ROWNE seem like the most natural to add.

If ROWCOLEQ and/or ROWCOLNE were added (up to you), yeah, I would have them be:

  • ROWCOLEQ: (i == y) or (j == y)
  • ROWCOLNE: (i != y) and (i != y), same as not ROWCOLEQ(...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance enhancement GraphBLAS is working but could be faster.
Projects
None yet
Development

No branches or pull requests

2 participants