Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues matching genes in network and sample matrix #5

Closed
ggonzalezp opened this issue Apr 5, 2019 · 2 comments
Closed

Issues matching genes in network and sample matrix #5

ggonzalezp opened this issue Apr 5, 2019 · 2 comments

Comments

@ggonzalezp
Copy link

Hi,

I was trying to run the examples and there seemed to be some issues in pyNBS_core.py in this piece of code (lines 77-86):

    # Filter columns by network nodes only if network is given
    if propNet is not None:
        # Check if network node names intersect with somatic mutation matrix column names
        # If there is no intersection, throw an error, gene names are not matched
        if len(set(list(propNet.nodes)).intersection(set(sm_mat.columns)))==0:
            raise ValueError('No mutations found in network nodes. Gene names may be mismatched.')
        gind_sample_filt = gind_sample.T.ix[list(propNet.nodes)].fillna(0).T
    else:
        gind_sample_filt = gind_sample
    return gind_sample_filt

Specifically:


  1. This line
    if len(set(list(propNet.nodes)).intersection(set(sm_mat.columns)))==0:

was always giving a warning because propNet.nodes names were in unicode and sm_mat.columns names were in string format, so the intersection was always 0.

I changed that line to convert propNet.nodes names to string when computing the intersection:

if len(set(list(str(n) for n in propNet.nodes)).intersection(set(sm_mat.columns))) == 0:


  1. I assumed that the following line was meant to add genes in the matrix that were in the network and not in the matrix. The problem is that it actually raised an error when indexing the data frame with values that were not present. I changed this:

gind_sample_filt = gind_sample.T.ix[list(propNet.nodes)].fillna(0).T

to:

        for n in propNet.nodes:                    
            if n not in gind_sample.T.index:
                gind_sample_filt[n] = 0    

Please let me know if those change look okay.

Maybe it is something related to the specific python/packages versions I am using. I am running Python 2.7. My pip freeze output is:


appnope==0.1.0
autograd==1.2
backports-abc==0.5
backports.functools-lru-cache==1.5
backports.shutil-get-terminal-size==1.0.0
Bottleneck==1.2.1
certifi==2019.3.9
cycler==0.10.0
decorator==4.4.0
enum34==1.1.6
future==0.17.1
futures==3.2.0
intel-openmp==2019.0
ipykernel==4.10.0
ipython==5.8.0
ipython-genutils==0.2.0
jupyter-client==5.2.4
jupyter-core==4.4.0
kiwisolver==1.0.1
lifelines==0.19.5
matplotlib==2.2.4
mkl==2019.0
networkx==2.2
numpy==1.16.2
pandas==0.24.2
pathlib2==2.3.3
pexpect==4.6.0
pickleshare==0.7.5
prompt-toolkit==1.0.15
ptyprocess==0.6.0
Pygments==2.3.1
pyNBS==0.2.0
pyparsing==2.3.1
python-dateutil==2.8.0
pytz==2018.9
pyzmq==18.0.1
scandir==1.10.0
scikit-learn==0.20.3
scipy==1.2.1
seaborn==0.9.0
simplegeneric==0.8.1
singledispatch==3.4.0.3
six==1.12.0
subprocess32==3.5.3
tornado==5.1.1
traitlets==4.3.2
wcwidth==0.1.7
@justinkhuang
Copy link
Collaborator

Hi Guadalupe,
Thank you for your fixes. You are correct about your assumption in point #2. I think these fixes look fine to me to use. I think the error is the result of your pandas version. Pandas deprecated the '.ix' indexing function when it moved to 0.20+, and I also believe that the correct way to identify nodes in a network in networkx is to call ".nodes" as a function on the graph object now (e.g. propNet.nodes()). Some of these kinds of changes are now causing pyNBS, built on older versions of these dependencies, to throw errors when users have updated version of the package.

@ggonzalezp
Copy link
Author

Hi Justin,
Thank you for replying!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants