Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-train Model #3

Open
r0438930 opened this issue Mar 21, 2021 · 4 comments
Open

Pre-train Model #3

r0438930 opened this issue Mar 21, 2021 · 4 comments

Comments

@r0438930
Copy link

Dear,

After running make.exe to compile the Makefile, this generates the magic.exe.

Next I try to run 'python prepare_pretrain_embedding.py' which recognizes the subprocess.call('magic -i ...) but calling
subprocess.call('python format_transform.py ...) gives me the below error. I realize that com-amazon_final.f.txt is not found but what is this file and how do I generate it? Is this the output of the subprocess.call('magic ...)?

Since I am trying to run CommunityGAN on different datasets I really hope to find a solution.

ERROR MESSAGE:
Traceback (most recent call last):
File "format_transform.py", line 13, in
with open(bigclam_out_filename) as fp, open(output_filename, 'w') as out_fp:
FileNotFoundError: [Errno 2] No such file or directory: '../src/PreTrain/community_detection/com-amazon_final.f.txt'

Sincerely,

Simon

@SamJia
Copy link
Owner

SamJia commented Mar 22, 2021

Hi Simon,

The file '../src/PreTrain/community_detection/com-amazon_final.f.txt' should be generated by the pretrain model MAGIC, which is the trained embeeding file of MAGIC.
line 9 (subprocess.call('python format_transform.py ...)) will read the file and transform the format for communityGAN.
If you want to run on some different datasets, you should edit the prepare_pretrain_embedding.py and replace the corresponding filenames in line 8-9.
Moreover, you can comment line 10-13, which is the pretrain process for the given dataset youtube and dblp.

Best regards,
Yuting Jia

@r0438930
Copy link
Author

Dear Yuting,

Thanks for the fast reply!

The real problem is that when I run magic, the "..._final.f.txt" file is not generated! I made sure that I gave the right input AGM file. I have tried it with all three included datasets but I do not see any files being generated. Therefore format_transform.py is not really the problem, I just dont see magic generating that txt file.

I fear that it may have something to do with how the magic.exe file is generated. I installed MinGW as mentioned in the readme file. I opened the PreTrain directory in powershell and ran the "make" command which generated all the ".o" files as well as magic.exe. Next I return to python to run prepare_pretrain_embedding.py given the correct -i and -o arguments. The code runs but no "...final.f.txt" file is generated.

When I run subprocess.call(magic ...) I do not get any errors (exit code 0). I see the below post when running magic. Note for -o I tried to generate file directly in PreTrain directory and for -i I placed the agm file in the PreTrain directory as well.

Hopefully this makes my problem more concrete. Obviously making a minimum reproducible code sample is not really possible here.

Do you have any suggestions to get "...final.f.txt" to be generated?


PYTHON MESSAGE POST:
Read arguments doneMon Mar 22 08:26:54 2021
usage:bigclam.exe
Model initiate doneMon Mar 22 08:26:54 2021
-o:Output Graph data prefix(default: A:\Users\Sam\OneDrive - sjtu.edu.cn\Lab\AAAI2019\CDGAN\data\community_detection\com-amazon_pretrain_)=com-amazon_
-i:Input edgelist file name(default: A:\Users\Sam\OneDrive - sjtu.edu.cn\Lab\AAAI2019\CDGAN\data\community_detection\com-amazon_agm.txt)=com-amazon_agm.txt
-l:Input file name for node dates (Node ID, Node date) (default: none)=
-t:Input file name for node' text (Node ID, Node texts), 'none' means do not load text (default: none)=
-nt:Number of threads for parallelization(default: 8)=20
-c:The number of communities to detect (-1 detect automatically)(default: 500)=100
-mc:Minimum number of communities to try(default: 5)=
-xc:Maximum number of communities to try(default: 500)=
-nc:How many trials for the number of communities(default: 10)=
-sa:Alpha for backtracking line search(default: 0.05)=
-sb:Beta for backtracking line search(default: 0.1)=
-st:Allow reference between two same time node or not (0: don't allow, 1: allow)(default: 0)=
-woe:Disable Eta or not (0: enable eta, 1: disable eta, 2: symmetric eta)(default: 1)=
-se:same Eta or not (0: different eta, 1: same eta)(default: 1)=
-mi:Maximum number of update iteration(default: 500)=200
-si:How many iterations for once save(default: 5000)=
-rsi:How many iterations for once negative sampling(default: 10)=
-sa:Zero Threshold for F and eta(default: 0.0001)=
-lnf:Remain only largest how many elements for F(default: 0)=

Process finished with exit code 0


Thanks in advance,

Simon

@r0438930
Copy link
Author

UPDATE:

By running magic.exe directly into windows powershell, the code is fully executed.
There seems to be a problem with 'subprocess.call()' here where it does not fully execute magic.exe.
I do not know the exact issue but this resolved it for me and all .txt files were generated!

Hopefully this helpes anyone with a similar issue in the future!

@shivansh1704bharadwaj
Copy link

@r0438930
Sir how do you download Magic.exe file?
Can you please provide me link. I can not find it anywhere

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants