-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distributional semantics using contexts rather than documents #50
Comments
Hi Jiang, You'll want to use the GwsMain class which uses the GenericWordSpace Thanks, On Wed, Apr 2, 2014 at 5:42 AM, jiangfeng [email protected] wrote:
|
Command: java edu.ucla.sspace.mains.GwsMain -d data/wiki.sample data/output-sample/ -t 6 -o sparse_text -F include=data/wiki_vocab_sample.lst;exclude=data/english-stop-words-large.txt What I get: ... |0,994.0,1,2457.0,2,796.0,3,19110.0,4,1510.0,5,1990.0,6,1256.0,7,18830.0,... ... It seems that representation of an empty word is generated. Could you help check this? Thanks, |
Hi Jiang, Yes, this looks like a bug. The boolean logic for filtering this case Thanks, On Wed, Apr 2, 2014 at 10:03 PM, jiangfeng [email protected] wrote:
|
Hi David, I would like to ask a little more. I realized that the Thanks, |
Hi Jiang, So you want to report the output of GwsMain and then use that as input to If all you want to do is run SVD on the GWS data, that's currently not Thanks, On Sun, May 4, 2014 at 8:15 AM, jiangfeng [email protected] wrote:
|
Dear developers,
I found that the VsmMain computes the
word-document matrix
, which concerns the co-occurrences of words and documents. Could I generate distributional representation using the context within a certain size of window (say: 10), and use the PMI, rather than tf-idf as the element in theword-context matrix
?Thanks,
Jiang
The text was updated successfully, but these errors were encountered: