Merging changes from GitHub online edits.
This commit is contained in:
williamleif 2017-05-31 06:39:43 -07:00
commit ae2e5e3df4

View File

@ -32,6 +32,10 @@ As input, at minimum the code requires that a --train_prefix option is specified
* <train_prefix>-feats.npy --- "A numpy-stored array of node features; ordering given by id_map.json" * <train_prefix>-feats.npy --- "A numpy-stored array of node features; ordering given by id_map.json"
* <train_prefix>-walks.txt --- "A text file specifying random walk co-occurrences (one pair per line)" (*only for unsupervised) * <train_prefix>-walks.txt --- "A text file specifying random walk co-occurrences (one pair per line)" (*only for unsupervised)
To run the model on a new dataset, you need to make data files in the format described above.
To run random walks for the unsupervised model and to generate the <prefix>-walks.txt file)
you can use the `run_walks` function in `graphsage.utils`.
#### Model variants #### Model variants
The user must also specify a --model, the variants of which are described in detail in the paper: The user must also specify a --model, the variants of which are described in detail in the paper:
* graphsage_mean -- GraphSAGE with mean-based aggregator * graphsage_mean -- GraphSAGE with mean-based aggregator
@ -49,4 +53,6 @@ Note that the full log outputs and stored embeddings can be 5-10Gb in size (on t
#### Using the output of the unsupervised models #### Using the output of the unsupervised models
TODO The unsupervised variants of GraphSAGE will output embeddings to the logging directory as described above.
These embeddings can then be used in downstream machine learning applications.
The `eval_scripts` directory contains examples of feeding the embeddings into simple logistic classifiers.