NetKitDesc
|
NetKit Description
NetKit-SRL is now open source on sourceforge!
NetKit-SRL is now on sourceforge. Please
go there for source code, new releases and updates.
Old description:
NetKit is an open-source Network Learning Toolkit for statistical
relational learning. It is written in Java 1.5 and is described in
greater detail the JMLR companion paper:
Macskassy, S. A., Provost, F. (2007) Classification in Networked Data: A toolkit and a univariate case study, Journal of Machine Learning, 8(May):935-983, 2007. [pdf].
[People]
[Example Data]
[Description]
[Binaries]
People
The people responsible for this toolkit are:
- Sofus A. Macskassy
- Kaveh R. Ghazi
Any bugs or comments should be directed to Sofus.
Example Data
NetKit example data files from the ones used in the journal paper [zip] (1.3Mb)
Description
- Input
The toolkit takes as its input a graph of homogeneous nodes and
heterogeneous edges. (While the input supports heterogeneous nodes,
the methods implemented only work with homogeneous nodes.) The
graph is defined by a schema file, which defines the set of nodes
and (directed) edges making up the graph. The data consists of
one file per node-type and one file per edge-type.
- Output
The toolkit is designed for probability estimation of
class-membership for categorical attributes. It can build a model
and predict values for any given categorical attribute for any given
node-type.
- Overview:
The toolkit is very easy to use, is modular and easy to extend. It
has the great advantage that each component is plug-and-play so that
it is very easy to do comparative studies with the exact same setup.
The toolkit consists of three main modules:
- Local Learner - a classifier using only intrinsic attributes of
an instance. This is used to initialize priors of the network.
- Relational Learner - a classifer using both intrinsic and
relational attributes, where a relational attribute is an attribute
associated with nodes in the neighborhood of the node whose
attribute is being predicted.
- Collective Inference Method - a method for performing collective
inferencing on a set of unknown nodes.
- High-level algorithm
The toolkit works very simply:
- Induce a local model and apply it to initialize the priors.
- Induce a relational model.
- Apply collective inferencing with the relational model as its
classifier.
- Stop when the collective inference method converges to a stable
state or a stopping threshold is met.
- Output final estimates.
Java 1.5 binaries (latest update: April 10, 2007):
Note: The source code will be available shortly. I am in the middle of
setting it up as a sourceforge project. It will be released under
the Apache License v2.0.
- April 10, 2007
Many updates and optimizations. Specifically, NetKit know has
stronger weka integration and will work as a fully functional
relational classifier. Aggregation has also been sped up.
User Guide (pdf)
NetKit (w/weka jar) [zip] (2.2Mb)
(this includes the jar file for weka version 3.4.2)
NetKit (no weka) [zip] (260Kb)
You will need to download weka for yourself at: http://www.cs.waikato.ac.nz/~ml/weka
The jar file you download should be put in the same directory as NetKit.jar. You must name it weka.jar.
- May 26, 2005
User Guide (pdf)
NetKit (w/weka jar) [tgz] [zip] (2Mb)
(this includes the jar file for weka version 3.4.2)
NetKit (no weka) [tgz] [zip] (180Kb)
You will need to download weka for yourself at: http://www.cs.waikato.ac.nz/~ml/weka
The jar file you download should be put in the same directory as NetKit.jar. You must name it weka.jar.
- Feb 1, 2005
NetKit (w/weka jar) [tgz] [zip] (2Mb)
(this includes the jar file for weka version 3.4.2)
NetKit (no weka) [tgz] [zip] (175Kb)
You will need to download weka for yourself at: http://www.cs.waikato.ac.nz/~ml/weka
The jar file you download should be put in the same directory as NetKit.jar. You must name it weka.jar.
Let me know if you want to be
notified when NetKit gets updated.
|