NetKit Description

NetKit-SRL is now open source on sourceforge!

NetKit-SRL is now on sourceforge. Please go there for source code, new releases and updates.

Old description:

NetKit is an open-source Network Learning Toolkit for statistical relational learning. It is written in Java 1.5 and is described in greater detail the JMLR companion paper:
  • Macskassy, S. A., Provost, F. (2007) Classification in Networked Data: A toolkit and a univariate case study, Journal of Machine Learning, 8(May):935-983, 2007. [pdf].

    [People] [Example Data] [Description] [Binaries]

    People
    The people responsible for this toolkit are:

    • Sofus A. Macskassy
    • Kaveh R. Ghazi
    Any bugs or comments should be directed to Sofus.

    Example Data
    NetKit example data files from the ones used in the journal paper [zip] (1.3Mb)

    Description

    • Input

      The toolkit takes as its input a graph of homogeneous nodes and heterogeneous edges. (While the input supports heterogeneous nodes, the methods implemented only work with homogeneous nodes.) The graph is defined by a schema file, which defines the set of nodes and (directed) edges making up the graph. The data consists of one file per node-type and one file per edge-type.

    • Output

      The toolkit is designed for probability estimation of class-membership for categorical attributes. It can build a model and predict values for any given categorical attribute for any given node-type.

    • Overview:

      The toolkit is very easy to use, is modular and easy to extend. It has the great advantage that each component is plug-and-play so that it is very easy to do comparative studies with the exact same setup.

      The toolkit consists of three main modules:

      1. Local Learner - a classifier using only intrinsic attributes of an instance. This is used to initialize priors of the network.
      2. Relational Learner - a classifer using both intrinsic and relational attributes, where a relational attribute is an attribute associated with nodes in the neighborhood of the node whose attribute is being predicted.
      3. Collective Inference Method - a method for performing collective inferencing on a set of unknown nodes.

    • High-level algorithm

      The toolkit works very simply:

      1. Induce a local model and apply it to initialize the priors.
      2. Induce a relational model.
      3. Apply collective inferencing with the relational model as its classifier.
      4. Stop when the collective inference method converges to a stable state or a stopping threshold is met.
      5. Output final estimates.

    Java 1.5 binaries (latest update: April 10, 2007):
    Note: The source code will be available shortly. I am in the middle of setting it up as a sourceforge project. It will be released under the Apache License v2.0.

    • April 10, 2007
      Many updates and optimizations. Specifically, NetKit know has stronger weka integration and will work as a fully functional relational classifier. Aggregation has also been sped up.
      User Guide (pdf)
      NetKit (w/weka jar) [zip] (2.2Mb)
                (this includes the jar file for weka version 3.4.2)
      NetKit (no weka) [zip] (260Kb)
                You will need to download weka for yourself at: http://www.cs.waikato.ac.nz/~ml/weka
                The jar file you download should be put in the same directory as NetKit.jar. You must name it weka.jar.
    • May 26, 2005
      User Guide (pdf)
      NetKit (w/weka jar) [tgz] [zip] (2Mb)
                (this includes the jar file for weka version 3.4.2)
      NetKit (no weka) [tgz] [zip] (180Kb)
                You will need to download weka for yourself at: http://www.cs.waikato.ac.nz/~ml/weka
                The jar file you download should be put in the same directory as NetKit.jar. You must name it weka.jar.
    • Feb 1, 2005
      NetKit (w/weka jar) [tgz] [zip] (2Mb)
                (this includes the jar file for weka version 3.4.2)
      NetKit (no weka) [tgz] [zip] (175Kb)
                You will need to download weka for yourself at: http://www.cs.waikato.ac.nz/~ml/weka
                The jar file you download should be put in the same directory as NetKit.jar. You must name it weka.jar.
    Let me know if you want to be notified when NetKit gets updated.