Graph Mining using Graph Pattern Profiles
[pdf]
Proceedings of the 2008 International Conference on Artificial Intelligence (ICAI).
Abstract
This paper presents our investigation into graph mining methods to
help users understand large graphs. Our approach is a two-step
process: First calculate subgraph labels and then calculate
distribution statistics on these labels. Our approach is flexible in
that it can identify a range of patterns from very abstract to very
specific (e.g., isomorphisms). The statistics that we calculate can
be used to find rare and common patterns, patterns that are
(dis)similar to the distribution of induced subgraphs of the same
size, patterns that are (dis)similar to each other, as well as
variance of graph patterns given a specific set of input node types.
We also investigate a method to understand structural characteristics
by analyzing clusters that are created by ``collapsing'' overlapping
instances of user-specified patterns. We evaluated our approach on
two publicly available networks---the Texas CS web-site from WebKB and
the internet movie database.