Description
Compares a graph with a classification/clustering file.
The comparison is performed at the level of arcs: for each arc of the
graph, the programs tests if the soure and target nodes are found in a
common cluster (intra-cluster arcs) or in separate clusters
(inter-cluster arcs).
In the cluster file, each node can be associated to one or several
clusters. For example, the "cluster" file could contain the result of a
classification such as ghe Gene Ontology (GO), where each gene can be
associated to multiple GO classes.
Authors
Options
Input and output formats
Graph format
The accepted input formats are GML, tab-delimited and adjacency matrix.
For more explanations about these, refer to the manual of convert-graph.
Classification format
A two-column file with column corresponding respectively to the node name and to the cluster name.
If the induced option is used. The classification file must only consist of a list of nodes, each node being
the first word of each line.
Column specifications (only for tab-delimited format)
Source and target column. Columns containing the source and target nodes.
Weight or label column. Column containing the weight or the label on the edge.
Results
Annotated graph
Each row corresponds to one arc, identified by its source and target
nodes, and with additional columns for the annotations, indicating
the status of each arc (intra- or inter-cluster), plus some
statistics about the clusters associated to the source and target
nodes. Extra-columns are documented in the header of the file.
Graph
A list of intra-cluster edges in the requested format. The intra-cluster edges belonging
to different classification are colored differently
Node cluster connection table
A tab-delimited file where each row represents a node of the graph,
and each column a cluster (class). The cells indicate the number of
arcs connecting each node to each cluster (class) (or the sum of the
weights on the edges).