Signed or unsigned: which network type is preferable?

How should pairs of nodes with strong negative correlations be treated in a correlation network analysis? One option is to consider them connected, just as if the correlation were positive. A network constructed in this way is an unsigned network, because the sign of the correlation does not matter. On the other hand, strongly negatively correlated nodes can also be considered unconnected. This leads to a signed network, so called because the sign of a strong correlation value makes all the difference between the pair of nodes being strongly connected or not connected at all. To avoid any confusion, I want to emphasize that the resulting adjacency matrix (the matrix that contains the connection strengths between nodes) is always non-negative.

Should you use a signed or unsigned network? By and large, I recommend using one of the signed varieties, for two main reasons. First, more often than not, direction does matter: it is important to know where node profiles go up and where they go down, and mixing negatively correlated nodes together necessarily mixes the two directions together. Second, negatively correlated nodes often belong to different categories. For example, in gene expression data, negatively correlated genes tend to come from biologically very different categories. It is true that some pathways or processes involve pairs of genes that are negatively correlated; if there are enough negatively correlated genes, they will form a module on their own and the two modules can then be analyzed together. (For the advanced practitioner, another option is to use the fuzzy module membership measure based on the module eigengene to attach a few strongly negatively correlated genes to a module after the modules have been identified).

By and large does not mean always, and there may be applications in which an unsigned network is preferable. In principle there’s also nothing wrong with carrying out both types of analysis, but working with two related yet distinct analyses of the same data may quickly get confusing and tiring.

For historical reasons (compatibility with old calculations), the defaults in the current implementation of WGCNA R package disregard my own recommendation and imply unsigned networks. To work with signed networks, use arguments type or networkType with value “signed” or “signed hybrid” (more on their difference later) whenever calling a function that in some shape or form constructs a network. If in doubt, help in R is always only a few keystrokes away.

Read on: Two types of signed networks in WGCNA

Share this:

Related

Published by Peter Langfelder