Vote: Fortune 500, or Al-Qaeda?
People working together on projects tend to interact in fairly predictable ways -- whether that project is installing a new computer system, or blowing up a building. So looking only at the links between people won't tell you much about what those folks are up to. At times, the links can be rather deceptive, in fact. Especially if your data set is huge, like the NSA's ginormous database of phone records. Other information is needed, to fill in the gaps.
Here's an example, below. Can you tell which cluster is from a Fortune 500 company, and which one is from Al-Qaeda? Network analysis guru Valdis Krebs shows this slide to corporate and government audiences. Their answers are usually pretty scattershot. Take your guesses in the comments section. Valdis will be back later on with the right answer.

My guess is the one on the left is Al Qaeda due to the multiple connections between cells, few inter-cell connections, and larger central hub.
In industry, cells/departments have specific patterns:
- Outlying clusters usually have 1 connection since there is only one manager or one external contact point.
- Leaves of a cell may connect to leaves of other cells. E.g., a developer may call a hardware tech for questions. This bypasses the central hub.
- The core is tight -- few external contacts. This is due to a core management team (e.g., CEO + VPs).
The graph on the right shows all of these features.
For terrorist networks:
- The main hub has many cross-connections. This prevents a single loss (capture/kill) from breaking the entire network. (Industry does not worry about this since subordinates are documented. Terrorists usually do not document structures since documents could compromise their network. Redundancy is used in lieu of documentation.)
- Cells may have cross-communication with parents, but are isolated from other cells. The multiple connections to parents show the communication redundancy. There is no intercommunication between cells because they do not know each other exist.
This matches the graph on the left.
Then again, I could be wrong.
A lot of this depends on the source of the data, duration of the collection, and scope of the graph. Is this a single Fortune-500 company or a department. Do the graphs span a week or a year?
Are the graphs from phone trees, network connections, IRC, IM, or something else?
Posted by: Dr. Neal Krawetz at May 17, 2006 8:59 AM