Connectivity Tab

 
 
Connectivity Tab
 
This tab allows application of the connectivity analysis methods to the graph format files.
 
1

Connectivity Analysis Method

 
Options are:
1) Shortest-path betweenness centrality
2) Shortest-path betweenness centrality subset
3) Current flow betweenness centrality
4) Current flow betweenness centrality subset
5) Minimum Cost Flow [min-cost-max-flow, both all-pairs and subset]
6) Time series minimum cost flow
 
In addition, the following network flow methods are available but not currently documented:
PageRank
Minimum cut
Maximum flow
Maximum flow - flow value
Maximum flow - normalized
 
 
2

Browse for graph format file

 
Browse to locate the input graph format file produced in the previous step.
 
3

Optional files

 
Several files are optionally used as input during connectivity analysis.
 
Options in this menu are:
 
1) None specified - All pairs of nodes will be analyzed (typical option for centrality analysis)
 
2) Generic data - Subset centrality using network flow methods can accept data on the node (hex) ids of the specific source/target node pairs to be analyzed. The input file can either be a two line file with a space-separated row of source ids in the first line and target ids in the second line (see file provided for Tutorial exercise 1 (sourcetarget.txt) for an example), or a file of more than two lines, in which each line specifies the space-separated ids of a source/target pair. Non-network flow methods (shortest-path, current flow, and PageRank) cannot accept this input format.
 
3) Coordinate data - This is a textfile with .crd extension produced during hexmap generation (see 'Hexmap Tab'), with three space-separated fields: node (hex) id, x-coordinate, y-coordinate. This coordinate file is used in analyses where only pairs of nodes less than a threshold distance apart are analyzed (termed 'bounded-distance' analysis).
 
4) Source and target data - Input for subset centrality using either network flow methods or python-NetworkX based methods (shortest-path and current flow). This input consists of two files that respectively identify the node (hex) ids of those nodes which are the source and target nodes in subset centrality analyses. Format is a single column of node ids per file (see file provided for Tutorial exercise 2 (sourcetarget2.txt) for an example).
 
Node id files are typically produced in a GIS by overlaying points or patches of interest on the shapefile produced by the Toolkit.
 
 
 
4

Browse for optional input files

 
Browse to locate the optional input file(s) if needed. These options will be disabled if not relevant to the method chosen.
 
 
5

Distance threshold

 
Distance threshold used for bounded-distance analyses can be entered here (min cost flow only). Distances should be expressed in the units of the original .asc file (typically meters).
 
6

Use Scaling

 
If checked, min-cost-max-flow output will be scaled by the ( n - 1 )( n - 2 ), where n is the number of nodes. (Shortest-path and current flow BC results are scaled by default). Do not use this option for subset analyses.
 
 
7

Browse for output

 
Browse for the location of the output file.
 
8

No Data Value

 
Typically, nodes with NODATA (usually -9999) values will not be included in the analysis. However, as they were typically filtered out in the generation of the graph format file ('Graphs' tab), this option is redundant except in cases where files that were created outside the application are being analyzed.
 
9

Number of threads

 
Users of systems with multi-core CPUs can improve performance by using multiple threads. This value will be set by default to the maximum number of virtual CPUs available. The value would be modified by the user who wanted the analysis to leave some CPU capacity available for other programs. This parameter is only relevant to when the min cost flow method is used.
10

Supply Fraction

 
Min-cost-max-flow analyses first identify the maximum flow between a node pair, and then the min-cost flow path for that flow. If desired, the user can specify that a min-cost flow path will be mapped that accommodates a set fraction of the max-flow. This may be useful in cases where planners want to maintain e.g., 80% of existing connectivity at minimum cost.
11

Probability

 
This value specifies that alpha parameter used in shortest-path BC, current-flow BC, and PageRank analyses. This value represents different parameters in the 3 functions:
1) Shortest-path BC: Alpha represents the proportion of nodes sampled (k/n) in the approximation process. The default value of 0.05 results in 5% of the nodes being subsampled, which typically results in an approximation that is >99% correlated with the exact value. Entering a value of 1 in this field will cause the exact algorithm to be used (warning: this may cause the function to take a long time to complete).
2) Current-flow BC: Alpha represents the absolute error tolerance (epsilon) in the approximation function. The default value of 0.1 typically results in an approximation that is >99% correlated with the exact value. Entering a value of 0 in this field will cause the exact algorithm to be used (warning: this may cause the function to use large amounts of RAM or fail with a memory error if sufficient RAM is not available).
3) Pagerank: Alpha represents (1 - the probability of a 'jump' to a random node in the graph). A default alpha value of 0.85 is used in typical webpage ranking applications, but a default alpha value of 0.95 is more appropriate in this context.
12

Run button

 
 
 

The help manual was created with Dr.Explain