Characterization of complex networks c A survey of measurements official title Characterization of complex networks: A survey of measurements "C:\Users\kurtw_000\Box Sync\DocDR\2014\03-29-2014d1134\Characterization of complex networks c A survey of measurements.pdf" ref 139 -[Communication boundaries in networks] ref 142 -Degree distribution of complex networks from statistical mechanics principle -here is another paper from the same author (Ginestra Bianconi) which looks extremely interesting -- { The entropy of randomized network ensembles http://iopscience.iop.org/0295-5075/81/2/28005 } --and another: Entropy of network ensembles -[pr_Ginestra Bianconi portal 06-26-2015d1058] -[some notes on k-core decomposition 06-22-2015d1010] Some notes { Extracted Annotations (7/5/2015, 3:37:55 PM) "Informally speaking, if the measurements considered in the feature vector are such that small changes of the network topology (e.g., add/ remove a few edges or vertices) imply large changes in the measurements (large values of k~k), those measurements can be considered as being highly sensitive or unstable." (Demaria et al 2014:727) "There is an unlimited set of topological measurements, and they are often correlated, implying redundancy in most of the cases." (Demaria et al 2014:728) "All network types can be derived from the weighted digraph through appropriate transformations." (Demaria et al 2014:729) "Graphs with selfor duplicate connections are sometimes called multigraphs, or degenerate graphs" (Demaria et al 2014:729) "most of the networks of interest are sparse, in the sense that only a small fraction of all possible edges are present." (Demaria et al 2014:730) "loop or cycle is defined as a walk starting and terminating in the same vertex i and passing only once through each vertex k n . In case all the vertices and edges along a walk are distinct, the walk is a path. Many measurements are based on the length of such connecting paths" (Demaria et al 2014:730) "Out-degree of vertex i" (Demaria et al 2014:731) "In-degree of vertex i" (Demaria et al 2014:731) "percolation [63] of the network." (Demaria et al 2014:731) "In their 1959 paper [5], Erdo s and Renyi introduced a model to generate random graphs consisting of N vertices and M edges." (Demaria et al 2014:732) "For this model, P(k) (the degree distribution, see section 6) is a Poisson distribution" (Demaria et al 2014:732) "small world property, i.e. most vertices can be reached from the others through a small number of edges" (Demaria et al 2014:732) "famous experiment made by Milgram in 1967" (Demaria et al 2014:732) "The most popular model of random networks with small world characteristics and an abundance of short loops was developed by Watts and Strogatz [8] and is called the Watts-Strogatz (WS) small-world model." (Demaria et al 2014:734) "small-world networks are common in a variety of realms ranging from the C. elegans neuronal system to power grids." (Demaria et al 2014:734) "The degree distribution for small-world networks is similar to that of random networks, with a peak at hki¼ 2 (" (Demaria et al 2014:734) "Models to generate networks with a given degree distribution, while being random in other aspects, have been proposed." (Demaria et al 2014:735) "the degree distribution has been found to follow a power law for large k" (Demaria et al 2014:736) "These networks are called scale-free networks" (Demaria et al 2014:736) "The Barabasi-Albert (BA) network model is based on two basic rules: growth and preferential attachment." (Demaria et al 2014:736) "The vertices which receive the new edges are chosen following a linear preferential attachment rule, i.e. the probability of the new vertex i to connect with an existing vertex j is proportional to the degree of j," (Demaria et al 2014:736) "Thus, the most connected vertices have greater a probability of receiving new vertices. This is known as "the rich get richer" paradigm." (Demaria et al 2014:736) "Networks with community structure Some real networks, such as social and biological networks, present modular structure [10]. These networks are formed by sets or communities of vertices such that most connections are found between vertices inside the same community, while connections between vertices of different communities are less common. A model to generate networks with this property was proposed by Girvan and Newman [10]." (Demaria et al 2014:736) "there are many networks where the position of vertices is particularly important as it influences the network's evolution. This is the case for highway networks or the Internet, for example, where the position of cities and routers can be localized in a map and the edges between them correspond to real physical entities, such as roads and optical fibers [85]. This kind of network is called geographical or spatial network." (Demaria et al 2014:737) "For undirected, unweighted graphs, the number of edges in a path connecting vertices i and j is called the length of the path. A geodesic path (or shortest path), between vertices i and j, is one of the paths connecting these vertices with minimum length (many geodesic paths may exist between two vertices); the length of the geodesic paths is the geodesic distance d ij between vertices i and j." (Demaria et al 2014:738) "The determination of shortest distances in a network is only possible with global information on the structure of the network. This information is not always available. When global information is unavailable, navigation in a network must happen using limited, local information and a specific algorithm. The effective distance between two vertices is thus generally larger than the shortest distance, and dependent on the algorithm used for navigation as well as network structure" (Demaria et al 2014:739) "Vulnerability" (Demaria et al 2014:739) "characterize the presence of loops of order three is through the clustering coefficient." (Demaria et al 2014:740) "coefficient to measure how cyclic a network is." (Demaria et al 2014:742) "High clustering coefficient and power-law degree distribution are ubiquitous in most real networks [2" (Demaria et al 2014:742) "cycles with length larger than three, such as by using the grid coefficient, c 4,i , which is defined as the fraction of all quadrilaterals (cycles of length four) passing by the vertex i divided by the maximum possible number of quadrilaterals sharing the vertex i." (Demaria et al 2014:742) "statistical distribution of loops of order 3, 4 and 5 remains stable during the network evolution." (Demaria et al 2014:743) "tendency of hubs to be well connected with each other. This phenomenon, known as rich-club, can be measured by the rich-club coefficient" (Demaria et al 2014:743) "degree distribution, P(k), which expresses the fraction of vertices in a network with degree k" (Demaria et al 2014:744) "An important property of many real world networks is their power law degree distribution [" (Demaria et al 2014:744) "An objective quantification of the level to which a log-log distribution of points approach a power law can be provided by the respective Pearson coefficient, which is henceforth called straightness and abbreviated as st." (Demaria et al 2014:744) "vertices of high degree tend to connect with vertices of high degree, and the network is classified as assortative, whereas whenever k nn (k) is a decreasing function of k, vertices of high degree tend to connect with vertices of low degree, and the network is called disassortative" (Demaria et al 2014:745) "While social networks tend to be assortative, biological and technological networks are often disassortative [24]" (Demaria et al 2014:745) "disease propagation, social networks would ideally be vulnerable (i.e. the network is dismantled into connected components, isolating the focus of disease) and technological and biological networks should be resilient against attacks." (Demaria et al 2014:745) "Some networks include vertices of different types" (Demaria et al 2014:746) "network is called bipartite if its vertices can be separated into two sets such that edges exist only" (Demaria et al 2014:746) "between vertices of different sets." (Demaria et al 2014:747) "Entropy and energy are key concepts in thermodynamics, statistical mechanics [130] and information theory [131" (Demaria et al 2014:747) "The entropy of the degree distribution provides an average measurement of the heterogeneity of the network, which can be defined as" (Demaria et al 2014:747) "Network entropy has been related to the robustness of networks, i.e. their resilience to attacks [134], and the contribution of vertices to the network entropy is correlated with lethality in protein interactions networks [135]." (Demaria et al 2014:748) "entropy of the remaining degree" (Demaria et al 2014:748) "The structure of a complex network is related to its reliability and information propagation speed" (Demaria et al 2014:748) "The difficulty while searching information in the network can be quantified through the information entropy of the network" (Demaria et al 2014:748) "In order to measure how difficult it is to locate vertices in the network starting from a given vertex i, the access information is used," (Demaria et al 2014:748) "To quantify how difficult is to find the vertex b starting from the other vertices in the network, the hide information is used," (Demaria et al 2014:749) "a network with a low value of T has a star structure and a low value of R means that the network is composed by hubs connected in a string." (Demaria et al 2014:749) "Rosvall et al. [141], who studied networks with higher order organization like modular or hierarchical structure." (Demaria et al 2014:749) "Bianconi [142] proposed a theoretical approach to describe the emergence of scale-free degree distribution or finite-scale degree distribution in complex networks. In particular, the energy associated to a degree distribution N k is given as EðfN k gÞ¼ logðN G Þ," (Demaria et al 2014:749) "The entropy of each distribution, S({N k }), is defined as e S ðfN k gÞ¼N N k ¼ Q ð2LÞ! k ðkN k Þ! ," (Demaria et al 2014:750) "Bianconi showed that the optimal degree distribution with respect to the free energy minimization is obtained for scale-free degree distribution [142]." (Demaria et al 2014:750) perhaps this is the thing I have been looking for (note on p.750) "In networks, the greater the number of paths in which a vertex or edge participates, the higher the importance of this vertex or edge for the network. Thus, assuming that the interactions follow the shortest paths between two vertices, it is possible to quantify the importance of a vertex or a edge in terms of its betweenness centrality" (Demaria et al 2014:750) "The central point dominance will be 0 for a complete graph and 1 for a star graph in which there is a central vertex included in all paths" (Demaria et al 2014:750) "The spectrum of a network corresponds to the set of eigenvalues i (i ¼ 1, 2, ... , N) of its adjacency matrix A" (Demaria et al 2014:751) "Many real networks present an inhomogeneous connecting structure characterized by the presence of groups whose vertices are more densely interconnected to one another than with the rest of the network." (Demaria et al 2014:751) "This modular structure has been found in many kinds of networks such as social networks [152, 153], metabolic networks [154] and in the worldwide flight transportation network [89]." (Demaria et al 2014:751) "measurement of the quality of a particular division of networks was proposed by Newman and Girvan [159], called modularity and typically represented by Q" (Demaria et al 2014:752) "Ziv et al. [161] proposed the modularity to be defined in terms of information entropy (see section 8). This algorithm, which has been called the Network Information Bottleneck," (Demaria et al 2014:753) "Currently, this method is believed to be the most precise, as it is able to find a division with the highest value of modularity for many networks [169]." (Demaria et al 2014:754) "real networks, the distribution of sizes of communities tends to follow a power law." (Demaria et al 2014:756) "If communities are to be identified with high precision, the spectral method proposed by Newman [169] is a good choice." (Demaria et al 2014:758) "However, if priority is assigned to speed, methods such as those using greedy algorithms (runs in O(N log 2 N)) should be considered [176]." (Demaria et al 2014:758) "An interesting way to describe the topology of real networks in terms of subgraphs is by using the k-core decomposition" (Demaria et al 2014:759) "k-core of the original network" (Demaria et al 2014:759) "An algorithm for this type of visualization is publicly available, namely the Large Network Visualization tool" (Demaria et al 2014:759) "investigations about topology of the Internet using k-core decomposition are presented by Carmi et al." (Demaria et al 2014:759) "protein interaction networks are analyzed in terms of k-cores by Wuchty and Almaas [187]. The k-core approach has also been applied in order to predict the function of proteins" (Demaria et al 2014:760) "Network motifs are subgraphs that appear more frequently in a real network than could be statistically expected" (Demaria et al 2014:760) "In order to quantify the significance of a given motif, its Z-score can be computed." (Demaria et al 2014:761) "for transcription networks of Saccharomyces cerevisiae and Escherichia coli two main motifs are identified: feed-forward loop and bi-fan; for neurons: feed-forward loop, bi-fan and bi-parallel; for food-webs: three chain and bi-parallel; for electronic circuits: feed-forward loop, bi-fan, bi-parallel, four-node feedback and three node feedback loop; and for the WWW: feedback with two mutual dyads, fully connected triad and uplinked mutual dyad." (Demaria et al 2014:761) "Thus, motifs can be considered as building blocks of complex networks and many papers have been published investigating the functions and evolution of motifs in networks [15]." (Demaria et al 2014:761) "Two fundamental operations of mathematical morphology are dilation and erosion" (Demaria et al 2014:763) "The hierarchical degree of a subgraph g at distance d, henceforth represented as k d (g), can be defined as the number of edges connecting rings R d (g) to R dþ1 (g). Note that k 0 (i) is equal to k i ." (Demaria et al 2014:764) "Another measurement which can be hierarchically extended is the clustering coefficient. The rs-clustering coefficient of g, C rs (g)," (Demaria et al 2014:765) "Other possible hierarchical measurements are briefly described in the following. The convergence ratio" (Demaria et al 2014:765) "Song et al. [201] analyzed complex networks by using fractal methodologies and verified that real complex networks may consist of self-repeating patterns on all length scales." (Demaria et al 2014:765) "In order to measure the fractal dimension of complex networks, a box counting method and a cluster growing method has been proposed" (Demaria et al 2014:765) "d f is the fractal cluster dimension." (Demaria et al 2014:766) "It might be of interest to quantify the 'complexity' of a network. Lattices and other regular structures, as well as purely random graphs, should have small values of complexity. Some recent proposals are briefly presented below" (Demaria et al 2014:766) "Meyer-Ortmanns [204] associated the complexity of the network with the number of topologically non-equivalent graphs generated by splitting vertices and partitioning the edges of the original vertex among the new vertices, the transformations being restricted by some constraints to guarantee the generation of valid graphs." (Demaria et al 2014:766) "The off-diagonal complexity, proposed by Claussen [205] is defined as an entropy of a specially defined vertex-vertex edge correlation matrix. An element with indexes (k, l) of this matrix has contributions from all edges that connect a vertex of degree k to a vertex of degree l (only values k > l are used)." (Demaria et al 2014:767) "A matching index can be assigned to each edge in a network in order to quantify the similarity between the connectivity of the two vertices adjacent to that edge [207]. A low value of the matching index identifies an edge that connects two dissimilar regions of the network, thus possibly playing an important role as a shortcut between distant network regions" (Demaria et al 2014:767) "Perturbation analysis Another important property of a given measurement relates to how much it changes when the networks undergo small perturbations (e.g., rewiring, edge or vertex attack, weight changes, etc.)" (Demaria et al 2014:772) "the node degree and the clustering coefficient are uncorrelated for most networks" (Demaria et al 2014:773) "Bayesian decision theory" (Demaria et al 2014:780) "US Airlines Transportation Network (USATN): The USATN is composed by 332 US airports" (Demaria et al 2014:784) "Protein-Protein Interaction Network of Saccharomyces Cerevisiae (PPIN): PPIN is formed by 1922 proteins" (Demaria et al 2014:784) "Autonomous System (AS): In the Internet, an AS is a collection of IP networks and routers under the control of one entity that presents a common routing policy to the Internet. E" (Demaria et al 2014:784) "ach AS is a large domain of IP addresses that usually belongs to one organization such as a university, a business enterprise, or an Internet Service Provider." (Demaria et al 2014:784) "Table 5. Summary of discussed measurements." (Demaria et al 2014:788) "We hope it has been made clear that the several available measurements often provide complementary characterization of distinct connectivity properties of the structures under analysis" (Demaria et al 2014:791) "All in all, this survey provides for the first time, an integrated presentation and discussion of a comprehensive set of measurements previously covered in separate works." (Demaria et al 2014:792) finished (note on p.798) } ----------- See also -[Hide and seek on complex networks] (note that this paper does not reference the "Characterization of. . . " paper, but the "Characterization of. . ." paper does reference the "Hide and seek. . ." paper (I think))