Actualizado: 18 de sep de 2020
Network analysis is being increasingly used to understand social interactions (e.g., McClurg, 2003). More than ever before, we are connected with others in the world, and the structure of such relationships can be understood as a social network in which the links between “nodes” (i.e., people, companies, webpages…) determines the network topology. Social networks have a dramatic impact on our health, behavior, values, outcomes, and happiness, among others (e.g., Heaney and Israel, 2008; Christakis and Fowler, 2007). They determine who leads, who follows, who influences, who informs, etc. And this has important implications for the individuals’ social capital (Coleman, 1998). However, individuals are often blind to such an impact. That is, the effects that social networks and their structure exert on us frequently operate through channels we are not even aware of (see, Nolan et al., 2008).
People indeed frequently benefit, or suffer, from their position in the network without knowing how or why this is the case. For example, some individuals have many contacts, others are key players in the network information flow, others have few but very good contacts, etc., and these factors determine to a large extent why some are more popular, influential or powerful than others (e.g., Knoke, 2019; Jackson & Rogers, 2007). Fortunately, not only the network structure (e.g., how centralized, dense or cohesive it is) but also individual positioning measures such as these can be accurately identified using network analysis. As we will see below, these three individual measures correspond, respectively, to the key concepts in network analysis of degree, betweenness, and eigenvector centrality.
There are in fact many applications of network-based measures. For example, understanding the network of employee relationships within a company is a powerful way to understand individual- and group-level differences in key indicators such as performance, job satisfaction, conflict, attrition, or disengagement (see, e.g., Espín et al. 2017). At Behave4, we know of the importance of network analysis for organizations and have developed tools to obtain all the crucial information which can be used to improve the organizational output. Even if this may look like a very complex endeavor, it is actually as easy as asking people who their friends are, or with whom they prefer to, and/or prefer not to, accomplish a specific job task or share a leisure activity (Brañas-Garza et al. 2017, Espín et al. 2017).
But, more specifically, where can network analysis be applied in organizations? Let us see some areas or processes we can improve using network analysis:
Employee onboarding: if we know how our people are related to each other (communication flow, influence, etc.), we will be able to design an optimized onboarding process to ensure the best possible experience for the new employee.
Management improvement: knowing how our team is configured and interconnected allows us to foster integration and improve the team’s cohesiveness.
Communication effectiveness: if we know who have more influence on the network, we can design more effective communication strategies. Moreover, we can identify what we call “cut-points” (network’s weak spots vulnerable to disruptions in information flow, resources, and influence), which are critical for organizational communication.
Employee Integration: identifying who is more vulnerable in our team/network can improve their integration and employee experience. At least, we will have key information available to anticipate possible negative consequences (disengagement, isolation, performance or motivation decrease, job dissatisfaction, etc.) and make data-driven decisions.
Based on our expertise, we wanted to gift our friends in the HR world with an analysis of their relationships on Twitter, as a meaningful example of the power of network-based measures. To do so, we chose one hashtag which is well known in the HR world: #HRCommunity. HR professionals use this hashtag to share ideas, views, activities, or jokes (typically) related to people management. We gathered data from all the Twitter accounts which used the hashtag during a single weekend. In this case, rather than asking for their friends or contacts, we built the network of relationships within this (reduced) set of the #HRCommunity users based on Twitter public information about the followers of the 62 accounts who used the hashtag. In other words, we analyzed ‘who follows who’ (at that time; this might have changed) among these 62 Twitter accounts. This means that we did not examine who interacts more with or gives more likes or retweets to others. We just focused on the network of relationships at that time. Of course, our goal was not to be totally systematic or to get representative data (otherwise, we should analyze longer time horizons). Instead, we wanted to show that even for such a short time window and relatively small dataset, network analysis has many interesting things to say!
To start our study, we first needed to collect all the tweets with the specific hashtag #HRCommunity tweeted during the aforementioned time interval (4th and 5th of July, 2020). Fortunately, Twitter allows developers to retrieve these data from their API. There are also some helpful packages built for Python and R to simplify the data retrieving. You only need a Twitter API key and some programming knowledge. Let’s work then.
Since we had already identified the community we wanted to focus on, we proceeded with the data collection. We retrieved all the data that the Twitter API threw for the search guided by the hashtag “#HRCommunity”. The Twitter free API retrieves all the tweets published in the last 7 days. We ran our search on the 6th of July. We collected a total of 1,122 tweets between “2020-06-27” and “2020-07-06”. We later cleaned and filtered the data to work uniquely with the tweets we needed, that is, those featuring during the last weekend. It’s worth mentioning that the Twitter API works on UTC zone, so the time parameters given to the Twitter API release the data in this format.
Once we had all tweets published with the hashtag #HRCommunity, we pulled all the single users who used it. In total, there were 69 unique users. Thus, we only needed to retrieve all the followers from each user. We let the computer work for a few hours and gathered the ‘tiny’ amount of 487,955 followers (these 69 people are important!). Next, we built an edge list with our 69 users. The edge list includes ‘who follows who’ within the sample and, therefore, allows us to draw and analyze the elicited network. Given that seven of the users did not follow or were followed by anyone else (among the 69 users), they were excluded from the sample. Therefore, we analyzed a network of 62 users. Note that all this information was public, and we did not collect any private information.
Figure 1 displays the #HRCommunity network for the weekend under scrutiny. Nodes refer to Twitter users and their pairwise relationships are denoted by straight lines (i.e., links). Disconnected users do neither follow, nor are followed by, each other. Larger nodes denote higher betweenness centrality of the user (see below). Our analysis yielded four clusters within the network. These are depicted using different colors. In the following, we briefly define each of the individual measures obtained and report their value for each of the 62 users. Tables 1-4 display the data arranged in clusters, with users sorted within their cluster according to betweenness centrality to facilitate interpretation. Note however that all the individual measures can be compared between individuals even if they belong to different clusters.
Figure 1. The #HRCommunity network (4th-5th July 2020; 62 users). The size of the nodes (i.e., users) reflect betweenness centrality. The links between the nodes reflect ‘who follows who’. The different node colors refer to different clusters within the whole network.
Network individual measures
Cluster refers to the group assigned to a member of the network according to a clustering algorithm. In this case, the clustering algorithm is the Louvain method which identifies groups or clusters within the network by evaluating which members are more closely interrelated among each other. Please see Tables 1-4.
Betweenness Centrality measures the number of times a network member serves as the shortest path between other members. Individuals with a high betweenness centrality in a social network serve as a bridge through which information between other individuals passes. For this reason, they are considered to have considerable influence within the network: they can control information flows and use them in their own benefit (or whatever other goal). In our #HRCommunity network, @sbrownehr (with a huge value of betweenness centrality, 747.16), @watssnextBen (219.87), and @Jon_Thurmond (193.59) are the profiles with higher betweenness centrality. Please see Tables 1-4.
Degree Centrality measures the total number of links of one individual within a social network. The larger the number of links the higher the degree centrality. However, this measure does not take into account the quality or importance of the link. In directed networks -those in which the direction of the link matters, as in our case- we can differentiate between In-degree and Out-degree centrality. In-degree refers to the number of times others point inwards an individual, while Out-degree is the number of times an individual points outwards other people. An easy example of why this matters is precisely Twitter: being followed by others (In-degree) has dramatically different implications than following others (Out-degree). Within our reduced sample, @sbrownehr (In-degree = 52; i.e., 52 out of the 62 follow him), @KyraMatkovichHR (40), and @Jon_Thurmond (41) are the most popular users: therefore, they have the highest In-degree. Regarding Out-degree, the highest values are associated to @ sbrownehr (51), @ KyraMatkovichHR (45), and @ Jon_Thurmond (44), thus being the users following the largest number of users in the sample. Please see Tables 1-4.
K-Reach Centrality measures the number of other individuals that one member of the network can reach in k steps. For instance, 2-reach identifies how many individuals can be reached in two steps or less. 1-reach, therefore, is equivalent to degree centrality. In the case of a social network, this measure would represent how many friends of friends an individual has in a certain group. Here, we focus on 2-reach to get meaningful information without overloading the analysis. The majority of users in our network can reach 90% or more of the other users in two steps or less, which indicates that this network has a high density. However, only a few users are able to reach the whole network in a couple of steps, they are @sbrownehr, @watssnextBen, and @TheBuzzOnHR. Please see Tables 1-4.
Closeness Centrality assigns each individual a score based on how close the individual is to the rest of the network. It is calculated as an average distance measure, where a high value indicates that the individual can easily reach other members of the network, while a low value entails being in a peripheral position. Again, considering the direction of the link, we can differentiate between In-closeness and Out-closeness. According to the former, the most central users are @sbrownehr (0.0076), @Jon_Thurmond (0.007), @KryaMatkovichHR (0.0069), @doublempeacock (0.0068), and @socialmicole (0.0068). Regarding out-closeness, the highest scores are observed for @ sbrownehr (0.0076), @KryaMatkovichHR (0.0072), @Jon_Thurmond (0.0071), and TheBuzzOnHR (0.0071). Please see Tables 1-4.
Eigenvector Centrality is a more sophisticated measure of centrality. In social networks, it measures the centrality of an individual given by the relative importance of her connections. The more popular or well-connected her closest neighbors, the higher her eigenvector centrality will be. Eigenvector centrality is often referred to as the level of influence or power of an individual within the social network. According to this, @sbrownehr (1), @KryaMatkovichHR (0.93), and @Jon_Thurmond (0.93) could be considered the most influential users in the #HRCommunity network according to this measure. Please see Tables 1-4.
The Clustering Coefficient measures how interconnected are the neighbors of a user. In other words, a maximum score (1) will be given to a user whose all connections are also connected among them and 0 if none of them are connected. People with high levels of clustering only interact with a reduced set of people and, therefore, cannot fully exploit the information in the network and its associated benefits. The highest clustering coefficients in our network are found for @HeartshipHour (0.61), @BrysonTenma (0.5), and @amitsethi85(0.5).
Table 1. Individual data for cluster 1 - The #HRCommunity network (4th-5th July 2020)
Table 2. Individual data for cluster 2 - The #HRCommunity network (4th-5th July 2020)
Table 3. Individual data for cluster 3 - The #HRCommunity network (4th-5th July 2020)
Table 4. Individual data for cluster 4 - The #HRCommunity network (4th-5th July 2020)
It might be argued that people have not changed much, but our technology definitely has, and nowadays we are able to illustrate the structure of our social world in terms of connections and influence. Social network analysis provides us with a huge amount of crucial information which we could easily exploit to understand, for example, workflows and differences in job performance, as well as to improve wellbeing and other people-centric variables. With this knowledge at hand, in sum, we can effectively promote behavioral change.
We hope that our exploratory (and not necessarily representative) analysis inside the #HRCommunity has been able to show the potential of network-related measures to make data-driven decisions in organizations.
At the first sight, we can observe the great interconnectedness between all the users of the network, which means that users are in general very close to each other and there is great cohesion. It will be hard to miss any piece of relevant information within this community!
There are certainly some users more influential than others. In this case, @sbrownehr has demonstrated to be a ‘super connector’, a bridge to the rest of the community. Other users with high potential to manage the information flow in the network are @wattsnextBen, @Jon_Thurmond and @ JoannaSuvarna. We could say that if you want to spread a message or motivate any behavior inside the #HRCommunity, you should do it through these users. Yet, as our results show, there are differences across centrality measures, meaning that some users display higher potential for some tasks than for others. And, again, this is only a non-representative example from one weekend. The results should therefore be taken with caution.
Another interesting conclusion is the presence of secondary clusters in the network. Even if the network is well connected, there are four main subpatches that are closely bonded to each other.
We hope this post encourages organizations to perform more network analysis!
Branas-Garza P, Jimenez N, Ponti G. Eliciting real-life social networks: a guided tour. Journal of Behavioral Economics for Policy. Society for the Advancement of Behavioral Economics (SABE); 2017;1: 33–39.
Christakis, N. A., & Fowler, J. H. (2007). The spread of obesity in a large social network over 32 years. New England journal of medicine, 357(4), 370-379.
Coleman, J. S. (1988). Social capital in the creation of human capital. American journal of sociology, 94, S95-S120.
Espín, A. M., Reyes-Pereira, F., & Ciria, L. F. (2017). Organizations should know their people: A behavioral economics approach. Journal of Behavioral Economics for Policy, 1(S), 41-48.
Heaney, C. A., & Israel, B. A. (2008). Social networks and social support. Health behavior and health education: Theory, research, and practice, 4, 189-210.
Jackson, M. O., & Rogers, B. W. (2007). Relating network structure to diffusion properties through stochastic dominance. The BE Journal of Theoretical Economics, 7(1).
Knoke, D., & Yang, S. (2019). Social network analysis (Vol. 154). Sage Publications.
McClurg, S. D. (2003). Social networks and political participation: The role of social interaction in explaining political participation. Political research quarterly, 56(4), 449-464.
Nolan, J. M., Schultz, P. W., Cialdini, R. B., Goldstein, N. J., & Griskevicius, V. (2008). Normative social influence is underdetected. Personality and social psychology bulletin, 34(7),913-923.