Visual Social Network Analytics for the Enterprise
There is a growing need for visual analytics in the workplace as information overload continues to grow. Countless new sources of knowledge appear in the public everyday, ranging from blog posts and wiki pages to papers and patents. Many people aim to keep track of all data sources that produce information relevant to their interests, but there is too much information being produced and too little time to manage it all. Instead, users are forced to rely on sophisticated information discovery tools to locate data on-demand (i.e., web search engines). Such tools crawl, mine, and rank data sources and organize the data in a manner that al-lows users to reach large amounts of information. However, most systems that support information discovery are document-centric: their databases are indexed by documents and their user interfaces focus on documents as well.
Nevertheless, not all users performing information discovery tasks are interested in documents. For instance, users trying to locate experts or build teams seek to find people, not documents, relevant to their interests so they can build new relationships. Another example are users who wish to reflect on existing relationships to understand how information is flowing through their company and how much people are collaborating. We refer to these people-centric tasks as relationship discovery tasks because they are tasks in which users are examining or creating new relationships.
Relationship discovery is not trivial to support, as the backbone of a people-centric discovery system is a social graph, not a traditional document-indexed database. However, it is possible to extract a social graph from documents with social attributes. In this paper, we focus on extracting a social graph from documents in the enterprise. Such documents include traditional media like papers, patents and organizational charts, as well as online social media, like blogs, bookmarks, and communities.
There are several differences between document discovery tasks and relationship discovery tasks. For instance, in order to judge the relevance of a result set of people, users may require access to features rarely present in document discovery interfaces. As an example, access to the social structure of relationships is critical to understanding who the people are, how well they communicate with others, and which ties they may share in common with the user. Furthermore, a list of people's names may not be enough if they are strangers, so access to evidence of why particular people were chosen is crucial. Also, users should be able to filter by meaningful facets, as not all candidates can fulfill all users' needs equally. For instance, an enterprise team-building task might require finding people with certain types of job roles at particular locations. We take these considerations into the design of SaNDVis, a visual analytics tool that better supports relationship discovery than typical information discovery interfaces.
The contribution of this work is an end-to-end analytics system that supports relationship discovery in the enterprise. Such a system requires several components: SaND mines and aggregates social media from dozens of data sources, SaNDGraph organizes the data into a people-centric database to support fast social graph queries, and SaNDVis provides a visual analytics UI to help users manage this complex, multi-dimensional information. SaNDVis not only represents the social graph, but also highlights evidence for why the relationships exist as well as linking to related documents.
Visualization Technique
As people are the focus of relationship discovery, the largest component of SaNDVis is the social graph view. The top n people related to the user's topic are displayed (by default, n=25). However, in this view, people are not simply represented as a textual list but instead displayed using a social graph visualization. While such a display is more complex to comprehend than a list, the visualization highlights a pivotal type of information relevant to relationship discovery: social position.
Social position is important because users will typically be un-familiar with most of the people who match their analytic queries. However, by seeing how those people connect to themselves, their peers, or known individuals, users can gauge which people are better suited for their relationship tasks. Social position can also be a barometer for judging whether or not a matched person might be willing to communicate with the user. For instance, prior work shows that 'social software participation' is a significant signal of likelihood of contact. Finding a matched person with few social connections may be adequate but finding a well-connected individual might better meet the user's needs.
Social position, as shown in the above image, is conveyed via a social graph visualization. Nodes represent each of the top people matching the user's topic, and edges represent the types of relationships that connect various people. Each node features the person's name and image. As there can be multiple categories of relationships connecting two individuals, bands are added for each edge representing each category - producing a ''rainbow'' when multiple categories are present. Thus, the thickness of an edge is relative to the overall relationship score, determined by SaND's weighting scheme described above.
While social graph visualizations have a tendency to be complex, sometimes derisively compared to hairballs or spaghetti, SaNDVis's design attempts to maximize visual legibility. Nodes and links are positioned using an advanced force-directed, stress majorization algorithm to minimize node overlaps and edge crossings. The number of nodes is, by default, kept to only 25 so the visualization is optimized to the design guidelines for achieving ''NetViz Nirvana'', but users can increase this number.
Results Highlights
Results of a 12-month deployment of SaNDVis are in the process in building published at ICWSM and VAST. These papers will be put online shortly. Please contact Adam Perer if you would like to view pre-print versions of these papers.