THE CROSS-DISCIPLINARY CHALLENGES OF VISUALIZING DATA
Isabel Meirelles, Associate Professor, graphic design, Northeastern University, U.S.
Rikke Schmidt Kjærgaard, Assistant Professor, Interdisciplinary Nanoscience Center,
Aarhus University, Denmark
Miriah Meyer, Assistant Professor, computer science, University of Utah, U.S.
Bang Wong, Creative Director, Broad Institute of MIT and Harvard, U.S.
Data are growing intensely and pose a substantial data visualization challenge. Every day we generate massive amounts of data in the form of photos we take, electronic messages we send, and queries we make to Internet browsers. Numbers of shared data have more than doubled during the past five years (Internet World Stats). The complexity of collecting and analyzing big data in meaningful ways challenges and changes the fundamental of research, impacting research methodology itself and our approach to tool design. From medicine to sociology, analysis of quantifiable data about life on Earth has allowed researchers to gain new insight making us better understand genetic and molecular underpinnings of disease. Analyzing big data encourages new research questions, triggers new data interactions, and motivates new research technologies and methods. In particular, we experience an enormous increase in development of visual technologies, tools and methods for exploring and analyzing data. Making sense of data visually is fundamental to most research processes. We depend on visual patterns and guidance in everything we see. When reviewing an article in a journal, we first explore graphics and visual representations before reading the actual paper. Visual representations and analytical tools have the potential to augment our reasoning capacities by facilitating perceptual inference, discover patterns, and expand our working memory.
Visualizations have emerged as an important component of understanding and interpreting data. In the emerging field of visual analysis, several key areas of focus exist: 1) analytical reasoning techniques (enabling users to obtain deep insights directly supporting assessment, planning and decision making); 2) visual representations and interaction techniques (allowing users to see, explore, and filter large amounts of information into intelligible partitions); 3) data representations and transformations (converting all types of conflicting and dynamic data in ways that support visualization and analysis); and 4) techniques to support production, presentation and dissemination of the results (in the appropriate context to a variety of audiences) (Thomas & Cook 2005: 4). These focus areas are to be pursued through collaboration and interaction between subjects such as scientific analytics, information analytics, knowledge discovery, cognitive and perceptual science, expertise data management, geo-spatial analytics, human-computer-interaction, and many more (Keim et al. 2006). A cross-disciplinary framework is fundamental in practicing successful data visualization and is the substance of the challenges presented in this paper.
The access to massive amounts of data together with an urgent need of tools to help us process information and create reliable representations have fostered the current trend of exploiting visual methods for discovering new knowledge and helping in decision-making processes. Most research disciplines are using visualization techniques to interpret and gain insight into the huge volumes of unstructured data. In our work we have identified two major challenges for data visualization and information design:
- The lack of communication and exchange of visual methods, tools and strategies across different research areas, resulting in unnecessary duplication of efforts.
- The lack of a common set of skills as a basis for more effective collaborations between people in different fields to develop and improve visual tools.
While we see domain specific efforts advancing visualization techniques, they often remain part of the knowledge of these particular communities, and are rarely shared across domains. For example, there are two main venues for visualization of biological data: Visualizing Biological Data (VIZBI) and BioVis (part of the Institute of Electrical and Electronics Engineers’ VisWeek). Although these meetings intend to integrate disciplines, few biologists attend (the expert domain side of these projects). It is necessary to explore integrative ways of sharing efforts in devising visual methods so as to help advance, not only the field of data visualization, but also the areas in which visualizations help advance knowledge.
Visualization tools and software solutions are increasingly designed to facilitate particular projects, data or results, and rarely offer a general approach. Furthermore, making such tools is often expensive and time consuming,requiring methodical approaches from practitioners in many disciplines, and can only be done in highly interdisciplinary collaborations between scientists, computer scientists, data designers, scientific illustrators, and many more disciplines. This constitutes a significant challenge for data visualization. What is needed is a ‘common language’ and shared skill sets that transcend conventional professional boundaries from computer science to graphic design. A research team needs to be able to interpret the underlying structure of a dataset in a very abstract, algorithmic way, as well as understand the process of mapping data attributes to specific visual encoding channels — skills that are natural extensions of basic computer science principles. Similarly, practitioners need to be able to extract the tasks and define the visual representations that will best capture the essence of the dataset — skills that relate to fundamental concepts found in design. Practitioners of data visualization need to work in multidisciplinary environments and communicate with field experts in order to extract knowledge about specific application areas — competences of critical analysis, communication abilities, and social skills are all highly important in a successful collaboration.
In our personal experience, each of us had a subset of these required skills and had to learn the others in order to have meaningful interactions with each other. Countless resources have shortened the way to knowledge. Technologies and databases offer free access to a number of documents, files, figures, numbers, facts, etc., telling stories about the world we live in. We now need to ask: How do we gain proper and relevant insight? How do we define the appropriate methods to explore, analyze, and communicate information? How do we go about teaching these skills and methods to the upcoming generation of visualization practitioners and data scientists? We argue for a more effective, structured and scalable way of doing this, rather than the serendipitous trajectories that we ourselves went by. We see several major challenges ahead for data visualization, from education of future generations of data designers to supporting mechanisms to those already working in the field. We ask, can we define a common knowledge base and think differently about teaching computer science and design principles with the goal of visual analysis in mind? How do we bring these common sets of skills to cross-disciplinary teams of current practitioners?
To answer these questions and to meet the cross-disciplinary challenges of data visualization set out by this paper, we present the following recommendations to advance the ongoing effort in the field of data visualization:
- Establish channels for cross-domain communication (e.g., professional meetings, peer-reviewed publications, community maintained web-based forums, etc.)
- Develop an interdisciplinary common ground.
- Carry vision for funding bodies on the potential payoffs for cross-domain initiatives.
1. Establishing channels of cross-domain communication
The recommendation and ambition of creating a common platform for knowledge exchange is aimed at a diverse data visualization community, including data producers, data designers, graphic designers, computer scientists, analysts, illustrators, etc.
Most of us use visual methods and tools to synthesize information and data. We do that to analyze and reason about our questions and subjects, to discover patterns, to understand structural features, and to communicate ideas and results effectively, etc. However, current methods for data visualization and information design are dispersed and rarely subject to cross-disciplinary knowledge exchange. Individually, all disciplines involved in data visualization advances the research and practice of visualizing data by devising new visual methods, new algorithms, and new design features, etc. Individual research communities share their best practices in domain specific conferences, meetings and journals. Researchers only join other parties out of sheer curiosity or by coincident, and their knowledge rarely overlaps without self-motivated pursuit and communication. For data visualization to advance as a distinct research field we need more immediate interaction and direct knowledge sharing. A common platform for knowledge exchange and sharing of best practices would provide that. Such a platform would not only strengthen research interaction, tool development, and design ideas for data visualization, but also provide valuable knowledge of design initiatives and methods that failed to perform as expected.
To encourage cross-domain and interdisciplinary exchange we recommend creating platforms including cross-disciplinary meetings, research conferences and workshops, and online open repositories for sharing knowledge of ongoing and concluded research projects, published papers, current tools and method databases, call for papers, etc. allowing documentation, storage, search, evaluation and retrieval of research and knowledge related to data visualization and information design. It will be advantageous if strategies, methods and tools created in a particular field are accessible to other domains. We are a growing community of practitioners in the field of data visualization. Having a common ground and means to share experiences can help advance the field, and further encourage interdisciplinary cooperation and collaboration.
2. Developing an interdisciplinary common ground
The recommendation of creating a necessary interdisciplinary common ground encourages and emphasizes the desire and need of a common visualization ground at university level. This common ground for discussion and collaboration is aimed at members of the diverse data visualization community in academia.
Currently, few strategies defending or describing a common ground in data visualization and information design exist. New developments of tools and methods tend to be subject to casual and individual demands, subjective design ideas, visual consensus in the particular field, and lack of visual training for the information designer or data analyst. As pointed out in the previous section, the education of young researchers is also constrained to domain specific techniques and students are rarely exposed to or encouraged to use visual analysis methods from other fields. The curriculum, and hence the education of students working with any kind of data visualization, tend to be narrow in focus, leaving any use of untried ways or reasoning up to the individual student. There are several initiatives that promote numerical literacy across all ages and gender: from incentives toward strong mathematical and scientific foundation in K-12 education, to encouraging women to embrace STEM education. But there is hardly any initiative that universally addresses the need for spatial and visual thinking along with analytical and numerical reasoning. The challenges posed by big data and the burgeoning practice of data visualization require us to rethink educating of the next generation of data visualizers at university level.
With the objective of bridging engineering and design aspects of data visualization, and thereby advance educational settings and curricula, we recommend forming taskforces to trace and outline a common pedagogical approach incorporating visual and analytical, statistical and computational core values and techniques. A proposed common ground and educational basis would include the analytical and data oriented models and methods from computer science, allowing a common language for structure and complexity of visualization systems. From the arts and design, we would recommend including the perceptual and human centered methods and strategies, allowing for a discussion of form, perspective, and usability. We believe that the basics of these two areas of enquiries and two ways of reasoning can be brought together, enriching the way we communicate in collaborative groups as well as adding skills that can benefit the way we work in either one of these groups. The effort will encourage disciplines to adopt curricula that are domain specific while attending to interdisciplinary pedagogical needs.
3. Funding cross-domain initiatives
Recognizing the importance of diverse skills for developing effective visualizations means providing resources for researchers and practitioners to come together to reach a common goal, while also pushing the boundaries of their individual domains. Funding agencies need to financially support a holistic solution to dealing with big data, which includes funding a broad range of research areas that approach the problem from different perspectives
There are funding opportunities in place for visualization research that we consider fundamental and that should continue as they help advance the visualization field in general. However, there seems to be a lack of funding for tackling visualization research (broadly construed) in the context of driving, real-world problems. For example, in tackling a specific biological question, the need for, and difficulty of, developing appropriate techniques and tools for making sense of the data should be accounted for in funding proposals. Proposals that include (equal) partnerships between application experts and visualization researchers should be encouraged, with appropriate resources for both fields to advance.
Furthermore, we recommend providing additional funding for encouraging cross-disciplinary initiatives as those described in recommendations #1 and #2. We believe that supporting the study of interdisciplinary teams across domains will be needed if we are to define and promote a common visualization platform and educational system. Funding could support, for example, launching selected pilot pedagogical projects to pioneer recommendation #2, as well as for archiving and retrieving research from diverse domains as described in recommendation #1.
Data is meaningless unless we can reason about it and ultimately gain insight through analysis. One of the main challenges we face in research environments working with big data is to find appropriate methods and strategies to make sense of and advance knowledge in academia, businesses, and government. To meet this challenge this paper focuses on visualization methods in terms of facilitating analytical reasoning processes in diverse research domains. Visualization is one method to address complexity, we believe it is the most ubiquitous, and one that can be used in meaningful ways together with other techniques, as it effectively actuates our sensory and cognitive systems. However, in order to advance the field of data visualization we need to create common grounds for sharing knowledge across domains, while also advancing curricula to prepare a new generation of practitioners.
Thomas J.J. and Cook K.A. (editors) (2005) Illuminating the Path: The Research and Development Agenda for Visual Analytics. IEEE Press.
Keim D.A., Mansmann F., Schneidewind J., and Ziegler H. (IV 2006) Challenges in Visual Data Analysis, Proceedings of Information Visualization. IEEE, p. 9-16.