Share |

Big data, bigger view: Advanced visualization for the humanities

Most Pixels Ever: Cluster Edition (MostPixelsEverCE), recently released by the Texas Advanced Computing Center (TACC) at the University of Texas at Austin, US, is an open source software tool that lets researchers – especially those in the humanities – create interactive, multimedia visualizations on high resolution, tiled displays.

Tiled display. Image courtesy Samuel Mann

Supported by a National Endowment for the Humanities startup grant called "A Thousand Words: Advanced Visualization for the Humanities," MostPixelsEverCE is based on an open source programming language called Processing. As a programming toolkit for creating images, animations, and interactions, Processing initially served as a software sketchbook for teaching the fundamentals of computer programming within a visual context. It has evolved into a tool for students, artists, designers, and researchers creating finished professional work.

MostPixelsEverCE is a library for extending Processing sketches to multi-node tiled displays. This makes it possible to render interactive Processing sketches across distributed computing systems on many displays. The library is intended for tiled display systems, but also works in other types of environments. With simple modifications, a researcher can render a sketch across a cluster at the native resolution of the displays, significantly increasing the amount of data visualized at one time. The library is designed to run on Linux and OS X-based clusters.

Visualization clusters and tiled displays help small groups of people collaboratively explore large amounts of data and a range of visualizations, including: high resolution imagery (satellite, aerial photography, and scientific instruments); high resolution movies (hi-res animations and time-series simulation results); 2D information display (maps, charts, graphs, data, and text); and 3D visualizations (complex geometries and interactive exploration of 3D datasets).

Rob Turknett, digital media, arts, and humanities coordinator at TACC notes, “The goal is to make visualization tools easier for humanities researchers to use. The proliferation of digitized textual, visual, and aural resources is a great boon for the humanities, offering opportunities for new kinds of scholarship, but it also brings a new complexity. As the amount of cultural data that scholars work with increases, it becomes crucial to visualize that data on a sufficiently high-resolution display. Conventional display resolutions simply aren't keeping pace with this explosion of online cultural data to be explored."

The work borrows ideas from a library called Most Pixels Ever, created by Daniel Shiffman at the Interactive Telecommunications Program at NYU's Tisch School of the Arts. However, Shiffman's version requires considerable configuration by users, according to Brandt Westing, technical lead on MostPixelsEverCE and manager of the TACC/ACES Visualization Lab (Vislab). "Using Shiffman's work as an inspiration, we re-wrote the software from scratch to work on any type of composite display from laptops to the highest-end visualization clusters and tiled displays."

"Most of the tools that exist for these displays are developed by and for scientists, yet there are many researchers from the humanities and arts who want to do visualization," Turknett says. "The software that we've developed is part of an effort to make advanced visualization systems more accessible to people who may not have a deep technical background."

Software tools will enable a new class of scholars from the humanities to use high-resolution displays and advanced computing to create visualizations, interactive maps, and multimedia. Video courtesy Texas Advanced Computing Center (TACC).

Jason Baldridge, an associate professor in the Linguistics Department at the University of Texas at Austin, researches a range of issues involving the connections among language, computation, geography, and time. His research has the potential to improve a variety of applications based on natural language processing and text analytics widely used to analyze unstructured data.

"We're awash in very large collections of text and we simply cannot read through all of them," observes Baldridge. "We need improved tools for exploring text collections so people can find interesting patterns, and this new software developed by TACC can help us accomplish this goal."

Baldridge's current project involves analyzing a collection of several hundred texts from the Civil War. "Using the new software on TACC's Stallion display cluster, we're parallelizing the computations to do visualizations and view an enormous amount of data at once, both of which are incredibly useful in exploring the output from our models and applications." Baldridge uses the software to identify text passages from memoirs that are connected to a particular city and time. "Because they connect language to the real world, they lend themselves to novel visualizations that illustrate the geographical and historical context of text collections and language use," Baldridge clarifies.

Tanya Clement, an assistant professor at the School of Information, at the University of Texas at Austin builds tools for scholars who analyze literary texts. "Humanities researchers have not had access to large data sets until recent decades. It's essential for humanities scholars to be involved in the creation of new software and tools so the concerns of the community are reflected," Clement notes. Both Baldridge and Clement collaborated with TACC on the project.

MostPixelsEverCE is in use at two other institutions, the University of Texas at El Paso, US, and the University of Texas at San Antonio, US, at the Center for Simulation Visualization and Real-time Prediction. The tool is open source and available for download. For more information, visit Most Pixels Ever: Cluster Edition and A Thousand Words: Advanced Visualization for the Humanities.

A version of this story first appeared on the Texas Advanced Computing Center (TACC) website.

Your rating: None Average: 4.9 (14 votes)