Share |

Extreme scale video image retrieval and research

These days, everywhere we go, screens bombard us with images and information. We can't escape the barrage, but what is the impact? What is it doing to us as human beings? With novel visualization techniques, Virginia Kuhn and her team are beginning to uncover some intriguing clues.

Virginia Kuhn. Image courtesy GameChangers.

Kuhn is an associate professor in the School of Cinematic Arts at the University of Southern California (USC) in Los Angeles, US. She joined USC in 2005 after successfully defending one of the first born-digital dissertations in the country. Kuhn challenges academic conventions borne of print literacy, integrating multimedia literacy into courses across the academic spectrum and inspiring new research and teaching practices. I caught up with her and Dave Bock (DB) to learn more about the Large Scale Video Analytics Project and Movie-cube.

Can you share with our readers where you’re at and where you’re going?

VK: Since 2007 I have served as cinema faculty and also as the associate director of the Institute for Multimedia Literacy (IML), one of the longest running new media programs in the US (founded in 1998). Just this year we became the newest division of the School of Cinematic Arts at the University of Southern California. The IML is now just one research unit in the Media Arts + Practice (MAP) division. This evolution, I think, speaks to the growing importance of research and teaching around cinematic language.


When did work on the Large Scale Video Analytics project begin and who is involved?

VK: With support from the XSEDE project and their Extended Collaborative Support Services – together with allocations on Gordon at the San Diego Supercomputing Center (SDSC) – the project took hold in 2012. The NCSA’s Institute for Computing in Humanities, Arts, and Social Science (ICHASS) reached out to initiate and collaborate on the project.


What sorts of needs are driving the project?

VK: The problem with studying massive video archives is that they would take longer to watch than any one person could manage in a lifetime. Now, most every person has at least one (if not two) devices that can capture images. There is a burst of widespread moving images and video, and we really have no way of studying this time-based medium with any precision. We desperately needed to automate the process.


Spatial representation of time-based media. Image courtesy Dave Bock.

Are there other drivers?

VK: Image tagging is a huge issue. When people tag images, the words they use for both objects and concepts can be wildly divergent. Contradictory and incomplete tagging really defies our ability to do any type of satisfactory search of video archives.


What results have you had so far?

VK: With Gordon at SDSC, we’ve been able to crunch about 300 films; we can scan them quickly and see different things in the footage. When we start to expand to thousands of films, it will shape the sorts of questions we can ask. A really interesting part of this project is Movie-cube. It’s just one of the novel visualization tools that Dave Bock at NCSA has developed.


What is Movie-cube?

DB: Movie-cube is a technique that attempts to represent and display aspects of a movie sequence within a single image. Specifically, this technique shows how space and time in a movie can be represented simultaneously. The custom software converts a digital video sequence into a three-dimensional dataset – termed a movie cube – by extracting and ordering each frame of the sequence along the Z axis.


What happens after the video sequence is converted?

DB: A variety of common scientific visualization techniques can be used to show multiple perspectives, something unattainable in traditional methodologies. When orthogonal planes are sliced through the dataset, for example, unique and interesting patterns emerge showing how elements within the movie change over time.

How does working with cinematic data differ from scientific data?

DB: There really are no standards for graphical representations of cinematic data, as there are with scientific data. This challenge makes this project very appealing. Each new form of representation I develop brings about a brand new set of interesting ideas and insights. With the Movie-cube slices, we saw shot changes, camera and scene movement, and interesting aspects of lighting.


Movie-cube demo; rendering slice planes across time, first vertically and then horizontally. Video courtesy Dave Bock.

What types of things can be deduced from shot change, camera angle, lighting?

VK: For example, in the old 1950s safety videos you can see the appeal to authoritarianism because the cop is in the frame the entire time – he never moves. The boy is in the frame most of the time, and the girl is there for only a few minutes. Even the direction of the camera angle will uncover useful information. For instance, people shot from a low angle have more power relative to the viewer, whereas the reverse is true for people shot from a down angle. Beyond just being able to find trends and changes to the way things are filmed over time, we'll really be able to identify larger shifts and what their implications are. We don't receive these cues on a conscious level, but they still impact us.

What does lighting reveal?

VK: The color timing on footage, even if it is shot say in Africa, is usually developed in the west and timed to a cool palette. This makes Caucasians look okay, but doesn’t accurately represent people of color at all. Looking across multiple archives we’ll be able to ask questions about how widespread this issue is, especially now that we have more footage being created.


What is it like working with humanists as opposed say to biochemists or astrophysicists?

DB: Since visualizing video data is rather new, the researchers don't have any preconceived ideas of what representations should look like. This mode of working is often very different from other fields, where specific types of graphical forms already exist.

VK: With more traditional scientists, the data is not shifted by the way it is visualized – but, at the heart of my work, the way data looks impacts what it means. It’s critical. Dave has been an incredible resource, helping us find or see meaning through visualization.


Do you have any advice for other humanists who may be interested in large-scale visualization?

DB: My advice would be to contact XSEDE. We would like the opportunity to continue work on projects like these. at NCSA is the first point of contact.


Where do things go from here?

VK: The next component of this project is getting crowdsourced tagging going, so the system can become smarter as more people use it. Another obstacle on the horizon is image recognition. It’s still not all that effective.

With 1.1 billion smartphone users in the world, the amount, type, and pace of images we see are dramatically different than in past years. We have so many devices providing individualized output, impacts are changing as a result. Simply viewing something has an impact on our ideas, emotions, and thinking. Kuhn believes humans in general are losing some amount of impulse control, which speaks volumes to the issue of shortened attention spans.

Kuhn’s project is one I will be watching -- her work can be found in peer-reviewed online journals such as Kairos, Electronic Book Review, Transformative Works and Cultures and Enculturation, as well as in print.  She has edited two digital anthologies and is currently finishing a print anthologytitled Future Texts: Subversive Performance and Feminist Bodies.

Your rating: None Average: 4.6 (5 votes)


Post new comment

By submitting this form, you accept the Mollom privacy policy.