Hard Drive: An Experiment in Intervisuality

Katya Sander

What images are on your computer? Do you know what they look like out of context? What might an image  outside its context be? What is the context of an image, and how would we define the framework in which an image gains its meaning? And, ultimately, what image of readers and users is produced if those images are understood in terms of the “image archive” or “image history” they’ve accumulated?

Instead of intertextuality, can one perhaps speak of an intervisuality?  As a way of understanding how the meaning of images are shaped by other images related to or existing around them?

As for intervisuality, I’m drawing on notions of intertextuality as introduced by Julia Kristeva [1]. Kristeva referred to texts in terms of two axes: a horizontal axis connecting the author and reader of a text, and a vertical axis connecting the specific text to others existing around it. What unites these two axes are, according to Kristeva, shared codes. She suggested that every text and reading depends on prior codes, “every text is from the outset under the jurisdiction of other discourses which impose a universe on it” [2]. Kristeva introduced the idea of “structuration”: Instead of limiting our attention to the structure of a text, we should study how this structure came into being. This is its “structuration,” which involved siting the text “within the totality of previous or synchronic texts” of which it was a “transformation.”[3]

One might try to imagine images in a similar way, as existing always “within the totality of previous or synchronic images” in relational structures determined by shared codes.

The Hard Drive project attempts to think through a number of possible relational axes among images, texts and contexts. An online magazine is a perfect setting for this. It contains different texts by different authors, all related to each other by the framework of the issue itself, and each of these texts is accompanied by images, also related to each other to varying degrees. These images are further also related to the reader/viewer in terms of what images s/he might already know, appreciate, use, store, or even own. Furthermore, the fact that the magazine exists online, allows for the flexible and ever-shifting embedding of images, all chosen through specific criteria and changing from one user to the next. Hard Drive renders visible a set of relationships between images and texts, and between authors and readers/viewers, and perhaps even between images and images, existing both online and on our hard drives.

Images and text

Western cultures share a tradition of subordinating images to the textual, of lending the textual an authority over pictures. Text defines, anchors, and names the images, the motives and spaces of which are cut off at the frame. The frame is what makes an image, but it is also a violent tool; it isolates the visible and differentiates it from that which is outside of it, thereby producing a new “not-visible.” To make up for the sudden removal of an entire universe through the said frame, a text is added: a title, a place, a date; a name of a person, animal, building; or perhaps an explanation of an object and how to use it. The text gives us a guide to that isolated and free-floating fragment; it gives us the possibility of re-establishing some kind of order around it. But this new order pertains to the world of the spectator—a subject who not only looks at images but also reads and speaks and hopes to identify something.

When searching for an image, we might have a vivid idea of what the image looks like, but the only way to search is by using text. Going through art historical archives, collections, museums, catalogues, we have to remember text: a country, a name, a pose of a model, a motif, a type of landscape or style. Language is the ground on which archives are built. That is what any museum or collection primarily shows us: how and to which extent images can be mapped by and organized through language. However, museums and library archives are no longer the obvious place to search for images. When looking for an image today, one goes online.

Different search engines suggest different routes through ever-expanding clusters of images. And so far, most search engines cannot actually see images. Most of them sort and categorize images in terms of what they are told the images depict, that is, through text—through tags. A tag is similar to what in traditional taxonomy is called classification. A tag gives an image a handle, an ability to be placed somewhere and to be found through a search. A tag is a name for a category, but unlike traditional taxonomy, a tag doesn’t have to be predefined. In traditional archives, a limited number of classificatory terms are defined beforehand, and so each item will have one correct way of being classified. In a tagging system, however, a user or author can use any word to classify an item at any point in time. There is no “wrong” categorization, and an item may have an unlimited number of different tags, corresponding to an unlimited number of categories. Instead of many items being related to each other by belonging to the same category, one category—or name—is related to another by being assigned to the same item.

Could tags thus be understood as an indication of another hierarchy emerging, one in which language is mapped by images and not the other way around? Tagging systems are often described as nonhierarchical or bottom-up, instead of the traditional taxonomical top-down system through which categories are predefined. However, in relation to the image itself, text is still the criteria for access. The searchers, in using an index, must express their interests in the same language that was used by the indexers. In the case of tags, the indexers might be anyone—in many cases simply the users. So the focus on language is emphasized even further: one does not have to study a specific archive and its structure to be able to search it; instead, one has to imagine how “most people” would name this or that item. An idea of the general or the normal—perhaps the universal? overwrites the specific architecture of specific archives, and their singular claims for universality.

Images and images

For years, researchers around the world have worked on ways to make computers able to actually see images: to recognize edges, surfaces, forms, scales, colors, depths, etc., and to relate them to others that are similar, and in this way “recognize” something seen in an image, thereby transcending the linguistic framework, the matching tags and terms. This prompts several questions: What could the relations between images within an ever-expanding image archive, structured through visual properties rather than linguistic ones, ultimately be? Can we imagine images organized through this kind of “purely” mechanic visuality? How would they be seen—i.e., organized, understood, categorized? And how would they be ordered, searched for, and found? What might ”similarity” or ”difference” mean, in terms of this “purely visual,” without any linguistic interference? Is that even possible for us to imagine?

So far, a small number of search machines [4] offer searches through image identification technology rather than keywords, metadata, or watermarks. They search for images based on how they look, rather than based on which tag words describe them.

Some of these search engines specialize in “automated image matching”: locating identical or modified versions of an image, thereby telling the searcher—or owner—where it came from, or how it is being used. Others make it possible to send images as queries from smart phones to search engines or databases. Google’s recent “Goggles,” for example, recognizes strings of letters, or symbols in digital images, logos, landmarks or objects, and converts the information into a symbolic format such as plain text, which can then access information about the object shown. (Photograph your immediate environs and find out where might be the nearest tanning salon or Art & Human Rights conference.)

Other software technology under development is intended to do for digital images on the Web what Google’s original PageRank software did for text-based searches: not only operate with image-recognition software but also combine these with mathematical algorithms for weighing and ranking images according to what seems most useful. Using technology designed for automatic face-recognition, this software can now also identify objects other than faces—mountains, horses, tea-pots, motifs that are instantly recognizable to humans—and prioritize the results in order to present those best fitted to the searcher. Whereas the first versions of Google’s PageRank would prioritize pages according to what was assumed relevant  to general user-activity and links, newer versions identify the specific user and prioritize the shown pages according to the previous activities and searches of this individual. This development has been criticized for ultimately producing a kind of mirror of the users themselves: a search for the term mouse, for example, brings up very different results for a biologist than it does for a computer engineer, and a search on the word race offers different results for a right-wing extremist or a human-rights activist (provided they each use their own computer). We know this technology very well from, for example, Amazon’s automatic suggestions—“Continue Shopping: Customers who bought items in your Recent History also bought…” – or the iTunes “TuneUp,” which identifies songs on your computer and compares them to those that other users, with similar songs, have on their computers, in order to suggest additional material for you.

Hard Drive

Hard Drive operates with a similar technique—understanding your hard drive as your image archive. Thus, the images you will see accompanying the texts in Red Hook are “suggested images for this text” based on visual image-recognition technology in combination with keywords—tags—identified in each of the texts. Image types and content will be visually identified on the hard drive(s) of the computer you  use when you read the journal (TIFFs, JPGs, and PNGs). The images you see along with each text are chosen from the vast archive of images accessible online. The choices are based on images you already have on your computer, identified through “purely visual” properties and prioritized in relation to predefined tags in each text.

Red Hook guarantees that all information regarding the images on your computer will be immediately deleted and rendered impossible to use once you exit the journal’s site.


  1. The term was first used by Mikhail Bakhtin, and was thought to apply as much to reading as to writing. return to text
  2. Julia Kristeva, cited in The Pursuit of Signs: Semiotics, Literature, Deconstruction by Jonathan Culler, 1981, p. 116. return to text
  3. From Le texte du roman by Julia Kristeva, cited in Language and Materialism: Developments in Semiology and the Theory of the Subject, by Coward & Ellis, 1977, p. 52. return to text
  4. A small number of search machines, such as TinEye, Piximilar, PixMatch, SAPIR  and others, offer searches through image identification technology. return to text


-- Download Hard Drive: An Experiment in Intervisuality as PDF --