The Surface of the Metaverse
Tonight, Microsoft announced its new "Surface" multi-touch interface and hardware system. Looking for all the world like one of those old Ms. Pac Man video game tables found in older bars and pizza joints, the Surface device combines a high-power Windows computer with a 30" display, set horizontally. Surface is controlled by touching this screen with one or more fingers, manipulating images in a reasonably intuitive manner.
The system bears a remarkable resemblance to the multi-touch display Jeff Han demonstrated at TED in 2006, but it's unclear just how much (if anything) he had to do with the Microsoft product. Surface does include some nifty features that Han's vertical-mounted screens couldn't do, such as recognizing when a digital devices has been put onto the table and reacting accordingly -- downloading pictures from cameras, opening up a jukebox app for a MP3 player, etc.. I was impressed by the gestural controls for these features (such as "tossing" a file towards a device to upload it); a key aspect of a usable kinesthetic interface has to be a subtle sense of physics, so that "objects" (virtual though they may be) have a perceived mass and momentum.
Okay, nifty tech, undoubtedly terrifically expensive for the foreseeable future, but if it's at all functional -- and my guess is that it will be -- it's probably a progenitor of a device we'll have in our homes by the middle of the next decade, and will find in cereal boxes not too much longer after that.
What struck me while watching the demos and reading the breathless write-up in Popular Mechanics (of all places) was that the multi-touch display system is probably the apotheosis of the two-dimensional interface model. It comes the closest to treating virtual objects as having 3D space and weight without compromising the utility of more traditional flat documents and menus. Users aren't limited by a single point of contact with the display (e.g., a mouse pointer), breaking a ironclad law dating from the earliest days of computers. In the end, a mouse pointer and a text insert cursor are making the same claim: here is the sole point of interaction with the machine. Multi-touch interfaces (whether Microsoft's Surface, Apple's iPhone, or whatever) toss aside that fundamental rule.
The appeal of Surface (etc.) for computing tasks, however, will be limited in many commonplace arenas. Multi-touch isn't going to make spreadsheets, blogging or surfing the web any simpler or more powerful. It will have some utility in photo and video editing, although here the question of whether greasy fingers will prove a regular problem rears its head. No, the real market for multi-touch is in the world of the Metaverse, especially in the Augmented Reality and Mirror Worlds versions.
(The final version of the Metaverse Roadmap Overview will finally be out in the next couple of weeks, if not sooner, btw.)
The core logic of both Mirror Worlds and Augmented Reality is the intertwining of physical reality and virtual space, in large measure to take advantage of an information substrate to spatial relationships. This substrate relies heavily upon abundant sensors, mobile devices and a willingness of citizens to tag/annotate/identify their environments. The Augmented Reality form emphasizes the in situ availability of the information substrate, while the Mirror Worlds form emphasizes the analytic and topsight power. In each case, the result is a flow of information about places, people, objects and context, one which relies on both history and dynamic interconnections. This may well be the breakthrough technology that makes it possible to control information flows.
Both of these manifestations of the Metaverse could readily take advantage of an interface system that allowed complex kinetic and gestural controls, with Mirror Worlds working best with a massive table/wall screen, and Augmented Reality working best with a hand-held device -- or maybe just the hand. One of Jeff Han's insights while developing his multi-touch system was that human kinesthetic senses need something to push against to work right. "Tapping" something virtual in mid-air may look cool in the movies, but runs against how our bodies have evolved. Our muscles and minds expect something to be there, offering physical resistance, when we touch something. Rather than digital buttons floating in mid-air (or a total reliance on a so-called "conversational interface"), mobile systems will almost certainly have either a portable tablet or (in my view the eventual winner) a way to use one hand drawing on another to mimic a stylus and tablet. The parallel here is to the touchpad found on most laptops: imagine using similar gestures and motions, but on your other hand instead of on a piece of plastic.
There are some obvious downfalls to this interaction model -- from the aforementioned greasy fingers to the ergonomics of head and arm positions in extended use -- but my guess is that the number of innovative applications of the interface (most of which haven't even been imagined) will outweigh any initial physical clumsiness.
Comments
When I saw the video of this last night I was impressed. The previous surface computing stuff MS demonstrated in March was fun to watch, but this really does seem much more like the sort of thing that could spur adoption, increase volume and lower costs.
Posted by: csven | May 30, 2007 5:58 AM
Jamais, you and I are just barely old enough to remember "gorilla arm" complaints from people using light pens on monitors.
I completely agree that there needs to be a physical interface, and that it needs to be something we're already accustomed to using.
Posted by: jet | May 30, 2007 7:31 AM
I've been scratching my head for a couple of years about why GUI environments (including video games) don't support multiple mouse cursors when plugging in a second USB is trivial.
Using a one-button mouse like it was your hand is like trying to work with one hand and three fingers tied behind your back.
Posted by: Nato Welch | May 30, 2007 12:12 PM
Without seeing this gizmo, it sounds like an extension of the drawing tablets and light pens (the mice have learnt to leap, thereby giving the illusion of a multiple point interface)
It does have the advantage that what you see is what you manipulate.
What's so difficult, in this age of bluetooth, wii and GPS navigation, of having a system that constantly tracks the point of your light pen in 3d? Heck, why not have a glove with tracking points at the tips of each finger?
It might even fit in a cereal box!
Posted by: Tony Fisk | May 30, 2007 4:53 PM
If I need to be creative, the best "first stage" device is a glove, with small "ballpoint rolling ballbearings" in the fingertips. That's 5 fairly delicate contact points and no smudging at all. Losing a finger or hand in ten years might prove a bigger disaster than it is already.
The same would hold for augmented reality; the fingertips could contact small feedback devices (vibration) in the parts of the hand, when you "vtouch" something. The mind would soon learn to recognize such subtle vtactile clues, applying finesse well into the millimeter range.
Imagine 3D painting with such a set of gloves. I recently acquired a WACOM with the help of a friend; the tip of the stylus holds a massive concentrated wealth of feedback devices. The same technology could be compressed into the gloves, whether they work on a flatscreen display surface or (several years later) in a telepresence projected environment.
Imagine me creating 3D paintings, in a high-definition maya/photoshop environment in what, 10 years?
*drools like an idiot*
Posted by: Dagon | May 31, 2007 1:55 PM
Jamais, did Google release their new "Street View" feature on GoogleMaps at this conference? I first noticed it two nights ago.
I think it fits into both an augmented reality and virtual world scenario.
I work at a high school and the kids love it -- non-stop wows! Reflection back onto the physical world through virtual mobility and consciousness, I suspect, renders a heightened awareness of reality, and will thus lead to new consequences and opportunities for human interaction and consciousness in the physically-real domain.
Posted by: matt waxman | June 1, 2007 4:25 PM
Web Fingers
The latest input device,
We have combined press on nails, photovoltaics, thin film batteries, micro accelerometers, and wireless communication to create an unobtrusive way for a computer to know the position and movement of each of your fingers. Thus creating an intuitive gestural interface that small, light weight, and powered off ambient light.
The prefect companion for your I-shades.
Posted by: jim moore | June 1, 2007 7:56 PM
I'm having a little bit of trouble seeing how "surface" itself fits in to augmented reality or augcog. Augmented reality really needs to be in a mobile device to work. Your comments on feedback are well taken, but it would seem to me that something like Dagon proposes above, combined with a "traditional" display, is more suited for augcog than a touch screen.
What do you think?
Posted by: Howard Berkey | June 3, 2007 10:15 AM
In principle, what Dagon proposes would be appropriate for augmented reality than a touch interface, but I strongly suspect that the broader public would reject the need to wear gloves for casual interaction with the infosphere. One of my general rules of technology is that tech that adapts to people will tend to win out over tech that requires people to adapt to it.
Another advantage of a touch-screen handheld for AR is the clear modality: there's no question of whether or not the gesture is meant to be AR interactive. You could also instantiate an input mode with the "drawing on hand" system with an easy but uncommon gesture, such as tapping on one's palm with two fingers held close together.
Jim Moore has it -- press-on nails-based computing.
Matt, I don't know if Google Street View was unveiled at the D5 conference, but it certainly has made a splash...
Posted by: Jamais Cascio | June 3, 2007 11:08 AM