The “Uncanny Valley”, success of CGI, and the human storyteller

William B. Demeritt III
June 11th, 2012

In the June 2012 issue of International Cinematographer’s Guild Magazine, Garrett Brown was interviewed about his work as a cameraman and his revolutionary invention, the Steadicam. In reading the review, I came across this snippet:

Will the human touch of an operator moving a camera through space lose favor with younger audiences weaned on a diet of CGI moves and effects?
GB: I surely hope not because that human presence is our connection to the narrative. The speeder bike chase in Return of the Jedi was effective because the POV was somehow still anchored and terrestrial – I was actually walking through the woods. The CGI boys would have made the trees ten thousand feet tall, and the speeder bike would have been flying at absurd altitudes, like a dogfight. Look at the pod race in the fourth Star Wars film. To my eyes it’s less gripping, yet much bigger in scale.

For those unaware, he shot the plates for the speeder bike scenes “Return of the Jedi” as well as the speeder bikes on a stage. In those plates, Garrett actually walked through a real forest with camera on steadicam, brushing past branches and downed trees at normal speeds. The result was the intense, fast-paced chase scene in the legendary final installment of the Star Wars trilogy.

He also refers to a scene in “Star Wars – Episode I: The Phantom Menace” which features an epic “pod racer” scene. Here, you have two similar scenes: fast paced, super kinetic race/chase scene swapping from lead to follow of multiple subjects, including the story’s protagonist. Both required plates and practical work, but even by the person involved in shooting the former, the latter seems less compelling.

Years ago, I was anticipating a feature film called “Final Fantasy: The Spirits Within”, mostly because I couldn’t wait to see a “Final Fantasy” story transition to the screen (I’d recently finished Final Fantasy VII, and to date it’s one of the best game stories ever made). Moreso, I was really enthusiastic about the technology driving CGI graphics towards “photo-realism”.

The film “bombed”. With a production budget estimated at $137 million, the film has earned since release on July 11, 2001 $85 million globally. The story was reviewed mostly as acceptable, and the effects were “revolutionary”. So, why did the film tank? In a way, it was a 108 minute pod-racer scene, only without any of the practical filming. The entire film was pushed to the limits, “photo-realistic” computer generated human actors created in CG realms. Was computer generated imaging to blame?

However, since 2001, audiences have supported the success of countless Pixar, Dreamworks, Fox Animation Studios, Nickelodeon Animation films and more. Clearly, audiences are hungry for CGI films, so much so that on January 12, 2004, Walt Disney Studios closed their Florida animation studio. Many cited that Disney was banking entirely on the popularity of CG animated films, allegedly declaring 2D animation (the style that created the Walt Disney empire) “dead”. The human hand was officially severed from animation.

Why is CG the way of the future of special effects and animated film? Why has it done so well for itself in the last 10 years despite the failure of “Final Fantasy: The Spirits Within”? And why did the pod-racing scene fail to thrill when the speeder-bike scenes in Jedi succeeded? Why isn’t “better” best?

In 1970, robotics professor Masahiro Mori released a paper entitled “Bukimi no Tani Genshō”, in which he coined the phrase “the uncanny valley”. In quick summary, the theory hypothesizes that human subjects have a familiarity and acceptance of things mimicking human likeness up unto a point where nearly-real has a rapidly degenerative effect on realism to the point of rejection, uneasiness and rejection. Moving humanoids have a greater rate of acceptance, but also have a deeper “valley” of rejection as they approach near-realism. The rapid change from acceptance to rejection before approaching acceptance again for real humans is the “uncanny valley”.

With the uncanny valley in mind, let’s consider the change in CGI film since “Final Fantasy: The Spirits Within”. 9 of the 10 highest grossing animated films of all time are CG animated films released after November 2, 2001 (“Monsters, Inc” was released 4 months after “Final Fantasy”).

None of them made any further attempt at photo-realistic humans. All of them, while epic in scope and compelling in animation technology, made no further attempt at depicting humans any more than “cartoonish”. Non-humans, animals, monsters were all impressionistic creations with exaggerated liberties but no attempt at realism, from simple humans in “Shrek 2” (highest grossing of all time) to “Up” (6th highest grossing) with a square-faced curmudgeon “human” star.

Also worthy of note: “Toy Story 3” and “Finding Nemo” (#3 and #4 grossing) have significantly reduced human screen time of any kind. The audience knows the whole movie is a cartoon.

I think animation studios got the idea early on: photorealism will always hit a barrier with connecting to human audiences when depicting humans. What about live-action filmmaking? The pod-racer scene had countless non-human characters on screen, none even attempting humanoid appearance. They even featured a real human protagonist in a practical pod-racer. So why do so many people skilled in cinema say it’s “less gripping”?

I would argue that filmmakers are discovering what the CG animators discovered back in 2001: tactile realism is a tool of the storyteller, and to fake realism to excess means a quick freefall into the “uncanny valley”. I would even suggest that this goes beyond human appearance and acceptance, but also event appearance and acceptance.

I was pleased to read this quote on the uncanny valley, the theory they’re still apparently testing last year the University of California San Diego:
“The brain doesn’t seem tuned to care about either biological appearance or biological motion per se… What it seems to be doing is looking for its expectations to be met – for appearance and motion to be congruent.”

Perhaps these discoveries support the idea that appearance and motion congruence goes beyond human interaction, but also events and landscapes? We don’t know what to expect of magical effects, so perhaps a quick wand blast in “Harry Potter” doesn’t offend us quite as badly as a 20 minute elongated fight scene between King Kong and multiple T-Rex dinosaurs? Maybe if the film’s world reflects no similarity to our world, the higher the peak before falling into the uncanny valley?

More filmmakers return to tactile realism every day. The most recent, and largest scale, I can think of is Ridley Scott and “Prometheus”, where actors reported minimal green screen work and real stage/set building:
“There was basically no [green screen], maybe looking out of a window occasionally,” Pearce added. “But no, it was all built stuff. I remember seeing a couple of sets and being completely awestruck over a couple of days and then somebody saying, ‘But oh, have you seen the big stage?’ And you go, ‘Oh? There’s more?’”
– Guy Pearce

I enjoyed “Tron: Legacy”, even in 3D, but I had one gripe that seemed to contradict an otherwise well done film: Jeff Bridges on the bed with his son at the very beginning of the film, and at the door. The CG effects to recreate a young Jeff Bridges were done as well as technology would allow, and in “the grid” (inside the computer), the “uncanny valley” wasn’t so eery: we were inside a computer landscape, so an “eery” effect was tolerated because of the context. However, I always felt introducing that special effect so early on in the film felt disturbing because they made such an effort to delineate “the grid” from the real world (the real world was 2D filmmaking, but “the grid” introduced the 3D effect).

I think moreso, filmmakers are reintroducing the human hand into filmmaking. A human can help other humans anticipate the expectations of an audience far better than a tyrannical computer model, no matter how “photorealistic”. Quite literally, if we don’t consider the audience’s expectations, we may never keep them from stumbling into that valley, and then we’ve lost them for that precious moment or minute of screen time.

“Rango”, an entirely CGI “photo-realistic” film that looked incredible still degraded the characters to the point where photo-realism wasn’t pushed beyond it’s limits. Further, the film (and others, including Oscar nominated Pixar film “Wall-E”) had ASC member Roger Deakins as a lighting consultant. Once again, the human eye was needed to help fool the audience.

What is the truth? The truth is the audience knows those X-wing fighters are on strings. The audience knows, for our purposes, those speeder bikes were in a soundstage somewhere, not zipping through a redwood forest. To quote “The Prestige”: “The audience knows the truth: the world is simple. It’s miserable, solid all the way through. But if you could fool them, even for a second, then you can make them wonder…”. I would suggest that you cannot give them that wonder if you’re straining their notions of acceptable. Grander isn’t necessarily better. As Garrett put it, larger scale doesn’t necessarily make it more gripping.

A director should always have a hand on the story, and I’ll always try to have a hand on the camera.

I recommend you read the wikipedia article and supporting articles regarding the “uncanny valley”, as I think it’s quite fascinating and relevant to our future as storytellers.

William B. Demeritt III