The Changing Face of Video

Video, as a medium for the transmission of information, is ubiquitous, familiar— and about to change forever.

An (apparently retouched) version of the first televised advertisement ever.

We take for granted continued improvements in the technologies that underlie video creation, such as cameras, microphones, post-processing and CGI (check out this evolutionary timeline of Canon lenses). Yet the core set of tasks inherent in video creation never changes: humans formulate an idea of the video they’d like to create, then attempt to realize that vision (with plenty of bumps along the way). The resulting video may resonate with viewers, or it may not. In either case, its creators use information about how the video was received, and by whom, to form a (highly lossy) feedback loop to inform their future efforts.

However, technological innovations are already underway that will dramatically alter this human-in-the-loop process of video formulation and creation.

This post is about the changing face of video in human society.

The Importance of Video

Surveys indicate the average American watches nearly six hours of video per day. To put it another way, 38% of the waking lives of Americans are spent watching video flashing across the screens of their mobile devices, computers, tablets, and TVs.

Nor is this video glut restricted to the United States. In 2017, 75% of global internet traffic was dedicated to video, and this figure is projected to reach 82% by 2021.

The reasons for the explosion of video are, on the face of it, obvious. Video lends itself to passive consumption, requiring only that we exercise innate human behaviors such as observation of movement, as well as traits developed in early childhood, such as processing spoken language. Both of these actions require little cognitive effort to perform, particularly compared with reading. Evolutionary pressures have also led movement and sound to be more effective at capturing human attention than static information, such as written words — this leaves video with an inherent advantage over text when it comes to grabbing and keeping viewers’ attention. At the same time, video is a comparatively rich channel for information transmission, permitting enormously complex scenes to be conveyed in just a few frames.

The stimuli we experience during our conscious lives forms a main thrust of our decision-making processes. Thus, because video is such a rich and digestible channel for these external stimuli, people who seek to influence the behavior of others will nearly always employ video as a core channel for cognitive access to their target market. Advertisers, in particular, have leveraged the power of video since shortly after its inception.

The Four Pillars of Video Effectiveness

Whether an advertiser is building their video campaign to increase product sales, drive brand awareness, or affect public action (such as through Public Service Announcements), the following four metrics provide a useful framework for predicting creative effectiveness. To wit, the video must:

Capture and Retain Attention — the effectiveness of a message is directly related to the level of attention devoted to it. This is a donnée; if no one watches or pays attention to a video, it is powerless to influence their behavior.

Engage Emotionally — emotional engagement is a key driver of behavior. Sparking the right mix of emotions leads to positive association, memorability, and even (sometimes) virality. Often these emotions are overt, such as the feelings of joy associated with comedy. Other times, they are subtle, such as the feelings of trust and safety that financial institutions often seek to engender through their advertising. Generally, advertisers seek to elicit positive valence emotions, such as happiness. However, negative valence emotions, such as sadness and anger, can be extremely effective if properly directed. It is not overly simplistic to state that the intensity of the emotion is a more effective predictor of campaign success than it’s overall valence (tenor).

Associate Properly — a video could go viral by driving strong emotional reaction and high levels of attention, yet not influence human behavior in the manner desired by its creators, if the emotions are not associated with the correct object. For example, an advertiser may create a comedic ad to sell toothpaste which succeeds in eliciting peals of laughter from any audience that sees it. But if the toothpaste brand is never mentioned, or mentioned at the incorrect time, then the fun, positive emotions the advertiser has elicited will not be transferred onto the correct object (the toothpaste brand) and the campaign is unlikely to meet with success.

Remain Memorable — generally, the behavioral change which advertisers seek to elicit through video is not desirable, or even possible, in the moment when the video is consumed. Rather, the decision (conscious or unconscious) to make the change (for instance, switching toothpaste brands), as well as the emotional and associative reasons the decision was made, must be retained in memory until the action can be performed (such as the next time one is at the store to buy toothpaste). Emotional intensity, attention, and associations are all predictors — not guarantees — of memorability. There is, however, no better way to increase memorability than by maximizing attention, emotion, and association. (It is recognized by the author that memorability of the ad itself is only a reflection of the memorability of the decision made; however, it is nevertheless instructive).

As an aside, these rules hold true for video content creators outside of advertising, as well. When a content producer is creating the new season of a hit show, they seek to do the same thing — to engender, through their video, certain strong feelings in their audience to keep them watching.

The Future of Video in the Long Tail

Unfortunately, the effectiveness of a given video for a single individual, under the fourfold framework, varies wildly based on a multitude of factors. These include personal experience, demographic, location, platform, even time of day. While broad trends can be assessed and exploited, (Hollywood blockbusters are but one example of this) no single video can have a predictable, similarly positive impact on each person in a population. The economics of video production has led to the primacy of mass-market appeal, where video producers seek to maximize ROI by catering to the broadest audience possible. This often leaves us with content that earns money from large numbers of people but is loved and remembered by very few.

In the future, though, technological innovation will continue to drive the evolution of video creation, sparking a transition from content developed by humans for mass consumption to the algorithmic generation of bespoke content, designed from the ground up for (at first) highly targeted groups, and ultimately, for the individual consumer. It will be generated to suit not just the individual’s overall tastes (which are a product of both genetic legacy and experience) but also their desires at precisely that time. Video will thus exist to cater to the long tail, rather than the majority. One example of an early foray into bespoke content generation is the popular Black Mirror series on Netflix, which released an episode (‘Bandersnatch’) which played out like a choose-your-own-adventure novel. The episode would pause at various points and ask viewers what they wanted the main character to do. The story would then unfold based on their selections. Of course, this content was all generated in advance, and permitted only a handful of potential story lines. It also required active input from the viewer in order to deviate from the default story line. However, it is an interesting experiment in custom content, in which two people who watched the ‘same’ episode might actually have seen very different story lines.

Shifts in an individual’s video preferences can happen over different time periods. For instance, a childhood preference for animated cartoons may shift to action movies during the teenage years, and then to political thrillers during middle age. Another type of shift might occur from month to month, as when a death in the family leads that same person to prefer light hearted comedy films for several weeks. Most noticeably, preference shifts of even more granular temporality exist, such as an evening of nostalgia that lead that same middle-aged person who generally prefers political thrillers to revisit the cartoons of their youth. Historical watch data on that individual, combined with biometric data about how they felt about what they saw (not to mention predictive models generated through large-scale collaborative filtering) will combine to generate a clear picture of what video, exactly, will resonate with the individual at this very moment… even if they didn’t know to search for it themselves.

Of course, knowing what would resonate is only part of the problem. What if the perfect video doesn’t already exist? Given the unique preferences of each person, and how often those preferences shift, the ‘perfect’ video for that person, at that time, almost certainly doesn’t exist yet. Thus, for optimal impact on the viewer, video will need to be created in real time and delivered to the consumer for maximum enjoyment. The rapidity with which this whole process must take place means that, by definition, machines must create this content. Research into this field of algorithmic content generation is already underway.