Several months ago, I saw a brief clip wherein some researchers were developing new technology for mining videos; in the clip, experimental software was automatically parsing through video files to identify objects, symbols and people. The software could annotate the video with comments such as “red car moving from left to right at 3:05:20 to 3:05:59″. I wasn’t able to find that clip, but I was able to find some other interesting work on this subject that I think is directly relevant to the future of social media analytics.

A very nice introduction to audio and video mining can be found at Intel’s Technology Journal. The article describes the computing requirements (at the tera-scale, still rather prohibitive) and some early applications, such as content summarization, annotation, search and indexing, analysis and surveillance for entertainment, healthcare, and personal use. Obviously, in addition to video images this entails mining the audio for much of the same purposes that we see text content mined today (sentiment, semantics, keywords, searching, etc.). In one example, they show how the different teams and players can be identified in a soccer match:

Video mining sports: Identifying soccer players and teams

I didn’t have to look very far to find some examples where companies are already making money in the area of audio and video mining. For example, appropriately-named Video Mining Corporation has some interesting products for extracting consumer insights and behavior, as shown in some stills from their website (visit their link to see the full flash animations):

Video Mining Corporation - Consumer Insight Examples

A security solutions company (TrueSentry) also has some interesting video mining products and examples:

TrueSentry Video Mining - People IdentificationTrueSentry Video Mining - Vehicle Identification

And there is apparently even more work being done in pure audio, though it seems the key niche right now is call quality and monitoring. Sound Communications Inc. appears to do offer automated sentiment / satisfaction analytics of call audio, as does Dalbar Corporation. Dalbar’s product, “Voice of Your Customer (VOYC) …produces four types of results that are extracted from telephone conversations with customers through a proprietary computer-based analysis”. The primary result of the VOYC system is a satisfaction index which reports on the relationship with customers and identifies factors that improve and those that injure customer satisfaction. Measured outcomes include the customer’s predisposition (favorable or unfavorable); issue resolution (resolved or unresolved); value added that was not expected by the customer (added or not added); the customer’s reaction at the end of the conversation (positive or negative). Information on the VOYC was provided by my friend Lou Harvey (President of Dalbar). Another audio mining company is Nuance, which offers a tool for searching and indexing audio content.

Moving on to still images, LTU Technologies appears to offer white-label image search and recognition solutions. Its two main products include an image search and retrieval engine (currently used by Corbis among many others) as well as an image recognition and filtering engine. These tools are based upon automatic comparison and description of static images, and have obvious applications for tracking and protecting brands and logos. The following is an example from LTU’s website, where their Image-Filter™ system successfully categorizes two images with identical tags/filenames as porn or not. Apparently, computers are already smart enough to have an appreciation for human anatomy!

LTU Technologies Image Filter Example

The examples I listed here are far from comprehensive, and there appears to be a tremendous body of academic literature on the subject. They all require immense bandwidth and computing capacity (images, audio, video in order of complexity) but if we can count on Moore’s Law to continue, then these won’t be obstacles for very long.

I think these examples are evidence that social media applications of video, audio, and image mining are not far away. Imagine applying analytics to the content of podcasts, vidcasts, and viral videos. Stretch your imagination a little further, and consider the application of search and statistical capabilities across all available webcams or tools like Google Street View. Perhaps it’s not so far-fetched to predict that in the next fifteen years we will be able to do a realtime search on queries like “the number of people in New Hampshire wearing red hats.”



 


 


Leave a new comment

(required)
(required)