To accurately define the emerging status and trends of the convergent video landscape, ZTE and other players have proposed the concept of big video that covers four aspects: big content, big network, big data, and big ecosystem.
The big video industry has benefited greatly from the recent advancements in artificial intelligence (AI), especially machine learning (ML) and deep learning (DL), which is a specialized and powerful version of ML with many processing layers.
Advancements in AI
Artificial intelligence has two major schools: rule-based expert systems and data-driven machine learning systems.
Rule-based expert systems perform efficiently and deterministically but lack the ability tolearn adaptively from the data sets being processed.
Data-driven machine learning systems, especially deep learning systems, are able to solve problems that are hard to define or enumerate by explicit rules, for example, natural language understanding. They can also autonomously learn from the massive big data sets and improve their accuracy over time.
ML and DL require massive computing power that was infeasible until the commercial deployment of cloud computing systems. In recent years, ML/DL are able to recognize and understand objects, faces, voices, and conversations with 95%+ accuracy, equivalent to human classification errors. This is an enabler to unlock many advanced features in big video.
AI for Big Content
Big content denotes the current diversified trends in the digital content business, ranging from audio to video to virtual reality (VR) / augmented reality (AR) / mixed reality (MR), from standard definition (SD) to high definition (HD) to ultra high definition (UHD), from traditional studio-produced long-form movies and dramas to user-generated mobile-friendly portrait-mode short videos.
The content may be professionally generated content (PGC), occupationally-generated content (OGC) like live casting of online games, and user-generated content (UGC).
To facilitate fluent multi-screen experience and sharing by social network services, short clips and trailers of long content are essential. This used to be done by experts manually.
One smart use case of AI for big video is to automatically generate interesting video clips as trailers. AI is able to analyze each video frame to figure out all the imageries (meaning objects) and mark them up with timestamps. For example, for a 90-minutes long soccer game, AI is able to pick up scenes related to goals and misses and audience cheers.
For the audio/speech side of big video, natural language processing (NLP), a branch of AI, has enabled automatic generation of subtitles and close-captions based on a good understanding of the speeches and conversations in a video.
Moreover, object-recognition (including face-recognition) technologies have made it possible to automatically tag objects with additional information. This is especially useful for VR, AR, and MR.
AI for Big Network
Big network captures the facts that modern video contents may be delivered via many types of networks including terrestrial broadcasting, analog/digital cable, analog/digital satellite, managed IPTV and undamaged OTT videos over fixed broadband networks, Wi-Fi networks, and mobile data networks.
IPTV and OTT video systems are notoriously hard to maintain and manage due to their non-deterministic and non-repeatable nature. Unsupervised learning algorithms can be used to automatically detect abnormality of network operations.
AI for Big Data
Big data in the video industry covers multiple dimensions: by each subscriber, by each content/asset, and by each network resource.
Face-recognition to each subscriber by the set-top box (STB) or the soft clients (e.g. smart phones/tablets) provides a smooth user experience with an individualized electronic program guide (EPG).
Content recommendation system is critical to attract users and increase average revenue per user (ARPU). It is based on supervised learning algorithms that remember what a user likes and dislikes and figure out a formula that covers many features/properties of the user and each content.
AI for Big Ecosystem
The big ecosystem of the video industry involves many parties, e.g. content producers, content aggregators, solution vendors, multi-channel video service providers/operators, and advertisers.
An interesting example here is smart AI advertisements based on image recognition and video blending. For example, appropriate advertisements or logos may be added onto open space in the video scene and will appear as naturally embedded objects.
APIs to access the subscriber data and network statistics can be offered to the third-party developers that create advanced features and services. For example, enterprise-oriented online education systems can utilize the content delivery network built for the operator-oriented pay TV services.
AI is not a panacea that can solve all the problems in one shot. Nevertheless, with massive data to train the system, AI can become smarter and smarter. For the big video industry, smart AI leads to happier users, more attractive contents, better-managed networks, and more prosperous ecosystems.
AI, Video offerings, Big video, Big content, Big network, Big data, Big ecosystem