One of the biggest questions being asked these days is how does YouTube use Big Data? This data – generated by video viewing, posting, and linking – is the second most popular search engine on the web. The study aims to understand how YouTube uses this data to create personalized features such as Recommended Channels, which are based on the videos users watch most often. It also examines how YouTube predicts what video topics will be popular in the future, and how it uses data visualization to improve its user experience.
The first question is where does YouTube store its data? The answer is in their Modular Data Centers, which are portable and can be placed wherever the storage capacity is needed. YouTube uses 5 or 6 Google data centers, along with their own content distribution network. These data centers are critical to facilitating their audience engagement, as they can offer valuable insights into user preferences. But what are these data centers? The most obvious answer is the CDN.
YouTube has used this data to remove videos with objectionable content. In fact, it’s so effective that 8.3 million videos were removed from the platform in the first quarter of 2018. Seventy-six percent of those videos were removed without any user views. But YouTube’s algorithms are far from foolproof. For instance, they sometimes mistake newsworthy videos for violent extremism. As a result, Google employs full-time human specialists who work with the AI to monitor content.