This project involves a comprehensive analysis of Netflix's movies and TV shows data using SQL. The goal is to extract valuable insights and answer various business questions based on the dataset. The following README provides a detailed account of the project's objectives, business problems and findings. The provided 'Netflix_problem_statements_and_solutions.sql' file contains the solution of the specified problems.
- Analyze the distribution of content types (movies vs TV shows).
- Identify the most common ratings for movies and TV shows.
- List and analyze content based on release years, countries, and durations.
- Explore and categorize content based on specific criteria and keywords.
The data for this project is sourced from the Kaggle dataset:
- Dataset Link: Movies Dataset
- Count the number of Movies vs TV Shows.
- Find the most common rating for movies and TV shows.
- List all movies released in a specific year (e.g., 2020).
- Find the top 5 countries with the most content on Netflix.
- Identify the longest movie.
- Find content added in the last 5 years.
- Find all the movies/TV shows by director 'Rajiv Chilaka'.
- List all TV shows with more than 5 seasons.
- Count the number of content items in each genre.
- Find each year and the average numbers of content release in India on netflix and return top 5 year with highest avg content release.
- List all movies that are documentaries.
- Find all content without a director.
- Find how many movies actor 'Salman Khan' appeared in last 10 years.
- Find the top 10 actors who have appeared in the highest number of movies produced in India.
- Categorize the content based on the presence of the keywords 'kill' and 'violence' in the description field. Label content containing these keywords as 'Bad' and all other content as 'Good'. Count how many items fall into each category.
- Content Distribution: The dataset contains a diverse range of movies and TV shows with varying ratings and genres.
- Common Ratings: Insights into the most common ratings provide an understanding of the content's target audience.
- Geographical Insights: The top countries and the average content releases by India highlight regional content distribution.
- Content Categorization: Categorizing content based on specific keywords helps in understanding the nature of content available on Netflix.
This analysis provides a comprehensive view of Netflix's content and can help inform content strategy and decision-making.
You can watch the video tutorial using this link - Advanced SQL Project | Netflix Data Analysis Using SQL (Guided)
Thanks to 'Zero Analyst' channel for creating such useful content !
