Movie Explorer presents an interactive visualization of the relationship between the User-Generated Content (UGC) on popular websites and movies’ features, including its box office and genre. While for amateur audience we are highly likely to judge a movie’s quality relying on user-generated ratings on popular movie-criticizing websites such as IMDb and Douban Movie(豆瓣电影), user-generated tags are a non-numerical type of measurement for audience to have a glimpse of the movie’s word of mouth. The project also takes into consideration of the number, frequency and relevance of the user-generated tags in order to compare the similarity between movies. To address the questions, the project implements multiple approaches such as the Stacked Area Chart, Node Network and Scatter Plot.
Year-Trend Stacked Area Chart
Visual Encoding: X axis: Year Y axis: Stacked Number of Movies
Color: Genre Menu bar: Filter
The chart shows am overall trend in movie production along the timeline 1933 to 2014. Marked by different colors, each area represents one genre, as labelled on the right, and the y axis shows the accumulated number of movies accordingly.
Interaction: When the mouse hover over one area, the data of movie production of all genres in the year shows in the floating panel beside the mouse. Users are also allowed to filter certain genres on the top bar menu, or to switch between the expanded or streamed chart. When an area is clicked, it stretches the y axis to provide a better view.
Interesting Findings: The overall movie production plumps in 2009, which might be caused by the global finance crisis.
Visual Encoding: Node: movie or tag
Line width: the relevance of the tag to a movie
Position: the relevance of the movie to a tag
It presents a network of tags and movie items, with tag nodes on the peripheral circle and movie nodes inside the circle. The line width indicates the degree of relevance between movies and tags.
Interaction: When the mouse hovers over a movie tag, it lights up all the tag nodes with which the movie is labelled, and the lines between. When the mouse hovers over a tag node, it lights up all the movie nodes with such tag, and other tags these movies have.
Interesting Findings: The tags “Gagsters” and “fight scenes” seem standard for the action genre.
Rate-BoxOffice Scatter Plot
Visual Encoding X axis:Box Office Y axis: Rates
Circle Radius: Number of the users who rate the movie
Interaction: When a mouse hover over a movie circle, the title of movie shows above the chart, and the circle is pushed forward.
Interesting Findings: There has long been a debate whether a movie could put both box office and audience’ acceptation under belt in this commercial market. While the pattern doesn’t give answer to this question, it suggests that a movie’s rate won’t screw up if it is able to acquire a high box office.
1. Developed in D3.js the interactive visualization in the forms of Stacked Area Chart, Node Network and Scatter Plot, among which the Stacked Area Chart utilizes nvde.js .
2. Processed 2 movie datasets and more than 5000 items in Python:
- Cleaning : Filter out movies that miss genre data(i.e. In the table it writes ”no genres listed”)
- Transformation : Split the genre data into an array (i.e. from “genre 1 | genre 2 |…” to [genre 1, genre 2,..]) using Python
- Wrangling : For each genre, count the number of movies published in a year. Output a temporary dataset in the Python program for later search
3. Optimized visual encoding in terms of visualization type and mark type.
Instructor: Prof. Nan Cao of Tongji University Intelligent Big Data Visualization Lab
Proposal could be viewed here https://pan.baidu.com/s/1raeaGSO .