Think Summer: Project 4 — 2024
Question 1
For the show the Gilmore Girls, there are 7 seasons listed in the IMDB database. Find the average rating of each of the seven seasons. Hint: Use AVG
for find the average, and GROUP BY
the season_number
. Make a plot or dotchart to show the average rating for each season in R.
Question 2
Identify the six most popular episodes of the show Grey’s Anatomy (where "popular" denotes a high rating).
Question 3
Make a dotchart in R showing the results of the previous question.
Hint: You can use your work from SQL, and export the results to a dataframe called myDF
in R. Then you can use something like:
# use a dbGetQuery here, to import the SQL results to R, and then
myresults <- myDF$rating
names(myresults) <- myDF$primary_title
dotchart(myresults)
Question 4
Make a plot or dotchart showing the total amount of money donated in each of the top 10 states, during the 2000 federal election cycle.
Question 5
Make a dotchart that shows how many movies premiered in each year. You do not need to show all of the years; there are too many years! Just show the number of movies premiered in each year since the year 2000.
Question 6
Among the three big New York City airports (JFK
, LGA
, EWR
), which of these airports had the worst DepDelay
(on average) in 2005? (Can you solve this with 1 line of R, using a tapply
(rather than using 3 separate lines of R)? Hint: After you run the tapply
, you can index your results using [c("JFK", "LGA", "EWR")]
to lookup all 3 airports at once.)
Question 7
LIKE
is a very powerful tool. You can read about SQLite’s version of LIKE
here. Use LIKE
to analyze the primary_title
of all IMDB titles: First determine how many titles have Batman
anywhere in the title, and then determine how many titles have Superman
anywhere in the title? Which one occurs more often?
Question 8
How much money was donated during the 2000 federal election cycle by people who have PURDUE
listed somewhere in their employer name? How much money was donated by people who have MICROSOFT
listed somewhere in their employer name? Hint: You might use the grep
or the grepl
(which is a logical grep) to solve this one.
Question 9
How much money was donated during the 2000 federal election cycle by people from your hometown? (Be sure to match the city and the state.)
Question 10
As in Monday’s project, during the years 2000 to 2020, how many people (from the people table) died in each year? Make a plot or dotchart to show the number of people who died in each year.
Question 11
As in Wednesday’s project, consider only the flights that arrive to Indianapolis (airport code IND
), i.e., for which Indianapolis is the destination. What are the 10 most popular origin airports? Make a plot or dotchart to show the number of flights from each of these 10 most popular origin airports (with Indianapolis as the destination airport).