From the following specifications, form MongoDB queries that work on three collections:
movies)people)oscars)Note that whenever we refer to a person or movie, we also embed the
associated entity’s name for convenience. Note also that we include two
Boolean fields to person documents, hasActed and hasDirected, that
are only included when their values are true. Sample documents are
included below.
{ _id: "0499549", name: "Avatar", year: 2009, rating: "PG-13", runtime: 162, genre: "AVYS", earnings_rank: 1, actors: [ { id: "0000244", name: "Sigourney Weaver" }, { id: "0002332", name: "Stephen Lang" }, { id: "0735442", name: "Michelle Rodriguez" }, { id: "0757855", name: "Zoe Saldana" }, { id: "0941777", name: "Sam Worthington" } ], directors: [ { id: "0000116", name: "James Cameron" } ] } { _id: "0000059", name: "Laurence Olivier", dob: "1907-5-22", pob: "Dorking, Surrey, England, UK", hasActed: true, hasDirected: true } { _id: ObjectID("528bf38ce6d3df97b49a0569"), year: 2013, type: "BEST-ACTOR", person: { id: "0000358", name: "Daniel Day-Lewis" }, movie: { id: "0443272", name: "Lincoln" } }
What are some of the methods available for use in MongoDB? Which of them can be used interchangeably?
db.collection.find(), which takes a selection and
projection (in relational algebra terms) JavaScript object to describe a
query.
The db.collection.distinct() method takes a field name and
selection object and produces distinct results. db.collection.count()
takes a selection object and returns the number of documents that
match the selection.
Finally, the db.collection.aggregate() method takes a pipeline
of objects, each keyed by a special aggregate operator. The full
list of operators are available in the lecture notes and on the
MongoDB documentation,
but here are a few we’ve covered in detail:
$match, which works like the selection objects you can pass
to db.collection.find() and others, and limits the documents
that pass to the next step in the pipeline$group, which works like a SQL GROUP BY clause, and can
pass counts/groupings to the next step in the pipeline$project, which works like the projection objects you can pass
to db.collection.find() and others, and can include/exclude fields
from a result, or create/rename fieldsFind the top ten highest grossing movies in the database. For each film, output the movie’s name and the name of the director(s).
db.movies.find({ earnings_rank: { $lte: 10 }}, { name: 1, "directors.name": 1, _id: 0 })
The following is also acceptable:
db.movies.aggregate( { $match: { earnings_rank: { $lte: 10 } } }, { $project: { name: 1, directors: "$directors.name", _id: 0 } } )
How many actors in the database were born in California?
db.people.count({ hasActed: true, pob: /, California/ })
The following is also acceptable:
db.people.aggregate( { $match: { hasActed: true, pob: /, California/ } }, { $group: { _id: 0, numActors: { $sum: 1 } } }, { $project: { numActors: 1, _id: 0 } } )
Which people in the database have directed a movie in which Tom Hanks has acted?
db.movies.distinct("directors.name", { "actors.name": "Tom Hanks" })
Find all years before 1970 that have 6 or more movies in the database.
db.movies.aggregate( { $match: { year: { $lt: 1970 } } }, { $group: { _id: "$year", count: { $sum: 1 } } }, { $match: { count: { $gte: 6 } } }, { $project: { year: "$_id", count: 1, _id: 0 } } )
Find the person in the database who has acted in the most movies (the $unwind operator may be useful for this query).
db.movies.aggregate( { $unwind: "$actors" }, { $group: { _id: "$actors.name", numPerformances: { $sum: 1 } } }, { $sort: { numPerformances : -1 } }, { $limit: 1 }, { $project: { _id: 0, actor: "$_id", numPerformances: 1 } } )
Who directs the longest movies? Of all directors who have directed 5 or more movies, find the 10 directors whose movies have the longest average runtime.
db.movies.aggregate( { $unwind: "$directors" }, { $group: { _id: "$directors.name", avgLength: { $avg: "$runtime" }, numMovies: { $sum: 1 } } }, { $match: { numMovies: { $gte: 5 } } }, { $sort: { avgLength : -1 } }, { $limit: 10 }, { $project: { _id: 0, director: "$_id", avgLength: 1, numMovies: 1 } } )
Last updated on September 26, 2024.