From the following specifications, form MongoDB queries that work on three collections:
movies
)people
)oscars
)Note that whenever we refer to a person or movie, we also embed the
associated entity’s name for convenience. Note also that we include two
Boolean fields to person documents, hasActed
and hasDirected
, that
are only included when their values are true. Sample documents are
included below.
{ _id: "0499549", name: "Avatar", year: 2009, rating: "PG-13", runtime: 162, genre: "AVYS", earnings_rank: 1, actors: [ { id: "0000244", name: "Sigourney Weaver" }, { id: "0002332", name: "Stephen Lang" }, { id: "0735442", name: "Michelle Rodriguez" }, { id: "0757855", name: "Zoe Saldana" }, { id: "0941777", name: "Sam Worthington" } ], directors: [ { id: "0000116", name: "James Cameron" } ] } { _id: "0000059", name: "Laurence Olivier", dob: "1907-5-22", pob: "Dorking, Surrey, England, UK", hasActed: true, hasDirected: true } { _id: ObjectID("528bf38ce6d3df97b49a0569"), year: 2013, type: "BEST-ACTOR", person: { id: "0000358", name: "Daniel Day-Lewis" }, movie: { id: "0443272", name: "Lincoln" } }
What are some of the methods available for use in MongoDB? Which of them can be used interchangeably?
db.collection.find()
, which takes a selection and
projection (in relational algebra terms) JavaScript object to describe a
query.
The db.collection.distinct()
method takes a field name and
selection object and produces distinct results. db.collection.count()
takes a selection object and returns the number of documents that
match the selection.
Finally, the db.collection.aggregate()
method takes a pipeline
of objects, each keyed by a special aggregate operator. The full
list of operators are available in the lecture notes and on the
MongoDB documentation,
but here are a few we’ve covered in detail:
$match
, which works like the selection objects you can pass
to db.collection.find()
and others, and limits the documents
that pass to the next step in the pipeline$group
, which works like a SQL GROUP BY
clause, and can
pass counts/groupings to the next step in the pipeline$project
, which works like the projection objects you can pass
to db.collection.find()
and others, and can include/exclude fields
from a result, or create/rename fieldsFind the top ten highest grossing movies in the database. For each film, output the movie’s name and the name of the director(s).
db.movies.find({ earnings_rank: { $lte: 10 }}, { name: 1, "directors.name": 1, _id: 0 })
The following is also acceptable:
db.movies.aggregate( { $match: { earnings_rank: { $lte: 10 } } }, { $project: { name: 1, directors: "$directors.name", _id: 0 } } )
How many actors in the database were born in California?
db.people.count({ hasActed: true, pob: /, California/ })
The following is also acceptable:
db.people.aggregate( { $match: { hasActed: true, pob: /, California/ } }, { $group: { _id: 0, numActors: { $sum: 1 } } }, { $project: { numActors: 1, _id: 0 } } )
Which people in the database have directed a movie in which Tom Hanks has acted?
db.movies.distinct("directors.name", { "actors.name": "Tom Hanks" })
Find all years before 1970 that have 6 or more movies in the database.
db.movies.aggregate( { $match: { year: { $lt: 1970 } } }, { $group: { _id: "$year", count: { $sum: 1 } } }, { $match: { count: { $gte: 6 } } }, { $project: { year: "$_id", count: 1, _id: 0 } } )
Find the person in the database who has acted in the most movies (the $unwind operator may be useful for this query).
db.movies.aggregate( { $unwind: "$actors" }, { $group: { _id: "$actors.name", numPerformances: { $sum: 1 } } }, { $sort: { numPerformances : -1 } }, { $limit: 1 }, { $project: { _id: 0, actor: "$_id", numPerformances: 1 } } )
Who directs the longest movies? Of all directors who have directed 5 or more movies, find the 10 directors whose movies have the longest average runtime.
db.movies.aggregate( { $unwind: "$directors" }, { $group: { _id: "$directors.name", avgLength: { $avg: "$runtime" }, numMovies: { $sum: 1 } } }, { $match: { numMovies: { $gte: 5 } } }, { $sort: { avgLength : -1 } }, { $limit: 10 }, { $project: { _id: 0, director: "$_id", avgLength: 1, numMovies: 1 } } )
Last updated on September 26, 2024.