Bonus problem from Section 2:
For each departure city in last section’s the Flights relation, find the flight
numbers of the flights from that city to New York (if any). If there
is no such flight, use a value of NULL
to indicate that fact.
UNION
should not be used.
Say our Flights table looks like this (ignoring the columns that are irrelevant to this problem):
flno |
origin |
dest |
---|---|---|
10 |
Chicago |
New York |
20 |
Chicago |
New York |
30 |
Los Angeles |
New York |
40 |
Los Angeles |
Seattle |
50 |
Seattle |
Anchorage |
60 |
New York |
Boston |
The result we eventually want for this problem is:
origin |
flno |
---|---|
Chicago |
10 |
Chicago |
20 |
Los Angeles |
30 |
Seattle |
|
New York |
|
Note that the result set has every possible departure city from the
table matched up with the flight numbers that get you from that city
to New York. Note also that if you can’t get to New York by starting
from a given departure city, it has one row in the result set and the
flno value is NULL
.
This sounds like how an outer join works, so let’s use an outer join from Flights to itself:
SELECT F1.origin, F2.flno FROM Flights AS F1 LEFT OUTER JOIN Flights AS F2 ON F1.flno = F2.flno AND F2.dest = 'New York';
When we join Flights with itself using the join condition
F1.flno = F2.flno
, we end up with just as many rows as Flights has,
duplicating the columns of Flights so that it looks like this:
F1.flno |
F1.origin |
F1.dest |
F2.flno |
F2.origin |
F2.dest |
---|---|---|---|---|---|
10 |
Chicago |
New York |
10 |
Chicago |
New York |
20 |
Chicago |
New York |
20 |
Chicago |
New York |
30 |
Los Angeles |
New York |
30 |
Los Angeles |
New York |
40 |
Los Angeles |
Seattle |
40 |
Los Angeles |
Seattle |
50 |
Seattle |
Anchorage |
50 |
Seattle |
Anchorage |
60 |
New York |
Boston |
60 |
New York |
Boston |
Then, by specifying AND F2.dest = 'New York'
, we can get NULL
to
appear wherever a given flight doesn’t arrive in New York:
F1.flno |
F1.origin |
F1.dest |
F2.flno |
F2.origin |
F2.dest |
---|---|---|---|---|---|
10 |
Chicago |
New York |
10 |
Chicago |
New York |
20 |
Chicago |
New York |
20 |
Chicago |
New York |
30 |
Los Angeles |
New York |
30 |
Los Angeles |
New York |
40 |
Los Angeles |
Seattle |
|
|
|
50 |
Seattle |
Anchorage |
|
|
|
60 |
New York |
Boston |
|
|
|
Finally, we project the columns F1.origin
and F2.flno
to get this:
origin |
flno |
---|---|
Chicago |
10 |
Chicago |
20 |
Los Angeles |
30 |
Los Angeles |
|
Seattle |
|
New York |
|
But this isn’t what we wanted! Los Angeles should not appear twice: it should appear once, since flight number 30 gets us to New York. (We don’t care about flight 40, the flight that departs in Los Angeles for Seattle.)
We can modify our query to exclude the rows where F2.flno
is NULL
when
there’s a flight to New York from that origin already:
SELECT DISTINCT F1.origin, F2.flno FROM Flights AS F1 LEFT OUTER JOIN Flights AS F2 ON F1.flno = F2.flno AND F2.dest = 'New York' WHERE NOT (F2.flno IS NULL AND F1.origin IN (SELECT origin FROM Flights WHERE dest = 'New York'));
With this query, we no longer have the row for flight 40.
Consider the following relational schemas, which relate to a database containing information for a hospital.
Given the above schema write the following SQL queries:
Find the schedule and number of appointments for each doctor on April 1, 2013.
Specifically, display the name of the doctor followed by the earliest appointment of the day for that doctor,
the latest appointment time of the day, and the total number of appointments
that day for each doctor who has at least one appointment on April 1, 2013.
SELECT S.name, MIN(A.appt_time), MAX(A.appt_time), COUNT(*) FROM Staff S, Appointment A WHERE S.id = A.staff_id AND A.appt_date = '2013-04-01' GROUP BY S.id, S.name;
Find the names of all staff members who do not have an appointment on the afternoon of March 6, 2013.
SELECT name FROM Staff WHERE id NOT IN (SELECT staff_id FROM Appointment WHERE appt_date = '2013-03-06' AND appt_time >= '12:00:00');
Find schedules of all staff members for the afternoon of March 6, 2013. The result set should include the names of each staff member, and the times of all appointments they have on March 6, 2013. A given name may appear more than once if that staff member has more than one appointment. Make sure to include those staff members that happen to have NO appointments on that date.
SELECT s.name, a.appt_time FROM Staff as s LEFT OUTER JOIN Appointment as a ON s.id = a.staff_id WHERE a.appt_date = '2013-03-06' and a.appt_time >= '12:00:00';
What are the advantages of fixed-length records? What are the disadvantages?
NULL
entries, or entries that
require fewer bytes than the specified field size).
Why are variable-length records advantageous?
NULL
entries, and variable-length record fields only require
as much space as is necessary for the particular field value in that record.
A record contains the following fields from our Movie relation from Problem Set 1:
id
, a fixed-length string of 7 charactersname
, a variable-length string of up to 30 charactersearnings_rank
, an integerruntime
, an integerAssume that we are using the record format discussed in lecture in which each record begins with a header of field offsets. What will the record look like for the following tuple? Assume that you have one-byte characters and four-byte integers.
('5678463', 'Philadephia', NULL, 125)
Note that we use a special offset -1 for the NULL
value, and that we also
include the offset of the end of the record, so that we can compute the
length of the last field. In this case, because the NULL
value is for an
integer, we could theoretically use an offset of 39 for it instead. Then,
when we subtracted and discovered that the length of the integer is 0, we
would know that the value is NULL
. However, that approach does not work
for string values, since an empty string (i.e., a string of length 0) is
not equivalent to a NULL
value for a string field.
Last updated on September 26, 2024.