Consider a database with objects X and Y. Transaction T1 reads objects X and Y and then writes object X. Transaction T2 reads objects X and Y and then writes objects X and Y.
Give an example of a schedule that involves one of the transactions reading dirty data.
| T1 | T2 |
|---|---|
| r(X) | |
| r(Y) | |
| w(X) | |
| r(X) | |
| w(Y) | |
| commit | |
| r(Y) | |
| w(X) | |
| commit |
T1’s read of X is a dirty read because the the value of X was changed by T2, which has not yet committed. Note that T1’s read of Y is not dirty, because it occurs after T2 commits.
Would two-phase locking prevent the dirty read from occurring? If so, explain why. If not, modify the schedule so that it uses two-phase locking but still includes a dirty read.
| T1 | T2 |
|---|---|
| xl(X); r(X) | |
| xl(Y); r(Y) | |
| w(X); u(X) | |
| xl(X); r(X) | |
| w(Y); u(Y) | |
| commit | |
| sl(Y); r(Y) | |
| w(X) | |
| u(X); u(Y) | |
| commit |
Note that both T1 and T2 employ two-phase locking: once they begin releasing locks, they don’t acquire any additional locks. However, because T2 releases its exclusive lock on X before it commits, T1 is able to read X after T2 has written it and before T2 has committed, which constitutes a dirty read.
How can we use locking to prevent dirty reads? Explain how your proposed approach would prevent the dirty read in the schedule from problem 1.
Does the approach that you specified in your answer to problem 3 replace two-phase locking (2PL), or do we still need 2PL as well? Explain your answer.
| T1 | T2 |
|---|---|
| xl(X); r(X) | |
| sl(Y); r(Y) | |
| w(X) | |
| sl(Y); r(Y) | |
| u(Y) | |
| xl(Y); w(Y) | |
| u(X); u(Y); commit | |
| xl(X); r(X) | |
| w(X) | |
| u(X); commit |
Note that both T1 and T2 employ strict locking: they hold their exclusive locks until they commit. However, T1 does not employ 2PL: it acquires a lock for X after having released a lock for Y. The resulting schedule is not serializable. To see this, note that T2 writes both X and Y. T1 reads the value of X written by T2, but it reads the prior value of Y. Thus, this transaction is equivalent to neither of the two possible serial orderings.
Consider the following schedule involving three transactions T1, T2, and T3, where si indicates the start of transaction Ti:
s1; r1(X); s2; w2(Y); w2(X); s3; r3(Z); w3(Y); c2; c3; w1(Y); w1(Z); c1
If the DBMS uses timestamp-based concurrency control (with the Thomas Write Rule), what would be the response of the system to each of the actions in the schedule? For each action in the schedule, indicate the reaction of the database, whenever relevant.
When a data item is read, show its new read timestamp. If a read is ignored, make note of this and go to the next action.
Consider the following schedule involving three transactions T1, T2, and T3, where si indicates the start of transaction Ti:
s1; r1(X); s2; w2(Y); w2(X); s3; r3(Z); w3(Y); c2; c3; w1(Y); w1(Z); c1
Consider the sequence of actions above, with multiversion concurrency control. You may assume that the read and write timestamps of all data items are initialized to zero, and all commit bits are initially set to true.
Why can we no longer use the Thomas Write Rule to ignore writes from transactions whose timestamps are less than the write timestamp of the object it wants to write?
What would the system’s responses be to each of these actions? Illustrate the reaction of the database, and remember create a new version of a data item whenever necessary.