Section 6

Locking

Consider a database with objects X and Y. Transaction T1 reads objects X and Y and then writes object X. Transaction T2 reads objects X and Y and then writes objects X and Y.

Give an example of a schedule that involves one of the transactions reading dirty data.

Here’s one possible schedule:

T1 T2

r(X)

r(Y)

w(X)

r(X)

w(Y)

commit

r(Y)

w(X)

commit

T1’s read of X is a dirty read because the the value of X was changed by T2, which has not yet committed. Note that T1’s read of Y is not dirty, because it occurs after T2 commits.
Would two-phase locking prevent the dirty read from occurring? If so, explain why. If not, modify the schedule so that it uses two-phase locking but still includes a dirty read.

No, regular two-phase locking would not necessarily prevent the dirty read. Here’s a schedule that demonstrates this fact:

T1 T2

xl(X); r(X)

xl(Y); r(Y)

w(X); u(X)

xl(X); r(X)

w(Y); u(Y)

commit

sl(Y); r(Y)

w(X)

u(X); u(Y)

commit

Note that both T1 and T2 employ two-phase locking: once they begin releasing locks, they don’t acquire any additional locks. However, because T2 releases its exclusive lock on X before it commits, T1 is able to read X after T2 has written it and before T2 has committed, which constitutes a dirty read.
How can we use locking to prevent dirty reads? Explain how your proposed approach would prevent the dirty read in the schedule from problem 1.

We can use strict locking, which requires that transactions hold all exclusive locks until they complete. Under strict locking, T2 would hold its exclusive lock on X until it committed. Thus, T1 would not be able to acquire the lock needed to read X until after T2 commits. Thus, it would not be able to perform the dirty read.
Does the approach that you specified in your answer to problem 3 replace two-phase locking (2PL), or do we still need 2PL as well? Explain your answer.

No, strict locking does not replace 2PL; it augments it. We still need 2PL, because strict locking only governs the exclusive locks, and we still need to require that transactions do not acquire any shared locks after they have begun releasing them. For example, here is an example of a problematic schedule that uses strict locking but does not use 2PL:

T1 T2

xl(X); r(X)

sl(Y); r(Y)

w(X)

sl(Y); r(Y)

u(Y)

xl(Y); w(Y)

u(X); u(Y); commit

xl(X); r(X)

w(X)

u(X); commit

Note that both T1 and T2 employ strict locking: they hold their exclusive locks until they commit. However, T1 does not employ 2PL: it acquires a lock for X after having released a lock for Y. The resulting schedule is not serializable. To see this, note that T2 writes both X and Y. T1 reads the value of X written by T2, but it reads the prior value of Y. Thus, this transaction is equivalent to neither of the two possible serial orderings.

T1	T2
	r(X)
	r(Y)
	w(X)
r(X)
	w(Y)
	commit
r(Y)
w(X)
commit

T1	T2
	xl(X); r(X)
	xl(Y); r(Y)
	w(X); u(X)
xl(X); r(X)
	w(Y); u(Y)
	commit
sl(Y); r(Y)
w(X)
u(X); u(Y)
commit

T1	T2
	xl(X); r(X)
	sl(Y); r(Y)
	w(X)
sl(Y); r(Y)
u(Y)
	xl(Y); w(Y)
	u(X); u(Y); commit
xl(X); r(X)
w(X)
u(X); commit

Timestamp-based Concurrency Control

Consider the following schedule involving three transactions T1, T2, and T3, where si indicates the start of transaction Ti:

s1; r1(X); s2; w2(Y); w2(X); s3; r3(Z); w3(Y); c2; c3; w1(Y); w1(Z); c1

If the DBMS uses timestamp-based concurrency control (with the Thomas Write Rule), what would be the response of the system to each of the actions in the schedule? For each action in the schedule, indicate the reaction of the database, whenever relevant.

When a transaction begins, give it a starting timestamp larger than the action preceding it.
When a data item is modified, show its new write timestamp.
When a data item is read, show its new read timestamp. If a read is ignored, make note of this and go to the next action.
Assume without loss of generality that the read and write timestamps of all of the data elements are initially 0.
1. s1: T1 begins and is assigned a timestamp TS(T1) > 0.
2. r1(X): To determine if T1 should be allowed to read X, the system checks if TS(T1) ≥ WTS(X). It is, because WTS(X) is still 0. Thus, the DBMS allows T1 to read X, and it sets RTS(X) = max(RTS(X), TS(T1)) = max(0, TS(T1)) = TS(T1).
3. s2: T2 begins and is assigned a timestamp TS(T2) > TS(T1).
4. w2(Y): To determine if T2 should be allowed to write Y, the system first checks if TS(T2) ≥ RTS(Y). It is, because RTS(Y) is still 0. The system next checks if TS(T2) ≥ WTS(Y). It is, because WTS(Y) is still 0. Thus, the DBMS allows T2 to write Y and it sets WTS(Y) = TS(T2).
5. w2(X): TS(T2) ≥ RTS(X) = TS(T1), since TS(T2) > TS(T1). TS(T2) ≥ WTS(X), because WTS(X) is still 0. Thus, the DBMS allows T2 to write X, and it sets WTS(X) = TS(T2).
6. s3: T3 begins and is assigned a timestamp TS(T3) > TS(T2) > TS(T1).
7. r3(Z): TS(T3) ≥ WTS(Z), since WTS(Z) = 0. Thus, the DBMS allows T3 to read Z, and it sets RTS(Z) = max(RTS(Z), TS(T3)) = max(0, TS(T3)) = TS(T3).
8. w3(Y): TS(T3) ≥ RTS(Y), since RTS(Y) = 0. TS(T3) ≥ WTS(Y) = TS(T2), because TS(T3) > TS(T2). Thus, T3 is allowed to write, and it becomes the most recent writer of Y (WTS(Y) = TS(T3)).
9. c2: T2 commits.
10. c3: T3 commits.
11. w1(Y): TS(T1) ≥ RTS(Y) because RTS(Y) is still 0. But TS(T1) < WTS(Y) = TS(T3). Thus, the write cannot be allowed, because the current value of Y was written by T3, which comes after T1 in the equivalent serial ordering based on the timestamps. Without the Thomas Write Rule, the DBMS would reject T1’s request and abort T1. With the Thomas Write Rule, it ignores the request—treating T1’s write as if if it occurred before the write by T3 and was overwritten.
12. w1(Z): TS(T1) < RTS(Z) = TS(T3). T3 was the last reader of Z. It should have read the value that T1 is about to write, because T3 comes after T1 in the equivalent serial schedule. Thus, this write cannot be allowed. The system aborts T1 and restarts it sometime later with a new timestamp.
13. c1: T1 has been aborted, so we disregard c1 in this exercise.

Multiversion Concurrency Control

Consider the following schedule involving three transactions T1, T2, and T3, where si indicates the start of transaction Ti:

s1; r1(X); s2; w2(Y); w2(X); s3; r3(Z); w3(Y); c2; c3; w1(Y); w1(Z); c1

Consider the sequence of actions above, with multiversion concurrency control. You may assume that the read and write timestamps of all data items are initialized to zero, and all commit bits are initially set to true.

Why can we no longer use the Thomas Write Rule to ignore writes from transactions whose timestamps are less than the write timestamp of the object it wants to write?
What would the system’s responses be to each of these actions? Illustrate the reaction of the database, and remember create a new version of a data item whenever necessary.
To make it easier to distinguish between the versions, we will assign concrete timestamps to the transactions, based on the order in which they start.
1. s1: T1 begins and is assigned a timestamp TS(T1) = 1.
2. r1(X): No change, except the first version X(0) of X is retrieved, and RTS(X(0)) is updated to TS(T1) = 1.
3. s2: T2 begins and is assigned a timestamp TS(T2) = 2.
4. w2(Y): T2 is allowed to write, and a new copy Y(2) of Y is created with RTS = 0, WTS = TS(T2) = 2, and commit = false.
5. w2(X): T2 is allowed to write, and a new copy X(2) of X is created with RTS = 0, WTS = TS(T2) = 2, and commit = false.
6. s3: T3 begins and is assigned a timestamp TS(T3) = 3.
7. r3(Z): TS(T3) ≥ WTS(Z(0)), since WTS(Z(0)) = 0. So T3 reads Z(0), and we set RTS(Z(0)) = max(RTS(Z(0)), TS(T3)) = TS(T3) = 3.
8. w3(Y): T3 is allowed to write, and a new copy Y(3) of Y is created with RTS = 0, WTS = TS(T3) = 3, and commit = false.
9. c2: T2 commits. Thus, X(2).commit is set to true and Y(2).commit is set to true.
10. c3: T3 commits. Thus, Y(3).commit is set to true.
11. w1(Y): T1 is allowed to write; a new copy Y(1) is created with RTS = 0, WTS = TS(T1) = 1, and commit = false.
12. w1(Z): TS(T1) < RTS(Z(0)), so the write is denied as before.