Hand-Over-Hand SRCU Use Case, Second Try?

Again, the repaired code fragment is as follows:

  1 struct foo {
  2   struct list_head list;
  3   ...
  4 };
  5 
  6 LIST_HEAD(mylist);
  7 struct srcu_struct mysrcu;
  8 
  9 void process(void)
 10 {
 11   int i1, i2;
 12   struct foo *p;
 13 
 14   i1 = srcu_read_lock(&mysrcu);
 15   p = list_entry_rcu(mylist.next, struct foo, list);
 16   while (&p->list != &mylist) {
 17     do_something_with(p);
 18     i2 = srcu_read_lock(&mysrcu);
 19     p = list_entry_rcu(p->list.next, struct foo, list);
 20     srcu_read_unlock(&mysrcu, i1);
 21     i1 = i2;
 22   }
 23   srcu_read_unlock(&mysrcu, i1);
 24 }

And again, as is customary with SRCU, the list is manipulated using list_add_rcu(), list_del_rcu, and friends.

And yet again, what are the advantages and disadvantages of this hand-over-hand SRCU list traversal?

The biggest disadvantage is still that it is totally broken. To see this, consider a four-element list where the second and third elements are deleted in quick succession, and where a reader fetched a pointer to the second element just before it was removed from the list:

Delete B and C from list A,B,C,D

Row (a) of this figure shows the reader just having fetched a pointer to element B. Row (b) shows the state just after element B has been removed and the corresponding SRCU grace period started, denoted by the dashed lines and pink color. The reader's reference to element B is legitimate, as the SRCU grace period cannot end until the reader exits its SRCU read-side critical section.

Finally, row (c) shows the state just after element C has been removed and the corresponding SRCU grace period started. The question now is “What happens when the reader follows the ->next pointer out of element B to element C?

And the answer is “trouble”, and lots of it. The reader starts an SRCU read-side critical section and advances to element C. Unfortunately, this SRCU read-side critical section began after element C's SRCU grace period, so element C can be freed (and, worse, reallocated as something else) before the reader has finished referencing it.

Now, it is possible to fix this by making deletion find all pointers to the deleted element and changing them to point to some element that is still in the list, but this is left as an exercise for the reader. Oh, and don't forget to test the case of removing all of the elements in the list in quick succession.

In short, overlapping SRCU read-side critical sections should not be used without very careful thought. Can you come up with a valid use case?