Advogato: Lockfiles and queuing

Posted 27 Sep 2001 at 18:51 UTC by bbense

With the advent of fast multi-cpu machines, I've noticed that some software is failing to get file locks. Would some kind of queuing help?

he problem came to my attention in the context of a "procmail-like" local email delivery program called "deliver". On a busy multi-cpu machine, it's standard lockfile algorithm can fail to ever get the lock.

In pseudo-code it looks like this:

while (  i < NUM_TRYS )  {
	if( get_lock( foo.lock ) ) 
	        break ; 
	else { 
	         i++ ;   
	         sleep(A_LITTLE_WHILE) ;
	} 
}

The standard way to "fix" this when it starts having problems is to either increase NUM_TRYS or add an expontential backoff to the sleep time. The expontential backoff is probably the most robust solution of the two, since eventually the email load/speed will overwhelm the NUM_TRYS approach.

What occurred to me is use a kind of queueing to limit the number of processes attempting to get the same lock. It's a bit more complicated, but might improve the throughput.

Here's the pseudo-code:

if (get_lock ( foo.lock) )
	break ; 
else { 
	i = 0 ; 
	while(  i < MAX_LOCK_QUEUE_LENGTH ) {   
		if( get_lock( foo.lock.i ) ) 
	        	break ; 
		else 
	           i++ ; 
	} 
	/* Now we have a lock on foo.lock.i */
	iplus = i ;
	i-- ;  
	
	while ( i > 0 ) { 
		if ( get_lock( foo.lock.i) { 
		     release_lock( foo.lock.iplus) ;
		     i--; iplus-- ;  
	        } else { 
		     sleep(A_LITTLE_WHILE); 
		} 
	}
	/* Now we should be the only one attempting to 
	   lock foo.lock since we have foo.lock.0 . 
	   We'll use the standard algorithm in case
	   of other programs attempting to lock 
	   the file
         */ 
	
	while (  j < NUM_TRYS )  {
		if( get_lock( foo.lock ) )
	                release_lock( foo.lock.0 );  
	        	break ; 
		else { 
	        	 j++ ;   
	         	sleep(A_LITTLE_WHILE) ;
		} 

       }

Obviously, the real code would need to be a bit more complicated to handle various error conditions. But this should present the basic idea. The advantage I hope to gain is that at most two or three processes would be competing for the same lock at a given instant. Questions:

1. Have I missed something obvious?

2. Is it worth doing?

3. Has somebody done this already?

IPC?, posted 27 Sep 2001 at 23:09 UTC by Malx »

Isn't it reimplementation of semaphores? (SYSV originated?)
Have you seen also messages and ...forgot 3rd?

Humm..., posted 28 Sep 2001 at 09:15 UTC by freetype »

I'm really sorry to say that but your code is flawed. Basically, your final loop:

while (  j < NUM_TRYS )  {
	if( get_lock( foo.lock ) )
                release_lock( foo.lock.0 );  
        	break ; 
	else { 
        	 j++ ;   
         	sleep(A_LITTLE_WHILE) ;
	}

suffers from exactly the same problems than your original algorithm (i.e. foo.lock can be released and grabbed by another task while all your queued tasks are in "sleep"). Even when adding a queue, you cannot guarantee that under any schedule a task will always be able to get the lock if you only use lock/unlock/sleep !!

Generally speaking, the only way to ensure that a task will get a lock that it requested is to use specific IPC objects that are supported by the operating system. These generally provide their own task queue as well. Examples are semaphores, mutexes, events, conditional variables, monitors, etc..

You'll probably find lots of literature on the subject in any good CS book teaching multi-tasking. Note that your original pseudo-code is a variant of a special IPC object named a 'spinlock', which is normally used to perform very fast reads or updates to protected data or resources. They are however not suitable when you need to hold a lock for a long period of time though

A solved problem, posted 28 Sep 2001 at 20:36 UTC by raph »

Dijkstra solved this problem in the '60s. Here are a couple of citations:

Appendix of The Structure of "THE"-Multiprogramming System, originally published May 1968, reprinted as a Communications of the ACM Classic of the Month, March 1996.

Typewritten manuscript of "The structure of the 'THE'-multiprogramming system"

I saw this quip recently in another context, but it seems to apply equally well here: "those who have studied history are condemned to stand by and watch others repeat it."

I'm really sorry to say that but your code is flawed. Basically, your final loop:
while ( j < NUM_TRYS ) { if( get_lock( foo.lock ) ) release_lock( foo.lock.0 ); break ; else { j++ ; sleep(A_LITTLE_WHILE) ; }
suffers from exactly the same problems than your original algorithm (i.e. foo.lock can be released and grabbed by another task while all your queued tasks are in "sleep"). Even when adding a queue, you cannot guarantee that under any schedule a task will always be able to get the lock if you only use lock/unlock/sleep !!

- I'm not trying for CS theory perfection, I'm shooting for something better than the original code. In this case you have to use the lock/unlock/sleep in the final step, because that's what the other programs use. This works "well enough" in the context of 2 or 3 processes. It can fail badly when you get 20 or 30 messages in a minute. Expontential backoff will "fix" this but since there are usually at least 1 big sendmail process waiting for the deliver process to acquire the lock, the less time sleeping the better. I guess the only way to know for sure is to write the code and do the benchmarks.

- I'm looking for a way to improve the odds, not guarantee certainty. If only 2 or 3 process are competing for the lock and you make 20 tries, the odds that you will eventually get the lock are at worst ( 1- ( .6666)^20) = 0.9997. However, if 20 processes are competing it's ( 1- ( 0.95)^20 ) = 0.64.

- Well, semaphores would be ideal, but are they portable without significant pain? Is there an open src app that uses then across many OS's? (i.e. is there any OS that doesn't support POSIX 4 ?) You'd still have to have that final loop though.

- Well, when I was thinking it up, I was sure I had seen it somewhere before, but not in the context of file locking. The process only needs to hold the lock long enough to append a single message to mail file. I guess the real problem is that unix mail files are showing their age. Using a dir-based mail folder would make this problem much simpler.

Neat -- but..., posted 28 Sep 2001 at 20:44 UTC by nmw »

Actually, this is a pretty cool algorythm, depending only on open(O_CREAT|O_EXCL).

Pretty easy to code, particularly in shell. The IPC mechanisms others are suggesting are not easily accessible via shell scripts...

I do see a problem though: this algorythm forces FIFO locking order, but does not really reduce sleep/wake cycles -- it probably adds such cycles. And there's no exponential backoff or anything like that.

One way to get around this might be to use pipes (IPC) as lock files, and, when waiting, wait by reading from the pipe of the lock you're waiting for (when unlocking writing to the lock your releasing). Gotta be careful about atomicity -- you don't want to lose wakeups ... or get deadlocked...

Besides the comments about not fixing the problem right, I also think that at least in the case of local mail delivery you are trying to fix the wrong problem. :-) If you're really too busy trying to deliver mail, you want to be able to back it off. Say with qmail you can just return a "soft failure" error code and it'll be retried later (when the system is less busy).

If you often have trouble optaining a lock, the problem is probably not the lock mechamism, but that each action requires the lock for too long (for mail look at maildir or something like that, for databases look at something that can do finer grained locking).

- ask

- Well, I guess I worded my question poorly. I didn't make the constraints of the problem clear enough. What I wanted to ask was

Is there a way to minimize contention of standard file locking algorithms without changing the underlying locking primitives?

- I want to make my wood bridge stronger, not tear it down and replace it with steel. I realize that this is not always the best strategy, but given that I don't want to spend my time porting and rewriting all the mail clients that use the Unix mbox format that's the constraints of the problem.

- I'll just up num_trys and worry about it later... I thought this was a neat enough idea to share, but apparently it's not. Oh well, it's less embarrassing than the time I thought I had invented quaternions.

- Booker C. Bense

> Is there a way to minimize contention of standard file locking
> algorithms without changing the underlying locking primitives?

I believe it's impossible to ensure fair access to the file with a non-blocking "lock" primitive. There exist several efficient non-blocking synchronisation algorithms, but they all rely on the availability of atomic operations, like "compare_and_set", which affect global _values_ accessible to all competitors. And this is not possible in the context of file locks.

If you don't control all competitors (i.e. if some of them are programs that you didn't write), you won't be able to get rid of the problem you describe (and using an exponential backoff in one of the programs will not fix it).

On the other hand, if all "frequent" competitors are instances of a program you control (e.g. that you wrote or can modify), it's certainly possible to synchronize them before they access the lock. For example by using a global mutex.

(by the way, some OSes do not support pthread, but none of them is a Unix-variant, and I highly doubt that the mbox format is going to be used elsewhere, so don't hesitate to use that..)

Another way is to find some "clever" algorithm to change the odds of contention, but I'd be extremely cautious with this approach, for a number of reasons:

first, you need to be sure that your algorithm is correct, and won't generate an infinite loop under certain circumstances. that's not so easy, unfortunately :-(
it's generally a bad idea to use statistics to justify certain algorithms, because thread scheduling isn't as random as it's supposed to be. It's not unusual to see some "patterns" appear on some machines or contexts, and not others. This can make debugging extremely difficult (and explains why correctness is very important for multi-tasking programming).

Also, don't forget that the time you'll spend on designing, testing and tuning your algorithm might be significant, especially when you have the option of using global mutexes which are pretty straightforward to use.

But, if you want to go this route, I'd advise you to look at the following site, and make a few document searches, there are tons of information here if you take the time to explore it:

http://citeseer.nj.nec.com/cs

>> I want to make my wood bridge stronger, not tear it down and
replace it with steel. I realize that this is not always the best
strategy, but given that I don't want to spend my time porting and
rewriting all the mail clients that use the Unix mbox format that's
the constraints of the problem.

Try to identify who's competing the most frequently for the file lock. I don't think that mbox mail clients lock the file when a user reads e-mail (but please let me know if I'm wrong). Most probably, the delivery agent(s) is/are the programs to "optimize" or "fix" for this specific problem.

Otherwise, what are your options at choosing a different mailbox format (e.g. Maildir ??)

>> I'll just up num_trys and worry about it later... I thought this
was a neat enough idea to share, but apparently it's not. Oh well,
it's less embarrassing than the time I thought I had invented quaternions.

Please don't be embarassed, synchronisation is subtler than programmers generally believe and there's nothing wrong with learning new things. Your post is worthwhile because it is also a good way to educate some of Advogato's readers on the topic.

Besides, there is still a chance that you come up with a great algorithm for what you need after learning a bit more on the subject, and we'll still be glad to share it here !!.