Things UNIX can do atomically 2010/01/06
This is a catalog of things UNIX-like/POSIX-compliant operating systems can do atomically, making them useful as building blocks for thread-safe and multi-process-safe programs without mutexes or read/write locks. The list is by no means exhaustive and I expect it to be updated frequently for the foreseeable future.
The philosophy here is to let the kernel do as much work as possible. At my most pessimistic, I trust the kernel developers more than a trust myself. More practically, it’s stupid to spend CPU time locking around an operation that’s already atomic. Added 2010-01-07.
Operating on a pathname
The operations below are best left to local filesystems. More than a few people have written in crying foul if any of these techniques are used on an NFS mount. True. When there are multiple kernels involved, the kernel can’t very well take care of all the locking for us. Added 2010-01-06.
mv -T <oldsymlink> <newsymlink>
atomically changes the target of<newsymlink>
to the directory pointed to by<oldsymlink>
and is indispensable when deploying new code. Updated 2010-01-06: both operands are symlinks. (So this isn’t a system call, it’s still useful.)A reader pointed out thatDeleted 2010-01-06:ln -Tfs <directory> <symlink>
accomplishes the same thing without the second symlink. Added 2010-01-06.strace(1)
shows thatln -Tfs <directory> <symlink>
actually callssymlink(2)
,unlink(2)
, andsymlink(2)
once more, disqualifying it from this page.mv -T <oldsymlink> <newsymlink>
ends up callingrename(2)
which can atomically replace<newsymlink>
. Caveat 2013-01-07: this does not apply to Mac OS X, whosemv(1)
doesn’t callrename(2)
.mv(1)
.link(oldpath, newpath)
creates a new hard link callednewpath
pointing to the same inode asoldpath
and increases the link count by one. This will fail with the error codeEEXIST
ifnewpath
already exists, making this a useful mechanism for locking a file amongst threads or processes that can all agree upon the namenewpath
. I prefer this technique for whole-file locking because the lock is visible tols(1)
.link(2)
.symlink(oldpath, newpath)
operates very much likelink(2)
but creates a symbolic link at a new inode rather than a hard link to the same inode. Symbolic links can point to directories, which hard links cannot, making them a perfect analogy tolink(2)
when locking entire directories. This will fail with the error codeEEXIST
ifnewpath
already exists, making this a perfect analogy tolink(2)
that works for directories, too. Be careful of symbolic links whose target inode has been removed ("dangling" symbolic links) —open(2)
will fail with the error codeENOENT
. It should be mentioned that inodes are a finite resource (this particular machine has 1,245,184 inodes).symlink(2)
. Added 2010-01-07rename(oldpath, newpath)
can change a pathname atomically, providedoldpath
andnewpath
are on the same filesystem. This will fail with the error codeENOENT
ifoldpath
does not exist, enabling interprocess locking much likelink(oldpath, newpath)
above. I find this technique more natural when the files in question will beunlink
ed later.rename(2)
.open(pathname, O_CREAT | O_EXCL, 0644)
creates and opens a new file. (Don’t forget to set the mode in the third argument!)O_EXCL
instructs this to fail with the error codeEEXIST
ifpathname
exists. This is a useful way to decide which process should handle a task: whoever successfully creates the file.open(2)
.mkdir(dirname, 0755)
creates a new directory but fails with the error codeEEXIST
ifdirname
exists. This provides for directories the same mechanismlink(2)
open(2)
withO_EXCL
provides for files.mkdir(2)
. Added 2010-01-06; edited 2013-01-07.
Operating on a file descriptor
fcntl(fd, F_GETLK, &lock)
,fcntl(fd, F_SETLK, &lock)
, andfcntl(fd, F_SETLKW, &lock)
allow cooperating processes to lock regions of a file to serialize their access.lock
is of typestruct flock
and describes the type of lock and the region being locked.F_SETLKW
is particularly useful as it blocks the calling process until the lock is acquired. There is a “mandatory locking” mode but Linux’s implementation is unreliable as it’s subject to a race condition.fcntl(2)
.fcntl(fd, F_GETLEASE)
andfcntl(fd, F_SETLEASE, lease)
ask the kernel to notify the calling process withSIGIO
when another processopen
s ortruncate
s the file referred to byfd
. When that signals arrives, the lease needs to be removed byfcntl(fd, F_SETLEASE, F_UNLCK)
.fcntl(fd, F_NOTIFY, arg)
is similar but doesn’t block other processes, so it isn’t useful for synchronization.fcntl(2)
.mmap(0, length, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0)
returns a pointer from which a file’s contents can be read and written by normal memory operations. By making frequent use ofmsync(addr, length, MS_INVALIDATE)
, data written in this manner can be shared between processes that both map the same file.mmap(2)
,msync(2)
.
Operating on virtual memory
__sync_fetch_and_add
,__sync_add_and_fetch
,__sync_val_compare_and_swap
, and friends provide a full barrier so “no memory operand will be moved across the operation, either forward or backward.” These operations are the basis for most (all?) lock-free algorithms. GCC Atomic Builtins.
Something I should add to my repertoire? Race condition? Let me know at r@rcrowley.org or @rcrowley and I’ll fix it.