Issues of Rust’s remove_dir_all implementation on Windows

I am recently working on a CLI tool to manage and distribute CTF problems. While I was implementing the remove repository operation, I got an unexpected Access is denied. (os error 5) message on stable-x86_64-pc-windows-msvc toolchain (Rust 1.31.1).

fs::remove_dir_all(...) in this code was emitting the error.

let mut repo_index = env.data_dir().read_repo_index()?;

let image_list = runtime.block_on(docker::list_images(env))?;
if docker::image_from_repo_exists(&image_list, repo_name) {
    Err(SomaError::RepositoryInUseError)?;
}

let repository = repo_index
    .remove(repo_name)
    .ok_or(SomaError::RepositoryNotFoundError)?;
env.data_dir().write_repo_index(repo_index)?;

remove_dir_all(repository.local_path())?;

env.printer().write_line(&format!(
    "Successfully removed repository: '{}'.",
    &repo_name
));

Ok(())

I first ensured that no program is using files inside the directory. Then, I started searching about the issue. Surprisingly, it was a long-standing issue in the standard library from 2015 (#29497).

According to @pitdicker’s comment, problems with the current remove_dir_all implementation on Windows are:

  • cannot remove contents if the path becomes longer than MAX_PATH
  • files may not be deleted immediately, causing remove_dir to fail
  • unable to remove read-only files

Mine was the third case. .git/objects/pack contained files with read-only attributes, which caused the denial of the access. This behavior was surprising because I had no problem deleting the directory with File Explorer or on Linux. Apparently, this is the default behavior of Windows API, and Python had a similar issue. I agree to Tim Golden’s comment which says that “this, unfortunately, is the classic edge-case where intra-platform consistency and inter-platform consistency clash,” but I hope to have an easy fix in Rust like Python’s onerror argument instead of manually writing a directory recursion with permission handling.

The second problem is also noteworthy. The core reason for it is that unlike POSIX API, Windows file deletion API does not delete the file immediately but mark it for “delete later.” Therefore, even though DeleteFile call returns, it is not guaranteed that the file is actually deleted from the file system. Racing the File System talk in CppCon2015 mentions how to wrongly delete a directory tree on Windows. Unfortunately, this is the way how Rust’s remove_dir_all is implemented.

Slide from “Racing the File System”

As a result, the issue is causing spurious failures in rustup (#995). Also, tempdir and Maskerad implemented their version of remove_dir_all to bypass this problem. There was a PR (#31944) to fix this problem, but it was not merged to the upstream because of the difficulty of defining reasonable cross-platform behavior for Windows and Linux, the complexity of permission handling, and the inactivity from the original author.

The best solution, for now, seems using remove_dir_all crate which is based on PR #31944. I understand that it is hard to define the reasonable behavior for this kind of operations especially for cross-platform projects, but at least I could have saved much time if these edge cases were listed in the official documentation.

댓글 남기기