Project

General

Profile

Feature #16965

Updated by stanhu (Stan Hu) 3 months ago

In recent RedHat kernels (for example: RHEL 7.8 using kernel 3.10.0-1127.8.2.el7.x86_64), `copy_file_range()` may return `EOPNOTSUP` has been disabled and now returns `ENOSYS` when Ruby attempts to call this in `IO.copy_stream` on an NFS mount. `IO.copy_stream`. A simple `FileUtils.copy_file` will fail with `Operation not supported - copy_file_range` on these kernels.  


  

 [This sample program](https://gitlab.com/kevenhughes/arcport/-/blob/master/cfr.c) demonstrates how to see the `ENOSYS` error: 

 ``` 
 $ strace ./a.out  
 execve("./a.out", ["./a.out"], 0x7ffde8ca0680 /* 38 vars */) = 0 
 brk(NULL)                                 = 0x24bc000 
 brk(0x24bd1c0)                            = 0x24bd1c0 
 arch_prctl(ARCH_SET_FS, 0x24bc880)        = 0 
 uname({sysname="Linux", nodename="xxx", ...}) = 0 
 readlink("/proc/self/exe", "/home/xxx/tmp/a.out", 4096) = 21 
 brk(0x24de1c0)                            = 0x24de1c0 
 brk(0x24df000)                            = 0x24df000 
 access("/etc/ld.so.nohwcap", F_OK)        = -1 ENOENT (No such file or directory) 
 openat(AT_FDCWD, "./1", O_RDONLY)         = 3 
 openat(AT_FDCWD, "./2", O_WRONLY)         = 4 
 copy_file_range(3, NULL, 4, NULL, 1, 0) = -1 ENOSYS (Function not implemented) 
 fstat(3, {st_mode=S_IFREG|0664, st_size=0, ...}) = 0 
 fstat(4, {st_mode=S_IFREG|0664, st_size=0, ...}) = 0 
 fcntl(4, F_GETFL)                         = 0x8001 (flags O_WRONLY|O_LARGEFILE) 
 read(3, "", 1)                            = 0 
 exit_group(0)                             = ? 
 +++ exited with 0 +++ 
 ``` 

 This was possibly changed during a recent security release: https://access.redhat.com/errata/RHSA-2020:1465 

 Ruby's io.c detects whether `copy_file_range()` is defined, not whether it is actually supported. The following test program illustrates the hole in the detection mechanism: 


 ``` 
 #include <syscall.h> 
 #include <stdio.h> 

 #if defined __linux__ && defined __NR_copy_file_range 
 #    define USE_COPY_FILE_RANGE 1 
 #else 
 #    define USE_COPY_FILE_RANGE 0 
 #endif 

 int main() 
 { 
   printf("copy_file_range? %d\n", USE_COPY_FILE_RANGE); 
 } 
 ``` 

 `USE_COPY_FILE_RANGE` gets set to 1 even in when the system call doesn't succeed.  

 I suggest a few improvements: 

 1. Use a compile-time test to verify that `copy_file_range()` can actually be executed.  
 2. Make it possible to disable `USE_COPY_FILE_RANGE` via a build option. Since the test in 1 could still pass if it is run on a Docker host that supports `copy_file_range()`, it would be helpful for us to manually disable it. 

 Reported by GitLab customers: https://gitlab.com/gitlab-org/gitlab/-/issues/218999 

Back