IO copy API for stream copying#20399
Conversation
|
I have done some benchmarking of this on Linux. The result vary between runs as it's highly system dependent but I can see some useful result from that:
So this seems good in terms of Linux perf which probably matter most. We should look after some follow ups to improve FreeBSD, MacOS and other unix variant. The sendfile works a bit differently for partial writes (think it blocks there but I made some changes so it is only used for real socket so it should be fine). there so it needs more testing but I think it can be a follow up PR. I should also note that the pipe check is there for future use - that pipe flag is only supported on Win but if we supported pipes more on Linux, we could potentially make use of that and it would limit some overhead for splice. @devnexen Please could you review it. |
|
I ll do a deeper dive sometime next week but here some quick findings. |
This introduces new API for fd copying and modifies php_stream_copy_to_stream_ex to use it. The implementation is separated for various platforms and the end result have couple of implications: - sendfile is used for copying file to generic fd (e.g. sockets) on all platforms except Windows that use TransmitFile - splice is used for copying between generic fds (e.g. sockets) on Linux - copy_file_range should get used on alpine linux with directly using syscall (as musl does not seem to implement it) - copy_file_range is used in the loop so it is used multiple times for files bigger than 2GB on Linux. - file mmap for copying is removed as it allowed crashing PHP when another process modified mapped file - this was used as a fallback for file copying. Sendfile should partially replace it. - File to file copying was optimized on Windows with use of ReadFile and WriteFile. Closes phpGH-20399 Co-authored-by: David Carlier <devnexen@gmail.com>
This introduces new API for fd copying and modifies php_stream_copy_to_stream_ex to use it. The implementation is separated for various platforms and the end result have couple of implications: - sendfile is used for copying file to generic fd (e.g. sockets) on all platforms except Windows that use TransmitFile - splice is used for copying between generic fds (e.g. sockets) on Linux - copy_file_range should get used on alpine linux with directly using syscall (as musl does not seem to implement it) - copy_file_range is used in the loop so it is used multiple times for files bigger than 2GB on Linux. - file mmap for copying is removed as it allowed crashing PHP when another process modified mapped file - this was used as a fallback for file copying. Sendfile should partially replace it. - File to file copying was optimized on Windows with use of ReadFile and WriteFile. Closes phpGH-20399 Co-authored-by: David Carlier <devnexen@gmail.com>
This introduces new API for fd copying and modifies php_stream_copy_to_stream_ex to use it. The implementation is separated for various platforms and the end result have couple of implications: - sendfile is used for copying file to generic fd (e.g. sockets) on all platforms except Windows that use TransmitFile - splice is used for copying between generic fds (e.g. sockets) on Linux - copy_file_range should get used on alpine linux with directly using syscall (as musl does not seem to implement it) - copy_file_range is used in the loop so it is used multiple times for files bigger than 2GB on Linux. - file mmap for copying is removed as it allowed crashing PHP when another process modified mapped file - this was used as a fallback for file copying. Sendfile should partially replace it. - File to file copying was optimized on Windows with use of ReadFile and WriteFile. This also adds various tests including Linux unit tests. Closes phpGH-20399 Co-authored-by: David Carlier <devnexen@gmail.com>
This introduces new API for fd copying and modifies
php_stream_copy_to_stream_ex to use it. The implementation is separated
for various platforms and the end result have couple of implications:
sendfileis used for copying file to generic fd (e.g. sockets) on all platforms except Windows that use TransmitFilespliceis used for copying between generic fds (e.g. sockets) on Linuxcopy_file_rangeshould get used on alpine linux with directly using syscall (as musl does not seem to implement it)copy_file_rangeis used in the loop so it is used multiple times for files bigger than 2GB on Linux.ReadFileandWriteFile.