wasi: add cat/sort/tail/touch integration coverage and related fixes#11712
wasi: add cat/sort/tail/touch integration coverage and related fixes#11712DePasqualeOrg wants to merge 17 commits into
Conversation
|
Would you split PR (at least for symlink support)? |
9bced35 to
aa75a81
Compare
|
I separated the symlink changes out into #11713. |
|
I added a new commit to address an issue that would cause CI failures. The WASI CI uses stable Rust, but
|
|
I resolved the linter error in CI. |
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
6b396f5 to
562cefd
Compare
|
GNU testsuite comparison: |
|
Is it able to split PR per utility? Diff is still too big. How about |
|
The changes here are interdependent, and splitting out |
562cefd to
e6f63b1
Compare
|
I've added WASI integration coverage for four more tools, plus the fixes needed to make those tests pass. New integration testsAdded Fixestouch: sort: cp: replaced uucore: cat: stub stays – when stdout is inherited from a host file descriptor, wasmtime reports its fstat as all-zero so dev/inode comparison can never match. Comment updated to reflect that accurately. Deferred
VerificationIntegration tests pass on macOS host and on Linux in an Ubuntu 24.04 Docker container (1592 passed / 0 failed / 176 ignored). |
9645693 to
7c34168
Compare
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
| } else { | ||
| filetime::set_file_times(dest, atime, mtime)?; | ||
| #[cfg(target_os = "wasi")] | ||
| { |
There was a problem hiding this comment.
please move that into a new function
|
i am sorry but it can't be merged this way, it needs to be split into several PRs also, it adds way too many wasi_no_threads everywhere |
de79c07 to
eaa010b
Compare
|
GNU testsuite comparison: |
5aa26b0 to
f96a56c
Compare
|
Before I split this up, I moved the sync/threaded variants into separate files, which reduces the number of cfg attributes in uu_sort significantly. If that's the direction you'd like to take, I'll start splitting this into separate PRs. |
f96a56c to
260501c
Compare
e7a99ab to
f2a1656
Compare
|
I'll use this draft PR as a reference for splitting out smaller PRs that can be reviewed and merged individually. Before doing that, I'll wait for #11713 to be reviewed and merged. |
Add #[cfg(target_os = "wasi")] blocks alongside existing unix and windows platform code. No changes to existing platform behavior. Enables compilation to wasm32-wasip1 and wasm32-wasip1-threads targets for running in WASI-compatible runtimes like WasmKit and Wasmer.
On wasm32-wasip1 (no atomics), sort crashes because ext_sort, merge, check, and rayon all spawn threads unconditionally. Add synchronous code paths gated on cfg(all(target_os = "wasi", not(target_feature = "atomics"))) so sort works on both wasip1 (sync) and wasip1-threads (threaded). Key changes: - Extract read_to_chunk() from chunks::read() for shared use - Add synchronous ext_sort with chunked sort-write-merge flow - Add SyncFileMerger for threadless merge operations - Add synchronous check for order verification - Gate rayon par_sort with sequential fallback
f2a1656 to
de7fbc1
Compare
Merging this PR will degrade performance by 3.66%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Memory | cp_recursive_deep_tree[(120, 4)] |
562.5 KB | 615.6 KB | -8.62% |
| ❌ | Simulation | ls_recursive_balanced_tree[(6, 4, 15)] |
49.6 ms | 52.3 ms | -5.18% |
| ❌ | Simulation | ls_recursive_wide_tree[(10000, 1000)] |
33 ms | 34.3 ms | -3.81% |
| ⚡ | Simulation | single_date_now |
85.5 µs | 82.7 µs | +3.38% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing DePasqualeOrg:wasi-support (53ebe10) with main (4505d51)
Footnotes
-
46 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
|
This pr is still in draft and too big. Could you please split it? Thanks BTW, long comment 0 are not helpful |
|
@sylvestre, this has already been discussed. It's in draft because I am in the process of splitting out one PR at a time. Draft status means it doesn't need your attention at this time. I already noted this above. See #12503, which is the first PR split out from this one and ready for review. What do you mean "long comment 0 are not helpful"? This sentence is not even grammatically correct and is incomprehensible to me. I'm really doing my best to help here and don't understand the attitude I'm getting in return. It makes me not want to contribute to open source anymore. |
|
Sorry for my tone. I am direct because of the volume of pr on uutils and I don't always take the time to be polite. I was talking about: Which seems written by a llm and isn't interesting for a reviewer as we focus on the code. Also, I asked for the split almost two months ago :) this is why I repeated it |
|
@sylvestre, I already responded to your request to split it at that time, asking for feedback, and didn't get any. To move things along, I made the first split two weeks ago and flipped this PR to draft status. So your explanation of your repeated request today doesn't make sense. I added what I thought was a reasonably concise summary of the changes, because the code alone doesn't tell the whole story, and not everyone looks only at the code. |
This PR includes comprehensive
wasm32-wasiplatform improvements, enabling many more tools to compile for WASI beyond the existingfeat_wasmset. It includes a full synchronous fallback forsorton targets without thread support, and symlink support forcpvia WASI'ssymlink_path.These changes underwent many rounds of review and refinement with Claude Code and Codex. I've tested them extensively running on Wasmtime and Wasmer.
Motivation
Recent PRs (#11574, #11568, #11569, #11573, #11595, #11624) added initial WASI support – platform stubs,
FileInformation, tail feature-gating, and a basic single-threaded sort path. This PR builds on that work with three improvements:target_feature = "atomics"to distinguishwasm32-wasip1(no threads) fromwasm32-wasip1-threads, so the threaded sort path is preserved on runtimes that support it.cp: usessymlink_pathinstead of returning errors. (lnsymlink support is in a separate PR.)This PR supports the following WASI targets:
wasm32-wasip1: universal baseline, works on all WASI runtimeswasm32-wasip1-threads: supports threads via the atomics and bulk-memory proposalsChanges
uucore
lib.rs#![cfg_attr(all(target_os = "wasi", feature = "fs"), feature(wasi_ext))]— onlystd::os::wasi::fsneeds the unstable gate;std::os::wasi::ffiis stablefeatures/fs.rsFileInformationusingstd::os::wasi::fs::MetadataExtfornlink(),PartialEq(dev+ino),Hash(dev+ino). Addis_stdin_directory,path_ends_with_terminatorfeatures/fsext.rs#[cfg(not(target_os = "wasi"))]fromread_fs_listso the inner WASI block (returning emptyVec) is reachablefeatures/mode.rsget_umask()returning default 0o022mods/io.rsinto_stdio()converts throughFilefirst (no directStdio::from(OwnedFd)on WASI)sort: synchronous fallback for WASI without threads
On
wasm32-wasip1,sortcrashes because it unconditionally spawns threads viastd::thread::spawnand rayon. This PR adds synchronous code paths gated on#[cfg(all(target_os = "wasi", not(target_feature = "atomics")))]so that sort works on both WASI targets.The sort command uses threads in four places, each with a synchronous alternative:
ext_sort/threaded.rsmerge.rsSyncFileMergerthat reads on demandcheck.rscheck_syncthat reads and checks inlinesort.rspar_sort_by/par_sort_unstable_bysort_by/sort_unstable_byThe
ext_sortmodule unconditionally compiles thethreadedmodule, which handles both cases via internalcfgguards. The existing separatewasi.rs(a read-all-into-memory fallback from #11624) is removed in favor of this more complete implementation, along with its unusedparse_into_chunkhelper.Both the synchronous
ext_sortandmergepaths emit a warning and fall back to uncompressed temp files if--compress-programis passed, since process spawning is not available on WASI without threads.The
chunks.rsfile extractsread_to_chunk()fromread()so both threaded and synchronous code paths share the same chunk-reading logic.Other tool crates
platform/mod.rsis_unsafe_overwritestub (returns false)cp.rsstd::os::wasi::fs::symlink_path; return error for timestamp preservation on WASI (filetimepanics infrom_last_access_time/from_last_modification_time) —handle_preservesuppresses for optional (-a) and reports for required (--preserve=timestamps)native_int_str.rs#[cfg(target_os = "wasi")] use std::os::wasi::ffi::{OsStrExt, OsStringExt}mktemp.rs#[cfg(unix)]instead of#[cfg(not(windows))]Cargo.tomlctrlccrate on WASI (no signal handling); makerayonconditional on atomics supporttmp_dir.rsctrlcusagesort.rsTMPDIRenv var before callingenv::temp_dir()(panics on WASI)platform/mod.rsPid,ProcessChecker(#[allow(dead_code)]— follow mode is disabled on WASI),supports_pid_checkspaths.rsfile_id_eq(returns false — no stable inode API on WASI yet)text.rs,args.rstouch.rsUnsupportedPlatformFeatureerror fortouch -on WASI (no/dev/stdoutpath)What's stubbed vs. fully functional
Most WASI stubs are for Unix concepts that don't exist in WASI's capability-based security model:
touch -(returns error – WASI has no/dev/stdoutpath), timestamp preservation incp(filetimecrate panics on WASI)Build requirements
std::os::wasi::fsextensions, gated with#![cfg_attr(all(target_os = "wasi", feature = "fs"), feature(wasi_ext))])wasm32-wasip1:cargo +nightly build --target wasm32-wasip1 --releasewasm32-wasip1-threads:cargo +nightly build --target wasm32-wasip1-threads -Zbuild-std=std,panic_abort --releaseTesting
cargo build --releasecompiles with no warningswasm32-wasip1: compiles, uses synchronous paths for sortwasm32-wasip1-threads: compiles, uses threaded paths for sortwasip1-threads) and Wasmtime (wasip1)sortrequiresTMPDIRenvironment variable on WASI (Rust'sstd::env::temp_dir()panics on WASI – this is a Rust std library issue, not specific to coreutils)Note on cfg alias commit
The last commit adds a
build.rsto the sort crate that defines awasi_no_threadscfg alias, replacing ~49 instances of the verbose#[cfg(all(target_os = "wasi", not(target_feature = "atomics")))]with#[cfg(wasi_no_threads)]. This is a readability improvement only and can be reverted if maintainers prefer the explicit predicates.