OxCaml changes from minus23 to minus25
OXCAML-CHANGELOG-23-25.md
243 lines 7.2 kB view raw view rendered
1# OxCaml Changes: 5.2.0minus-23 to 5.2.0minus-25 2 3This document summarizes the key changes between OxCaml versions 5.2.0minus-23 4and 5.2.0minus-25. 5 6## Major New Features 7 8### The Dissector: Linking Large Executables (#5146) 9 10A new compiler pass called the "dissector" enables linking very large executables 11with the small code model by: 12 13- Analyzing all object files to compute total ELF section sizes 14- Partitioning object files to prevent relocation overflow 15- Partially linking each partition 16- Creating an intermediate PLT/GOT for cross-partition calls 17 18Enable with `-dissector`. Additional flags: 19- `-dissector-partition-size <gb>` - Set partition threshold (default varies) 20- `-ddissector` - Verbose logging 21- `-ddissector-sizes` - Dump section sizes per file 22- `-ddissector-partitions` - Keep partition files for debugging 23 24### Untagged Int Arrays (#4643) 25 26Full support for packed arrays of small integers: 27 28```ocaml 29(* New array types - tightly packed *) 30let bytes : int8# array = [| #0s; #1s; #255s |] (* 1 byte/element *) 31let shorts : int16# array = [| #0S; #1S; #32767S |] (* 2 bytes/element *) 32let ints : int# array = [| #0; #1; #42 |] (* native word/element *) 33let chars : char# array = [| #'a'; #'b'; #'c' |] (* 1 byte/element *) 34``` 35 36Pattern matching now works with char# ranges: 37```ocaml 38match c with 39| #'a'..#'z' -> `lowercase 40| #'A'..#'Z' -> `uppercase 41| _ -> `other 42``` 43 44### Unboxed Elements at Module Top-Level (#4020, #5064) 45 46Unboxed types can now appear at the top-level of modules: 47 48```ocaml 49module M = struct 50 let pi : float# = #3.14159 51 let answer : int32# = #42l 52end 53``` 54 55Previously this was prohibited. 56 57## SIMD Enhancements 58 59### AVX2 Gather Intrinsics (#5040) 60 61Gather operations for loading values from non-contiguous memory addresses: 62 63```ocaml 64(* Gather using index vector - loads arr[indices[0]], arr[indices[1]], etc. *) 65gather_int32x4 ~base ~indices ~scale ~mask 66gather_float64x2 ~base ~indices ~scale ~mask 67``` 68 69### BMI/BMI2 Intrinsics (#5065) 70 71Complete set of bit manipulation instructions: 72 73- **BMI**: `andn`, `bextr`, `blsi`, `blsmsk`, `blsr`, `tzcnt` 74- **BMI2**: `bzhi`, `mulx`, `pdep`, `pext`, `rorx`, `sarx`, `shrx`, `shlx` 75- **POPCNT**: `popcnt_int32`, `popcnt_int64` 76- **LZCNT**: `lzcnt_int32`, `lzcnt_int64` 77 78### SIMD Load/Store Intrinsics (#4994) 79 80Direct memory operations with explicit alignment handling: 81 82```ocaml 83(* Aligned/unaligned loads and stores *) 84vec128_load_aligned, vec128_store_aligned 85vec128_load_unaligned, vec128_store_unaligned 86vec256_load_aligned, vec256_store_aligned 87 88(* Non-temporal (streaming) stores *) 89vec128_store_aligned_uncached 90 91(* Partial loads/stores *) 92vec128_load_low64, vec128_load_low32 93vec128_store_low64, vec128_store_low32 94``` 95 96### 128-bit Integer Arithmetic (#5025) 97 98Support for wide integer arithmetic using register pairs. 99 100### Float64/Int64 Cast Builtins (#5114) 101 102Bitwise reinterpretation between float64 and int64. 103 104## Mode System Changes 105 106### New `shareable` Portability Mode 107 108The portability axis now has three values instead of two: 109 110``` 111nonportable → shareable → portable 112``` 113 114- `nonportable`: Functions capturing uncontended mutable state 115- `shareable`: Functions capturing shared state (may execute in parallel) 116- `portable`: Functions capturing all values at contended (may execute concurrently) 117 118### `@@ global` Implies `@@ aliased` 119 120For modalities, `@@ global` now always implies `@@ aliased`. Using 121`@@ global unique` together is forbidden to ensure soundness of borrowing: 122 123```ocaml 124(* OK *) 125type t = { field : 'a @@ global aliased } 126 127(* ERROR - forbidden *) 128type t = { field : 'a @@ global unique } 129``` 130 131### Improved Modal Inclusion Errors (#5112) 132 133Better error messages when mode constraints are violated. 134 135## CFG Backend Improvements 136 137### CFG Reducibility Checking (#4920, #4921) 138 139The compiler now checks for and handles irreducible control flow graphs, 140with safeguards to prevent optimizations that could create them. 141 142### CFG Value Propagation (#4879, #4807) 143 144Extended value propagation to float values and improved terminator simplification. 145New flags: 146- `-cfg-value-propagation` / `-no-cfg-value-propagation` 147- `-cfg-value-propagation-float` / `-no-cfg-value-propagation-float` 148 149### Register Allocation Affinity (#5059) 150 151Basic support for register affinity hints in the allocator. 152 153### CFG Peephole: Neutral Element Removal (#4932) 154 155Removes operations whose operand is a neutral element (e.g., adding 0). 156 157## Flambda2 / Optimizer Improvements 158 159### Reaper Enhancements 160 161- **Auto mode for direct call preservation** (#5081): New `-reaper-preserve-direct-calls auto` 162 option that preserves direct calls only when the reaper cannot identify 163 called functions. 164- **Type rewriting** (#5043): The reaper can now rewrite types. 165- **Local field handling**: New `-reaper-local-fields` flag. 166 167### Improved Inlining Metrics (#5116) 168 169Added profiling counters for inlining decisions that don't decrease code size. 170 171### to_cmm Safety Improvements (#4941) 172 173Prevents illegal re-orderings when converting to Cmm representation. 174 175## Runtime Metaprogramming (Experimental) 176 177### Slambda and Quotes (#4776, #5023, #5077) 178 179Initial support for compile-time metaprogramming: 180- `runtime_metaprogramming` language extension 181- Slambda splices in Lambda 182- Quote printing and AST mapper fixes 183 184## Probes Support 185 186### Unoptimized Implementation (#5007) 187 188Basic probe support for runtime instrumentation. Closure middle-end support 189was removed (#4990) - probes now require Flambda 2. 190 191## Type System 192 193### With-Bounds for GADTs Re-enabled (#5046) 194 195Support for with-bounds constraints on GADTs has been restored. 196 197### Looser Checking for Staticity (#5075) 198 199More permissive checking for static expressions. 200 201## Compiler Flags 202 203### New Flags 204 205- `-dissector` and related debug flags 206- `-cfg-value-propagation[-float]` 207- `-reaper-preserve-direct-calls auto` 208- `-reaper-local-fields` 209- `-reaper-unbox` 210- `-reaper-change-calling-conventions` 211- `-flambda2-expert-cmm-safe-subst` 212- `-ddwarf-metrics-output-file` 213- `-no-locs` (for test output) 214 215## Build System Changes 216 217- Removed `boot` directory (#5067) 218- Added `make clean` and `make distclean` targets (#5035) 219- Manifest files support (#4986) 220 221## Bug Fixes 222 223- Fixed computation of code size for `Boolean_not` switches (#5121) 224- Fixed code size computation when converting switch to lookup table (#5120) 225- Fixed over-estimation of removed primitives due to canonicalization (#5118) 226- Race condition fix for `-dump-dir` with nonexistent path (#5010) 227- Fixed fiber cache leak (#5017) 228- Don't access globals in tight marking loop (#4997) 229- Avoid touching global variables in sweep loop (#5022) 230- Application type error now preempts mode error (#5073) 231- Fixed simplify terminator pass for irreducible graphs (#5174) 232- Small int SIMD casts no longer sign-extend incorrectly (#4987) 233 234## Array Tag Changes (#5126) 235 236Rearranged array tags to restore backwards compatibility with existing code. 237 238## Documentation Updates 239 240- Parallelism tutorials now use `#(...)` syntax for unboxed tuples 241- Clarified portability mode documentation 242- Updated small numbers documentation with array support 243- Fixed contended/uncontended field projection documentation