OxCaml changes from minus23 to minus25
OXCAML-CHANGELOG-23-25.md
1# OxCaml Changes: 5.2.0minus-23 to 5.2.0minus-25
2
3This document summarizes the key changes between OxCaml versions 5.2.0minus-23
4and 5.2.0minus-25.
5
6## Major New Features
7
8### The Dissector: Linking Large Executables (#5146)
9
10A new compiler pass called the "dissector" enables linking very large executables
11with the small code model by:
12
13- Analyzing all object files to compute total ELF section sizes
14- Partitioning object files to prevent relocation overflow
15- Partially linking each partition
16- Creating an intermediate PLT/GOT for cross-partition calls
17
18Enable with `-dissector`. Additional flags:
19- `-dissector-partition-size <gb>` - Set partition threshold (default varies)
20- `-ddissector` - Verbose logging
21- `-ddissector-sizes` - Dump section sizes per file
22- `-ddissector-partitions` - Keep partition files for debugging
23
24### Untagged Int Arrays (#4643)
25
26Full support for packed arrays of small integers:
27
28```ocaml
29(* New array types - tightly packed *)
30let bytes : int8# array = [| #0s; #1s; #255s |] (* 1 byte/element *)
31let shorts : int16# array = [| #0S; #1S; #32767S |] (* 2 bytes/element *)
32let ints : int# array = [| #0; #1; #42 |] (* native word/element *)
33let chars : char# array = [| #'a'; #'b'; #'c' |] (* 1 byte/element *)
34```
35
36Pattern matching now works with char# ranges:
37```ocaml
38match c with
39| #'a'..#'z' -> `lowercase
40| #'A'..#'Z' -> `uppercase
41| _ -> `other
42```
43
44### Unboxed Elements at Module Top-Level (#4020, #5064)
45
46Unboxed types can now appear at the top-level of modules:
47
48```ocaml
49module M = struct
50 let pi : float# = #3.14159
51 let answer : int32# = #42l
52end
53```
54
55Previously this was prohibited.
56
57## SIMD Enhancements
58
59### AVX2 Gather Intrinsics (#5040)
60
61Gather operations for loading values from non-contiguous memory addresses:
62
63```ocaml
64(* Gather using index vector - loads arr[indices[0]], arr[indices[1]], etc. *)
65gather_int32x4 ~base ~indices ~scale ~mask
66gather_float64x2 ~base ~indices ~scale ~mask
67```
68
69### BMI/BMI2 Intrinsics (#5065)
70
71Complete set of bit manipulation instructions:
72
73- **BMI**: `andn`, `bextr`, `blsi`, `blsmsk`, `blsr`, `tzcnt`
74- **BMI2**: `bzhi`, `mulx`, `pdep`, `pext`, `rorx`, `sarx`, `shrx`, `shlx`
75- **POPCNT**: `popcnt_int32`, `popcnt_int64`
76- **LZCNT**: `lzcnt_int32`, `lzcnt_int64`
77
78### SIMD Load/Store Intrinsics (#4994)
79
80Direct memory operations with explicit alignment handling:
81
82```ocaml
83(* Aligned/unaligned loads and stores *)
84vec128_load_aligned, vec128_store_aligned
85vec128_load_unaligned, vec128_store_unaligned
86vec256_load_aligned, vec256_store_aligned
87
88(* Non-temporal (streaming) stores *)
89vec128_store_aligned_uncached
90
91(* Partial loads/stores *)
92vec128_load_low64, vec128_load_low32
93vec128_store_low64, vec128_store_low32
94```
95
96### 128-bit Integer Arithmetic (#5025)
97
98Support for wide integer arithmetic using register pairs.
99
100### Float64/Int64 Cast Builtins (#5114)
101
102Bitwise reinterpretation between float64 and int64.
103
104## Mode System Changes
105
106### New `shareable` Portability Mode
107
108The portability axis now has three values instead of two:
109
110```
111nonportable → shareable → portable
112```
113
114- `nonportable`: Functions capturing uncontended mutable state
115- `shareable`: Functions capturing shared state (may execute in parallel)
116- `portable`: Functions capturing all values at contended (may execute concurrently)
117
118### `@@ global` Implies `@@ aliased`
119
120For modalities, `@@ global` now always implies `@@ aliased`. Using
121`@@ global unique` together is forbidden to ensure soundness of borrowing:
122
123```ocaml
124(* OK *)
125type t = { field : 'a @@ global aliased }
126
127(* ERROR - forbidden *)
128type t = { field : 'a @@ global unique }
129```
130
131### Improved Modal Inclusion Errors (#5112)
132
133Better error messages when mode constraints are violated.
134
135## CFG Backend Improvements
136
137### CFG Reducibility Checking (#4920, #4921)
138
139The compiler now checks for and handles irreducible control flow graphs,
140with safeguards to prevent optimizations that could create them.
141
142### CFG Value Propagation (#4879, #4807)
143
144Extended value propagation to float values and improved terminator simplification.
145New flags:
146- `-cfg-value-propagation` / `-no-cfg-value-propagation`
147- `-cfg-value-propagation-float` / `-no-cfg-value-propagation-float`
148
149### Register Allocation Affinity (#5059)
150
151Basic support for register affinity hints in the allocator.
152
153### CFG Peephole: Neutral Element Removal (#4932)
154
155Removes operations whose operand is a neutral element (e.g., adding 0).
156
157## Flambda2 / Optimizer Improvements
158
159### Reaper Enhancements
160
161- **Auto mode for direct call preservation** (#5081): New `-reaper-preserve-direct-calls auto`
162 option that preserves direct calls only when the reaper cannot identify
163 called functions.
164- **Type rewriting** (#5043): The reaper can now rewrite types.
165- **Local field handling**: New `-reaper-local-fields` flag.
166
167### Improved Inlining Metrics (#5116)
168
169Added profiling counters for inlining decisions that don't decrease code size.
170
171### to_cmm Safety Improvements (#4941)
172
173Prevents illegal re-orderings when converting to Cmm representation.
174
175## Runtime Metaprogramming (Experimental)
176
177### Slambda and Quotes (#4776, #5023, #5077)
178
179Initial support for compile-time metaprogramming:
180- `runtime_metaprogramming` language extension
181- Slambda splices in Lambda
182- Quote printing and AST mapper fixes
183
184## Probes Support
185
186### Unoptimized Implementation (#5007)
187
188Basic probe support for runtime instrumentation. Closure middle-end support
189was removed (#4990) - probes now require Flambda 2.
190
191## Type System
192
193### With-Bounds for GADTs Re-enabled (#5046)
194
195Support for with-bounds constraints on GADTs has been restored.
196
197### Looser Checking for Staticity (#5075)
198
199More permissive checking for static expressions.
200
201## Compiler Flags
202
203### New Flags
204
205- `-dissector` and related debug flags
206- `-cfg-value-propagation[-float]`
207- `-reaper-preserve-direct-calls auto`
208- `-reaper-local-fields`
209- `-reaper-unbox`
210- `-reaper-change-calling-conventions`
211- `-flambda2-expert-cmm-safe-subst`
212- `-ddwarf-metrics-output-file`
213- `-no-locs` (for test output)
214
215## Build System Changes
216
217- Removed `boot` directory (#5067)
218- Added `make clean` and `make distclean` targets (#5035)
219- Manifest files support (#4986)
220
221## Bug Fixes
222
223- Fixed computation of code size for `Boolean_not` switches (#5121)
224- Fixed code size computation when converting switch to lookup table (#5120)
225- Fixed over-estimation of removed primitives due to canonicalization (#5118)
226- Race condition fix for `-dump-dir` with nonexistent path (#5010)
227- Fixed fiber cache leak (#5017)
228- Don't access globals in tight marking loop (#4997)
229- Avoid touching global variables in sweep loop (#5022)
230- Application type error now preempts mode error (#5073)
231- Fixed simplify terminator pass for irreducible graphs (#5174)
232- Small int SIMD casts no longer sign-extend incorrectly (#4987)
233
234## Array Tag Changes (#5126)
235
236Rearranged array tags to restore backwards compatibility with existing code.
237
238## Documentation Updates
239
240- Parallelism tutorials now use `#(...)` syntax for unboxed tuples
241- Clarified portability mode documentation
242- Updated small numbers documentation with array support
243- Fixed contended/uncontended field projection documentation