fix scheduler idempotency test race condition
root cause: test was observing partial scheduler tick. each flow run
insertion commits separately, so test could see 25 runs mid-tick and
then 50 when tick completed - appearing as a failure.
fix: wait for count to stabilize before recording baseline, ensuring
we measure after the scheduler tick completes rather than during it.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>