الأبحاث

النتائج السلبية

متغيرات النموذج وإضافات الميزات التي اختُبرت وحُكم عليها مقابل بوابة Brier + ECE لفترة 8×90 يوماً walk-forward، ولم تحسّن المجموعة المنشورة. تُنشر بالكامل لأن قرار عدم النشر هو نفس قصة المعايرة لقرار النشر: كل مدخل أدناه يسجل فرضية كان يمكن لشخص كتابتها، والاختبار الذي حكم عليها، وسبب رفض الاختبار.

11 من 25 ملاحظة في المجموعة لم تُعتمد. فهرس الملاحظات الكامل، بما في ذلك المتغيرات التي اعتُمدت، موجود في /research/notes/.

لماذا ننشر ما لم يُعتمد

بدون انتقاء. لو نُشرت فقط المتغيرات التي حسّنت البوابة، لبدت المجموعة المنشورة أكثر حتمية مما هي عليه. ما لم يُعتمد هو دليل على ما لا يستطيع المجموع والبوابة التمييز بينه — هو الفضاء السلبي حول كل تغيير مُعتمد.
يمنع إعادة الاختبار بالخطأ. استبعاد فاشل منذ ستة أشهر غير مرئي لمتعاون جديد ما لم يكن تقريره قابلاً للاكتشاف. الاحتفاظ بالنتائج السلبية على نفس السطح مع الإيجابية يعني أن "هل جرّب أحد هذا؟" له جواب لا يتطلب قراءة سجل التنفيذ.
يحد سقف النموذج. سلسلة من المتغيرات الفاشلة ذات السعة العالية على نفس المجموعة هي بحد ذاتها قياس: البوابة صعبة الاجتياز بالبيانات المتاحة حالياً. هذه الإشارة أكثر فائدة للقارئ الذي يرى الإخفاقات من القارئ الذي يرى النجاحات فقط.

النتائج السلبية

لماذا ننشر ما لم يُعتمد

A within-match chase layer "passes" the headline gate — and the placebo proves it shouldn't

Is composite coverage the lever for the player-strength offset? (No)

Does a player-form (momentum) offset improve match forecasts? (No)

Can we fit the player-strength coefficient instead of hand-setting it? (No)

Anytime-scorer `start_prob` v2 — predicted-XI layer (default-off)

Do teams try harder in must-win games? (No, actually)

Letting team ratings drift over time (didn't improve predictions)

Do some playing styles beat others? (Not enough to measure)

Retuning the models for tournament football — what changed

Does extra rest between matches help? (Not measurably)

Can international-tournament StatsBomb signals beat the club-derived baseline?

لماذا ننشر ما لم يُعتمد

A within-match chase layer "passes" the headline gate — and the placebo proves it shouldn't

Is composite *coverage* the lever for the player-strength offset? (No)

Does a player-form (momentum) offset improve match forecasts? (No)

Can we fit the player-strength coefficient instead of hand-setting it? (No)

Anytime-scorer `start_prob` v2 — predicted-XI layer (default-off)

Do teams try harder in must-win games? (No, actually)

Letting team ratings drift over time (didn't improve predictions)

Do some playing styles beat others? (Not enough to measure)

Retuning the models for tournament football — what changed

Does extra rest between matches help? (Not measurably)

Can international-tournament StatsBomb signals beat the club-derived baseline?

Is composite coverage the lever for the player-strength offset? (No)