Gaps in the formal proof relative to the paper’s causal identification and consistency claims
Lean entrypoint: ConjointSD.lean
-
Approximation-error bounds for the linear-in-terms model are not instantiated
- Remaining gap: supply concrete approximation bounds (C, δ) or variance nonnegativity needed to apply the approximation lemmas.
- Empirical bound sketch (not formalized): use a sample split to estimate an approximation error ε between the linear-in-terms model and a richer benchmark, then plug ε into the approximate SD targets.
- Empirical RMSEs vs target-human-population L2 bounds are not linked: the proof uses target-human-population L2 distances under
ν_pop, but the R workflow computes test-set RMSEs. A generalization/LLN step is needed to show the sample RMSE converges to the target-human-populationL2(ν_pop)distance (or to a weighted target-human-population target), and then thread that into the L2-approximation assumptions used in the bounds. - Component SD targets require an additive-projection definition to separate estimation error from approximation error (projection residual size).
-
Missingness and “not sure” responses are not modeled
- The formalized outcomes are bounded real values on [0,100] and do not model the “not sure” response option or any missingness mechanism.
- To fix: justify treating “not sure” as ignorable, or add a missingness/selection model and connect it to the estimands.
- Patch plan (MAR/MCAR): (i) introduce a missingness indicator
Rand observed outcomeYobs := if R=1 then Y else default, plus a notation that treats “not sure” asR=0; (ii) add a bundled assumption inConjointSD/Assumptions.leanfor MCAR (R ⟂ (Y, A, X)withP(R=1)>0) or MAR (R ⟂ Y | (A, X)with positivityP(R=1 | A, X) >= c); (iii) show the target estimand is identifiable from observed data under MAR by rewriting expectations asE[Y | A, X] = E[Yobs | A, X, R=1]and then integrating over(A, X); (iv) update the estimator definitions to either (a) treat missing as ignorable and use complete cases under MCAR, or (b) add inverse-probability weights1/P(R=1 | A, X)under MAR; (v) thread the new assumptions into the consistency theorems by replacing outcome integrability with conditional integrability givenR=1, and adding the positivity bound needed for LLN/variance control; (vi) update docs (readable/Assumptions.md, any affectedreadable/*.md,proven_statements.md,gaps.md) and rerun the dependency tables.