Methods & Math
A precise specification of the baseline scorer, evaluation protocol, and the interpretable ML refinement used in this study.
Pipeline Overview
- 1Invitation-only: each partner completes a 50-item questionnaire independently (TIPI, PVQ-21, Social, Communication/Friendship, Lifestyle, Priorities/Deal-breakers).
- 2Baseline scorer computes FinalMatch for true partners and sampled non-partners; we evaluate AUC, lift, and a median-rank proxy.
- 3Interpretable ML learns domain multipliers via 5-fold CV; we re-evaluate and report the change in AUC/lift.
Notation & Encoding
Let be the set of items, with domain map . For a pair of participants and an item , denote self‑answers by and . Importance weights (OkCupid‑style) are encoded as
Likert items have scale size with . MC‑single items have options with values .
Acceptability Maps (Likert vs. MC)
Each item records the set of partner answers A would accept. For Likert items, A selects a tolerance label (Same, ±1, ±2, Any) mapped to. The acceptability set is
For MC‑single items, A selects (defaulting to A's own choice if unspecified). The directed indicator is
(MC‑multi items are retained for analysis; when placed in a domain with base multiplier 0, they do not affect the score.)
Per‑Domain Multipliers & User Priorities
Let be the base multiplier for domain . Preregistered defaults:
- Values: 3.0, Communication: 2.0, Lifestyle: 2.0, Social: 1.5, Personality: 1.0, Friendship: 1.0
- Quality/IMC or unknown domains: 0.0 (excluded from scoring)
User priorities (Q47) map option labels → domains; if A prioritizes , set, else .
Core Scoring: Baseline Algorithm
Items contribute only if both answered and at least one side's weight is non‑zero. Define
Directional (A←B) satisfaction and its B←A analogue:
Mutuality (primary: geometric; sensitivity: arithmetic):
Apply a finite‑sample penalty with hyperparameter and a min‑overlap gate :
Preregistered grid: and , with both aggregators evaluated.
Quick Reference: Scoring Formula
IMC/Quality items carry base weight 0 and are excluded. Pairs with insufficient overlap are filtered ().
ML Refinement: Learned Domain Weights
Features are per‑domain mutual satisfactions (no penalty), normalized to [0,1]:
We fit L2‑regularized logistic regression with 5‑fold stratified cross‑validation (liblinear solver, ). The objective is
Foldwise coefficients are averaged; we map them to positive, normalized multipliers (for interpretability and drop‑in use):
We report cross‑validated AUC as a measure of separability. (Negative sampling yields a contrastive dataset of positives vs. sampled non‑partners; stratification mitigates class‑imbalance variance.)
Advanced Models: Distance‑Kernel Family
PAM, soft‑gated, and evolutionary variants that incorporate trait distance
Downstream ML models operate on per‑domain mutual acceptability (derived from a probabilistic acceptability model, PAM), a trait distance from normalized self‑reports, and an optional reliability weight . All distance‑kernel variants used in this study share the same core score:
where are domain weights, exponents (default 1), and is a distance kernel chosen per domain:
Baseline / PAM (Similarity‑only)
At the domain level we drop distance and use logit‑transformed mutuality:
with and . PAM refines using directional lower‑confidence bounds before mutual aggregation.
Soft‑Gated Model (Learns similarity vs complementarity)
Each domain mixes similarity and complementarity with non‑negative gates:
We learn by minimizing a pairwise ranking (BPR) loss over triples (anchor , partner , non‑partner ):
After training we infer each domain as similarity (), complementarity (), or irrelevant ().
Evolutionary Model (Discrete search)
Here each domain chooses a discrete mode and kernel parameters that are searched rather than learned by gradient:
The evolutionary search maximizes an identification‑focused objective over generations:
using elitism, domain‑wise crossover, and small mutations in to obtain an interpretable configuration.
In the current pilot data, the soft‑gated model and ML refinement perform best on Hit@K and Mutual@K; the evolutionary model provides a complementary, fully discrete summary of domain modes.
Evaluation Protocol (Ranking‑Based)
For each observed couple, we score the true pair and k sampled non‑partners (default ). The primary metric is AUC via the Mann–Whitney U statistic with standard tie handling:
We also report lift and a median‑rank proxy derived from the sampled distractors. Multiple preregistered variants (different , , aggregator) are compared transparently.
For the distance‑kernel models we additionally track (fraction of anchors whose true partner is ranked in the top ), (fraction of true couples where both rank each other in the top ), and mean reciprocal rank as complementary identification metrics.
Mathematical Properties & Computational Profile
- Range & truncation: , so ; after penalty and floor at 0.
- Monotonicity in acceptability: Expanding or weakly increases (and analogously for B).
- Scale invariance (per‑direction): Multiplying all by a constant leaves unchanged (ratio).
- Bottleneck effect: Geometric aggregation gives if either direction is 0; arithmetic mean is used only for sensitivity.
- Penalty shape: yields diminishing penalty as overlap grows; the gate suppresses unstable scores.
- Complexity: Scoring one pair is ; evaluation with sampled non‑partners per couple is where couples and average overlap.
References
- TIPI (Ten‑Item Personality Inventory). Gosling, S. D., Rentfrow, P. J., & Swann, W. B. (2003). A very brief measure of the Big‑Five personality domains. Journal of Research in Personality, 37(6), 504–528.
- Schwartz values / PVQ‑21. Schwartz, S. H. (2012). An overview of the Schwartz theory of basic values. Online Readings in Psychology and Culture, 2(1). (PVQ‑21 used by the European Social Survey.)
- AUC and Mann–Whitney U. Mann, H. B., & Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics, 18(1), 50–60.
- Stable matching (DA). Gale, D., & Shapley, L. S. (1962). College admissions and the stability of marriage. American Mathematical Monthly, 69(1), 9–15.
- Strategy‑proofness (proposers). Dubins, L. E., & Freedman, D. A. (1981). Machiavelli and the Gale–Shapley algorithm. American Mathematical Monthly, 88(7), 485–494.
- Stability & incentives. Roth, A. E. (1982). The economics of matching: Stability and incentives. Mathematics of Operations Research, 7(4), 617–628.
- Probabilistic serial (random assignment). Bogomolnaia, A., & Moulin, H. (2001). A new solution to the random assignment problem. Journal of Economic Theory, 100(2), 295–328.
- Software (scikit‑learn). Pedregosa, F., et al. (2011). Scikit‑learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
- Software (LIBLINEAR). Fan, R.‑E., Chang, K.‑W., Hsieh, C.‑J., Wang, X.‑R., & Lin, C.‑J. (2008). LIBLINEAR: A library for large linear classification. JMLR, 9, 1871–1874.
Notes: PVQ‑21 portrait wording is attributed to the ESS instrument; our app exposes official text when provided. We cite scikit‑learn and LIBLINEAR corresponding to the code paths used in tiger-alg/ml/learn_domain_weights.py.