Overview

Methods & Math

A precise specification of the baseline scorer, evaluation protocol, and the interpretable ML refinement used in this study.

Baseline & PAM scorers

Directed acceptability with importance and per-domain multipliers, plus a probabilistic acceptability model (PAM) that adds uncertainty-aware mutuality.

Distance-kernel ML models

Soft-gated and evolutionary domain-mode models over mutual acceptability and trait distance; each domain chooses similarity, complementarity, or irrelevance.

Evaluation metrics

True partners vs sampled non-partners; Hit@K, Mutual@K, MRR, and AUC via Mann–Whitney U, plus simple lift and median-rank diagnostics.

Pipeline Overview

1Invitation-only: each partner completes a 50-item questionnaire independently (TIPI, PVQ-21, Social, Communication/Friendship, Lifestyle, Priorities/Deal-breakers).
2Baseline scorer computes FinalMatch for true partners and sampled non-partners; we evaluate AUC, lift, and a median-rank proxy.
3Interpretable ML learns domain multipliers via 5-fold CV; we re-evaluate and report the change in AUC/lift.

Notation & Encoding

Let $\mathcal{I}$ be the set of items, with domain map $d:\mathcal{I}\to\mathcal{D}$ . For a pair of participants $A,B$ and an item $i\in\mathcal{I}$ , denote self‑answers by $a_i^A$ and $a_i^B$ . Importance weights (OkCupid‑style) are encoded as

w_i^A \in \{0,\,1,\,10,\,50,\,250\} \quad\text{and}\quad w_i^B \in \{0,\,1,\,10,\,50,\,250\}.

Likert items have scale size $K_i\in\{5,6,7\}$ with $a_i^\cdot \in \{1,\dots,K_i\}$ . MC‑single items have $M_i$ options with values $\{1,\dots,M_i\}$ .

Acceptability Maps (Likert vs. MC)

Each item records the set of partner answers A would accept. For Likert items, A selects a tolerance label (Same, ±1, ±2, Any) mapped to $\tau_i^A \in \{0,1,2,\infty\}$ . The acceptability set is

\mathcal{A}_i^A = \begin{cases} \{\, b \in \{1,\dots,K_i\} : |\,b - a_i^A\,| \le \tau_i^A \,\}, & \tau_i^A < \infty\\[2mm] \{1,\dots,K_i\}, & \tau_i^A = \infty \end{cases}

For MC‑single items, A selects $\Omega_i^A \subseteq \{1,\dots,M_i\}$ (defaulting to A's own choice if unspecified). The directed indicator is

I_i^A(B) = \begin{cases} \mathbf{1}\!\left[\, a_i^B \in \Omega_i^A \,\right], & \text{MC\_SINGLE}\\[2mm] \mathbf{1}\!\left[\, a_i^B \in \mathcal{A}_i^A \,\right], & \text{Likert} \end{cases}

(MC‑multi items are retained for analysis; when placed in a domain with base multiplier 0, they do not affect the score.)

Per‑Domain Multipliers & User Priorities

Let $m_d$ be the base multiplier for domain $d\in\mathcal{D}$ . Preregistered defaults:

Values: 3.0, Communication: 2.0, Lifestyle: 2.0, Social: 1.5, Personality: 1.0, Friendship: 1.0
Quality/IMC or unknown domains: 0.0 (excluded from scoring)

User priorities (Q47) map option labels → domains; if A prioritizes $d$ , set $p_d^A = 1.5$ , else $p_d^A = 1$ .

W_i^A = w_i^A \cdot m_{\,d(i)} \cdot p_{\,d(i)}^A, \qquad W_i^B = w_i^B \cdot m_{\,d(i)} \cdot p_{\,d(i)}^B.

Core Scoring: Baseline Algorithm

Items contribute only if both answered and at least one side's weight is non‑zero. Define

\mathcal{I}_{AB} = \bigl\{\, i\in\mathcal{I} : a_i^A\neq\varnothing,\; a_i^B\neq\varnothing,\; W_i^A>0 \text{ or } W_i^B>0 \,\bigr\},\quad n = |\mathcal{I}_{AB}|.

Directional (A←B) satisfaction and its B←A analogue:

s_A(B) = \frac{\sum_{i\in\mathcal{I}_{AB}} W_i^A \, I_i^A(B)}{\sum_{i\in\mathcal{I}_{AB}} W_i^A}, \qquad s_B(A) = \frac{\sum_{i\in\mathcal{I}_{AB}} W_i^B \, I_i^B(A)}{\sum_{i\in\mathcal{I}_{AB}} W_i^B}.

Mutuality (primary: geometric; sensitivity: arithmetic):

\text{Match}_{\mathrm{geo}} = 100\,\sqrt{s_A(B)\,s_B(A)},\qquad \text{Match}_{\mathrm{mean}} = 100\cdot \tfrac{1}{2}\bigl(s_A(B)+s_B(A)\bigr).

Apply a finite‑sample penalty with hyperparameter $c>0$ and a min‑overlap gate $n_{\min}$ :

\mathrm{FinalMatch} = \begin{cases} \max\!\bigl(0,\; \text{Match}_{\star} - \dfrac{c}{\sqrt{n}} \bigr), & n \ge n_{\min} \\[3mm] \text{undefined (filtered)}, & n < n_{\min} \end{cases} \quad \text{ where } \star\in\{\mathrm{geo},\mathrm{mean}\}.

Preregistered grid: $c\in\{60,100,140\}$ and $n_{\min}\in\{15,20,25\}$ , with both aggregators evaluated.

Quick Reference: Scoring Formula

s_A(B) = \frac{\sum_{i\in\mathcal I_{AB}} W_i^A \cdot I_i^A(B)}{\sum_{i\in\mathcal I_{AB}} W_i^A}, \qquad W_i^A = w_i^A\, m_{d(i)}\, p_{d(i)}^A

\text{FinalMatch} = \max\!\Bigl(0,\; 100\,\sqrt{s_A(B)\, s_B(A)} - \frac{c}{\sqrt{n}}\Bigr)

IMC/Quality items carry base weight 0 and are excluded. Pairs with insufficient overlap are filtered ( $n<n_{\min}$ ).

ML Refinement: Learned Domain Weights

Features are per‑domain mutual satisfactions (no penalty), normalized to [0,1]:

x_{AB}^{(d)} = \frac{1}{100}\,\mathrm{Match}^{(d)}_{\mathrm{geo}} = \sqrt{ s_A^{(d)}(B)\, s_B^{(d)}(A) } \,,\quad d\in\mathcal{D}.

We fit L2‑regularized logistic regression with 5‑fold stratified cross‑validation (liblinear solver, $C=1.0$ ). The objective is

\min_{\beta\in\mathbb{R}^{|\mathcal{D}|}} \sum_{j=1}^N \log\!\bigl(1 + \exp\{-y_j\,\beta^\top x_j\}\bigr) + \lambda \,\|\beta\|_2^2, \qquad \lambda=\tfrac{1}{C}.

Foldwise coefficients are averaged; we map them to positive, normalized multipliers (for interpretability and drop‑in use):

m_d^{\mathrm{learned}} = \frac{\max(\bar\beta_d,\varepsilon)}{\sum_{k\in\mathcal{D}}\max(\bar\beta_k,\varepsilon)}\,,\quad \varepsilon=10^{-3}.

We report cross‑validated AUC as a measure of separability. (Negative sampling yields a contrastive dataset of positives vs. sampled non‑partners; stratification mitigates class‑imbalance variance.)

Advanced Models: Distance‑Kernel Family

PAM, soft‑gated, and evolutionary variants that incorporate trait distance

Downstream ML models operate on per‑domain mutual acceptability $m_d(A,B)\in[0,1]$ (derived from a probabilistic acceptability model, PAM), a trait distance $\Delta_d(A,B)\in[0,1]$ from normalized self‑reports, and an optional reliability weight $r_{A,B,d}\in(0,1]$ . All distance‑kernel variants used in this study share the same core score:

S(A,B;\Theta) \;=\; \sum_{d} w_d\,[m_d(A,B)]^{\alpha_d}\,k_d\!\bigl(\Delta_d(A,B);\mu_d,\sigma_d\bigr)^{\beta_d}\,r_{A,B,d},

where $w_d\ge 0$ are domain weights, $\alpha_d,\beta_d\ge 0$ exponents (default 1), and $k_d$ is a distance kernel chosen per domain:

k_{\mathrm{sim}}(\Delta) = \exp\!\bigl(-(\Delta/\sigma)^2\bigr), \quad k_{\mathrm{comp}}(\Delta) = \exp\!\Bigl(-\frac{(\Delta-\mu)^2}{2\sigma^2}\Bigr), \quad k_{\mathrm{irr}}(\Delta)\equiv 1.

Baseline / PAM (Similarity‑only)

At the domain level we drop distance and use logit‑transformed mutuality:

S_{\mathrm{base}}(A,B) \;=\; \sum_d w_d\,\operatorname{logit}\!\bigl(m_d(A,B)\bigr),

with $\alpha_d=1,\ \beta_d=0$ and $k_d\equiv 1$ . PAM refines $m_d(A,B)$ using directional lower‑confidence bounds before mutual aggregation.

Soft‑Gated Model (Learns similarity vs complementarity)

Each domain mixes similarity and complementarity with non‑negative gates:

S_d(A,B) = w_d\, m_d(A,B)\,\bigl(a_d\,k_{\mathrm{sim}}(\Delta_d) + b_d\,k_{\mathrm{comp}}(\Delta_d)\bigr),

S_{\mathrm{soft}}(A,B)=\sum_d S_d(A,B).

We learn $w_d,a_d,b_d\ge 0$ by minimizing a pairwise ranking (BPR) loss over triples (anchor $A$ , partner $P$ , non‑partner $N$ ):

\ell(A,P,N) = \log\!\bigl(1 + \exp\bigl(-(S_{\mathrm{soft}}(A,P)-S_{\mathrm{soft}}(A,N))\bigr)\bigr) + \lambda_1\!\sum_d |w_d| + \lambda_{\mathrm{excl}}\!\sum_d a_d b_d.

After training we infer each domain as similarity ( $a_d\gg b_d$ ), complementarity ( $b_d\gg a_d$ ), or irrelevant ( $w_d\approx 0$ ).

Evolutionary Model (Discrete search)

Here each domain chooses a discrete mode and kernel parameters that are searched rather than learned by gradient:

z_d \in \{\mathrm{sim},\mathrm{comp},\mathrm{irr}\},\; w_d\in[0,5],\; \mu_d\in\{0.25,0.50,0.75\},\; \sigma_d\in\{0.15,0.25,0.35\}.

k_d(\Delta_d) = \begin{cases} 1, & z_d=\mathrm{irr}\\[2pt] k_{\mathrm{sim}}(\Delta_d), & z_d=\mathrm{sim}\\[2pt] k_{\mathrm{comp}}(\Delta_d), & z_d=\mathrm{comp} \end{cases}

\Longrightarrow

S_{\mathrm{evo}}(A,B) = \sum_d w_d\, m_d(A,B)\,k_d\!\bigl(\Delta_d(A,B)\bigr).

The evolutionary search maximizes an identification‑focused objective over generations:

J = 3\cdot \mathrm{Hit@1} + 2\cdot \mathrm{Mutual@}K + \mathrm{Hit@}K + 0.5\cdot \mathrm{AUC},

using elitism, domain‑wise crossover, and small mutations in $z_d,w_d,\mu_d,\sigma_d$ to obtain an interpretable configuration.

In the current pilot data, the soft‑gated model and ML refinement perform best on Hit@K and Mutual@K; the evolutionary model provides a complementary, fully discrete summary of domain modes.

Evaluation Protocol (Ranking‑Based)

For each observed couple, we score the true pair and k sampled non‑partners (default $k=20$ ). The primary metric is AUC via the Mann–Whitney U statistic with standard tie handling:

\mathrm{AUC} = \frac{U}{N_+ N_-}, \qquad U = \sum_{x\in\text{true}} \sum_{y\in\text{non}} \mathbf{1}[\, x>y \,] + \tfrac{1}{2}\mathbf{1}[\, x=y \,].

We also report lift $\ \Delta = \mathbb{E}[S\,|\,Y{=}1]-\mathbb{E}[S\,|\,Y{=}0]$ and a median‑rank proxy derived from the sampled distractors. Multiple preregistered variants (different $c$ , $n_{\min}$ , aggregator) are compared transparently.

For the distance‑kernel models we additionally track $\mathrm{Hit@}K$ (fraction of anchors whose true partner is ranked in the top $K$ ), $\mathrm{Mutual@}K$ (fraction of true couples where both rank each other in the top $K$ ), and mean reciprocal rank $\mathrm{MRR} = \tfrac{1}{N}\sum_i 1/\mathrm{rank}_i$ as complementary identification metrics.

Mathematical Properties & Computational Profile

Range & truncation: $s_A,s_B\in[0,1]$ , so $\text{Match}_{\mathrm{geo}}\in[0,100]$ ; $\mathrm{FinalMatch}\in[0,100]$ after penalty and floor at 0.
Monotonicity in acceptability: Expanding $\mathcal{A}_i^A$ or $\Omega_i^A$ weakly increases $s_A(B)$ (and analogously for B).
Scale invariance (per‑direction): Multiplying all $W_i^A$ by a constant leaves $s_A(B)$ unchanged (ratio).
Bottleneck effect: Geometric aggregation gives $\text{Match}_{\mathrm{geo}}=0$ if either direction is 0; arithmetic mean is used only for sensitivity.
Penalty shape: $c/\sqrt{n}$ yields diminishing penalty as overlap grows; the gate $n_{\min}$ suppresses unstable scores.
Complexity: Scoring one pair is $\mathcal{O}(|\mathcal{I}_{AB}|)$ ; evaluation with $k$ sampled non‑partners per couple is $\mathcal{O}(C\cdot (k{+}1)\cdot\bar{I})$ where $C$ couples and $\bar{I}$ average overlap.

References

TIPI (Ten‑Item Personality Inventory). Gosling, S. D., Rentfrow, P. J., & Swann, W. B. (2003). A very brief measure of the Big‑Five personality domains. Journal of Research in Personality, 37(6), 504–528.
Schwartz values / PVQ‑21. Schwartz, S. H. (2012). An overview of the Schwartz theory of basic values. Online Readings in Psychology and Culture, 2(1). (PVQ‑21 used by the European Social Survey.)
AUC and Mann–Whitney U. Mann, H. B., & Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics, 18(1), 50–60.
Stable matching (DA). Gale, D., & Shapley, L. S. (1962). College admissions and the stability of marriage. American Mathematical Monthly, 69(1), 9–15.
Strategy‑proofness (proposers). Dubins, L. E., & Freedman, D. A. (1981). Machiavelli and the Gale–Shapley algorithm. American Mathematical Monthly, 88(7), 485–494.
Stability & incentives. Roth, A. E. (1982). The economics of matching: Stability and incentives. Mathematics of Operations Research, 7(4), 617–628.
Probabilistic serial (random assignment). Bogomolnaia, A., & Moulin, H. (2001). A new solution to the random assignment problem. Journal of Economic Theory, 100(2), 295–328.
Software (scikit‑learn). Pedregosa, F., et al. (2011). Scikit‑learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
Software (LIBLINEAR). Fan, R.‑E., Chang, K.‑W., Hsieh, C.‑J., Wang, X.‑R., & Lin, C.‑J. (2008). LIBLINEAR: A library for large linear classification. JMLR, 9, 1871–1874.

Notes: PVQ‑21 portrait wording is attributed to the ESS instrument; our app exposes official text when provided. We cite scikit‑learn and LIBLINEAR corresponding to the code paths used in tiger-alg/ml/learn_domain_weights.py.

Back Begin Questionnaire