TigerMatch
Overview

Methods & Math

A precise specification of the baseline scorer, evaluation protocol, and the interpretable ML refinement used in this study.

Baseline & PAM scorers
Directed acceptability with importance and per-domain multipliers, plus a probabilistic acceptability model (PAM) that adds uncertainty-aware mutuality.
Distance-kernel ML models
Soft-gated and evolutionary domain-mode models over mutual acceptability and trait distance; each domain chooses similarity, complementarity, or irrelevance.
Evaluation metrics
True partners vs sampled non-partners; Hit@K, Mutual@K, MRR, and AUC via Mann–Whitney U, plus simple lift and median-rank diagnostics.

Pipeline Overview

  1. 1Invitation-only: each partner completes a 50-item questionnaire independently (TIPI, PVQ-21, Social, Communication/Friendship, Lifestyle, Priorities/Deal-breakers).
  2. 2Baseline scorer computes FinalMatch for true partners and sampled non-partners; we evaluate AUC, lift, and a median-rank proxy.
  3. 3Interpretable ML learns domain multipliers via 5-fold CV; we re-evaluate and report the change in AUC/lift.

Notation & Encoding

Let I\mathcal{I} be the set of items, with domain map d:IDd:\mathcal{I}\to\mathcal{D}. For a pair of participants A,BA,B and an item iIi\in\mathcal{I}, denote self‑answers by aiAa_i^A and aiBa_i^B. Importance weights (OkCupid‑style) are encoded as

wiA{0,1,10,50,250}andwiB{0,1,10,50,250}. w_i^A \in \{0,\,1,\,10,\,50,\,250\} \quad\text{and}\quad w_i^B \in \{0,\,1,\,10,\,50,\,250\}.

Likert items have scale size Ki{5,6,7}K_i\in\{5,6,7\} with ai{1,,Ki}a_i^\cdot \in \{1,\dots,K_i\}. MC‑single items have MiM_i options with values {1,,Mi}\{1,\dots,M_i\}.

Acceptability Maps (Likert vs. MC)

Each item records the set of partner answers A would accept. For Likert items, A selects a tolerance label (Same, ±1, ±2, Any) mapped toτiA{0,1,2,}\tau_i^A \in \{0,1,2,\infty\}. The acceptability set is

AiA={{b{1,,Ki}:baiAτiA},τiA<{1,,Ki},τiA= \mathcal{A}_i^A = \begin{cases} \{\, b \in \{1,\dots,K_i\} : |\,b - a_i^A\,| \le \tau_i^A \,\}, & \tau_i^A < \infty\\[2mm] \{1,\dots,K_i\}, & \tau_i^A = \infty \end{cases}

For MC‑single items, A selects ΩiA{1,,Mi}\Omega_i^A \subseteq \{1,\dots,M_i\} (defaulting to A's own choice if unspecified). The directed indicator is

IiA(B)={1 ⁣[aiBΩiA],MC_SINGLE1 ⁣[aiBAiA],Likert I_i^A(B) = \begin{cases} \mathbf{1}\!\left[\, a_i^B \in \Omega_i^A \,\right], & \text{MC\_SINGLE}\\[2mm] \mathbf{1}\!\left[\, a_i^B \in \mathcal{A}_i^A \,\right], & \text{Likert} \end{cases}

(MC‑multi items are retained for analysis; when placed in a domain with base multiplier 0, they do not affect the score.)

Per‑Domain Multipliers & User Priorities

Let mdm_d be the base multiplier for domain dDd\in\mathcal{D}. Preregistered defaults:

  • Values: 3.0, Communication: 2.0, Lifestyle: 2.0, Social: 1.5, Personality: 1.0, Friendship: 1.0
  • Quality/IMC or unknown domains: 0.0 (excluded from scoring)

User priorities (Q47) map option labels → domains; if A prioritizes dd, setpdA=1.5p_d^A = 1.5, else pdA=1p_d^A = 1.

WiA=wiAmd(i)pd(i)A,WiB=wiBmd(i)pd(i)B. W_i^A = w_i^A \cdot m_{\,d(i)} \cdot p_{\,d(i)}^A, \qquad W_i^B = w_i^B \cdot m_{\,d(i)} \cdot p_{\,d(i)}^B.

Core Scoring: Baseline Algorithm

Items contribute only if both answered and at least one side's weight is non‑zero. Define

IAB={iI:aiA,  aiB,  WiA>0 or WiB>0},n=IAB. \mathcal{I}_{AB} = \bigl\{\, i\in\mathcal{I} : a_i^A\neq\varnothing,\; a_i^B\neq\varnothing,\; W_i^A>0 \text{ or } W_i^B>0 \,\bigr\},\quad n = |\mathcal{I}_{AB}|.

Directional (A←B) satisfaction and its B←A analogue:

sA(B)=iIABWiAIiA(B)iIABWiA,sB(A)=iIABWiBIiB(A)iIABWiB. s_A(B) = \frac{\sum_{i\in\mathcal{I}_{AB}} W_i^A \, I_i^A(B)}{\sum_{i\in\mathcal{I}_{AB}} W_i^A}, \qquad s_B(A) = \frac{\sum_{i\in\mathcal{I}_{AB}} W_i^B \, I_i^B(A)}{\sum_{i\in\mathcal{I}_{AB}} W_i^B}.

Mutuality (primary: geometric; sensitivity: arithmetic):

Matchgeo=100sA(B)sB(A),Matchmean=10012(sA(B)+sB(A)). \text{Match}_{\mathrm{geo}} = 100\,\sqrt{s_A(B)\,s_B(A)},\qquad \text{Match}_{\mathrm{mean}} = 100\cdot \tfrac{1}{2}\bigl(s_A(B)+s_B(A)\bigr).

Apply a finite‑sample penalty with hyperparameter c>0c>0 and a min‑overlap gate nminn_{\min}:

FinalMatch={max ⁣(0,  Matchcn),nnminundefined (filtered),n<nmin where {geo,mean}. \mathrm{FinalMatch} = \begin{cases} \max\!\bigl(0,\; \text{Match}_{\star} - \dfrac{c}{\sqrt{n}} \bigr), & n \ge n_{\min} \\[3mm] \text{undefined (filtered)}, & n < n_{\min} \end{cases} \quad \text{ where } \star\in\{\mathrm{geo},\mathrm{mean}\}.

Preregistered grid: c{60,100,140}c\in\{60,100,140\} and nmin{15,20,25}n_{\min}\in\{15,20,25\}, with both aggregators evaluated.

Quick Reference: Scoring Formula

sA(B)=iIABWiAIiA(B)iIABWiA,WiA=wiAmd(i)pd(i)A s_A(B) = \frac{\sum_{i\in\mathcal I_{AB}} W_i^A \cdot I_i^A(B)}{\sum_{i\in\mathcal I_{AB}} W_i^A}, \qquad W_i^A = w_i^A\, m_{d(i)}\, p_{d(i)}^A
FinalMatch=max ⁣(0,  100sA(B)sB(A)cn) \text{FinalMatch} = \max\!\Bigl(0,\; 100\,\sqrt{s_A(B)\, s_B(A)} - \frac{c}{\sqrt{n}}\Bigr)

IMC/Quality items carry base weight 0 and are excluded. Pairs with insufficient overlap are filtered (n<nminn<n_{\min}).

ML Refinement: Learned Domain Weights

Features are per‑domain mutual satisfactions (no penalty), normalized to [0,1]:

xAB(d)=1100Matchgeo(d)=sA(d)(B)sB(d)(A),dD. x_{AB}^{(d)} = \frac{1}{100}\,\mathrm{Match}^{(d)}_{\mathrm{geo}} = \sqrt{ s_A^{(d)}(B)\, s_B^{(d)}(A) } \,,\quad d\in\mathcal{D}.

We fit L2‑regularized logistic regression with 5‑fold stratified cross‑validation (liblinear solver, C=1.0C=1.0). The objective is

minβRDj=1Nlog ⁣(1+exp{yjβxj})+λβ22,λ=1C. \min_{\beta\in\mathbb{R}^{|\mathcal{D}|}} \sum_{j=1}^N \log\!\bigl(1 + \exp\{-y_j\,\beta^\top x_j\}\bigr) + \lambda \,\|\beta\|_2^2, \qquad \lambda=\tfrac{1}{C}.

Foldwise coefficients are averaged; we map them to positive, normalized multipliers (for interpretability and drop‑in use):

mdlearned=max(βˉd,ε)kDmax(βˉk,ε),ε=103. m_d^{\mathrm{learned}} = \frac{\max(\bar\beta_d,\varepsilon)}{\sum_{k\in\mathcal{D}}\max(\bar\beta_k,\varepsilon)}\,,\quad \varepsilon=10^{-3}.

We report cross‑validated AUC as a measure of separability. (Negative sampling yields a contrastive dataset of positives vs. sampled non‑partners; stratification mitigates class‑imbalance variance.)

Advanced Models: Distance‑Kernel Family

PAM, soft‑gated, and evolutionary variants that incorporate trait distance

Downstream ML models operate on per‑domain mutual acceptability md(A,B)[0,1]m_d(A,B)\in[0,1] (derived from a probabilistic acceptability model, PAM), a trait distance Δd(A,B)[0,1]\Delta_d(A,B)\in[0,1] from normalized self‑reports, and an optional reliability weight rA,B,d(0,1]r_{A,B,d}\in(0,1]. All distance‑kernel variants used in this study share the same core score:

S(A,B;Θ)  =  dwd[md(A,B)]αdkd ⁣(Δd(A,B);μd,σd)βdrA,B,d, S(A,B;\Theta) \;=\; \sum_{d} w_d\,[m_d(A,B)]^{\alpha_d}\,k_d\!\bigl(\Delta_d(A,B);\mu_d,\sigma_d\bigr)^{\beta_d}\,r_{A,B,d},

where wd0w_d\ge 0 are domain weights, αd,βd0\alpha_d,\beta_d\ge 0 exponents (default 1), and kdk_d is a distance kernel chosen per domain:

ksim(Δ)=exp ⁣((Δ/σ)2),kcomp(Δ)=exp ⁣((Δμ)22σ2),kirr(Δ)1. k_{\mathrm{sim}}(\Delta) = \exp\!\bigl(-(\Delta/\sigma)^2\bigr), \quad k_{\mathrm{comp}}(\Delta) = \exp\!\Bigl(-\frac{(\Delta-\mu)^2}{2\sigma^2}\Bigr), \quad k_{\mathrm{irr}}(\Delta)\equiv 1.

Baseline / PAM (Similarity‑only)

At the domain level we drop distance and use logit‑transformed mutuality:

Sbase(A,B)  =  dwdlogit ⁣(md(A,B)), S_{\mathrm{base}}(A,B) \;=\; \sum_d w_d\,\operatorname{logit}\!\bigl(m_d(A,B)\bigr),

with αd=1, βd=0\alpha_d=1,\ \beta_d=0 and kd1k_d\equiv 1. PAM refines md(A,B)m_d(A,B) using directional lower‑confidence bounds before mutual aggregation.

Soft‑Gated Model (Learns similarity vs complementarity)

Each domain mixes similarity and complementarity with non‑negative gates:

Sd(A,B)=wdmd(A,B)(adksim(Δd)+bdkcomp(Δd)),S_d(A,B) = w_d\, m_d(A,B)\,\bigl(a_d\,k_{\mathrm{sim}}(\Delta_d) + b_d\,k_{\mathrm{comp}}(\Delta_d)\bigr),
Ssoft(A,B)=dSd(A,B).S_{\mathrm{soft}}(A,B)=\sum_d S_d(A,B).

We learn wd,ad,bd0w_d,a_d,b_d\ge 0 by minimizing a pairwise ranking (BPR) loss over triples (anchor AA, partner PP, non‑partner NN):

(A,P,N)=log ⁣(1+exp((Ssoft(A,P)Ssoft(A,N))))+λ1 ⁣dwd+λexcl ⁣dadbd. \ell(A,P,N) = \log\!\bigl(1 + \exp\bigl(-(S_{\mathrm{soft}}(A,P)-S_{\mathrm{soft}}(A,N))\bigr)\bigr) + \lambda_1\!\sum_d |w_d| + \lambda_{\mathrm{excl}}\!\sum_d a_d b_d.

After training we infer each domain as similarity (adbda_d\gg b_d), complementarity (bdadb_d\gg a_d), or irrelevant (wd0w_d\approx 0).

Evolutionary Model (Discrete search)

Here each domain chooses a discrete mode and kernel parameters that are searched rather than learned by gradient:

zd{sim,comp,irr},  wd[0,5],  μd{0.25,0.50,0.75},  σd{0.15,0.25,0.35}. z_d \in \{\mathrm{sim},\mathrm{comp},\mathrm{irr}\},\; w_d\in[0,5],\; \mu_d\in\{0.25,0.50,0.75\},\; \sigma_d\in\{0.15,0.25,0.35\}.
kd(Δd)={1,zd=irrksim(Δd),zd=simkcomp(Δd),zd=comp k_d(\Delta_d) = \begin{cases} 1, & z_d=\mathrm{irr}\\[2pt] k_{\mathrm{sim}}(\Delta_d), & z_d=\mathrm{sim}\\[2pt] k_{\mathrm{comp}}(\Delta_d), & z_d=\mathrm{comp} \end{cases}
\Longrightarrow
Sevo(A,B)=dwdmd(A,B)kd ⁣(Δd(A,B)).S_{\mathrm{evo}}(A,B) = \sum_d w_d\, m_d(A,B)\,k_d\!\bigl(\Delta_d(A,B)\bigr).

The evolutionary search maximizes an identification‑focused objective over generations:

J=3Hit@1+2Mutual@K+Hit@K+0.5AUC,J = 3\cdot \mathrm{Hit@1} + 2\cdot \mathrm{Mutual@}K + \mathrm{Hit@}K + 0.5\cdot \mathrm{AUC},

using elitism, domain‑wise crossover, and small mutations in zd,wd,μd,σdz_d,w_d,\mu_d,\sigma_d to obtain an interpretable configuration.

In the current pilot data, the soft‑gated model and ML refinement perform best on Hit@K and Mutual@K; the evolutionary model provides a complementary, fully discrete summary of domain modes.

Evaluation Protocol (Ranking‑Based)

For each observed couple, we score the true pair and k sampled non‑partners (default k=20k=20). The primary metric is AUC via the Mann–Whitney U statistic with standard tie handling:

AUC=UN+N,U=xtrueynon1[x>y]+121[x=y]. \mathrm{AUC} = \frac{U}{N_+ N_-}, \qquad U = \sum_{x\in\text{true}} \sum_{y\in\text{non}} \mathbf{1}[\, x>y \,] + \tfrac{1}{2}\mathbf{1}[\, x=y \,].

We also report lift  Δ=E[SY=1]E[SY=0]\ \Delta = \mathbb{E}[S\,|\,Y{=}1]-\mathbb{E}[S\,|\,Y{=}0] and a median‑rank proxy derived from the sampled distractors. Multiple preregistered variants (different cc, nminn_{\min}, aggregator) are compared transparently.

For the distance‑kernel models we additionally track Hit@K\mathrm{Hit@}K (fraction of anchors whose true partner is ranked in the top KK),Mutual@K\mathrm{Mutual@}K (fraction of true couples where both rank each other in the top KK), and mean reciprocal rank MRR=1Ni1/ranki\mathrm{MRR} = \tfrac{1}{N}\sum_i 1/\mathrm{rank}_i as complementary identification metrics.

Mathematical Properties & Computational Profile

  • Range & truncation: sA,sB[0,1]s_A,s_B\in[0,1], so Matchgeo[0,100]\text{Match}_{\mathrm{geo}}\in[0,100]; FinalMatch[0,100]\mathrm{FinalMatch}\in[0,100] after penalty and floor at 0.
  • Monotonicity in acceptability: Expanding AiA\mathcal{A}_i^A or ΩiA\Omega_i^A weakly increases sA(B)s_A(B) (and analogously for B).
  • Scale invariance (per‑direction): Multiplying all WiAW_i^A by a constant leaves sA(B)s_A(B) unchanged (ratio).
  • Bottleneck effect: Geometric aggregation gives Matchgeo=0\text{Match}_{\mathrm{geo}}=0 if either direction is 0; arithmetic mean is used only for sensitivity.
  • Penalty shape: c/nc/\sqrt{n} yields diminishing penalty as overlap grows; the gate nminn_{\min} suppresses unstable scores.
  • Complexity: Scoring one pair is O(IAB)\mathcal{O}(|\mathcal{I}_{AB}|); evaluation with kk sampled non‑partners per couple is O(C(k+1)Iˉ)\mathcal{O}(C\cdot (k{+}1)\cdot\bar{I}) where CC couples and Iˉ\bar{I} average overlap.

References

  1. TIPI (Ten‑Item Personality Inventory). Gosling, S. D., Rentfrow, P. J., & Swann, W. B. (2003). A very brief measure of the Big‑Five personality domains. Journal of Research in Personality, 37(6), 504–528.
  2. Schwartz values / PVQ‑21. Schwartz, S. H. (2012). An overview of the Schwartz theory of basic values. Online Readings in Psychology and Culture, 2(1). (PVQ‑21 used by the European Social Survey.)
  3. AUC and Mann–Whitney U. Mann, H. B., & Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics, 18(1), 50–60.
  4. Stable matching (DA). Gale, D., & Shapley, L. S. (1962). College admissions and the stability of marriage. American Mathematical Monthly, 69(1), 9–15.
  5. Strategy‑proofness (proposers). Dubins, L. E., & Freedman, D. A. (1981). Machiavelli and the Gale–Shapley algorithm. American Mathematical Monthly, 88(7), 485–494.
  6. Stability & incentives. Roth, A. E. (1982). The economics of matching: Stability and incentives. Mathematics of Operations Research, 7(4), 617–628.
  7. Probabilistic serial (random assignment). Bogomolnaia, A., & Moulin, H. (2001). A new solution to the random assignment problem. Journal of Economic Theory, 100(2), 295–328.
  8. Software (scikit‑learn). Pedregosa, F., et al. (2011). Scikit‑learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
  9. Software (LIBLINEAR). Fan, R.‑E., Chang, K.‑W., Hsieh, C.‑J., Wang, X.‑R., & Lin, C.‑J. (2008). LIBLINEAR: A library for large linear classification. JMLR, 9, 1871–1874.

Notes: PVQ‑21 portrait wording is attributed to the ESS instrument; our app exposes official text when provided. We cite scikit‑learn and LIBLINEAR corresponding to the code paths used in tiger-alg/ml/learn_domain_weights.py.