This framework identifies Fannie Mae and Freddie Mac single-family mortgage loans entering a configurable expiration window, then cross-references borrower identity data against the SSA Death Master File and aggregated obituary sources using cosine similarity on embedding vectors. The output surfaces loans where the primary borrower may be deceased — enabling servicers, investors, and compliance teams to take appropriate action before maturity.
$7.1T
Fannie Mae + Freddie Mac combined (2025)
~$430B
loans maturing within rolling 12 months
~3.0M
Death Master File new records/year
≥ 0.88
fuzzy identity match confidence cutoff
Matching Pipeline
Six-stage identity matching workflow — flip any card for detailLoans Fetched
feeds into: Borrower Profile Build
GSE Loan Fetch
Step 01 of 6
Full Description
Pull Fannie Mae and Freddie Mac single-family MBS loan-level disclosure files.
Filter to loans where maturity_date falls within the configurable expiration window (default: 12 months).
Fannie Mae: monthly CAS/MBS disclosure files.
Freddie Mac: quarterly STACR/ACIS files..
Inputs / Data Sources
Profile Strings
feeds into: Death Record Ingestion
Borrower Profile Build
Step 02 of 6
Full Description
Construct a normalized borrower profile string per loan: full name + approximate birth year (derived from origination age field) + state + origination year.
This string becomes the embedding anchor.
SSN is never included.
The origination age field provides a proxy birth year within ±1 year accuracy..
Inputs / Data Sources
Death Records
feeds into: Embedding Vectorization
Death Record Ingestion
Step 03 of 6
Full Description
Load SSA Death Master File (DMF) and supplemental obituary feeds (Obituaries.com, Legacy.com, local newspaper scrapes).
Each record normalized to: full name + birth date + death date + state of death + source.
Records merged on name+DOB key to deduplicate across sources.
~3M new DMF entries per year..
Inputs / Data Sources
Vectors
feeds into: Cosine Similarity Match
Embedding Vectorization
Step 04 of 6
Full Description
Run both borrower profile strings and death record strings through the Jina-v5 embedding model (1024-dim).
Embed the full concatenated identity string for each record.
Store vectors as float[] in pgvector or HNSW index.
Both corpora must be in the same embedding space — use the same model version and tokenizer for both..
Inputs / Data Sources
Threshold
feeds into: Ranked Match Output
Cosine Similarity Match
Step 05 of 6
Full Description
For each expiring loan, compute cosine similarity against the death record pool using approximate nearest neighbor search (HNSW or pgvector).
Candidate pairs above the ≥0.88 threshold are returned.
Pre-filter death records by state and approximate age range to reduce search space.
Threshold is configurable; 0.88 balances recall vs.
false-positive rate..
Inputs / Data Sources
Tiers
pipeline complete
Ranked Match Output
Step 06 of 6
Full Description
Return results sorted by match confidence descending.
Each result includes: loan metadata (ID, GSE, maturity, UPB, state), death record metadata (obit source, dates), cosine score, age delta (years), state match flag, and composite confidence tier (Strong / Probable / Possible / None).
Output routes to servicer review workflow..
Inputs / Data Sources
Embedding Strategy
Cosine similarity via Jina-v5 (1024-dim)Both borrower profiles and death records are encoded as a single normalized identity string before embedding. This ensures the vector captures the semantic meaning of the full identity — not just the name. Matching follows the same approach used in the Panopticon Polymarket prediction market pipeline.
Borrower Profile String
"[FIRST] [LAST] born approx [YEAR] state [ST]
mortgage origination [ORIG_YEAR] loan maturity [MAT_YEAR]"Age derived from origination age field in GSE disclosure. SSN never included.
Death Record String
"[FIRST] [LAST] born [DOB_YEAR] died [DOD_YEAR]
state [ST] source [OBIT_SOURCE]"Death records normalized before embedding. DMF + obit sources merged on name+dob key.
Cosine Similarity Function
def cosine_sim(a: list[float], b: list[float]) -> float:
dot = sum(x * y for x, y in zip(a, b))
norm_a = math.sqrt(sum(x ** 2 for x in a))
norm_b = math.sqrt(sum(x ** 2 for x in b))
return dot / (norm_a * norm_b) if norm_a and norm_b else 0.0Match Score Distribution
Simulated — 1,000 loan pool vs death record corpus
Age Delta vs. Cosine Similarity
Candidate matches — age difference (yrs) vs match score
Match Confidence Tiers
Action thresholds and workflow routingStrong Match
≥ 0.92
Name + DOB delta ≤1yr + state match. High probability of deceased borrower. Requires immediate servicer review.
Probable Match
0.88 – 0.91
Name + DOB delta ≤3yr or state mismatch. Likely the same individual but requires secondary verification.
Possible Match
0.82 – 0.87
Partial name overlap or DOB uncertainty. Common name may produce false positives. Manual review required.
No Match
< 0.82
Below similarity threshold; borrower identity not found in death records.
Data Sources
GSE loan disclosures, death registries, and obituary feedsFannie Mae MBS Disclosure
GSE Loan DataFields: Loan sequence, maturity date, origination age, state, UPB, LTV, purpose
Public (monthly CAS/MBS disclosure files)
Freddie Mac Loan Performance
GSE Loan DataFields: Loan ID, scheduled maturity, originator, UPB, borrower credit score, MSA
Public (quarterly disclosure, STACR/ACIS files)
SSA Death Master File (DMF)
Federal Death RegistryFields: First/last name, SSN (last 4), DOB, DOD, state of death
NTIS subscription; limited public file available
Obituaries.com / Legacy.com
Obituary AggregatorFields: Full name, dates, city/state, family mentions, funeral home
Web scrape / API (terms permitting)
Local Newspaper Obit Archives
Obituary SourceFields: Name, age, city, family tree, employer history
Scrape; ~4,500 US newspapers indexed
FindAGrave / BillionGraves
Burial RegistryFields: Full name, birth/death dates, burial state, family links
Public API (rate-limited)
Sample Match Results
Illustrative output — borrower data anonymized| Loan ID | GSE | Maturity | UPB | St | Borrower (anonymized) | Obit / DMF Match | Score | Age Δ | Tier |
|---|---|---|---|---|---|---|---|---|---|
| FM-2009-000441 | Fannie Mae | Nov 2026 | $184,200 | OH | Robert E. [REDACTED], ~age 79 | Legacy.com — Robert E. *** of Columbus, OH, died Sep 2024, age 79 | 0.95 | 0 | Strong |
| FM-2011-008872 | Freddie Mac | Feb 2027 | $221,500 | FL | Dorothy M. [REDACTED], ~age 82 | FindAGrave — Dorothy M. *** of Sarasota, FL, died Jan 2025, age 83 | 0.91 | 1 | Probable |
| FM-2013-019004 | Fannie Mae | Apr 2027 | $308,900 | AZ | James W. [REDACTED], ~age 74 | Obituaries.com — James W. *** of Tucson, AZ, died Mar 2025, age 77 | 0.88 | 3 | Probable |
| FM-2010-031155 | Freddie Mac | Aug 2026 | $97,400 | PA | Margaret A. [REDACTED], ~age 71 | Pittsburgh Post-Gazette — Margaret A. *** of Bethel Park, died Jun 2024, age 74 | 0.85 | 3 | Possible |
All names redacted. Loan IDs, states, ages, and match scores are illustrative of the output schema.
Legal & Compliance Framework
FCRA, GLBA, and RESPA considerationsFCRA Compliance
The Fair Credit Reporting Act governs use of consumer credit information; death match outputs used in lending decisions are subject to FCRA requirements
Servicers acting on match results must follow adverse action notice procedures if any action affects the estate's credit standing
DMF data resale is restricted under NTIS license; internal-use matching is permissible with appropriate data handling agreements
GLBA / Data Security
Gramm-Leach-Bliley Act protects non-public personal information (NPI) of borrowers; death record cross-referencing must occur within a compliant data environment
Borrower name + loan data constitutes NPI; all matching must occur in an isolated, logged environment with role-based access
Embedding vectors derived from NPI are still regulated; treat derived vectors as sensitive data assets
RESPA / Servicer Obligations
RESPA §6 requires servicers to respond to Qualified Written Requests from successors in interest (e.g., surviving spouse or estate executor)
Upon confirmed death of a borrower, the servicer must work with the estate or successor — not accelerate the loan without due process
Confirmed deceased-borrower matches should trigger servicer notification workflow, not automated collection actions
Birdsong Capital Knowledge Base
Environment 16 — includes GSE, REIT, industrial RE, and finance entities