Namsor - The global leader in name origin, gender, and ethnicity analysis

Namsor is the world's most trusted solution for onomastics and name-based analytics. Using a specialized NLP engine and the most comprehensive name database available (7+ billion names, 22 alphabets), we accurately identify the origin, ethnicity, and gender behind any name.

Trusted by governments, international organizations, and the global research community, Namsor transforms names into actionable insights for scientific research, diaspora marketing, migration studies, the fight against bank fraud, and internal and external security.

Our mission: Empowering global insight through name-based intelligence

At Namsor, our mission is to unlock the hidden meaning behind names — turning them into powerful data points that reveal origin, gender, and ethnicity with unmatched precision. We believe that names matter: they carry cultural, historical, and geographic significance that can foster deeper understanding, more inclusive policies, and better-informed decisions.

By combining advanced Natural Language Processing (NLP) with the world's most comprehensive name database, we empower a wide range of stakeholders — from academic researchers to governments, NGOs, and corporations — to harness name-based intelligence.

Whether it's for migration and diaspora analysis, diversity analysis, or market localization, Namsor delivers the insights needed to make data truly global. Our technology is also a trusted asset in national and international security, Know Your Customer (KYC) compliance, and the fight against fraud and financial crime, where identifying the origin of names can enhance risk assessment and prevent illicit activities.

Our goal: make name-based data accessible, ethical, and impactful — for a safer, more inclusive, and better-connected world.

What we do: Empowering global insight through name-based intelligence

All the information that Namsor can estimate from a name.

Namsor provides advanced name-based analytics to identify the geographic origin, ethnicity, and gender behind any personal name — first name, last name, or full name. Our specialized AI models analyze names in over 22 writing systems*, covering billions of individuals across the globe.

We help you:

🌍
Detect country or region of origin for names from any language or culture
🧬
Infer likely ethnicity and cultural background for demographic analysis
🚻
Determine gender based on first name, full name, or name pairings
🕵
Strengthen identity verification in KYC and AML workflows
🔐
Support national and global security efforts through origin-based intelligence
📊
Segment global audiences for marketing, CRM enrichment, and diaspora outreach
🎓
Advance research in sociology, migration, public health, political science, and more
🧾
Autocomplete fields in forms to streamline onboarding and boost completion rates
⚖️
Detect bias in AI systems for transparency and compliance with the EU AI Act

Whether you're working with international datasets, validating identities, or studying migration flows, Namsor turns names into actionable insight — accurately, ethically, and at scale.

Custom models built for your needs

We develop tailor-made AI models based on your data, region, and analytical goals. Below are real examples of custom solutions we've delivered for world-class organizations.

Social Research Modeling

We built a custom model for the World Bank and the International Organization for Migration (IOM) to estimate caste groupings from Indian names, enabling research on internal migration and social inequalities.

Name Transliteration Engine

We created a high-precision transliteration model for cross-border identity providers to convert Mandarin and Kanji names into Latin scripts, improving international interoperability.

Risk & Policy Intelligence

We developed a custom model for international risk analysts to support geopolitical analysis, compliance monitoring, and decision-making based on onomastic intelligence.

Request a custom model

Why Namsor: The most accurate name origin and classification technology

Choosing the right solution for name-based analytics is critical when precision, scale, and trust matter. Unlike generic AI or basic indexation tools, Namsor delivers expert-level onomastic analysis — combining morphological understanding, linguistic context, and cultural nuances to reach unparalleled accuracy.

Verified by science. Trusted by institutions.

Namsor has contributed to over 600 scientific publications and collaborates with some of the world's most prestigious universities, international organizations, and global companies.

Following numerous audits and benchmarks by scientific, academic, and independent bodies, Namsor's name verification technology has been rated as the most accurate in the world.

These evaluations, performed on a wide range of Namsor functionalities, have demonstrated a significantly higher level of precision compared to other solutions:

A group of people celebrating their first place around a trophy.

Elsevier audit

Rated most accurate by Science-Metrix for Elsevier and the European Commission, Namsor was selected for the EU's official SheFigures gender statistics.

Science Metrix report

Harvard & University of Chicago

Namsor was validated on 250,000 individuals from North Carolina's voter registry — showing high alignment between predicted and self-reported ethnicities.

Harvard/Chicago university study

Uber benchmark

A comparative study on U.S. race/ethnicity inference tools concluded that Namsor outperformed all alternatives.

Uber benchmark

Research Square benchmark

Evaluated on 90,000 researchers names, Namsor demonstrated outstanding accuracy for origin and ethnicity prediction.

Research Square benchmark

Research Done audit

An audit carried out by Research Done to validate the results obtained via Namsor.

"It is actually fascinating to see how close the estimation were to the real data, especially with key parts of the analysis."

Zack Kertcher, Research Done

Columbia University benchmark

Coming soon.

Namsor vs. other solutions: how we compare

Below is a comparison between Namsor and other common approaches like LLMs, basic databases, and general onomastic APIs:

	Namsor	Data base comparaison	Large Language Model (LLM)	Onomastic solutions
Accuracy
Language coverage
Names covered	99.99%	75% to 92% (depending on the solution)	80% to 95% (depending on the models)	99.99%
Distinguishes typos from cultural nuances	✓	✕	✕	⯁
Specialised onomastic analysis	Expert approach (morphology, context)	None (gross indexation)	Generic (not dedicated to names)	Partial
Taxonomy segmentation	Very high (origin, ethnicity, location, U.S. race ethnicity, etc.)	Low (origin, location)	Low (Poor understanding of different taxonomies)	Low (origin, location)
Data updates	Continuous (data & algorithms)	Sporadic	Non priority	Irregular
Speed of analysis per name (lower is better)	0.03 sec.	0.03 sec.	From 1 sec. to 5sec.	0.2 sec.
User-friendliness	Very high (API, CSV/Excel, SDK, Google sheets, Make, n8n, Zapier)	Medium (API, CSV, SDK)	Very low (no dedicated API or tools)	Low (API only)
Privacy and anonymity	Very high (Anonymisable data, deactivatable machine learning)	Medium (No anonimisation of data)	Very low (Data retention and compulsory machine learning)	Low (Data retention)

Our technology: How we turn names into deep insight

Example of a basic morphological analysis of the Sharma surname.

At Namsor, we've built a unique artificial intelligence dedicated to the morphological and cultural analysis of names. Unlike generic AI or simple databases, our models are rooted in over a decade of expertise in natural language processing (NLP) and onomastics — the scientific study of names.

Our technology analyzes the structure, context, and cultural signals embedded in names to estimate their origin, ethnicity, gender, country of residence, or even sociocultural categories like caste, or to make transliteration.

Our tools work not only with full names, but also with nicknames, brand names, or non-standard spellings — across thousands of naming conventions and 22 alphabets: Cyrillic, Georgian, Latin, Arabic, Devanagari, Bengali, Greek, Armenian, Thai, Hebrew, Kannada, Gujarati, Tamil, Hangul, Telugu, Gurmukhi, Oriya, Han (Chinese traditional and simplified characters, Kanji), Hiragana, Myanmar, Katakana and Malayalam.

We also design custom AI models to meet the specific challenges of our clients. Whether it's for a particular geography, industry, or data environment, we build bespoke solutions that go beyond our standard APIs.

Refined through global collaboration

What makes Namsor unique is not just its architecture, but the way it's continuously refined. Our AI evolves through active research partnerships and ongoing collaboration with top-tier institutions:

🎓Universities Harvard, Berkeley, Sciences Po, VNU, etc.
🧬Scientific groups Elsevier, Springer Nature, The Lancet, ASME, SSRN, etc.
🌍International institutions United Nations, European Commission, IOM, OECD, World bank, etc.
🕵Experts Historians, linguists, anthropologists from around the world.

These collaborations help us integrate new knowledge, validate results, and maintain a high level of scientific rigor.

How our tools are designed

From assembling our data sets to continuous learning, each step is designed to optimize our estimations.

1
Large-scale data collection
After more than ten years of research, we've assembled a high-quality dataset made up of billions of names from diverse cultures, languages, and regions.
2
Onomastic model training
We train our AI using onomastics and integrate insights from linguistics, morphology, and cultural analysis.
3
Model validation
We evaluate multiple models in parallel using specialized metrics to identify edge cases, anomalies, and potential biases.
4
Continuous learning
Our team continuously refines these models and retains only the best-performing versions to ensure maximum reliability and accuracy.

Our commitments : Ethics & responsibility

At Namsor, we believe that powerful technology must come with a deep sense of responsibility. Our solutions are designed to deliver precise name-based insights while strictly respecting individual privacy, ethical standards, and regulatory frameworks such as the GDPR.

Control over your data

We give every user full control over how their data is used. By default, submitted contributes to improving our models. You can easily disable this by turning off the “learnable” option via the API or dashboard. You can also activate the “anonymised” mode, which uses SHA encryption to fully anonymize the data — ensuring privacy while still allowing fast, consistent processing.

This level of transparency and flexibility is rare among AI providers, and a key reason why organizations choose Namsor for responsible data analysis.

Ethical design and GDPR compliance

Namsor only collects the minimum data required for model training and predictions. All data is handled legally and ethically, with a focus on data minimization, transparency, and proportionality. We strictly comply with GDPR and other international data protection laws.

Transparency at every level

To ensure full explainability, the Namsor API offers a special option: a detailed explanation of the AI reasoning behind each prediction. With one click, users can receive a breakdown (in Python logic) of how gender, origin, or ethnicity were estimated — making our decision process both auditable and trustworthy.

Supporting research and social impact

Namsor is actively used by researchers, NGOs, and institutions to support scientific studies, fraud prevention, fair hiring, and anti-discrimination efforts. Our tools help map global diversity responsibly, without reinforcing bias or stereotypes.

From gender equality initiatives to migration policy research, we're committed to advancing knowledge while protecting the dignity and privacy of every individual.

Namsor - The global leader in name origin, gender, and ethnicity analysis

Our mission: Empowering global insight through name-based intelligence

What we do: Empowering global insight through name-based intelligence

We help you:

Custom models built for your needs

Social Research Modeling

Name Transliteration Engine

Risk & Policy Intelligence

Why Namsor: The most accurate name origin and classification technology

Verified by science. Trusted by institutions.

Elsevier audit

Harvard & University of Chicago

Uber benchmark

Research Square benchmark

Research Done audit

Columbia University benchmark

Namsor vs. other solutions: how we compare

Our technology: How we turn names into deep insight

Refined through global collaboration

How our tools are designed

Large-scale data collection

Onomastic model training

Model validation

Continuous learning

Our commitments : Ethics & responsibility

Control over your data

Ethical design and GDPR compliance

Transparency at every level

Supporting research and social impact