Namsor – The global leader in name origin, gender, and ethnicity analysis
Namsor is the world's most trusted solution for onomastics and name-based analytics. Using a specialized NLP engine and the most comprehensive name database available (7+ billion names, 22 alphabets), we accurately identify the origin, ethnicity, and gender behind any name.
Trusted by governments, international organizations, and the global research community, Namsor transforms names into actionable insights for scientific research, diaspora marketing, migration studies, the fight against bank fraud, and internal and external security.
Our mission: Empowering global insight through name-based intelligence
At Namsor, our mission is to unlock the hidden meaning behind names — turning them into powerful data points that reveal origin, gender, and ethnicity with unmatched precision. We believe that names matter: they carry cultural, historical, and geographic significance that can foster deeper understanding, more inclusive policies, and better-informed decisions.
By combining advanced Natural Language Processing (NLP) with the world's most comprehensive name database, we empower a wide range of stakeholders — from academic researchers to governments, NGOs, and corporations — to harness name-based intelligence.
Whether it's for migration and diaspora analysis, diversity analysis, or market localization, Namsor delivers the insights needed to make data truly global. Our technology is also a trusted asset in national and international security, Know Your Customer (KYC) compliance, and the fight against fraud and financial crime, where identifying the origin of names can enhance risk assessment and prevent illicit activities.
Our goal: make name-based data accessible, ethical, and impactful — for a safer, more inclusive, and better-connected world.
What we do: Empowering global insight through name-based intelligence

Namsor provides advanced name-based analytics to identify the geographic origin, ethnicity, and gender behind any personal name — first name, last name, or full name. Our specialized AI models analyze names in over 22 writing systems*, covering billions of individuals across the globe.
We help you:
- 🌍
Detect country or region of origin for names from any language or culture
- 🧬
Infer likely ethnicity and cultural background for demographic analysis
- 🚻
Determine gender based on first name, full name, or name pairings
- 🕵
Strengthen identity verification in KYC and AML workflows
- 🔐
Support national and global security efforts through origin-based intelligence
- 📊
Segment global audiences for marketing, CRM enrichment, and diaspora outreach
- 🎓
Advance research in sociology, migration, public health, political science, and more
- 🧾
Autocomplete fields in forms to streamline onboarding and boost completion rates
- ⚖️
Detect bias in AI systems for transparency and compliance with the EU AI Act
Whether you're working with international datasets, validating identities, or studying migration flows, Namsor turns names into actionable insight — accurately, ethically, and at scale.
Custom models built for your needs
We develop tailor-made AI models based on your data, region, and analytical goals. Below are real examples of custom solutions we've delivered for world-class organizations.
Social Research Modeling
We built a custom model for the World Bank and the International Organization for Migration (IOM) to estimate caste groupings from Indian names — enabling research on internal migration and social inequalities.
Name Transliteration Engine
We created a high-precision transliteration model for cross-border identity providers to convert Mandarin and Kanji names into Latin scripts — improving international interoperability.
Risk & Policy Intelligence
We developed a custom model for international risk analysts to support geopolitical analysis, compliance monitoring, and decision-making based on onomastic intelligence.
Why Namsor: The most accurate name origin and classification technology
Choosing the right solution for name-based analytics is critical when precision, scale, and trust matter. Unlike generic AI or basic indexation tools, Namsor delivers expert-level onomastic analysis — combining morphological understanding, linguistic context, and cultural nuances to reach unparalleled accuracy.
Verified by science. Trusted by institutions.
Namsor has contributed to over 600 scientific publications and collaborates with some of the world's most prestigious universities, international organizations, and global companies.
Following numerous audits and benchmarks by scientific, academic, and independent bodies, Namsor's name verification technology has been rated as the most accurate in the world.
These evaluations, performed on a wide range of Namsor functionalities, have demonstrated a significantly higher level of precision compared to other solutions:

Elsevier audit
Rated most accurate by Science-Metrix for Elsevier and the European Commission, Namsor was selected for the EU's official SheFigures gender statistics.
Science Metrix reportHarvard & University of Chicago
Namsor was validated on 250,000 individuals from North Carolina's voter registry — showing high alignment between predicted and self-reported ethnicities.
Harvard/Chicago university studyUber benchmark
A comparative study on U.S. race/ethnicity inference tools concluded that Namsor outperformed all alternatives.
Uber benchmarkResearch Square benchmark
Evaluated on 90,000 researchers names, Namsor demonstrated outstanding accuracy for origin and ethnicity prediction.
Research Square benchmarkResearch Done audit
An audit carried out by Research Done to validate the results obtained via Namsor.
"It is actually fascinating to see how close the estimation were to the real data, especially with key parts of the analysis."
Columbia University benchmark
Coming soon.
Namsor vs. other solutions: how we compare
Below is a comparison between Namsor and other common approaches like LLMs, basic databases, and general onomastic APIs:
Namsor | Data base comparaison | Large Language Model (LLM) | Onomastic solutions | |
---|---|---|---|---|
Accuracy | ||||
Language coverage | ||||
Names covered | 99.99% | 75% to 92% (depending on the solution) | 80% to 95% (depending on the models) | 99.99% |
Distinguishes typos from cultural nuances | ✓ | ✕ | ✕ | ⯁ |
Specialised onomastic analysis | Expert approach (morphology, context) | None (gross indexation) | Generic (not dedicated to names) | Partial |
Taxonomy segmentation | Very high (origin, ethnicity, location, U.S. race ethnicity, etc.) | Low (origin, location) | Low (Poor understanding of different taxonomies) | Low (origin, location) |
Data updates | Continuous (data & algorithms) | Sporadic | Non priority | Irregular |
Speed of analysis per name (lower is better) | 0.01 sec. | 0.01 sec. | From 1 sec. to 5sec. | 0.03 sec. |
User-friendliness | Very high (API, CSV/Excel, SDK) | Very high (API, CSV/Excel, SDK) | Very low (no dedicated API or tools) | Low (API only) |
Privacy and anonymity | Very high (Anonymisable data, deactivatable machine learning) | Medium (No anonimisation of data) | Very low (Data retention and compulsory machine learning) | Low (Data retention) |
Our technology: How we turn names into deep insight

At Namsor, we've built a unique artificial intelligence dedicated to the morphological and cultural analysis of names. Unlike generic AI or simple databases, our models are rooted in over a decade of expertise in natural language processing (NLP) and onomastics — the scientific study of names.
Our technology analyzes the structure, context, and cultural signals embedded in names to estimate their origin, ethnicity, gender, country of residence, or even sociocultural categories like caste, or to make transliteration.
Our tools work not only with full names, but also with nicknames, brand names, or non-standard spellings — across thousands of naming conventions and 22 alphabets: Cyrillic, Georgian, Latin, Arabic, Devanagari, Bengali, Greek, Armenian, Thai, Hebrew, Kannada, Gujarati, Tamil, Hangul, Telugu, Gurmukhi, Oriya, Han (Chinese traditional and simplified characters, Kanji), Hiragana, Myanmar, Katakana and Malayalam.
We also design custom AI models to meet the specific challenges of our clients. Whether it's for a particular geography, industry, or data environment, we build bespoke solutions that go beyond our standard APIs.
Refined through global collaboration
What makes Namsor unique is not just its architecture, but the way it's continuously refined. Our AI evolves through active research partnerships and ongoing collaboration with top-tier institutions:
- 🎓Universities Harvard, Berkeley, Sciences Po, VNU, etc.
- 🧬Scientific groups Elsevier, Springer Nature, The Lancet, ASME, SSRN, etc.
- 🌍International institutions United Nations, European Commission, IOM, OECD, World bank, etc.
- 🕵Experts Historians, linguists, anthropologists from around the world.
These collaborations help us integrate new knowledge, validate results, and maintain a high level of scientific rigor.
How our tools are designed
From assembling our data sets to continuous learning, each step is designed to optimize our estimations.
- 1
Large-scale data collection
After more than ten years of research, we've assembled a high-quality dataset made up of billions of names from diverse cultures, languages, and regions.
- 2
Onomastic model training
We train our AI using onomastics and integrate insights from linguistics, morphology, and cultural analysis.
- 3
Model validation
We evaluate multiple models in parallel using specialized metrics to identify edge cases, anomalies, and potential biases.
- 4
Continuous learning
Our team continuously refines these models and retains only the best-performing versions to ensure maximum reliability and accuracy.
Ethics & responsibility
At Namsor, we believe that powerful technology must come with a deep sense of responsibility. Our solutions are designed to deliver precise name-based insights while strictly respecting individual privacy, ethical standards, and regulatory frameworks such as the GDPR.
Control over your data
We give every user full control over how their data is used. By default, submitted contributes to improving our models. You can easily disable this by turning off the “learnable” option via the API or dashboard. You can also activate the “anonymised” mode, which uses SHA encryption to fully anonymize the data — ensuring privacy while still allowing fast, consistent processing.
This level of transparency and flexibility is rare among AI providers, and a key reason why organizations choose Namsor for responsible data analysis.
Ethical design and GDPR compliance
Namsor only collects the minimum data required for model training and predictions. All data is handled legally and ethically, with a focus on data minimization, transparency, and proportionality. We strictly comply with GDPR and other international data protection laws.
Transparency at every level
To ensure full explainability, the Namsor API offers a special option: a detailed explanation of the AI reasoning behind each prediction. With one click, users can receive a breakdown (in Python logic) of how gender, origin, or ethnicity were estimated — making our decision process both auditable and trustworthy.
Supporting research and social impact
Namsor is actively used by researchers, NGOs, and institutions to support scientific studies, fraud prevention, fair hiring, and anti-discrimination efforts. Our tools help map global diversity responsibly, without reinforcing bias or stereotypes.
From gender equality initiatives to migration policy research, we're committed to advancing knowledge while protecting the dignity and privacy of every individual.