Frequently Asked Questions - Everything you need to know about Namsor

Q: What features does Namsor offer?

Namsor provides a comprehensive suite of name analysis features, all accessible via REST API, SDKs, CSV/Excel upload, Google Sheets or no-code integrations. Standard features include: gender detection, name origin (131 countries), ethnicity and diaspora estimation (139 ethnicities), country of residence inference (247 countries), US race/ethnicity classification (US Census categories), Indian name analysis (caste, religion, state), name parsing (split full name), name type recognition (personal name, brand, pseudonym, place name), and phone number formatting. Namsor also generates name embeddings: numerical vector representations of proper names that capture morphological, cultural and linguistic signals, usable in custom machine learning pipelines. Beyond standard features, Namsor builds custom AI models for specific industry needs, including fake name detection for KYC and compliance, romance scam detection, and name transliteration (e.g. Mandarin or Kanji to Latin).

Q: What is onomastics and how does Namsor use it?

Onomastics is the scientific study of proper names: their origin, structure, meaning and cultural usage. It is a branch of linguistics that analyzes how names carry information about a person's gender, geographic heritage, language, religion or ethnic background. Namsor uses computational onomastics, a discipline that combines morphological analysis of names with artificial intelligence. Rather than simply matching a name against a list, Namsor decodes the internal structure of a name to extract meaningful signals. Names contain morphemes (roots, prefixes, suffixes) that carry cultural and linguistic information. For example, the suffix -ović (Petrović, Jovanović) is a patronymic marker signaling South Slavic origin. The prefix Al- (Al-Fayed) is the Arabic definite article, indicating Arab heritage. The suffix -ko signals a Ukrainian family name (Shevchenko, Bondarenko) but a feminine Japanese given name (Hanako, Yoshiko). These are simplified illustrations of well-known morphological patterns. In practice, Namsor's AI models detect far more subtle signals in name structures, identifying micro-patterns across billions of names that go beyond what traditional onomastic analysis can capture. This morphological approach is what allows Namsor to classify names it has never encountered before, including rare names, newly invented names, or names from underrepresented populations.

Q: Which institutions have validated Namsor's accuracy?

Namsor's accuracy has been independently validated through peer-reviewed studies, institutional audits and large-scale scientific benchmarks. Namsor is cited over 1,200 times on Google Scholar and has contributed to more than 600 academic publications. Elsevier and Science-Metrix (2018) judged Namsor the most accurate tool for name-based gender inference and selected it to power the European Commission's SheFigures gender statistics. Bursztyn, Chaney, Hassan and Rao (Harvard University and University of Chicago, 2022) validated Namsor on 250,000 individuals from the North Carolina voter registry for origin and ethnicity classification. Rieke, Southerland, Svirsky and Hsu (Uber, ACM FAccT 2022) found in an internal benchmark that Namsor outperformed all alternatives for race and ethnicity inference. Sebo (Journal of the Medical Library Association, 2021) confirmed Namsor as one of the top most accurate gender detection tools and the only one with zero unclassified names on 6,131 physicians. Sebo, Shamsi and Wang (Internal and Emergency Medicine, Springer, 2026) compared three gender detection APIs on 11,999 marathon runners and Namsor achieved the lowest error rate with 100% of names classified. Sebo (PLOS ONE, 2023) evaluated Namsor on 88,699 researcher names confirming precision for origin and ethnicity. A benchmark by Columbia University is currently in progress.

Q: Is Namsor used in academic research?

Yes, extensively. Namsor is cited in over 1,200 Google Scholar publications and has contributed to more than 600 academic studies across disciplines. Researchers use Namsor for gender gap analysis, bibliometrics, migration and diaspora studies, epidemiology and public health, and discrimination and bias research. Namsor is used across medicine, sociology, economics, political science, computer science and information science. Researchers choose Namsor because it is the reference solution used by leading scientific publishers. Elsevier and Springer Nature rely on Namsor for their own bibliometric analyses of author demographics. Research teams from Harvard, Columbia University, Yale, Oxford, HEC and other major universities use Namsor in their studies. Namsor allows retroactive analysis of large datasets where self-reported demographics are unavailable. It is fast, cost-effective, and its accuracy has been independently validated in peer-reviewed studies, making it defensible in academic methodology sections. Namsor offers a dedicated support program for researchers and scientists preparing a publication.

Q: Is Namsor used by governments and international organizations?

Yes. Namsor is trusted by governments, international organizations and public institutions for large-scale demographic analysis and policy research. The European Commission uses Namsor to power the gender statistics in the SheFigures reports, produced by Elsevier and Science-Metrix. The United Nations uses Namsor for demographic and digital inclusion research, including the EQUALS Research Report and ECLAC studies on the digital footprint in Latin America and the Caribbean. The World Bank commissioned a custom Namsor model to estimate caste groupings from Indian names for research on internal migration and social inequalities. The International Organization for Migration partnered with the World Bank on the Indian caste model and uses Namsor for diaspora mapping projects including the Armenian, Georgian and Azerbaijan diasporas. The Federal Reserve Bank of Chicago used Namsor to classify the ethnic origin of authors in a working paper on cultural change in the economics profession (García-Jimeno & Parsa, 2024). The DARES (French Ministry of Labour) uses Namsor for labor market and demographic analysis in France. The Boston Planning & Development Agency used Namsor to map the Brazilian scientific diaspora. Namsor's combination of accuracy, privacy controls and regulatory compliance (GDPR, CCPA, EU AI Act) makes it suitable for public sector use cases where data sensitivity is critical.

Q: Is Namsor used by companies?

Yes. Namsor powers name analysis at scale for companies across a wide range of industries, from global enterprises to fast-growing startups. While most clients operate under confidentiality, the types of organizations using Namsor include international airports, global airlines, business travel and tourism platforms, neobanks, global money transfer and remittance leaders, pharmaceutical companies, scientific publishers, global cosmetics brands, e-commerce platforms, retail companies, marketing and advertising agencies, AI and big data companies, recruitment and HR tech platforms, and intelligence and risk analysis firms. Companies choose Namsor because it scales from thousands to billions of names with consistent accuracy, integrates through API, SDK, CSV/Excel tools and no-code platforms, and meets enterprise requirements for GDPR, CCPA and EU AI Act compliance.

Q: Why is a specialized onomastic API better than a name lookup database?

Name lookup databases work by matching an input name against a precompiled list. When the name is in the list, the result can be correct. When it is not, the tool either returns no result or falls back on an approximate match with no guarantee of accuracy. Lookup databases typically cover between 75% and 92% of names. That gap is not random. The missing 8% to 25% of unrecognized names are disproportionately rare names, non-Western names, transliterated names and newly coined names. A morphological approach can still classify these correctly because analysis does not depend on having seen that exact name before. Name databases treat Muhammed, Mohammed and Muhammad as separate entries. A specialized onomastic API recognizes them as transliteration variants of the same Arabic root and classifies them consistently. Most lookup databases only offer basic classifications like origin and location. They analyze first name and last name in isolation, missing the cultural signals that emerge from their combination. The same first name paired with different last names can indicate completely different origins, genders or ethnicities. Lookup databases also cannot distinguish a fake name from a rare one: both are simply absent from the list. A specialized onomastic API can detect structural anomalies in a fabricated name while still classifying a genuinely rare name correctly. Finally, lookup databases depend on periodic imports from public registries or crowdsourced lists, while Namsor's models are continuously updated with both new data and improved algorithms.

Q: Why is a specialized onomastic API better than a general-purpose LLM for name classification?

LLMs can appear accurate on name classification when tested on common names. In practice, on real-world data, they fall short in every critical dimension. LLMs are trained on publicly available data, including lists of the top well-known names by country. When tested on these top well-known names, their results are correct, precisely because they have been overtrained on this data. This creates a dangerous bias: it gives a false sense of accuracy that collapses on real-world datasets. When Namsor tested three major LLMs on a real-world dataset of 400,000 names submitted by actual API users, Namsor correctly classified over 92% of names while the best-performing LLM achieved approximately 62%, with 18% of names left unclassified, 8% assigned to the wrong taxonomy and 12% attributed to the wrong country. Beyond missing names, LLMs frequently confuse classification categories, mixing linguistic origins with countries and diasporas, sometimes referencing entities that no longer exist such as the Persian Empire. LLMs process names at the syllable or token level, while Namsor performs letter-by-letter morphological analysis capturing micro-patterns that syllable-level processing misses. LLMs produce non-deterministic results, take 1 to 5 seconds per name versus 0.03 second for Namsor, and retain input data for training by default. Despite these limitations, LLMs can provide useful semantic context about names, which is why Namsor V3 integrates a semantic model alongside its morphological and statistical models.

Q: What happens if a name isn't in your dataset?

Namsor still classifies it. Unlike lookup-based tools that return no result when a name is absent from their list, Namsor does not depend on having seen a name before. Namsor analyzes the structure of a name letter by letter, extracting cultural, linguistic and geographic signals from its roots, prefixes, suffixes and phonetic patterns. This means Namsor can classify a rare name, a transliterated name, a misspelled name or even a completely invented name. In two independent peer-reviewed studies, Namsor achieved zero unclassified names while competing tools left up to 25% of names without a result. Namsor classifies 99.99% of names submitted, regardless of origin, writing system or frequency.

Q: How many names can Namsor analyze and in which alphabets?

Namsor's models are trained on a proprietary database of 13 billion unique names, the largest in the industry. Over 12 billion names have been processed through the platform to date, covering individuals, companies and aliases from every region of the world. Namsor supports 22 writing systems: Latin, Cyrillic, Arabic, Han (Chinese traditional and simplified, Kanji), Hangul (Korean), Hiragana, Katakana, Devanagari, Bengali, Georgian, Greek, Armenian, Thai, Hebrew, Kannada, Gujarati, Tamil, Telugu, Gurmukhi, Oriya, Myanmar and Malayalam. Unlike lookup-based tools that leave 8% to 25% of names unclassified, Namsor's morphological analysis ensures that virtually every name receives a classification, including rare names, transliterated names and newly invented names. In independent benchmarks, Namsor is the only tool that consistently achieves zero unclassified names (Sebo, 2021; Sebo, Shamsi & Wang, 2026).

Question 1

Is Namsor the best tool available?

Answer · 2026

Yes. Namsor is the most widely validated and comprehensive name analysis tool on the market. Here is the evidence, dimension by dimension.

Most accurate, validated by peer-reviewed studies

Gender detection. A 2026 study published in Internal and Emergency Medicine (Springer) on 11,999 marathon runners from seven major international marathons showed that Namsor achieved an error rate of 4.8%, nearly half that of the next best tool at 8.0% (p < 0.001).

Origin classification. On a benchmark of 400,000 names, Namsor reached 92% accuracy, compared to 62% for the best-performing large language model. Researchers from Harvard and the University of Chicago validated Namsor on 250,000 individuals from the North Carolina voter registry (Bursztyn, Chaney, Hassan & Rao, 2022). A study published in PLOS ONE confirmed accuracy on 88,699 researcher names.

Highest coverage: 99.99% of names classified

A study published in the Journal of the Medical Library Association on 6,131 Swiss physicians showed that Namsor left 0% of names unclassified, compared to 0.3% to 16.4% for competing tools. On unique names, Namsor's error rate only moves from 2.0% to 3.1%, while a major competing tool's error rate jumps from 17.7% to 28.2%.

Fastest: 30 ms per name, 80 to 500 ms per batch

Namsor processes a single name in under 30 ms and a batch of several hundred names in 80 ms to less than 500 ms depending on name complexity. For comparison, large language models (LLMs) typically take 1 to 5 seconds per name for similar classification tasks. At this speed, processing 1 million names takes minutes, not days.

Most complete: nine features, deepest taxonomies in the industry

Namsor offers nine classification features: gender detection, origin (131 countries), ethnicity and diaspora (139 cultural groups), country of residence (247 countries and territories), US race/ethnicity (US Census categories), Indian name analysis (caste, religion, state), name parsing, name type recognition and phone number formatting. This range covers 22 writing systems (Latin, Cyrillic, Arabic, Han, Hangul, Devanagari, Hiragana, Katakana, Hebrew, Thai and more) with the deepest taxonomy segmentation in the industry.

Most private

Namsor is the only name analysis tool offering both data anonymization through SHA encryption and deactivatable machine learning on your data, fully compliant with GDPR, CCPA and the EU AI Act. Unlike LLMs, which transmit your data to third-party providers and may reuse it for their training, Namsor operates on dedicated infrastructure and provides a downloadable Data Processing Agreement.

Recognized by the global scientific community

Namsor is cited in over 1,200 Google Scholar publications and has contributed to more than 600 academic studies published in venues such as Nature, The Lancet Global Health, PLOS ONE, the British Journal of Surgery, the Journal of Medical Internet Research, Scientometrics, the Journal of the Medical Library Association and Internal and Emergency Medicine.

Elsevier and Springer Nature rely on Namsor for their own bibliometric analyses of author demographics. Namsor was selected by the European Commission to power the gender statistics in its SheFigures reports.

Question 2

What features does Namsor offer?

Answer

Namsor provides a comprehensive suite of name analysis features, all accessible via REST API, SDKs, CSV/Excel upload, Google Sheets or no-code integrations.

Standard features

Gender detection: determine if a name is male or female
Name origin: identify the country of origin across 131 countries
Ethnicity and diaspora: estimate cultural and ethnic background across 139 groups
Country of residence: infer where a person currently lives between 247 countries
US race/ethnicity: classify according to US Census categories
Indian name analysis: detect 12 caste groups, religions and states
Name parsing: split a full name into first and last name
Name type recognition: classify as personal name, brand, pseudonym or place name
Phone number formatting: detect country code and validate structure from a name

Name embeddings

Namsor generates name embeddings: numerical vector representations of proper names that capture morphological, cultural and linguistic signals. These vectors can be integrated into your own machine learning pipelines for clustering, similarity search or custom classification tasks. Available on namsor.ai.

Custom models

Beyond standard features, Namsor builds custom AI models for specific industry needs, including fake name detection for KYC and compliance, romance scam detection, and name transliteration (e.g. Mandarin or Kanji to Latin).

Question 3

What is onomastics and how does Namsor use it?

Answer

Onomastics is the scientific study of proper names: their origin, structure, meaning and cultural usage. It is a branch of linguistics that analyzes how names carry information about a person's gender, geographic heritage, language, religion or ethnic background.

How Namsor applies onomastics

Namsor uses computational onomastics, a discipline that combines morphological analysis of names with artificial intelligence. Rather than simply matching a name against a list, Namsor decodes the internal structure of a name to extract meaningful signals.

Morphological analysis in practice

Names contain morphemes (roots, prefixes, suffixes) that carry cultural and linguistic information. For example:

The suffix "-ović" (Petrović, Jovanović) is a patronymic marker signaling South Slavic origin
The prefix "Al-" (Al-Fayed) is the Arabic definite article, indicating Arab heritage
The suffix "-ko" signals a Ukrainian family name (Shevchenko, Bondarenko) but a feminine Japanese given name (Hanako, Yoshiko)

This last example illustrates why lookup tables fail: the same suffix carries opposite gender signals depending on linguistic context. Onomastic analysis decodes these patterns. Lookup tables cannot.

Beyond human onomastics

The examples above are simplified illustrations of well-known morphological patterns. In practice, Namsor's AI models detect far more subtle signals in name structures, identifying micro-patterns across billions of names that go beyond what traditional onomastic analysis can capture. The result is a level of precision that no human expert or static rule set can replicate at scale.

Why it matters

This morphological approach is what allows Namsor to classify names it has never encountered before, including rare names, newly invented names, or names from underrepresented populations that do not appear in publicly available name lists.

Question 4

Which institutions have validated Namsor's accuracy?

Answer · 2018

Namsor's accuracy has been independently validated through peer-reviewed studies, institutional audits and large-scale scientific benchmarks. Namsor is cited over 1,200 times on Google Scholar and has contributed to more than 600 academic publications.

Elsevier and Science-Metrix (2018)

Namsor was judged the most accurate tool for name-based gender inference and selected to power the gender statistics in the European Commission's SheFigures reports. (Read the report)

Harvard University and the University of Chicago (2022)

Bursztyn L., Chaney T., Hassan T.A., Rao A. validated Namsor on a dataset of 250,000 individuals from the North Carolina voter registry for origin and ethnicity classification. (Read the study)

Uber, ACM FAccT (2022)

Rieke A., Southerland V., Svirsky D., Hsu M. conducted an internal benchmark comparing name-based race and ethnicity inference tools and found that Namsor outperformed all alternatives tested. (Read the benchmark)

Journal of the Medical Library Association (2021)

Sebo P. conducted a peer-reviewed study on 6,131 physicians in Switzerland and confirmed Namsor as one of the top most accurate gender detection tools, and the only one with zero unclassified names. (Read the study)

Internal and Emergency Medicine, Springer (2026)

Sebo P., Shamsi A., Wang T. compared three leading gender detection APIs on 11,999 runners from seven international marathons. Namsor achieved the lowest error rate and classified 100% of names. (Read the study)

PLOS ONE (2023)

Sebo P. evaluated Namsor on 88,699 researcher names and confirmed its precision for origin and ethnicity classification. (Read the study)

Columbia University

Benchmark currently in progress.

Question 5

Is Namsor used in academic research?

Answer

Yes, extensively. Namsor is cited in over 1,200 Google Scholar publications and has contributed to more than 600 academic studies across disciplines.

Types of research

Researchers use Namsor in a wide range of studies, including:

Gender gap analysis: measuring female representation in scientific authorship, editorial boards, grant allocations and career progression
Bibliometrics: analyzing author demographics across large publication databases (Scopus, PubMed, Web of Science)
Migration and diaspora studies: tracking population flows, immigrant integration and diaspora mapping
Epidemiology and public health: studying demographic patterns in health outcomes and clinical trial participation
Discrimination and bias research: detecting ethnic or racial disparities in hiring, citations, funding and peer review

Disciplines

Namsor is used across medicine, sociology, economics, political science, computer science and information science, among others.

Why researchers choose Namsor

Namsor is the reference solution used by leading scientific publishers. Elsevier and Springer Nature rely on Namsor for their own bibliometric analyses of author demographics. Research teams from Harvard, Columbia University, Yale, Oxford, HEC and other major universities use Namsor in their studies.

Namsor allows retroactive analysis of large datasets where self-reported demographics are unavailable. It is fast, cost-effective, and its accuracy has been independently validated in peer-reviewed studies, making it defensible in academic methodology sections.

Researcher support program

Namsor offers a dedicated support program for researchers and scientists preparing a publication. Contact Namsor to learn more.

Question 6

Is Namsor used by governments and international organizations?

Answer · 2024

Yes. Namsor is trusted by governments, international organizations and public institutions for large-scale demographic analysis and policy research.

International organizations

Among many others, here are a few examples of international organizations using Namsor:

European Commission: Namsor powers the gender statistics in the SheFigures reports, produced by Elsevier and Science-Metrix, to measure women's contribution to scientific research across Europe (read the report)
United Nations: uses Namsor for demographic and digital inclusion research, including the EQUALS Research Report and the ECLAC study on the digital footprint in Latin America and the Caribbean
World Bank: commissioned a custom Namsor model to estimate caste groupings from Indian names, enabling research on internal migration and social inequalities
IOM: partnered with the World Bank on the Indian caste model, and uses Namsor for diaspora mapping projects including the Armenian diaspora, the Georgian diaspora and the Azerbaijan diaspora

Government and public sector

Among many others, here are a few examples of government and public sector institutions using Namsor:

Federal Reserve Bank of Chicago: used Namsor to classify the ethnic origin of authors in a working paper on cultural change in the economics profession (García-Jimeno & Parsa, 2024)
DARES (French Ministry of Labour): uses Namsor for labor market and demographic analysis in France (CNIS report, 2022)
Boston Planning & Development Agency: used Namsor to map the Brazilian scientific diaspora in Boston

Why the public sector trusts Namsor

Namsor's combination of accuracy, privacy controls and regulatory compliance (GDPR, CCPA, EU AI Act) makes it suitable for public sector use cases where data sensitivity is critical.

Question 7

Is Namsor used by companies?

Answer

Yes. Namsor powers name analysis at scale for companies across a wide range of industries, from global enterprises to fast-growing startups. While most clients operate under confidentiality, the types of organizations using Namsor include:

Transportation and travel

International airports
Global airlines
Business travel and tourism platforms

Financial services

Neobanks
Global money transfer and remittance leaders

Science and publishing

Pharmaceutical companies
Scientific publishers

Retail, e-commerce and marketing

Global cosmetics brands
E-commerce platforms
Retail companies
Marketing and advertising agencies

Technology and data

AI and big data companies
Recruitment and HR tech platforms

Security and intelligence

Intelligence and risk analysis firms

Why companies choose Namsor

Namsor scales from thousands to billions of names with consistent accuracy, integrates through API, SDK, CSV/Excel tools and no-code platforms, and meets enterprise requirements for GDPR, CCPA and EU AI Act compliance.

Question 8

Why is a specialized onomastic API better than a name lookup database?

Answer

Name lookup databases work by matching an input name against a precompiled list. When the name is in the list, the result can be correct. When it is not, the tool either returns no result or falls back on an approximate match with no guarantee of accuracy.

Coverage drops on real-world data

Lookup databases typically cover between 75% and 92% of names, depending on the solution. That gap is not random. The missing 8% to 25% of unrecognized names are disproportionately rare names, non-Western names, transliterated names and newly coined names. These are precisely the names that a morphological approach can still classify correctly, because analysis does not depend on having seen that exact name before.

No ability to distinguish typos from cultural nuances

Name databases treat "Muhammed", "Mohammed" and "Muhammad" as separate entries. A specialized onomastic API recognizes them as transliteration variants of the same Arabic root and classifies them consistently. Conversely, when a name contains a genuine typo, an onomastic model can still extract the morphological signal, while a database either mismatches or returns nothing.

Shallow taxonomy and no contextual understanding

Most lookup databases only offer basic classifications: origin and sometimes location. They analyze first name and last name in isolation, missing the cultural signals that emerge from their combination. For example, the same first name paired with different last names can indicate completely different origins, genders or ethnicities. Only a model that understands name morphology and cultural context can capture these nuances.

Lookup databases also cannot distinguish a fake name from a rare one: both are simply absent from the list. A specialized onomastic API can detect structural anomalies in a fabricated name while still classifying a genuinely rare name correctly. This distinction is critical for KYC, fraud prevention and compliance workflows.

Sporadic updates

Lookup databases depend on periodic imports from public registries, census data or crowdsourced lists. Namsor's models are continuously updated with both new data and improved algorithms, adapting to evolving naming patterns across cultures.

Question 9

Why is a specialized onomastic API better than a general-purpose LLM for name classification?

Answer

LLMs can appear accurate on name classification when tested on common names. In practice, on real-world data, they fall short in every critical dimension.

Accuracy collapses on real names

LLMs are trained on publicly available data, including lists of the top well-known names by country that appear on thousands of websites. When tested on these top well-known names, their results are correct, precisely because they have been overtrained on this data. This creates a dangerous bias: it gives a false sense of accuracy that collapses on real-world datasets.

When Namsor tested three major LLMs on a real-world dataset of 400,000 names submitted by actual API users, the results were very different. Namsor correctly classified over 92% of names. The best-performing LLM achieved approximately 62%, with 18% of names left unclassified, 8% assigned to the wrong taxonomy (confusing origin with diaspora, language with country), and 12% attributed to the wrong country.

Taxonomy confusion

Beyond missing names, LLMs frequently confuse classification categories. They mix linguistic origins (Latin, Greek, Cyrillic) with countries, and countries with diasporas. Some responses reference entities that no longer exist, such as the Persian Empire. A specialized onomastic API maintains strict, consistent taxonomies across every classification.

Syllable-level vs. letter-level analysis

LLMs process names at the syllable or token level, which limits their ability to detect fine morphological signals. Namsor's models perform letter-by-letter morphological analysis, capturing micro-patterns that syllable-level processing misses entirely.

Non-deterministic results

The same name submitted twice to an LLM can produce different answers. For research, compliance or any use case requiring reproducibility, this is disqualifying. A specialized API returns the same result every time.

Latency and cost

An LLM takes 1 to 5 seconds per name. Namsor processes a name in 0.03 second. At scale, the difference is the gap between minutes and days.

Privacy risk

LLMs retain input data and use it for training by default. Name data submitted to an LLM cannot be anonymized or excluded from model training. Namsor offers anonymized mode with SHA encryption and opt-out from machine learning.

But LLMs bring one thing

Despite their limitations on precision and taxonomy, LLMs can provide useful semantic context about names. This is why Namsor V3 integrates a semantic model alongside its morphological and statistical models, capturing the best of LLM capabilities without their weaknesses.

Question 10

What happens if a name isn't in your dataset?

Answer · 2021

Namsor still classifies it. Unlike lookup-based tools that return no result when a name is absent from their list, Namsor does not depend on having seen a name before.

Morphological analysis, not lookup

Namsor analyzes the structure of a name letter by letter, extracting cultural, linguistic and geographic signals from its roots, prefixes, suffixes and phonetic patterns. This means Namsor can classify a rare name, a transliterated name, a misspelled name or even a completely invented name.

Proven in benchmarks

In two independent peer-reviewed studies, Namsor achieved zero unclassified names, while competing tools left up to 25% of names without a result (Sebo, 2021; Sebo, Shamsi & Wang, 2026).

99.99% classification rate

Namsor classifies virtually every name submitted, regardless of origin, writing system or frequency.

Question 11

How many names can Namsor analyze and in which alphabets?

Answer · 2021

Namsor's models are trained on a proprietary database of 13 billion unique names, the largest in the industry. Over 12 billion names have been processed through the platform to date, covering individuals, companies and aliases from every region of the world.

22 writing systems supported

Namsor analyzes names written in Latin, Cyrillic, Arabic, Han (Chinese traditional and simplified, Kanji), Hangul (Korean), Hiragana, Katakana, Devanagari, Bengali, Georgian, Greek, Armenian, Thai, Hebrew, Kannada, Gujarati, Tamil, Telugu, Gurmukhi, Oriya, Myanmar and Malayalam.

99.99% classification rate

Unlike lookup-based tools that leave 8% to 25% of names unclassified, Namsor's morphological analysis ensures that virtually every name receives a classification, including rare names, transliterated names and newly invented names. In independent benchmarks, Namsor is the only tool that consistently achieves zero unclassified names (Sebo, 2021; Sebo, Shamsi & Wang, 2026).

Question 12

Why does Namsor offer 4 different features for analyzing name origin?

Answer

Namsor offers four features for analyzing name origin because there are four different questions you can ask about a person, and each one requires a different answer. They are not redundant: a single name often returns four different but equally valid results.

The four questions and the four features

Origin answers "Where does this person's family historically come from?" It returns a country code (ISO) and covers 131 countries.
Ethnicity / Diaspora answers "What cultural identity does this person belong to?" It returns a named cultural group from 139 groups (e.g. Scottish, Catalan, Hispanic, Jewish, Tatar, AfricanAmerican).
Country of Residence answers "Where does this person currently live?" It returns a country code and covers 247 countries and territories — the broadest geographic coverage of the four features.
US Race / Ethnicity answers "Which US Census racial category does this person belong to?" It returns one of six Census categories: White, Black/African American, Hispanic/Latino, Asian, Native Hawaiian/Pacific Islander, American Indian/Alaska Native.

Why four features instead of one?

Because the four concepts genuinely do not overlap. A person can be ethnically Chinese, of Chinese ancestral origin, living in the United States, and classified as Asian under US Census categories — all at the same time. None of these four facts can be derived from any single other one.

A name like "García" tells you something about ancestral roots (Spanish), but not about where the person lives (could be Spain, Mexico, Colombia, the US, or anywhere else) and not about cultural identity (could be Spanish, Mexican, Hispanic-American, etc.). A name like "Smith" could belong to someone born in the US for ten generations, or someone who recently moved to London from Australia. One feature cannot answer all four questions correctly, so Namsor offers four specialized features instead of one approximate one.

One name, four answers: an example

For Wei Zhang living in San Francisco, the four features return:

Feature	Returns	What it tells you
Origin	CN (China)	His family historically comes from China
Ethnicity	Chinese	His cultural identity is Chinese
Country of Residence	US (United States)	He currently lives in the United States
US Race	Asian	His US Census racial category

All four answers are correct. They simply answer different questions. Choosing the right feature means knowing which question you are actually asking.

Why coverage differs across features

The four features cover different numbers of countries or groups because each is built around a different concept:

Origin (131 countries): limited to countries that are historically sources of population. Immigration countries like the US, Canada, Australia, Brazil, Argentina and most of Latin America are not in the taxonomy because there is no single "American origin" or "Brazilian origin."
Ethnicity (139 groups): captures cultural identities that don't always align with country borders — including sub-national groups (Scottish, Catalan), transnational groups (Hispanic, Jewish) and communities defined by shared culture rather than geography.
Country of Residence (247 countries and territories): the most geographically complete feature. Covers every country, including immigration destinations, newly formed states, overseas territories and micro-states.
US Race (6 categories): strictly aligned with the US Census taxonomy, used for federal reporting and disparate impact analysis.

A common pitfall to know about

Origin will not return the United States, Canada, Brazil, Mexico, Colombia, Argentina, Australia or any other immigration country for people who live there. Because Origin reflects ancestral roots, it returns the country the family historically came from instead — typically Spain or Portugal for Latin America, or various European/African/Asian countries for the US, Canada and Australia.

If you need the country where the person actually lives, use Country of Residence instead of Origin. This is the single most common confusion among new Namsor users.

Quick decision guide

You know where the person lives or works → use Ethnicity / Diaspora with the country code. Most precise option for multicultural countries.
You only have a name with no context (social media aliases, anonymous lists) → use Origin. Works from the name alone.
You need to know where someone currently lives → use Country of Residence. The only feature that covers immigration countries like the US, Canada, Australia and Latin America.
You need US Census-aligned categories → use US Race, ideally with a ZIP code for neighborhood-level precision.
You want both cultural detail and geographic distribution → combine Ethnicity + Country of Residence on the same dataset.

Question 13

Why does Namsor return Spain or Portugal instead of the actual country someone lives in?

Answer

Short answer: Namsor's Origin feature returns the country a person's family historically came from, not the country where they currently live. For someone living in Latin America, the United States, Canada, Australia or any other immigration country, Origin will return the ancestral country instead of the country of residence. This is by design, not a bug.

Why Origin works this way

Origin is built around 131 countries that are historically sources of population, not destinations. Countries built largely through immigration (United States, Canada, Australia, New Zealand, Brazil, Argentina, and most of Latin America) are not in the Origin taxonomy because there is no single ancestral origin shared by their populations.

For someone living in São Paulo, ancestral roots could be Portuguese, Italian, Japanese, Lebanese, German or African. There is no "Brazilian origin" in the historical sense Origin is designed to capture. The same is true for the US, where ancestral roots span every continent. Origin therefore returns the country the family historically came from, which is the only meaningful answer the feature can give within its taxonomy.

Examples across regions

Here are typical results that often surprise new users:

Person	Origin returns	Why
Diego Hernández in Buenos Aires	ES (Spain)	Hernández is a Spanish surname, not Argentine
Ana Costa in Rio de Janeiro	PT (Portugal)	Costa is Portuguese, not Brazilian
John Smith in Boston	GB (Great Britain)	Smith is a British surname, not American
Liam O'Connor in Sydney	IE (Ireland)	O'Connor is an Irish surname, not Australian
Mohammed Hassan in Toronto	EG (Egypt) or similar	Arabic name, not Canadian in origin
Hiroshi Tanaka in São Paulo	JP (Japan)	Japanese name (large Nikkei community in Brazil)
Wei Zhang in Vancouver	CN (China)	Chinese name, not Canadian in origin

In all these cases, Origin is doing exactly what it is designed to do: identify ancestral roots. The "wrong" country is only wrong relative to a question Origin was never built to answer.

How to get the actual country of residence

Use Country of Residence instead of Origin. Country of Residence is built around a different question (where someone currently lives) and covers 247 countries and territories, including all immigration countries that Origin cannot return.

For the same examples above, Country of Residence returns:

Diego Hernández in Buenos Aires → AR (Argentina)
Ana Costa in Rio de Janeiro → BR (Brazil)
John Smith in Boston → US (United States)
Liam O'Connor in Sydney → AU (Australia)
Mohammed Hassan in Toronto → CA (Canada)

If you need cultural identity rather than geography (for example, identifying the Hispanic, African American or Asian American community a person belongs to), use Ethnicity / Diaspora instead. Ethnicity can return groups like HispanoLatino, AfricanAmerican or AsianAmerican that neither Origin nor Country of Residence can represent.

When Origin is still the right feature

Origin remains the right choice in several cases:

You have no context about where the person lives (anonymous lists, social media aliases, historical records). Origin is the only feature that works from a name alone.
You specifically want ancestral roots for genealogy, family history research or migration studies.
You are studying historical population movements, diasporas or migration patterns. In this context, ancestral country is exactly the signal you want.
The names come from a country that is in the Origin taxonomy (most of Europe, Asia, Africa and the Middle East). For these populations, Origin and Country of Residence often return the same answer.

For most analytics, customer segmentation, compliance and localization use cases involving immigration countries, Country of Residence is the more appropriate feature.

Question 14

Can I use Namsor without coding?

Answer

Yes. Namsor offers four no-code ways to analyze names at scale, with no technical skills required.

CSV and Excel tool

Upload a spreadsheet, choose the analysis type, map your columns and download the enriched file. Supports .xls, .xlsx, .csv, .txt and .ods files. Learn more about the CSV/Excel tool.

Google Sheets add-on

Analyze up to 500,000 names directly inside a Google Sheet. Install from the Google Workspace Marketplace and run analyses from the sidebar.

No-code automations

Connect Namsor to 8,000+ apps through Zapier, Make or n8n to automate name analysis in your existing workflows (CRM enrichment, form submissions, database sync). Learn more about no-code integrations.

Interactive forms on feature pages

Every feature pages includes an interactive form at the top, so you can run small analyses directly from the Namsor website with no setup. Useful for testing a feature before integrating it, validating a one-off result or showing the product to a colleague.

Which option to choose

Use the Google Sheets add-on for collaborative work, the CSV/Excel tool for one-off large batches, no-code automations for recurring workflows, and the feature pages forms for quick tests.

Question 15

What programming languages does Namsor support, and does it provide SDKs and a CLI?

Answer

Namsor provides official SDKs and a CLI for developers, all open-source on GitHub.

Supported languages

Native SDKs are available in four languages:

Java
Python
JavaScript
Go (Golang)

For languages without an official SDK, the Namsor REST API can be called directly from any language that supports HTTP requests.

SDKs

Each SDK wraps the Namsor REST API with typed methods, authentication handling and batch support, making integration straightforward in your application's data flow.

CLI (command-line tool)

Run name analyses from your terminal without writing code. Useful for quick tests, scripted pipelines and server-side automation.

How they're built

Namsor SDKs are generated through OpenAPI Generator from the official API specification. This guarantees consistency across languages and automatic updates when the API evolves.

Installation

Install via standard package managers:

Java: Maven or Gradle (com.namsor:namsor-sdk2)
Python: pip install namsor
JavaScript: npm install namsor
Go: go get github.com/namsor/namsor-go-sdk

Source code and documentation

All SDKs and the CLI are publicly available on the Namsor GitHub organization. Learn more about Namsor developer tools.

Question 16

Is there API documentation?

Answer

Yes. Namsor publishes complete, interactive API documentation with code examples, endpoint references and authentication guides. Read the Namsor API documentation.

What's documented

Every endpoint across all features is documented (gender, origin, ethnicity, country of residence, US race, Indian name, split name, name type, phone number format), with request/response schemas, error codes and rate limits.

Code examples

Ready-to-copy snippets in JavaScript, Python, Java and Shell (curl) for each endpoint.

API details

Base URL: https://v2.namsor.com/NamSorAPIv2
Current version: 2.0.21
Authentication: API key (header-based)
Format: JSON
Batch support: up to 100 names per POST request

Advanced topics covered

Learnable mode: opt-out of machine learning on your data
Anonymized mode: irreversibly anonymizes names with SHA before logging, so no raw name data is stored
API Explainability: detailed reasoning output in Python logic
API Enumerators: full list of return values for alphabets, countries, diasporas, castes, religions, US races and name types

Question 17

How fast is the Namsor API per name?

Answer

A single name can be processed in under 30ms, and a batch of several hundred names typically completes in between 80ms and less than 500ms, depending on name complexity. Namsor is built for high-throughput name analysis at scale, with batch endpoints, persistent connections and a tuned inference layer.

Batch processing: 80ms to less than 500ms for hundreds of names

When you send a batch of names through a POST endpoint, Namsor processes them in parallel server-side and returns the full response in between 80ms and less than 500ms for several hundred names, depending on name complexity. Names in non-Latin scripts or with ambiguous structure may sit at the higher end of the range. This is the recommended mode for production workloads.

GET vs POST endpoints

GET endpoints: process one name per request, typically in under 30ms. Useful for quick tests, integration debugging and very low-volume workflows.
POST endpoints: process up to 100 names per request. Use these for production, bulk enrichment and batch pipelines.

How to maximize throughput

Use POST batch endpoints instead of looping GET calls
Run parallel batch requests if you need to process millions of names
For very large workloads, the CSV/Excel tool handles millions of names per file, compared to 500,000 for the Google Sheets add-on

Why this matters

For comparison, large language models (LLMs) typically take 1 to 5 seconds per name for similar classification tasks. At Namsor's batch speed, processing 1 million names takes minutes, not days.

Question 18

Is Namsor free to use?

Answer

Yes. Every Namsor account starts with 2,500 free credits per month, with no credit card required.

What you can do with 2,500 credits

The number of names you can analyze depends on the feature you use:

2,500 names for gender detection, name splitting or name type recognition (1 credit each)
250 names for origin, country of residence, US race or Indian name analysis (10 credits each)

No time limit on the free tier

Credits renew every month automatically. You can use Namsor for free indefinitely within the free quota.

Upgrade when you need more

Paid plans start at $19/month and unlock larger quotas, lower per-credit costs and premium features. See Namsor pricing plans.

Question 19

How does Namsor pricing work?

Answer

Namsor uses a credit-based system: every name analysis consumes a defined number of credits, from 1 to 50 depending on the feature.

Credit cost by feature

1 credit: simple analyses (gender, name splitting, name type)
10 credits: mid-tier analyses (origin, country of residence, US race, Indian name)
20 credits: advanced analyses (ethnicity/diaspora)

Two ways to pay

Monthly subscription (recommended): includes a monthly credit quota at a 30% discount versus one-time purchases. Plans range from Free (2,500 credits) to Enterprise (10 million credits).
One-time credit packs: purchase credits as needed. Credits remain valid for 120 days.

Smart deduplication

On Ultra, Mega and Enterprise plans, repeated names in the same batch are only charged once (up to 10 or 20 times per duplicate), reducing costs significantly on large customer databases.

No lock-in

All subscriptions are monthly with no commitment. You can upgrade, downgrade or cancel anytime. See full pricing details and compare plans.

Question 20

What happens if I exceed my monthly credits?

Answer

On a paid Namsor subscription, you continue to use the API without interruption. Additional credits are automatically billed at the end of the current billing period, at a per-credit rate that depends on your plan.

Additional credit pricing by plan

Free: up to 200,000 additional credits/month at $0.005 per credit
PRO: up to 500,000 additional credits/month at $0.003 per credit
ULTRA: up to 2 million additional credits/month at $0.002 per credit
MEGA: up to 10 million additional credits/month at $0.001 per credit
ENTERPRISE: up to 100 million additional credits/month at $0.0005 per credit

Larger plans offer lower per-credit rates, so heavy users benefit from economies of scale.

Stay in control: soft and hard limits

Two configurable limits let you control exactly how much you want to spend on additional credits:

Hard limit: caps your total monthly consumption. Once reached, your API key is automatically disabled until the next billing cycle. This prevents unexpected billing.
Soft limit: a warning threshold. Once reached, you receive a notification email but the API continues to work. Useful to get early alerts without blocking production.

Both limits can be adjusted anytime in the Plan management section of your account.

Need more than your plan allows?

To increase your hard limit beyond your plan's maximum additional credits, contact the Namsor team. We can also help you choose the most cost-effective plan for your expected volume.

Question 21

Do credits roll over, and how long are they valid?

Answer

Credit validity depends on whether you are on a monthly subscription or bought credits one-time.

Subscription credits

Subscription credits do not roll over from month to month. Each billing cycle gives you a fresh allocation that must be used within that month. Unused credits expire at the end of the cycle and the next month starts with the full plan allocation. This keeps pricing simple and predictable.

One-time credit purchases

One-time credit purchases are valid for 120 days from the date of purchase. You can use them at your own pace within that window. If you exhaust them before 120 days, you can buy more credits at any time. The 120-day validity starts fresh with each new purchase.

What happens when you downgrade your plan

When you downgrade your subscription plan, any credits you have not yet used from the previous plan are preserved and added to your account. They remain valid until the end of the original subscription period. After that date, the new (lower) plan quota applies normally.

This means a downgrade never causes you to lose credits you have already paid for.

In summary

Subscription credits: reset each month, no carryover
One-time credits: valid for 120 days
Plan downgrade: unused credits preserved until the end of the old billing cycle

Question 22

Why does Diaspora analysis cost more credits than Gender detection?

Answer

Credit cost reflects the computational complexity of each prediction, not an arbitrary price. The more possible outcomes a model has to choose from, the more resources each prediction requires.

Gender detection: a binary outcome

Gender detection classifies a name into two possible outcomes (male or female on a continuous scale). The underlying model is compact, trained on a simpler decision surface and returns a result quickly. Cost: 1 credit per name.

Diaspora: 139 cultural groups

Diaspora analysis classifies a name into 139 cultural groups. Each group carries distinct linguistic, morphological and cultural signals that the model must disentangle from potentially overlapping patterns. The model is larger, the training data is more diverse, and each prediction requires evaluating many possible outcomes simultaneously. Cost: 20 credits per name.

How pricing scales across features

The same logic applies across all Namsor features:

1 credit: simple classifications with binary or short taxonomies (Gender, Split Name, Name Type)
10 credits: mid-complexity features with national-level taxonomies (Origin with 131 countries, Country of Residence with 247 territories, US Race with 6 Census categories, Indian Name classifications)
11 credits: combined analysis (Phone Number Format, which parses a name and a phone number together)
20 credits: granular cultural classification (Diaspora with 139 groups)
50 credits: cross-entity analysis (Names Corridor, which analyzes the interaction between two names for cross-border dynamics)

Pricing is proportional to what you get

Choosing Diaspora instead of Gender means choosing a much deeper analysis, not the same analysis at a higher price. The extra credits reflect the extra signal, granularity and infrastructure required to deliver a classification across 139 groups instead of 2.

Question 23

Is there a discount for researchers or academic use?

Answer

Yes. Namsor runs a dedicated research support program with discounts ranging from 40% to 99% on name analysis credits, designed to make rigorous onomastic methods accessible to academic teams, PhD students and research projects.

What determines your discount

The exact discount depends on several factors:

Team: size and composition of the research group
Project: nature, scope and scientific ambition of the research
Publication target: the journals or conferences where results will appear

Larger projects destined for high-impact peer-reviewed venues typically qualify for the deepest discounts.

Hands-on methodology support

For complex research projects, Namsor's team can provide methodology support at no additional cost. This includes:

Advice on which features best suit your research question (Origin vs Ethnicity, gender classification thresholds, alphabet coverage)
Guidance on structuring batch processing and handling edge cases
Recommendations for reproducible analysis pipelines

A proven track record in academia

Namsor is already used in over 600 academic publications and cited in more than 1,200 Google Scholar results. This includes studies published by Harvard, Columbia, Yale, HEC and in top-tier venues such as Nature, The Lancet Global Health, PLOS ONE, British Journal of Surgery, Journal of Medical Internet Research, Scientometrics (Springer), Journal of the Medical Library Association and Internal and Emergency Medicine (Springer).

Elsevier and Springer Nature use Namsor internally for bibliometric gender analyses, including for the European Commission's SheFigures reports.

How to apply

Contact the Namsor team with a short description of your project, team, methodology and target publication. You will usually receive a tailored offer within a few working days.

Question 24

What is Namsor V2?

Answer

Namsor V2 is the current production version of Namsor, available at namsor.app. It is the product used by researchers, enterprises and institutions worldwide.

A specialized morphological engine

Namsor V2 is built on a single specialized morphological model that analyzes the internal structure of names letter by letter, detecting cultural, linguistic and geographic signals embedded in roots, prefixes and suffixes. The model is trained on a proprietary dataset of 5 billion unique names.

Purpose-built for onomastics

Unlike general-purpose tools, Namsor V2 is dedicated entirely to the analysis of proper names. It covers gender detection, geographic origin, ethnicity and diaspora, country of residence, US race/ethnicity, Indian name classification, name parsing, name type recognition and phone number formatting.

Transparent and privacy-first

Namsor V2 includes an Explainability API that details the reasoning behind each classification in Python, as well as anonymized mode (SHA encryption) and opt-out from machine learning.

Independently validated

Namsor V2 is the version benchmarked by Elsevier, Harvard, the University of Chicago, Uber and in multiple peer-reviewed studies. It is cited in over 1,200 Google Scholar publications.

Question 25

What is Namsor V3 and how does it differ from V2?

Answer

Namsor V3 is the next generation of Namsor's name analysis platform, available on request at namsor.ai. It represents a fundamental architectural evolution from V2.

From one model to three, on a massively expanded dataset

Namsor V2 relies on a single morphological model trained on 5 billion unique names. Namsor V3 moves to three models combined in a single pipeline, trained on a dataset massively expanded to 13 billion unique names:

Improved morphological model: letter-by-letter analysis of name structure (roots, prefixes, suffixes), in a deeply reworked version compared to V2
Statistical model (new): a brand new layer in V3, refining probabilities based on the largest proprietary name dataset in the industry, now expanded to 13 billion unique names
Semantic model (new): a large language model that captures contextual and cultural meaning beyond what morphology and statistics alone can detect

Why add a semantic model?

Morphological and statistical models excel at precision and consistency but can miss contextual nuances that a semantic model captures. For example, understanding that a name is associated with a specific historical period, social class or regional dialect. The semantic layer adds this depth without sacrificing the speed, privacy and determinism that define Namsor.

What stays the same

Namsor V3 retains the core principles that make Namsor trusted by researchers and institutions: deterministic results, sub-second latency, anonymizable data and deactivatable machine learning.

What V3 unlocks beyond V2

Namsor V3 is a separate platform with its own API, opening capabilities that V2 does not offer:

Name embeddings: numerical vector representations of names for integration into your own machine learning models
Custom models: purpose-built solutions for fake name detection, fraud detection, romance scam detection, name transliteration and more
Model enhancement: use Namsor's name intelligence to improve your own predictive models (churn prediction, customer lifetime value, forecasting)

Available on request

Namsor V3 is accessible at namsor.ai. Contact Namsor to discuss access and migration from V2.

Question 26

What are Namsor name embeddings and how can I use them in my own models?

Answer

Namsor name embeddings are high-dimensional numerical vectors that encode the full onomastic fingerprint of a proper name. Each embedding captures thousands of signals in a single vector of several thousand dimensions.

What embeddings capture

A single name embedding encodes far more information than any individual Namsor feature. It contains signals related to gender, geographic origin, diaspora, religion, age patterns, social background, name type (real vs fabricated), historical period and many other dimensions that are not classifiable by human experts. These signals are learned from Namsor's proprietary dataset of 13 billion unique names across 22 writing systems.

How to use them in your own models

Name embeddings can be injected as input features into any machine learning pipeline. Concrete applications include:

Fraud detection: a global money transfer leader uses Namsor embeddings to detect fake names in transaction flows
Operational forecasting: an international airport integrates name embeddings to improve passenger flow predictions
Fake profile detection: scientific organizations use embeddings to identify fabricated identities in their databases
Clustering and similarity search: group names by cultural proximity without relying on predefined categories
Custom classification: train your own models using name embeddings as features for churn prediction, customer lifetime value or any domain-specific task

Why embeddings, not just API calls

Namsor's standard API returns discrete classifications (male/female, country, ethnicity). Embeddings give you the raw signal, a dense numerical representation that your own models can exploit in ways that predefined categories cannot capture. You keep full control over the downstream logic.

Available on Namsor V3

Name embeddings are accessible through the Namsor V3 platform at namsor.ai. Contact Namsor to discuss integration.

Question 27

Is Namsor GDPR, CCPA and EU AI Act compliant?

Answer

Yes. Namsor is fully compliant with the three major regulatory frameworks governing data protection and artificial intelligence.

Namsor applies data minimization principles, collecting only what is essential for model operation. Users retain full control over their data: the learnable option can be deactivated to prevent data from contributing to model training, and anonymized mode encrypts name data using SHA before processing. A Data Processing Agreement (DPA) is available for download.

CCPA (California Consumer Privacy Act)

Namsor's privacy architecture meets CCPA requirements for transparency, data access and deletion rights. The same anonymization and opt-out mechanisms that ensure GDPR compliance also satisfy CCPA obligations.

EU AI Act

Namsor is designed for compliance with the EU AI Act's requirements on algorithmic transparency and fairness. The Explainability API provides a detailed breakdown of how each classification is produced, enabling full traceability of origin, gender and ethnicity estimations. This level of transparency allows organizations to audit Namsor's reasoning and demonstrate compliance in regulated use cases.

Question 28

What is anonymized mode and how does Namsor encrypt name data?

Answer

Namsor gives users full control over how their data is stored and used, through two independent privacy settings available in the account page or via API.

Anonymized mode

When set to true, all processed names are irreversibly hashed using SHA encryption before being stored. The original name cannot be recovered from the hash. Namsor only retains the hashed version to verify deduplication (smart processing), ensuring you are not billed multiple times for the same name. The smart processing for redundant queries works even with anonymized data.

Learnable mode

When set to false, the data processed through your API key does not feed Namsor's machine learning algorithm. Your data is used for classification only and does not contribute to model improvement.

Storage encryption

All data logs, whether anonymized or not, are secured using AES encryption before being stored.

Both settings are independent

You can disable machine learning while keeping full data logs, or enable anonymization while allowing machine learning. The two controls can be combined to match your organization's privacy requirements.

Question 29

Is name analysis with Namsor privacy-safe compared to using LLMs?

Answer

Yes. Namsor is significantly more privacy-safe than sending names to a general-purpose LLM and offers controls that most LLM providers don't.

The problem with LLMs for name analysis

When you send names to a general-purpose LLM, the data typically:

Leaves your infrastructure and travels to a third-party provider
May be retained for model training, depending on the provider's terms
Is processed by a model that wasn't designed for name analysis and has no dedicated privacy controls
Is often logged in prompt history, accessible to employees of the LLM provider

How Namsor is different

Purpose-built for name analysis. Namsor only processes names, not broader personal data or context. The scope of data exposure is minimal.
Opt-out of machine learning. Set learnable=false and your data never feeds Namsor's algorithm. Your names are used for classification only.
Anonymized mode. Set anonymized=true and Namsor irreversibly hashes names with SHA before logging. No raw name data is stored.
AES encryption. All data logs are encrypted with AES at rest.
Data Processing Agreement. A standard DPA is available for download and covers your GDPR and CCPA obligations.

Bottom line

Sending names to a general LLM exposes more data, with fewer controls. Namsor limits exposure to names only and gives you explicit controls over storage, training and anonymization.

Question 30

What is Namsor's API Explainability feature and how does it ensure transparency?

Answer

API Explainability is a Namsor feature that returns a detailed explanation of how the AI arrived at each classification, in the form of a closed mathematical formula including both training data features and the complete model logic.

What it returns

When enabled, the API response includes an additional field containing the AI's reasoning as executable Python code. This code shows exactly which features, weights and decision paths produced the result for that specific name.

Why it matters for compliance

The EU AI Act requires bias detection and correction mechanisms in high-risk AI systems. Namsor's Explainability output can be stored as audit evidence, documenting how each inference was made. This is particularly valuable for regulated industries (finance, insurance, recruitment, healthcare) where decisions based on demographic inference must be defensible.

How it's delivered

The explanation is returned as Python logic. Namsor recommends removing tabs and line breaks for clean execution.

Cost and activation

Additional cost: 50 credits per name processed
Contact the Namsor team to enable Explainability on your account
Add the header X-OPTION-EXPLANABILITY: true to your requests
Namsor requires signed documentation and an NDA before activation, to protect the intellectual property of the underlying model

Who it's for

Teams building high-risk AI systems, conducting algorithmic audits, preparing AI Act compliance documentation, or requiring detailed traceability for internal governance.

Question 31

Does Namsor offer on-premise or private cloud deployment?

Answer

Yes. Beyond the standard SaaS API on namsor.app, Namsor offers two dedicated deployment options for organizations with strict data sovereignty, compliance or infrastructure requirements.

On-premise deployment

Namsor's servers are installed directly within the client's infrastructure. Names never leave the client's network perimeter, which meets the most demanding data sovereignty and confidentiality requirements.

On-premise is suited to defense and intelligence agencies, national security bodies, healthcare providers, government institutions and organizations operating under strict regulatory frameworks.

Private cloud deployment

A dedicated Namsor instance runs on a private cloud, isolated from the main Namsor SaaS infrastructure. The client chooses the cloud region for processing, which is relevant for GDPR localization, regional compliance or latency optimization.

This option provides the benefits of a dedicated infrastructure without the operational complexity of an on-premise setup.

Who these options are for

Banks and financial services under strict data handling rules
Large enterprises with very high volumes of names to process
Organizations under specific regulatory constraints such as Swiss banking secrecy or European digital sovereignty
Defense, intelligence and government bodies
Any entity requiring full control over where and how names are processed

Discuss your deployment requirements

Each on-premise or private cloud project is scoped to the client's infrastructure, volume and compliance needs. Contact the Namsor team to discuss your requirements and get a tailored proposal.

Question 32

How can I detect gender and ethnicity bias in a dataset using name analysis?

Answer

Name analysis lets you measure the gender and ethnic composition of any dataset where names are available, even when self-reported demographic data is missing or incomplete. Namsor powers bias detection in CRMs, hiring pipelines, scientific publications, customer bases and editorial boards.

The typical workflow

Export your dataset (CSV, Excel, Google Sheet or API-accessible database)
Run Namsor on the name column using the relevant feature: Gender, Ethnicity, Origin, or US Race
Aggregate the results by the dimension you care about (team, department, year, region)
Compare distributions against your reference benchmark (national population, industry average, target representation)

What you can measure

Gender representation: share of women vs men in hiring shortlists, promotions, authorship, customer base, editorial boards
Ethnic representation: share of each cultural background in the same contexts
Regional origin: geographical diversity of a population
US race ethnicity distribution: for US-specific reporting aligned with Census categories

Why name analysis is the right tool

Self-reported demographic data is often missing, outdated or inconsistent. Name analysis reconstructs the distribution retroactively on any historical dataset, without asking individuals to disclose sensitive information. Namsor's output is aggregated and statistical, never used to label individuals.

Privacy and scope

Namsor returns probabilities, not certainties. Use name-based inference at the group level (statistics, reporting, audits), not to make individual decisions about people. This is both an ethical best practice and an EU AI Act requirement for systems that rely on inferred attributes.

Question 33

Can Namsor detect fake names and bots?

Answer

Yes. Namsor detects fake names with two levels of accuracy, depending on your requirements and volume.

Basic level: Name Type Recognition combined with Ethnicity

Namsor flags potentially fake names by combining its Name Type Recognition feature (anthroponym, brand, toponym, pseudonym classification) with Ethnicity analysis. This combined approach delivers solid accuracy for screening, risk scoring and exploratory detection work, and is accessible on request by contacting the Namsor team.

Expert level: Namsor V3 embeddings and custom models

For production-grade fake name detection, Namsor V3 provides name embeddings and custom models that capture the fine morphological, phonetic and cultural patterns that distinguish real names from generated or synthetic ones. Two options are available:

Embeddings: plug Namsor V3 embeddings (several thousand dimensions per name) into your own fraud detection model to significantly improve its performance
Custom models: have the Namsor team build a fake name detection model trained on your data, delivered as an API endpoint

Continuous improvement with feedback loops. Custom V3 models can be enhanced with a feedback loop: as your team labels detected names as true or false positives, the model retrains on this signal and improves over time. This adaptive approach keeps detection accuracy high even as fraud patterns evolve.

Proven accuracy

In a test on real data from one of the global leaders in money transfer, a Namsor V3 custom model reached over 94% accuracy in detecting fake names.

Why it works

Fake names, bots and synthetic profiles leave linguistic traces: improbable phoneme sequences, cross-cultural inconsistencies, low-frequency morphological patterns. Namsor V3 was trained on 13 billion names and captures these signals in its embeddings, outperforming generic anti-fraud machine learning that relies on behavioral or network-based features only.

Who it's for

Trust and safety teams, fraud prevention units, KYC and onboarding flows, marketplaces, social platforms, neobanks, money transfer and remittance companies.

Get started

To discuss fake name detection for your use case, contact the Namsor team. To learn more about Namsor V3 embeddings and custom models, visit namsor.ai.

Question 34

How can name analysis improve KYC and fraud prevention?

Answer

Name analysis strengthens KYC and fraud prevention by enriching identity data, scoring risk and flagging suspicious patterns at multiple points in the customer journey.

Where name analysis fits in a KYC workflow

Onboarding: verify that a submitted name matches expected patterns for the declared country, language and cultural background. Catch inconsistencies before an account is opened.
Risk scoring: incorporate name-derived features (origin, ethnicity, cultural consistency) into your risk engine to improve the signal without requiring additional PII.
Ongoing monitoring: re-analyze names periodically to detect identity manipulation or gradual drift in a customer profile.
Sanctions and PEP screening support: normalize and transliterate names across alphabets before matching them against watchlists, reducing false negatives on non-Latin names.

Fraud pattern detection

Namsor helps fraud teams detect patterns associated with several types of financial crime, including account takeover attempts, impersonation, romance scams and authorized push payment (APP) fraud. In these cases, analyzing the names involved in a transaction, alongside other risk signals, reveals anomalies that purely behavioral or network-based fraud models miss.

To protect the integrity of these detection systems, Namsor does not publish the specific linguistic or statistical markers used in its fraud models. Customers receive these details under NDA during integration.

Real-world use

Several global leaders in money transfer and remittance use Namsor to strengthen their fraud prevention stack, benefiting from Namsor's coverage of 22 alphabets, 99.99% classification rate and V3 custom models trained on industry-specific data.

Why name analysis complements traditional fraud models

Behavioral models (login patterns, device fingerprints, transaction velocity) detect what someone is doing. Name analysis helps detect who someone is claiming to be. Combined, they reduce false positives and catch sophisticated identity-based fraud that behavioral signals alone miss.

Get started

To discuss KYC and fraud prevention for your specific stack, contact the Namsor team. For production-grade custom fraud detection models, visit namsor.ai.

Question 35

How can name analysis power marketing segmentation and audience analytics?

Answer

Name analysis lets marketing teams segment audiences, personalize campaigns and analyze customer or influencer bases by cultural origin, language, gender and ethnicity, all from data that most organizations already have: names.

International marketing segmentation

Split your contact database, email list or CRM by cultural origin, language group or country of residence to run targeted campaigns. Personalize message tone, language, imagery and offers by segment. Allocate media budget based on where your real audience is, not where you assumed it would be.

Audience analytics on your existing base

Understand the real composition of your customer base, newsletter subscribers, app users or community members. Namsor reconstructs the demographic distribution retroactively, even when self-reported data is missing or incomplete. Typical questions you can answer:

What share of my customers comes from each cultural background?
How does gender distribution vary across my product lines?
Which regions are over- or under-represented versus my target market?
How has my audience composition evolved over the past 3 years?

Influencer and partnership mapping

For influencer marketing, brand partnerships or community programs, Namsor helps identify and group creators by cultural origin, language and gender. This enables:

Building diverse influencer rosters that reflect your target markets
Matching creators to campaigns by linguistic or cultural fit
Measuring the demographic reach of an influencer's follower base (when follower names are accessible)

Integration with your stack

Namsor connects to CRMs and marketing platforms through the Google Sheets add-on, CSV/Excel tool, Zapier, Make, n8n, or the REST API. Analyses run in real time on form submissions or in batch on existing databases.

Privacy and compliance

Use name-based segmentation at the aggregate level for campaign strategy, not to make individual decisions about consumers. Namsor is GDPR and CCPA compliant and offers anonymized mode for privacy-sensitive workflows.

Question 36

How is Namsor used for AI bias detection and EU AI Act compliance?

Answer

The EU AI Act requires providers and deployers of high-risk AI systems to detect, document and mitigate discriminatory bias. Namsor helps on both sides of this requirement: auditing existing AI systems for bias, and documenting AI decisions with a verifiable audit trail.

Auditing an existing AI system for bias

When an AI system (hiring tool, credit scoring model, insurance pricing engine, fraud detector) makes decisions about individuals, the AI Act requires evidence that outcomes are not systematically biased against protected groups. Namsor lets you test this by:

Running names from your training data or production logs through Namsor to infer gender, origin or ethnicity at the aggregate level
Segmenting your AI system's decisions (accept/reject, approve/deny, high/low score) by these inferred demographic groups
Measuring disparities in outcomes across groups and comparing them against fairness thresholds (disparate impact ratio, statistical parity, equalized odds)

This approach fits directly into the bias detection and correction duties defined in AI Act Article 10 (data governance) and Article 15 (accuracy, robustness and cybersecurity).

Documenting AI decisions with Explainability

When your own AI system uses Namsor for inference, the API Explainability feature returns the complete reasoning behind each classification as executable Python code, including training features and model weights. This output can be stored as verifiable audit evidence for every decision, satisfying the transparency requirements of AI Act Article 13 (transparency and information to users).

The sensitive data exception

The EU AI Act introduces a specific exception in Article 10(5): providers may process special categories of personal data (ethnic origin, gender identity) specifically to detect and correct bias in high-risk AI systems, as a matter of substantial public interest. Namsor's name-based inference is designed to support this lawful use case while respecting data minimization principles.

Industries using Namsor for AI Act preparation

Finance and insurance: bias audits on credit scoring, pricing and underwriting models
Recruitment and HR: fairness testing on CV screening and candidate ranking algorithms
Healthcare: equity analysis of clinical decision support tools
Public sector: audits of algorithmic decision systems used by administrations

Critical usage principle

Use Namsor's output only at the group level for bias detection and correction. Do not use name-based inference to make individual decisions about people: this would defeat the purpose of the AI Act and create new discrimination risks.

Get started

For Explainability activation and AI Act documentation support, contact the Namsor team. A signed NDA is required before Explainability can be enabled.

Question 37

How can Namsor improve data quality and enrich CRM databases?

Answer

Namsor helps data teams clean, validate and enrich customer databases at scale by turning raw names into structured, actionable attributes: split first/last name, detect invalid entries, infer gender, origin, country of residence, ethnicity and more.

Data cleaning and validation

Detect invalid entries: the Name Type Recognition feature flags non-person entries in your name fields (brand names, placeholders like "TEST" or "Customer", toponyms, nonsense strings). Filter these out before they pollute downstream processes.
Split full names: when first name and last name are merged in a single field, the Split Name feature separates them correctly, including for names that don't follow Western conventions.
Normalize across alphabets: Namsor handles names in 22 writing systems, reducing data inconsistencies in international databases.

Data enrichment

Add high-value attributes to every contact in your CRM:

Gender: populate a gender field when it's missing, for analytics or personalization
Origin: country of cultural origin (131 countries supported)
Country of residence: infer where a contact currently lives (247 countries supported)
Ethnicity / Diaspora: cultural background for segmentation (139 ethnicities supported)
US Race: for US-specific reporting aligned with 6 Census categories

Advanced deduplication with Namsor V3

Traditional CRM deduplication fails on name variants. Namsor V3 custom models use name embeddings to compute semantic similarity between name variants, detecting duplicates that exact-match algorithms miss:

Abbreviations and reordering: "Jean Dupont", "J. Dupont", "Jean Du Pont" and "Dupont, Jean" recognized as the same person
Accents and diacritics: "François", "Francois" matched together
Typos and misspellings: "Catherine", "Catherien" and "Cathrine" matched despite data entry errors
Transliteration variants: "Mohammed", "Mohamed" and "Muhammad" recognized as the same Arabic name; "Владимир", "Vladimir" and "Wladimir" identified as the same name written in different scripts

This is particularly valuable for:

Legacy databases: reconcile historical contacts entered under inconsistent formatting conventions
International CRMs: unify customer records across languages, scripts and regional naming conventions
Post-merger data consolidation: merge customer bases from multiple sources without losing or duplicating records

Advanced deduplication is available through custom V3 models. To discuss your specific use case, contact the Namsor team or visit namsor.ai.

Privacy and usage principle

Enriched attributes are statistical inferences, not certified facts. Use them at the aggregate level for segmentation, analytics and reporting. Avoid using inferred attributes to make individual decisions about consumers. Namsor is GDPR and CCPA compliant and offers anonymized mode for privacy-sensitive workflows.

Question 38

How do international organizations use name analysis for migration and diaspora mapping?

Answer

International organizations use name analysis to map diasporas, track migration flows and estimate the size and composition of populations when traditional census or registration data is missing, incomplete or outdated. Namsor has powered several published studies for UN agencies, the World Bank and city governments.

The core workflow

Collect name data from professional sources: researcher databases (ORCID), labor market intelligence platforms (LinkedIn, job boards), public registries or administrative data
Run Namsor to infer origin, ethnicity or diaspora membership at the aggregate level
Apply refinement filters: exclude false positives from related cultural groups (e.g. distinguish Brazilian from Portuguese or Angolan names), add keyword filters on cities or institutions of origin
Enrich with professional and educational attributes: job titles, degree level, industry, employer, field of study
Aggregate by geography or professional segment: measure the share of each diaspora group by country, region, city, industry or institution

What you can measure

Size of a diaspora abroad: how many people of origin X currently live in country Y (e.g. the IOM study identified 26,945 researchers of Armenian origin living outside Armenia)
Professional and educational composition: degree levels, fields of study, industries, seniority and employer types within a diaspora
Geographic concentration: where diaspora communities settle within a host country, down to metropolitan area or neighborhood level
Skills and industry specialization: which sectors a diaspora is concentrated in (healthcare, engineering, research, tech), enabling targeted knowledge transfer programs

Published examples

IOM mapped the Armenian diaspora in the United States and France by running Namsor onomastic analysis on the ORCID researcher database and ZoomInfo professional profiles, identifying 26,945 scientists of Armenian origin living outside Armenia (read the study). Namsor has also powered IOM diaspora mapping projects for Georgia and Azerbaijan.
United Nations (ECLAC) used Namsor for its study Tracking the digital footprint in Latin America and the Caribbean, applying name-based inference to understand population flows in the region (read the study).
Boston Planning & Development Agency mapped the Brazilian scientific diaspora in Greater Boston by combining Namsor's diaspora and origin models with labor market data, applying filters to distinguish Brazilian professionals from other Lusophone groups (Portuguese, Angolan, Cabo Verdean) (read the report).

Why name analysis is effective for this use case

Migration and diaspora research often faces the problem of missing self-reported data: people don't always register their ethnicity or origin in administrative systems, census coverage varies, and second-generation migrants are often invisible in traditional statistics. Name analysis reconstructs the demographic picture retroactively from data that is already available (names in professional registries, authorships, public records), without requiring new data collection.

Privacy and aggregation principle

Diaspora mapping with Namsor is always done at the aggregate level (populations, neighborhoods, professional groups), never to identify or track specific individuals. This is both an ethical requirement and a GDPR/CCPA compliance best practice.

Question 39

Can Namsor auto-detect language or salutation from a contact name?

Answer

Yes. By combining gender, origin and phone number inference, Namsor lets you auto-detect language, salutation, phone prefix and country from basic contact data, without asking the user to fill in additional fields.

Salutation and language from a name

Gender inference determines whether the contact is male or female, enabling the correct title (Mr / Mrs / Ms)
Origin or country of residence inference identifies the contact's cultural and linguistic background
Combine both to generate a localized salutation: "Herr" for a German male, "Madame" for a French female, "Señor" for a Spanish male, "Dear Ms" for an English-speaking female

Language detection from a name

Namsor's origin feature returns the most likely country of cultural origin from a name. Map this country to its primary language and you have a reliable language preference signal, without asking the contact to fill in an additional field.

Phone prefix and country from a name and phone number

When a contact provides a phone number alongside their name, Namsor's Phone Number Format feature identifies the international phone prefix, validates the number structure and infers the country code. This is particularly useful when:

Users enter a phone number without the international prefix
You need to validate that a phone number is consistent with the contact's name origin (fraud signal if mismatched)
You want to auto-route calls or SMS to the correct regional team

Where to use this

Contact forms: auto-populate salutation, language and phone prefix fields as soon as the name and number are entered
Email personalization: generate properly gendered and localized greetings at scale
CRM onboarding: enrich new contacts with language and country fields for routing to the right support or sales team
Call center routing: use the inferred language and phone country to connect callers to agents who speak their language
Direct mail and print: produce correctly addressed and titled correspondence across markets

Cost

Gender costs 1 credit. Origin costs 10 credits. Phone Number Format costs 11 credits. For a full contact enrichment (salutation + language + phone validation), running all three features on a single contact costs 22 credits.

Question 40

Does Namsor build custom name analysis models?

Answer

Yes. Namsor builds custom AI models trained on your data and your specific classification needs, delivered as a dedicated API endpoint on the Namsor V3 platform.

What a custom model is

A custom model extends Namsor's standard features beyond the built-in classifications (gender, origin, ethnicity). Instead of a generic taxonomy, the model is trained to answer a question specific to your business:

"Is this name likely fraudulent?"
"What caste group does this Indian name belong to?"
"Is this name transliterated from Mandarin?"
"Are these two name records the same person?" (cross-script and cross-format deduplication)
"Does this name match an entry on a sanctions or PEP list?" (sanctions screening optimization with fuzzy matching across alphabets and transliteration variants)
"Which prospects in my database look most like my top customers?" (lookalike and audience expansion using vector search with cosine similarity on name embeddings to identify prospects with similar cultural and demographic profiles)

Types of custom models

Classification models: assign names to categories specific to your domain (caste, religion, tribe, linguistic group, customer segment)
Detection models: identify patterns in names (fake names, bots, synthetic profiles)
Matching models: compare name records to detect duplicates across formats, scripts and transliterations, or to optimize sanctions and PEP list screening with fuzzy matching
Scoring models: assign a probability or risk score to each name based on your criteria
Lookalike models: use vector search (cosine similarity) on name embeddings to find prospects with cultural and demographic profiles similar to your best customers
Transliteration models: convert names between writing systems (e.g. Mandarin to Latin, Arabic to Latin)

How the process works

Scoping: define the classification objective and the target taxonomy with the Namsor team
Data exchange: share labeled training data under NDA (Namsor provides guidance on data format and volume)
Model training: Namsor trains a custom model using V3 embeddings (several thousand dimensions per name) and your labeled data
Validation: review precision, recall and edge cases on a holdout test set
Delivery: the model is deployed as a dedicated API endpoint, ready for production integration
Continuous improvement: optionally, set up a feedback loop where your team labels predictions as correct or incorrect, and the model retrains periodically on this signal

What you get

A dedicated API endpoint on the Namsor V3 platform (namsor.ai)
Trained on your data, tuned to your taxonomy
Batch and real-time inference
Optional feedback loop for continuous improvement
Documentation and integration support

Contact the Namsor team to discuss your custom model requirements.

Question 41

What industries has Namsor built custom AI models for?

Answer

Namsor has built custom AI models for organizations across several industries. While most engagements are covered by NDAs, the following examples illustrate the range of domains where Namsor V3 embeddings and custom models deliver results.

International organizations and development

The World Bank and the IOM commissioned custom models for Indian names: caste group classification, religion estimation and sub-region identification, enabling research on internal migration and social inequality.

Namsor has also built models with different levels of geographic and ethnic granularity depending on the client's needs, including ethnicity classification in Australia and regional segmentation models adapted to specific national contexts.

Financial services and money transfer

Custom V3 models detect fake names, fraudulent identities and authorized push payment (APP) fraud patterns in transaction flows. In tests on real data from one of the global leaders in money transfer, a custom model reached over 94% accuracy in detecting fake names.

Transportation and aviation

Custom models built on name embeddings power passenger flow forecasting at international airports, using the cultural and geographic profile of passenger names to improve demand predictions by route and season.

Global identity verification

Namsor has built bidirectional transliteration models for a global identity verification provider: Latin to Mandarin and Mandarin to Latin, Latin to Kanji and Kanji to Latin. These models power an intelligent name translation engine used in KYC, electronic identity verification and PEP/sanctions screening across writing systems.

Security and intelligence

Custom models support risk analysis and intelligence workflows where name-based demographic inference is a critical signal.

Marketing and social listening

Namsor is currently developing custom models for synthetic account detection on social platforms, helping brands and agencies identify fake profiles and assess the authenticity of online audiences.

Discuss your industry

These examples represent a fraction of what Namsor V3 custom models can do. To explore a model for your specific domain, contact the Namsor team or visit namsor.ai.

Frequently Asked Questions - Everything you need to know about Namsor

About Namsor

Is Namsor the best tool available?

Most accurate, validated by peer-reviewed studies

Highest coverage: 99.99% of names classified

Fastest: 30 ms per name, 80 to 500 ms per batch

Most complete: nine features, deepest taxonomies in the industry

Most private

Recognized by the global scientific community

What features does Namsor offer?

Standard features

Name embeddings

Custom models

What is onomastics and how does Namsor use it?

How Namsor applies onomastics

Morphological analysis in practice

Beyond human onomastics

Why it matters

Trust & validation

Which institutions have validated Namsor's accuracy?

Elsevier and Science-Metrix (2018)

Harvard University and the University of Chicago (2022)

Uber, ACM FAccT (2022)

Journal of the Medical Library Association (2021)

Internal and Emergency Medicine, Springer (2026)

PLOS ONE (2023)

Columbia University

Is Namsor used in academic research?

Types of research

Disciplines

Why researchers choose Namsor

Researcher support program

Is Namsor used by governments and international organizations?

International organizations

Government and public sector

Why the public sector trusts Namsor

Is Namsor used by companies?

Transportation and travel

Financial services

Science and publishing

Retail, e-commerce and marketing

Technology and data

Security and intelligence

Why companies choose Namsor

Why is a specialized onomastic API better than a name lookup database?

Coverage drops on real-world data

No ability to distinguish typos from cultural nuances

Shallow taxonomy and no contextual understanding

Sporadic updates

Why is a specialized onomastic API better than a general-purpose LLM for name classification?

Accuracy collapses on real names

Taxonomy confusion

Syllable-level vs. letter-level analysis

Non-deterministic results

Latency and cost

Privacy risk

But LLMs bring one thing

Coverage & capabilities

What happens if a name isn't in your dataset?

Morphological analysis, not lookup

Proven in benchmarks

99.99% classification rate

How many names can Namsor analyze and in which alphabets?

22 writing systems supported

99.99% classification rate

Understanding results

Why does Namsor offer 4 different features for analyzing name origin?

The four questions and the four features

Why four features instead of one?

One name, four answers: an example

Why coverage differs across features

A common pitfall to know about

Quick decision guide

Why does Namsor return Spain or Portugal instead of the actual country someone lives in?

Why Origin works this way

Examples across regions

How to get the actual country of residence

When Origin is still the right feature

Getting started & integration

Can I use Namsor without coding?