R-CMD-check CRAN_Status_Badge CRAN_Downloads

The goal of rethnicity is to provide a method to predict ethnicity from names of people.


I created this package hoping to help applied researchers on their studies regarding ethnic bias and discrimination, and potentially eliminate the racial and ethnic disparities. By using this package, you agree to the following:

  1. You will NOT use this package for purposes other than academic research.
  2. You will NOT disclose the predicted ethnic group to the public, given the names data you might have.
  3. You will NOT discriminate anyone on the basis of race and color, by using the methods provided by this package.
  4. You understand that the method cannot make predictions 100% correct, and you should be cautious about the results.
  5. You will not use the information to study individuals, but rather to study populations in the aggregate.

Again, you should use the package responsibly and please refer to the methodology paper for details.


I recommend using the wonderful package manager pak to install this package:

# first install `pak` if not yet installed
# install.packages("pak")

# install the CRAN version

# or install the Github development version

Of course, you can also install the package in the old way: install the released version of rethnicity from CRAN with:


Or the development version from GitHub with:

# install.packages("devtools")

How to use this package?

There is a vignette that discusses how to use this package.

Documentation on Methodology

The complete description of the methodology is on arXiv and published on SoftwareX and please cite it as:

  title = {Rethnicity: {{An R}} Package for Predicting Ethnicity from Names},
  shorttitle = {Rethnicity},
  author = {Xie, Fangzhou},
  year = {2022},
  month = jan,
  journal = {SoftwareX},
  volume = {17},
  pages = {100965},
  issn = {2352-7110},
  doi = {10.1016/j.softx.2021.100965},

  title = {Rethnicity: Predicting {{Ethnicity}} from {{Names}}},
  shorttitle = {Predicting {{Ethnicity}} from {{Names}} with Rethnicity},
  author = {Xie, Fangzhou},
  year = {2021},
  month = sep,
  journal = {arXiv:2109.09228 [cs]},
  eprint = {2109.09228},
  eprinttype = {arxiv},


Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

This license was chosen to prohibit commercial usage, while still being free and accessible for non-commercial academic uses.