ethnoLab: inferring cultural patterns from ethnographic writing

This material is based upon work supported by the National Science Foundation under Grant Number 2024286. HDNS-I: Infrastructure for Knowledge Linkages from Ethnography of World Societies. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.


HRAF API


Download slides for AAA2023 presentation | Download Jupyter Notebook for examples | Notes for Demo materials

Word2Vec Network

What can artificial intelligence and machine learning contribute to anthropology, the most human of disciplines? Ethnography combines elements of science and humanities, with ethnographic writing mediating exposition and science. Data science techniques such as Natural Language Processing (NLP) and statistical analysis can lead to better understanding of patterns emerging from ethnographer accounts of observations and experiences within or across societies and cultures.

iKLEWS (Infrastructure for Knowledge Linkages from Ethnography of World Societies) is an NSF-funded HRAF project which seeks to use data science to create digital semantic infrastructure and associated computer services supporting research based on HRAF’s growing ethnographic database, eHRAF World Cultures. A basic goal of iKLEWS is to greatly expand the research support of eHRAF World Cultures for addressing scientific, scholarly and applied research.

Network of words seeded from Kinship

In NLP, the Word2Vec algorithm uses a neural network model trained on a large corpus of text in order to reconstruct the linguistic contexts of words. With each word in the corpus being assigned a “vector” in space, the model is able to learn word associations, detect synonyms, and suggest other related terms. PCA plots are another way to visualize the data in a 3D field. As the demonstration will show, starting with just a few concepts as keywords, these can be expanded to lists of related terms drawn from the eHRAF corpus. Networks of these terms and how these interconnect, together with relevant excerpts from ethnographic texts, can lead the researcher to further related material.

PCA plot of kinshipLearn more about how to find HRAF at the AAA/CASCA 2023 Annual Meeting.