A physical data visualization of Judaism, Islam, and Christianity
In Fall 2018, I took a Digital Humanities course at MIT with Kurt Fendt. For our final project, we pitched an idea to make an interactive visualization that explored the similarities/differences between the Abrahamic religions. I designed/developed the visualizations and created the machine learning models.
Initially, we really had no idea what the final form of this would look like. So we decided to focus on our primary research question: how can we visualize the similarities between religions?
But, more specifically, we wanted to know: howdo we understand thecontextandsimilaritybetween religious texts?
This question is personally very important to me. I grew up in a secular Jewish household but was surrounded by plenty of religiously conservative people in my extended family (all of which are Russian immigrants). I've thought a lot about religion in my free time and in how it can simulataneously divide and unify families. With my background in NLP and data visualization, I wanted to take a different approach than most comparative analyses of the Abrahamic faiths and focus on interaction, geometry, and physicality.
Carla and I primarily worked on the design and development of the visualizations. Peri and Abnell primarily worked on background research and qualitative analysis.
We started out wanting to create a walkable, immersive installation. The idea is that a user could walk into three large triangular panes, each of which would represent one of the religions. As you approach each panel, the most dominant words from each religious text inside of three triangular planes.
From a top down view, you can see how the basic user interaction would take place. The blue square in the center represents a user, and each panel lights up/activates as it gets approached.
As we entered the prototyping stage, we thought more critically about what the core idea of interacting physically instead of digitally meant. We also realistically assessed our time constraints and the skillsets of our team-- nobody had a ton of experience with hardware and we were most skilled (collectively) at data analysis, data visualization and 3D modeling/fabrication.
We used publicly available CSVs of the Bible (KJV) and the Quran (Yusuf-Ali translation). We decided to use the version that our Muslim friends recommended so we could iterate quickly. That being said, we acknowledge the potential loss of accuracy in meaning and translation while choosing a specific translation of a religious text.
Figuring out the "shape of similarity"
We knew that we wanted to visualize the overall shape of similarity, so we needed a reliable and mature way to calculate these geometries.
In Natural Language Processing (NLP), there's a popular algorithm called Word2Vec which is able to create vector representations of words, i.e. turns words into lists of numbers that can be compared mathematically. Intuitively, we can easily tell that words like "electricity" and "battery" are more related than "electricity" and "sausage," but how can we get a computer to understand these relationships? This is where Word2Vec shines.
In our case, Word2Vec uses a Continuous Bag of Words (CBOW) algorithm to create these word vectors. Imagine a little robot hovering over every word in a book. Word2Vec looks at the text around each word and learns how to predict the word in between.
Word2Vec gets better and better at predicting words based on their surrounding contexts and eventually can tell you which words appear more frequently with each other.
The entire process is based on distributional semantics, in particular the idea that similar words appear in similar contexts.
"You shall know a word by the company it keeps" - John Rupert Firth
Because similarity can be thought of as a form of "distance", where similar words are "closer" to each other, Word2Vec allows us to use their vector representations as coordinates. Lucky for us, coordinates can be easily plotted and compared.
My insight was simple: if you can plot how similar words are to each other on a chart, then you can look at the words that outline any other word. In effect, these shape outlines can give a high-level view ofwhat similarity looks like.
Below, you can see a chart which shows power and electricity related words plotted next to each other and some outlines of related words.
We knew that using Word2Vec, we could obtain the "coordinates" of all of the words in each religious text. After training the models and obtaining the word vectors for all the vocabularies, we were able to peek into the actual geometries that each text produced.
After labeling and some design cleanup, we were able to take the generated geometries and prep them for laser cutting and further design work.
Given our initial results from training three Word2Vec models on each religious text, we started our physical prototyping by laser cutting several small pieces of acrylic to see what looking through/around.
Once we were done with the small-scale physical prototype, we began designing the larger physical enclosure for the final piece. We wanted users to slide through different words that could be compared across religions.
Constructing a narrative
When it came down to choosing the words to display, we boiled it down to two options: objectively choose the words by looking at the top shared vocabularies, or subjectively create a narrative. The most shared words ended up being lackluster, so we decided to choose several key words based on our own experiences with religion and the the story we wanted to tell.
We wanted to tell a story about how religions develop and wanted the visualizations to reflect that story. We chose believe to be the central point of the piece. Vision and faith would precede it (words about prediction and anticipation), followed by remember and forget (words about retrospection).
The coolest part about this linear narrative is that it's actually 2D. When each religion is lined up in chronological order, we can view not only how each religion develops over time, but how each thematic word develops through all the religions.
Computational design pipeline
We used the Gensim library to create our Word2Vec models. We also experimented with using Doc2Vec to use sentence and verse-level embeddings and cluster topics with LDA, but we decided to stay simple and just use word-embeddings. In English (for people who aren't familiar with machine learning): our program looked at how similar words were to each other instead of sentences or verses.
Overall, the entire pipeline looked something like this:
Create an aligned Word2Vec model for all three texts
Export a file of the word vectors
Convert all word vectors into geometries (specifically, convex hulls)
Export each geometry into DXF files to be rendered in Rhino
Align the DXF files in Rhino to prep for laser cutting on Acrylic
Laser cut the pieces!
Our final presentation for class can be viewed and presented here. We didn't have quite enough time to install LEDs into the piece to make the engravings glow, but we are still very happy with how everything came out.