
Week 10: Word Embeddings II
Quick introduction to Word Embedding + Matrix Factorization
Coding
In class Exercise: Applied Readings
We saw ONE way to estimate word vectors with neural networks: Word2Vec
However, at this point in time, we actually do not really use Word2Vec embedding in modern DL
Consider Word2Vec as a general approach of learn dense representation of words via self-supervision.
But… you will have many many many distinct NN architectures doing this.


Count-based methods rely on word co-occurrence to learn embeddings:
Compute Unigram Probabilities: Measure how often each word appears in the corpus
Compute Word Co-occurrences: Count how often words appear together across the entire corpus
Calculate PMI (Pointwise Mutual Information): Measures how much more often two words co-occur than expected by chance
\[ PMI(\text{word}_1, \text{word}_2) = \log \frac{P(\text{word}_1, \text{word}_2)}{P(\text{word}_1) \, P(\text{word}_2)} \]
Construct a Co-occurrence Matrix: Store PMI values in a large word-by-word matrix
Apply Singular Value Decomposition (SVD): Reduce the matrix to extract meaningful lower-dimensional word vectors
SVD is a matrix factorization technique used in linear algebra and data science
Reduces dimensionality by identifying important patterns in data: splits the matrix into fundamental patterns (singular vectors), ranked by importance (singular values)
Glove algorithm is a variation of this approach.
Let’s discuss now several applications of embeddings on social science papers. These paper show:
How to use embeddings to track semantic changes over time
How to use embeddings to measure emotion in political language.
How to use embeddings to measure gender and ethnic stereotypes
And a favorite of political scientists, how to use embeddings to measure ideology.
Work in pairs
Each pair will select one of the five applied papers for this week
Prepare a presentation about this paper for your colleagues (30min)
Your presentation needs to answer three key questions:
How are word embeddings used?
What substantive/applied insights are extracted from the embeddings approach?
Could this be done with non-embeddings methods (any of the things we saw before embeddings)?
Every group will have 5 minutes to present
Rodman, E., 2020. A Timely Intervention: Tracking the Changing Meanings of Political Concepts with Word Vectors. Political Analysis, 28(1), pp.87-111.
Gennaro, Gloria, and Elliott Ash. “Emotion and reason in political language.” The Economic Journal 132, no. 643 (2022): 1037-1059.
Rheault, Ludovic, and Christopher Cochrane. “Word embeddings for the analysis of ideological placement in parliamentary corpora.” Political Analysis 28, no. 1 (2020): 112-133.
Austin C. Kozlowski, Austin C., Matt Taddy, and James A. Evans. 2019. “The Geometry of Culture: Analyzing the Meanings of Class through Word Embeddings.” American Sociological Review 84, no. 5: 905–49.
Garg, Nikhil, Londa Schiebinger, Dan Jurafsky and James Zou. 2018. “Word embeddings quantify 100 years of gender and ethnic stereotypes.” Proceedings of the National Academy of Sciences 115(16):E3635–E3644.