I am trying to reproduce the results of this paper: https://arxiv.org/pdf/1607.06520.pdf
Specifically this part:
To identify the gender subspace, we took the ten gender pair difference vectors and computed its principal components (PCs). As Figure 6 shows, there is a single direction that explains the majority of variance in these vectors. The first eigenvalue is significantly larger than the rest.
I am using the same set of word vectors as the authors (Google News Corpus, 300 dimensions), which I load into word2vec.
The 'ten gender pair difference vectors' the authors refer to are computed from the following word pairs:
I've computed the differences between each normalized vector in the following way:
model = gensim.models.KeyedVectors.load_word2vec_format('GoogleNews-vectors-
negative300.bin', binary = True)
model.init_sims()
pairs = [('she', 'he'),
('her', 'his'),
('woman', 'man'),
('Mary', 'John'),
('herself', 'himself'),
('daughter', 'son'),
('mother', 'father'),
('gal', 'guy'),
('girl', 'boy'),
('female', 'male')]
difference_matrix = np.array([model.word_vec(a[0], use_norm=True) - model.word_vec(a[1], use_norm=True) for a in pairs])
I then perform PCA on the resulting matrix, with 10 components, as per the paper:
from sklearn.decomposition import PCA
pca = PCA(n_components=10)
pca.fit(difference_matrix)
However I get very different results when I look at pca.explained_variance_ratio_
:
array([ 2.83391436e-01, 2.48616155e-01, 1.90642492e-01,
9.98411858e-02, 5.61260498e-02, 5.29706681e-02,
2.75670634e-02, 2.21957722e-02, 1.86491774e-02,
1.99108478e-32])
or with a chart:
The first component accounts for less than 30% of the variance when it should be above 60%!
The results I get are similar to what I get when I try to do the PCA on randomly selected vectors, so I must be doing something wrong, but I can't figure out what.
Note: I've tried without normalizing the vectors, but I get the same results.
They released the code for the paper on github: https://github.com/tolga-b/debiaswe
Specifically, you can see their code for creating the PCA plot in this file.
Here is the relevant snippet of code from that file:
def doPCA(pairs, embedding, num_components = 10):
matrix = []
for a, b in pairs:
center = (embedding.v(a) + embedding.v(b))/2
matrix.append(embedding.v(a) - center)
matrix.append(embedding.v(b) - center)
matrix = np.array(matrix)
pca = PCA(n_components = num_components)
pca.fit(matrix)
# bar(range(num_components), pca.explained_variance_ratio_)
return pca
Based on the code, looks like they are taking the difference between each word in a pair and the average vector of the pair. To me, it's not clear this is what they meant in the paper. However, I ran this code with their pairs and was able to recreate the graph from the paper: