What does a weighted word embedding mean?

Dawn17 picture Dawn17 · Dec 9, 2017 · Viewed 8.8k times · Source

In the paper that I am trying to implement, it says,

In this work, tweets were modeled using three types of text representation. The first one is a bag-of-words model weighted by tf-idf (term frequency - inverse document frequency) (Section 2.1.1). The second represents a sentence by averaging the word embeddings of all words (in the sentence) and the third represents a sentence by averaging the weighted word embeddings of all words, the weight of a word is given by tf-idf (Section 2.1.2).

I am not sure about the third representation which is mentioned as the weighted word embeddings which is using the weight of a word is given by tf-idf. I am not even sure if they can used together.

Answer

Maxim picture Maxim · Dec 9, 2017

Averaging (possibly weighted) of word embeddings makes sense, though depending on the main algorithm and the training data this sentence representation may not be the best. The intuition is the following:

  • You might want to handle sentences of different length, hence the averaging (better than plain sum).
  • Some words in a sentence are usually much more valuable than others. TF-IDF is the simplest measure of the word value. Note that the scale of the result doesn't change.

See also this paper by Kenter et al. There is a nice post that performs the comparison of these two approaches in different algorithms, and concludes that none is significantly better than the other: some algorithms favor simple averaging, some algorithms perform better with TF-IDF weighting.