Eree's Blog Eree's Blog

The implentation of Word2vec

in 笔记 read (1370) 文章转载请注明来源!

Implentation of Word2Vec

Phase 1: Assemble the Graph

1. Create dataset and generate samples.

Input:center word,output:context word.Create a dictionary of the most common words and feed the indices of those words.

BATCH_SIZE of the sample inputs have shape [BATCH_SIZE],for the outputs have shape [BATCH_SIZE,1]

code

dataset = tf.data.Dataset.from_generator(gen, 
                            (tf.int32, tf.int32), 
                            (tf.TensorShape([BATCH_SIZE]), tf.TensorShape([BATCH_SIZE, 1])))
iterator = dataset.make_initializable_iterator()
center_words, target_words = iterator.get_next()

2. Define the Weight(in this case,embedding matrix)

Each row corresponds to the representation vector of one word.If one word is represented with a vector of size EMBED_SIZE, the embedding matrix will have shape[VOCAB_SIZE, EMBED_SIZE]. The embedding matrix is initilized to value from a random distribution. In this case, it's uniform distribution.

embed_matrix = tf.get_variable('embed_matrix', 
                               shape = [VOCAB_SIZE,EMBED_SIZE],                             initializer=tf.random_uniform_initializer())

3. Inference(compute the forward path of the graph)

The embed_matrix has dimension VOCAB_SIZE * EMBED_SIZE, with each row of that corresponds to the vector representation of the word at that index.Use the tf.nn.embedding_lookup to get the slice of all coreesponding rows in the embedding matrix.

embed = tf.nn.embedding_lookup(embed_matrix, center_words, name='embed')

4. Define the loss function

NCE loss has been implented by Tensorflow, just use it.

Note that the thrid argument is actually inputs, and the forth is labels. And also we need weights and bias to calculate NCE loss. They will be updated by optimizer. After sampling, the final output will be computed in tf.nn.nce_loss operation.

#create nce_weight and bias
nce_weight = tf.get_variable('nce_weight',
          shape=[VOCABE_SIZE, EMBED_SIZE],
         initializer=tf.truncated_normal_initializer(stddev=1.0 / (EMBED_SIZE ** 0.5)))
nce_bias = tf.get_variable('nce_bias',initializer=tf.zeros([VOCAB_SIZE]))
loss = tf.reduce_mean(tf.nn.nce_loss(weights=nce_weight,
                                    biases=nce_bias,
                                    labels=target_words,
                                    inputs=embed,
                                    num_sampled=NUM_SAMPLED,
                                    num_classes=VOCAB_SIZE))

5. Define the optimizer

optimizer = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(loss)

Phase 2: Execute the computation

with tf.Session() as sess:
    sess.run(iterator.initializer)
    sess.run(tf.global_variables_initializer())

    writer = tf.summary.FileWriter('graph/word2vec_simple', sess.graph)

    for index in range(NUM_TRAIN_STEPS):
        try:
            loss_batchm,_ = sess.run([loss, optimizer])
        except tf.errors.OutOfRangeError:
            sess.run(iterator.initializer)
    writer.close()

Full model in the code word2vec.py, and word2vec_eager with the eager execution.

Reference

standford cs20i

本文基于《署名-非商业性使用-相同方式共享 4.0 国际 (CC BY-NC-SA 4.0)》许可协议授权
文章链接:https://ereebay.me/archives/word2vec.html (转载时请注明本文出处及文章链接)

word2vectensorflownote
发表新评论
已有 2 条评论
  1. 可乐
    可乐 NQQ浏览器 8
    回复

    你的博客好美啊 能不能分享下模板

博客已萌萌哒运行
© 2018 由 Typecho 强力驱动.Theme by YoDu
PREVIOUS NEXT
雷姆
拉姆
音乐加载中...
0:00