mardi 28 juin 2016

Tensorflow sequence-to-sequence LSTM within LSTM (nested)


I would like to build an LSTM with a special word embedding, but I have some questions about how this would work.

As you might know, some LSTMs operate on characters, so it is characters in, characters out. I would like to do the same, with an abstraction on words to learn a robust embedding on them with a nested LSTM to be resistant to slight character-level errors.

So, a tiny LSTM would unroll on every letter of a word, then this would create an embedding of the word. Each embedded word in a sentence would then be fed as an input to a higher level LSTM, which would operate on a word level at every time step, rather than on characters.

Questions: - I cannot find anymore the research paper that talked about that. If you know of what I talk about, I would like to put a name on what I want to do. - Does some TensorFlow open-source code already exist for that? - Else, do you have an idea on how to implement that? The output of the neural network might be harder to deal with, as we would need to undo the word embedding for the training on characters with an output nested LSTM. The whole thing should be trained once as a single unit (workflow: LSTM chars in, LSTM on words, LSTM chars out).

I guess that rnn_cell.MultiRNNCell would stack LSTMs on top of each other rather than nesting them.

Else would you recommend training the embeddings (in and out) as an autoencoder outside the main LSTM ?


Aucun commentaire:

Enregistrer un commentaire