Hi,
I have made a chatbot, using the translation model [1] (with some modifications), by feeding it with message-response pairs from the Ubuntu Dialogue Corpus. I was wondering if anyone has any ideas about how I can handle context in a conversation? I.e. I don’t want the chatbot to forget what I have previously said, after I enter a new sentence.
I can only think about one strategy, which is to handle the context in the preprocessing. Lets say that I have this conversation:
M1: Hi, how are you?
R1: Hey, good! I just finished work at the restaurant. How are you?
M2: Good. How was it?
R2: Exhausting…
M3: Many customers?
R3: Yes, and they didn’t tip well either!
Then I could put them in pairs like this: (M1-R1), (R1M2-R2), (R2M3-R3) etc… Another option would be to save the context from M1 in each pair e.g. (M1-R1), (M1R1M2-R2), (M1R1M2R2M3-R3) - but then the length of the training sentences will increase (a lot) - leading to more memory allocating during training and I would probably need to decrease my network (fewer neurons in each layer).
They did something similar in this paper[2], but I don’t understand how their model is built and how it will handle this.
[1] https://www.tensorflow.org/tutorials/seq2seq
[2] https://arxiv.org/abs/1506.06714