How to create a smart chat-bot?

Roman picture Roman · Nov 17, 2009 · Viewed 53.1k times · Source

I know that it's still an open problem so I don't expect to see complete answers here. I just want to find some approaches to solve the next problem:

I have a model (assume that is's bot's memory), and different words are associated with different objects in the model. Speaking with the bot is like executing sql-queries with a DB. Language is a very hard formalizable protocol. And we can't just write a million lines of code to implement some real language. But I believe that it's absolutely possible to implement some self-learning mechanism. How can it be implemented? Is it possible to implement learning "from scratch" or "from few basic words"? Just want to hear your ideas.

Actually, English is a very strict language and it's one of the easiest languages for experimenting with AI. Many other languages allow you to change the order of words (for example). And in some cases changed order can change the whole meaning or just add some intonation. I really don't have any ideas how to teach a bot for these things.

Answer

mjv picture mjv · Nov 17, 2009

The first step, in taking this game to the next level, is ...

...to have a very clear view of prior art!

(and pardon me to say, the question doesn't suggest that you have such an extensive insight into the matter [and you're not alone, count me in ;-)])

Even, and maybe in particular, if your intention is to apply completely novel techniques and models, it seems important to review the literature on current and past practices. Aside from possibly identifying elements that may be adapted or reused in a new implementation, a survey of the domain will provide an keen understanding of the nature of the problem[s].

I've personally tried -on various and multiple occasions!- either the naive approach or the sophomoric approach to tackling broadly-defined problems. With the naive approach, one has but a very slight idea of the true nature and scope of the problem. The sophomoric sees us better equipped with domain knowledge and also with related tools, but this can also be misleading because without a deeper understanding, we tend to mis-read/mis-understand new material offered to us and also misuse some of the tools (a bit like the the fellow who's "good with a hammer" for whom many things look like a nail...)

It is particularly easy to make these mistakes in the field of NLP. That's because

  • Common sense seems to be all is required: after all a child, who's native tongue is English understands subtleties like
       "He's not really an expert"
       "He's really not an expert"
    (small wink at the OP's reference to the ordering of word in the English language)
  • We live in such exciting times, technology and knowledge wise: Processing power, programming language and tools, mathematical techniques, availability of affordable corpora... to name a few of these things that make this moment in time so special.

Far from me the idea of discouraging you in your chat-bot endeavor, I just hope that this long and generic exposé will encourage to look-before-you-leap, as this will truly save you time in the long run, I think in two ways:

  • provide you some frames of references (again, even if your intention is to "think outside these boxes")
  • maybe entice you to redefine the problem, for example by limiting it to particular domains of conversation (sports, or health, or life at a particular university campus...) or by focusing on a particular aspect of the problem (semantic awareness, smooth, natural sounding grammar, use of colloquial forms...)

Good luck ;-)