AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

read from file to find answer
 
 

Hello, I’ve been enjoying learning CS for over a week and hopefully can stick with it for my project. Very nice work.

I need CS to answer questions from info on a file (which is updated regularly, so the file will be a database, but just a .txt for now), i.e. I need to search the file for a key and output the related info. Can I do this?
Thanks!

 

 
  [ # 1 ]

depends on what you actually want.

ChatScript can read a file of facts using ^import.  The facts can be something like
(  George_Washington birthdate “He was born on September 22, 1730” )
And you can then query the facts.

This however, is something one imagines doing at startup of the system. If the system stays running continuously for months on end, your file updates won’t be seen.  You’d have to figure out a refresh mechanism. Maybe merely taking down the server once a day and restarting it.

 

 
  [ # 2 ]

For searching a database (file or web or otherwise), you can have chatscript call a system function which you write, passing in text arguments.  The output from that call is only an integer, but the call could have written a facts file and then the bot could import the fact (answer) back to answer the user.

Depending on how long your call takes, and whether this is done from a heavily used server, you may have performance issues because chatscript will be stalled while your call is being made.

 

 
  [ # 3 ]

I’ve gone with the 1st approach by creating a table from the .CSV file, easier for now…
I have a few issues I appreciate help with.

1. The table contains over 20 arguments. Facts are triplets. So if I want to handle many elements simultaneously, I’ll have to work with facts within facts within facts ....  Or is there a simpler approach?

2. The elements are mostly nonwords that the user will query. I’m having difficulty saving the user variables exactly as written, e.g.  (user inputs) Show me YYY 891S. The variable stores Y_891_S if assigned to one variable. If I assign it to two variables I get Y_891_S as var #1 and 891 as var #2. What I want to query is YYY_891S.

 

 
  [ # 4 ]

yes, with triples representing all data, when you have complex stuff you either have to nest facts within facts, or create a set of linearly related facts, depending on what facts are and how you access them.

Chatscript has default spelling correction, repetion truncation, propername merging, etc as part of handling words for natural language.  You’ll need to shut them off.
In the simplecontrol bot definition, use $token = 0 to shut down various features.
But the tokenizer itself currently cannot be controlled. It is what assumes 891S is a mistake. I could give you control over that. But in the meantime, you’d want to show me a sample table entry (maybe) and maybe alter the code for your rule retrieving this IF YOU KNOW what you are expecting from the user.  E.g,
u: (show me _*1 _* ) 
  if you know it will be something like YYY and 891S.  Yes, the tokenizer will have converted 891S into 891 S and stored it on _1 as 891_S, but you could fix it by
u: (show me _*1 _* ) substitute(character _1 _ “”)  # delete underscore.

 

 
  [ # 5 ]

I plan to validate all user entries, but anyway yes I do know what the user will input.
I just tried the line
u: (show me _*1 _* ) substitute(character _1 _ “”) 
and when I re did :build, the system returned
Probably bad double-quoting: “) 
indefinitely until
Bus error: 10
and it shutdown. When I try to restart I get
Segmentation fault: 11

What have I done?

 

 
  [ # 6 ]

Not trusted me.  smile

You can recover by removing the substutie command which is bad… (my bad)
and erase your topic folder contents then rebuild.

I’ll have to give you a correct way to delete the _ shortly.

 

 
  [ # 7 ]

Meanwhile… it was fine for me… here’s what I did.

1. in harry bot I defined $token = 0 instead of the one allowing spell checking
2. I defined this rule in introductions:

u: ( I _*1 _*  ) ‘_0 _2 = substitute(character _1 _ “” ) _2 noerase() repeat()

I tested it with input:  “I YYY 283S” and it output “YYY 283S”

I had no compile error messages.

 

 
  [ # 8 ]

YOURS has curvy double quotes… not ascii normal double quotes?  can’t tell from here. I used ordinary double quotes and what shows up here isn’t what I entered.

 

 
  [ # 9 ]

The substitute line now works. I like that solution. I see now the previous attempt also works, but I can only (later) save _2 to a $var on the first call. Not sure why. Ascii quotes was not the issue previously.  Maybe I didn’t leave a space between the _ and ‘’, i.e. substitute(character _1 _”“) , but I didn’t want to repeat the mess.

The $token = 0, was that in simplecontrol? I don’t wanna lose the tokenizer for good…
I tried defining it inside the topic but it resulted in issues with responders and maybe sometimes other stuff.

I was hoping there was a trick with the quotes or something to read “as written”. I had also tried adding to allownonwords.
Thanks

 

 
  [ # 10 ]

probably missing the space.
$token can be found in simplecontrol.
You can keep the old definition, but simply add a line after it that does $token = 0
There are only two places you can pull this stunt off. the bot definition topic or the preprocess topic. You need that value in effect when each sentence is parsed. Normal topic placement is too late.

There is currently no “raw” mode, in part because the tokenizer is still required to decide where sentenec boundaries are. While it would be possible to control the tokenizer so that it would keep as a token all non-blank content and not try to fix anything, there has been no call to do so. And if your app can manage with the tricks described, there is still no call to override the tokenizer.

 

 
  [ # 11 ]

Im having that when I define the bot with token 0, I’ve disabled all multiple rejoinders without a verify comment. (multiple = at least 2 choices) 

I have that same problem when I define it inside the topic (my bad saying responders).
I did:
place a gambit with token = 0 in the topic and let it be called
call the responder in question and redefine token before the rejoinder
all works well, once.
where to define token = 0 always before the responder?

 

 
  [ # 12 ]

just to be precise….  $token = 0   (not token = 0).

If you are trying to override token control only sometimes, then you can do the assignment on a gambit or a responder, and it will apply to the input read immediately after that (ie the rejoinders). Of course you have to insure turning it on again.

when you say all works well once….  what does this mean…. The typical “once” failure is trying to call a rule multiple times that doesn’t have noerase on it or repeat on it (depending on what kind of failure you experience).

 

 
  [ # 13 ]

I missed this answer, didn’t get the email… that was a huge run for a late Friday grin


Yes I was lazy with the $

With “once” I meant that in my example the $token def is in a gambit, so I get to call this gambit once before its erased. I need to define $token = 0 just before the responder that handles this input. In that same responder I define $token = 239 before the rejoinders and that prevents any issues.
  I’m working on having a forced responder (instead of a gambit) that would introduce an unnecessary confirmation to the user that we’re about to do a search of the database, but yeah not the ideal conversation.

 

 
  [ # 14 ]

anyway… you dont need any more information from me, presumably. I can let you stew in a cesspool of your own construction.

 

 
  [ # 15 ]

If you ever see a similar situation:

Since I know the possible codes the user should input, I ended up solving (so far so good) the issue using an IF Construct.  I learned how CS interprets each intended code and tested against these.


...stew in a cesspool o_0

 

 1 2 > 
1 of 2
 
  login or register to react