|
Posted: Feb 13, 2016 |
[ # 106 ]
|
|
Senior member
Total posts: 179
Joined: Feb 11, 2015
|
Hi Bruce, thanks for your support
Bruce, but when I do as you say, meaning, just deleting the DICT/ENGLISH content and place inside it my “a.txt” (I have save it as UTF8). Then as soon as I run CS, it always crashes, why?? chatscript.exe only runs without crashing if I delete the contents of DICT/BASIC and DICT/ENGLISH and place my a.txt inside DICT/BASIC, why?? am I doing something wrong???
Thanks Advanced Bruce
Thanks again for your support
|
|
|
|
|
Posted: Feb 13, 2016 |
[ # 107 ]
|
|
Moderator
Total posts: 2372
Joined: Jan 12, 2010
|
if the a.txt you used is like the a 0000.txt you emailed me, the file format is all wrong. I sent you an email.
|
|
|
|
|
Posted: Feb 13, 2016 |
[ # 108 ]
|
|
Senior member
Total posts: 179
Joined: Feb 11, 2015
|
Hi
Bruce thnaks for your support
Thanks, yes I fixed the a.txt, some words were mixed together, and I erased the words without definition (). Then CS worked fine, Bruce, but most of those words without definition were verbs, verb conjugations, what would be more suitable for a spanish chatbot, that could tell between a verb in past and a verb in present, do I need to create a table macro?? in a *.top file or I could just list all verb conjugations directly in the a.txt.
Thanks very much for ur support Bruce
Thanks Advanced.
|
|
|
|
|
Posted: Feb 13, 2016 |
[ # 109 ]
|
|
Moderator
Total posts: 2372
Joined: Jan 12, 2010
|
You can list all verb conjugations as single words with their appropriate description. You should link them together with conjugate= xxx (see existing dictionary words “am” “are” “be” for illustration. While it may be that currently the system would not automatically give you markings on the base word if it sees a conjugation form, if the data is in the dictionary, I can make it do that in the future.
|
|
|
|
|
Posted: Feb 14, 2016 |
[ # 110 ]
|
|
Senior member
Total posts: 179
Joined: Feb 11, 2015
|
Hi Bruce, thanks for your support
I got the CS load the a.txt (placed in DICT\ENGLISH) without problems, but even so, CS 6.1d can’t recognize the word mañana either áéíóú in the console, CS 5.9 can, why is that? am I doing something wrong??? Thanks Advanced.
Do you want me to send you my new reviced a.txt??
Thanks again Bruce
|
|
|
|
|
Posted: Feb 14, 2016 |
[ # 111 ]
|
|
Moderator
Total posts: 2372
Joined: Jan 12, 2010
|
|
|
|
|
|
Posted: Feb 15, 2016 |
[ # 112 ]
|
|
Senior member
Total posts: 179
Joined: Feb 11, 2015
|
Hi Bruce, thanks for your support
I thoroughly checked my b.txt (now is called “b” instead of “a”), there were some words definitions that were kinda mispelled, I corrected them and I got CS recognizing the DICT and recognizing the mañana and áéíóú words.
Now despite I let some words without definition I got the DICT loading and working, I corrected everything else, just let some words without any definition () and the DICT still work.
I sent you my b.txt file to your email, please Bruce check it out. Im especially concern about some issues that the CS engine mark up when loading my b.txt DICT, some stuff like….
Missing plurality fondo
Missing plurality pensamiento
Missing comparison fuerte
Missing comparison mayor
Missing conjugation hecho
Missing conjugation streap_tease
I guess they’re related to the last part in some word’s definitions…
plural=
comparative=
conjugate=
I have some doubts about the correct way to create a spanish DICT…
- what does COMMON4, COMMON2, COMMON1 mean?
- what does KINDERGARTEN GRADE3_4 mean?
Just in case you need I sent you also a c.txt DICT wich is the b.txt but without any undefined () word.
Thanks Advanced Bruce, Thanks for your support
PD: Bruce, about the verb conjugation, It would be great to put all the verb conjugations that the spanish chatbot needs in the DICT itself, but I didn’t quite understand how to fill the definition () of each verb word, how I manage the time tenses??? like in this post https://www.englisch-hilfen.de/en/grammar/tenses_table.pdf
could you please give me an example???
These are the “am”, “are”, “be” verbs’ definitions that you suggested
am ( meanings=1 SINGULAR_PERSON VERB_PRESENT AUX_BE VERB COMMON4 COMMON2 KINDERGARTEN posdefault:VERB ) conjugate=be
are ( meanings=1 VERB_PRESENT AUX_BE VERB COMMON4 COMMON2 COMMON1 KINDERGARTEN ) conjugate=am
be ( meanings=14 glosses=7 VERB_INFINITIVE AUX_BE VERB COMMON4 COMMON2 COMMON1 KINDERGARTEN posdefault:VERB VERB_TAKES_ADJECTIVE VERB_DIRECTOBJECT VERB_NOOBJECT PRESENTATION_VERB ) conjugate=were
but there are lots of more verb conjugations in spanish, How can I do the definition for them??
Thanks Advanced.
|
|
|
|
|
Posted: Feb 17, 2016 |
[ # 113 ]
|
|
Senior member
Total posts: 179
Joined: Feb 11, 2015
|
Hi Bruce
I have try to workaround make the DICT definitions for the verb words
I kinda filled their definitions (VERB_PRESENT) (VERB_PAST)
and kinda filled the… conjugate= stuff
but I guess there are a gap between english and spanish grammar
in these two tenses
will ( NOUN_ABSTRACT AUX_VERB_FUTURE NOUN NOUN_SINGULAR COMMON4 COMMON2 COMMON1 KINDERGARTEN posdefault:NOUN )
could ( AUX_VERB_FUTURE COMMON4 COMMON2 COMMON1 KINDERGARTEN )
because in spanish every verb has a conjugation to form these two tenses, e.g. sera, sería
ser[be] ( VERB_INFINITIVE AUX_BE VERB COMMON4 COMMON2 COMMON1 KINDERGARTEN posdefault:VERB VERB_TAKES_ADJECTIVE VERB_DIRECTOBJECT VERB_NOOBJECT PRESENTATION_VERB ) conjugate=fueron[were]
fue[was] ( SINGULAR_PERSON VERB_PAST AUX_BE VERB COMMON4 COMMON2 COMMON1 KINDERGARTEN posdefault:VERB ) conjugate=sido[been]
sera[will_be]
sería[could_be]
how could I make the definition of these to tenses in the DICT??? every spanish verb has those two tenses I need to fill, in spanish there are not something like “will” and “could” auxiliars.
Thanks Advanced Bruce.
|
|
|
|
|
Posted: Feb 22, 2016 |
[ # 114 ]
|
|
Member
Total posts: 5
Joined: Jan 13, 2016
|
H everyone, I’m following this conversation in order to continue with the steps around a spanish bot. It seems difficult to “translate” a dictionary from english because a more complex taxonomy and alternative rules that spanish has.
I think that what Eduardo Bedoya means is the right point. It seems that COMMON definitions are alternative parameters to define verbs, but not enough.
It would be helpful to expand that lists and give the chance to create new parameters.
I’m totally new in CS so I’m trying to catch it in order to help this work.
There is something only? repository? where I can help you with?
Regards!
|
|
|
|
|
Posted: Feb 22, 2016 |
[ # 115 ]
|
|
Moderator
Total posts: 2372
Joined: Jan 12, 2010
|
One need not fill in this kind of data: COMMON4 COMMON2 COMMON1 KINDERGARTEN GRADE1_2…
Common is used to specify how common (frequently used) the word is, for spell correction.
Grade is for when one learns the word - no specific use. VERB_CONJUGATE1, VERB_CONJUGATE2, VERB_CONJUGATE3 are used in english for knowing which part of a multiple-word verb to conjugate, hence they are available for reuse in Spanish for whatever additional tense markings you wish.
|
|
|
|
|
Posted: Mar 1, 2016 |
[ # 116 ]
|
|
Senior member
Total posts: 179
Joined: Feb 11, 2015
|
Hi Bruce, thanks for your support
I’ve been trying with the Spanish DICT, I downloaded CS 6.2b
I had the problem when setting the Verb worb like this
seré ( SINGULAR_PERSON VERB VERB_CONJUGATE1 )
when starting CS it showed me this comment “Verb seré lacks tenses”
I try to solved this, now I know it works if I set the verb like this…
seré ( SINGULAR_PERSON VERB VERB_INFINITIVE VERB_CONJUGATE1 )
so this way the DICT gets build without sending any message CS starts normally
please tell me if this solution is wrong, can I mix VERB_INFINITE with VERB_CONJUGATE1 in order to fit my spanish new tenses purposes??
I did it that way, cuz that was the way VERB_CONJUGATE1 was used in the english DICT, always together with VERB_INFINITE, am I wrong??
thanks for your support Bruce
Bruce, I deleted the “comparative=”, “plural=”, and “conjugate=” parts of the DICT as you suggested, also I realize that CS will not build the DICT if those refer to any word that doesn’t exists in the DICT itself.
But I have the doubt, do I need to put the “conjugate=” in Verb words? or I should spare it at all?
Thanks Again for your support Bruce
PD: I sent you an example the conjugation of a spanish verb to your email, thanks again.
|
|
|
|
|
Posted: Mar 2, 2016 |
[ # 117 ]
|
|
Senior member
Total posts: 179
Joined: Feb 11, 2015
|
Thanks Bruce for your reply by mail. I quote
“ideally you would use conjugate=xxx to link them together, but it is probably not helpful to you since you will shut down parsing (English) and postagging(english). So for now assume you dont need them”
“The dictionary required both that you declare it a verb and you give it a recognized tense. You can also use other bits that are NOT required to expand your tenses.
So seré ( SINGULAR_PERSON VERB VERB_INFINITIVE VERB_CONJUGATE1 ) was a fine thing to do. “Singular_person” is optional because CS is not going to care. Only if you need to use ~singular_person as a concept set in a rule will it matter. How VERB_CONJUGATE1 was used by CS is immaterial since it will not be used by you from within the engine.
Your $cs_token should NOT include parsing (since it is useless for spanish) and maybe not include pos-tagging (since it may be useless as well, being based on english rules). So either change | #DO_PARSE to | #DO_POSTAG or remove it altogether.”
|
|
|
|
|
Posted: Apr 18, 2016 |
[ # 118 ]
|
|
Senior member
Total posts: 179
Joined: Feb 11, 2015
|
Hi Bruce I having a minor issue with CSbot output
all my CSbot outputs start with Capital letters like…
Hola [Hello]
Gusto en conocerte [Nice to meet you]
Trabajo en turismo [I work in turism]
How can I make the bot to send output all in lower case???
Thanks Advanced Bruce, thanks for your support
|
|
|
|
|
Posted: Apr 18, 2016 |
[ # 119 ]
|
|
Moderator
Total posts: 2372
Joined: Jan 12, 2010
|
cs_response controls output behavior. See System Variables manual
|
|
|
|
|
Posted: Apr 20, 2016 |
[ # 120 ]
|
|
Senior member
Total posts: 179
Joined: Feb 11, 2015
|
Thanks Bruce, it worked fine
Im having hard time deciding exactly which CS directories’ files delete completely in order to enable the cs_token | #DO_SUBSTITUTE_SYSTEM so it can recognize the words in my spanish DICT. I’m trying to do a chatbot that only speaks spanish.
LIVEDATA/interjections seems to be a good feature, I would like to translate it to spanish also, but I need the cs_token | #DO_SUBSTITUTE_SYSTEM enabled in order to CS build the LIVEDATA\interjections. Right know I have the cs_token like | DO_ESSENTIALS, because of the spanish DICT words issue explained above.
Thanks for your support, thanks for the great tool, I sent you a email about this english only directories doubt, thanks again.
|
|
|
|