What are you trying to accomplish with your bot database? Usually bots have a pattern that needs to be matched that goes along with the sentence that would be the response. What are you trying to do with just the sentences?
If you use Chatscript for your bot it can read and process documents. You determine what to do with the sentences in the documents. You could store each sentence that is read as a database record.
I am currently trying to isolate sentences from web pages for use with my bot using a custom c# program. I am extracting the paragraph tag contents and processing the text with regex match and substitutions. Currently I have 10-14 substitutions that give me a textfile that is a list of sentences formatted as fact triples that Chatscript can read while it is running. A chatscript chatbot can also execute a command line utility command so it can call my custom c# code. I would like to be able to have my chatbot respond to a question by having it “read” the internet and infer an intelligent response by parsing the sentences it reads.
I like to use Simple Wikipedia because the sentence structures are simpler.
Processing the URL: https://simple.wikipedia.org/wiki/Guitar yields:
( webquery answer The_guitar_is_a_string_instrument_which_is_played_by_plucking_the_strings. )
( webquery answer The_main_parts_of_a_guitar_are_the_body,_the_fretboard,_the_headstock_and_the_strings. )
( webquery answer Guitars_are_usually_made_from_wood_or_plastic. )
( webquery answer Their_strings_are_made_of_steel_or_nylon. )
( webquery answer The_guitar_strings_are_plucked_with_the_fingers_and_fingernails_of_the_right_hand_openparen_or_left_hand,_for_left_handed_players_closedparen_,_or_a_small_pick_made_of_thin_plastic. )
( webquery answer This_type_of_pick_is_called_a_"plectrum"_or_guitar_pick. )
( webquery answer The_left_hand_holds_the_neck_of_the_guitar_while_the_fingers_pluck_the_strings. )
( webquery answer Different_finger_positions_on_the_fretboard_make_different_notes. )
( webquery answer Guitar-like_plucked_string_instruments_have_been_used_for_many_years. )
( webquery answer In_many_countries_and_at_many_different_time_periods,_guitars_and_other_plucked_string_instruments_have_been_very_popular,_because_they_are_light_to_carry_from_place_to_place,_they_are_easier_to_learn_to_play_than_many_other_instruments. )
( webquery answer Guitars_are_used_for_many_types_of_music,_from_Classical_to_Rock. )
( webquery answer Most_pieces_of_popular_music_that_have_been_written_since_the_1950s_are_written_with_guitars. )
( webquery answer There_are_many_different_types_of_guitars,_classified_on_how_they_are_made_and_the_type_of_music_they_are_used_for. )
...
Chatscript includes a POS tagger and can identify the main subject, main verb and main object of a sentence so rather than just have the bot repeat static sentences verbatim it should be possible to process the sentences for the meaning in them.
Parsing the html of websites is one way of gathering sentences on subjects you are interested in or that your bot is interested in. There are automated ways of obtaining the sentences.
Good luck!