I am looking for the most efficient algorithm to form all possible combinations of words from a string. For example:
Input String: forevercarrot
Output:
forever carrot
forever car rot
for ever carrot
for ever car rot
(All words should be from a dictionary).
I can think of a brute force approach. (find all possible substrings and match) but what would be better ways?
Use a prefix tree for your list of known words. Probably libs like myspell
already do so. Try using a ready-made one.
Once you found a match (e.g. 'car'), split your computation: one branch starts to look for a new word ('rot'), another continues to explore variants of current beginning ('carrot').
Effectively you maintain a queue of pairs (start_position, current_position)
of offsets into your string every time you split the computation. Several threads can pop from this queue in parallel and try to continue a word that starts from start_position
and is already known up to current_position
of the pair, but does not end there. When a word is found, it is reported and another pair is popped from the queue. When it's impossible, no result is generated. When a split occurs, a new pair is added to the end of the queue. Initially the queue contains a (0,0)
.