Friday, August 4th

One month blogging!

 Find the words

Given a string and a dictionary, find if the string can be broken into segments where each segment is a word in the dictionary.

Photo provided by Wikipedia, distributed under the Creative Commons Attribution 2.0 Generic license.  No changes made

Looking at the size constraints of the problem, with a string of up to 300 characters, the brute force approach of the past few days' problems is not appropriate here.  

Thinking about the problem, which looks for combinations of words, we can see how sub-problems may overlap.  For example, if the input string is "catsanddogs" and the dictionary contains the words "cat," "cats," "sand," and "dogs," there are 2 different ways to make up the substring "catsand."  We do not necessarily care what they are, nor are we concerned about the number of ways to get there, all that concerns us is if from "catsand" we can find a word for the remaining string "dogs."  The nature of overlapping subproblems leads to a dynamic programming approach

I had already solved the problem top-down, so let's do bottom-up.  We will iterate through the string and mark in a corresponding table if we can get to the current position of the string using words from the dictionary.

Begin with initializing the dp table.  We start without a letter position, so initialize the dp table to the size of the input string s size +1

    bool wordBreak(string s, vector<string>& wordDict) {
vector<int> dp(s.size()+1, 0);
        ...
    }

We start at index 0, so it's inherently reachable

    dp[0]=1;

Now beginning at index 0, we iterate through the dp array, and for each index in the dp array, we also iterate through words in the dictionary.  

    for(int i=0; i<dp.size(); i++) {
for(string word:wordDict) {
            ...   
        }
    }

If the current index of the dp table is true, then we test if the next letters in the string are equal to a word.  If they are, then we mark the corresponding index in the dp array as true, since that index in the dp array is reachable through words in the dictionary.

    int ss;
string sub;
    if(dp[i]) {
    ss=word.size();
sub=s.substr(i,ss);
if(sub==word) {
    dp[i+ss]=1;
}
}

This bottom approach starts at index 0 which is reachable from the start, then builds upon reachable parts of the string by comparing parts of the string against dictionary words.  If there is a match, then the index = current index + word size (i+ss in the code) is reachable.

Finally, we return the end of the dp table.  If we can reach the end of the dp table using words from the dictionary, then the answer is an affirmative

return dp[dp.size()-1];

Full code below

    bool wordBreak(string s, vector<string>& wordDict) {
vector<int> dp(s.size()+1, 0);
dp[0]=1;

int ss;
string sub;
for(int i=0; i<dp.size(); i++) {
for(string word:wordDict) {
if(dp[i]) {
ss=word.size();
sub=s.substr(i,ss);
if(sub==word) {
dp[i+ss]=1;
}
}
}
}
return dp[dp.size()-1];
}


Comments

Popular posts from this blog

Wednesday, September 20th

Monday, September 11th

Tuesday, September 19th