The task is to find the longest substring in a given string that is composed of any two unique repeating characters
Ex. in an input string "aabadefghaabbaagad", the longest such string is "aabbaa"
I came up with the following solution but wanted to see if there is a more efficient way to do the same
import java.util.*;
public class SubString {
public static void main(String[] args) {
//String inStr="defghgadaaaaabaababbbbbbd";
String inStr="aabadefghaabbaagad";
//String inStr="aaaaaaaaaaaaaaaaaaaa";
System.out.println("Input string is "+inStr);
StringBuilder sb = new StringBuilder(inStr.length());
String subStr="";
String interStr="";
String maxStr="";
int start=0,length=0, maxStart=0, maxlength=0, temp=0;
while(start+2<inStr.length())
{ int i=0;
temp=start;
char x = inStr.charAt(start);
char y = inStr.charAt(start+1);
sb.append(x);
sb.append(y);
while( (x==y) && (start+2<inStr.length()) )
{ start++;
y = inStr.charAt(start+1);
sb.append(y);
}
subStr=inStr.substring(start+2);
while(i<subStr.length())
{ if(subStr.charAt(i)==x || subStr.charAt(i)==y )
{ sb.append(subStr.charAt(i));
i++;
}
else
break;
}
interStr= sb.toString();
System.out.println("Intermediate string "+ interStr);
length=interStr.length();
if(maxlength<length)
{ maxlength=length;
length=0;
maxStr = new String(interStr);
maxStart=temp;
}
start++;
sb.setLength(0);
}
System.out.println("");
System.out.println("Longest string is "+maxStr.length()+" chars long "+maxStr);
}
}
Here's a hint that might guide you towards a linear-time algorithm (I assume that this is homework, so I won't give the entire solution): At the point where you have found a character that is neither equal to x
nor to y
, it is not necessary to go all the way back to start + 1
and restart the search. Let's take the string aabaaddaa
. At the point where you have seen aabaa
and the next character is d
, there is no point in restarting the search at index 1 or 2, because in those cases, you'll only get abaa
or baa
before hitting d
again. As a matter of fact, you can move start
directly to index 3 (the first index of the last group of a
s), and since you already know that there is a contiguous sequene of a
s up to d
, you can move i
to index 5 and continue.
Edit: Pseudocode below.
// Find the first letter that is not equal to the first one,
// or return the entire string if it consists of one type of characters
int start = 0;
int i = 1;
while (i < str.length() && str[i] == str[start])
i++;
if (i == str.length())
return str;
// The main algorithm
char[2] chars = {str[start], str[i]};
int lastGroupStart = 0;
while (i < str.length()) {
if (str[i] == chars[0] || str[i] == chars[1]) {
if (str[i] != str[i - 1])
lastGroupStart = i;
}
else {
//TODO: str.substring(start, i) is a locally maximal string;
// compare it to the longest one so far
start = lastGroupStart;
lastGroupStart = i;
chars[0] = str[start];
chars[1] = str[lastGroupStart];
}
i++;
}
//TODO: After the loop, str.substring(start, str.length())
// is also a potential solution.