I have 2 sequences, AACAGTTACC
and TAAGGTCA
, and I'm trying to find a global sequence alignment. I managed to create a 2D array and create the matrix, and I even filled it with semi-dynamic approach.
Here is my code to fill the matrix:
void process() {
for (int i = 1; i <= sequenceA.length; i++) {
for (int j = 1; j <= sequenceB.length; j++) {
int scoreDiag = opt[i-1][j-1] + equal(i, j);
int scoreLeft = opt[i][j-1] - 1;
int scoreUp = opt[i-1][j] - 1;
opt[i][j] = Math.max(Math.max(scoreDiag, scoreLeft), scoreUp);
}
}
}
private int equal(int i, int j) {
if (sequenceA[i - 1] == sequenceB[j - 1]) {
return 1;
} else {
return -1;
}
}
My main problem is that this code generates this output:
0 -1 -2 -3 -4 -5 -6 -7 -8
-1 -1 0 -1 -2 -3 -4 -5 -6
-2 -2 0 1 0 -1 -2 -3 -4
-3 -3 -1 0 0 -1 -2 -1 -2
-4 -4 -2 0 -1 -1 -2 -2 0
-5 -5 -3 -1 1 0 -1 -2 -1
-6 -4 -4 -2 0 0 1 0 -1
-7 -5 -5 -3 -1 -1 1 0 -1
-8 -6 -4 -4 -2 -2 0 0 1
-9 -7 -5 -5 -3 -3 -1 1 0
-10 -8 -6 -6 -4 -4 -2 0 0
But I want it to look like this (all i care are numbers in the pictures):
I got to apply penaltys: per mismatch 1 and per gap 2, if its match 0.
There are several things that you need to modify:
AACAGTTACC
and TAAGGTCA
, but CCATTGACAA
and ACTGGAAT
.The full solution is:
// Note that these sequences are reversed!
String sequenceA ="CCATTGACAA";
String sequenceB = "ACTGGAAT";
// The penalties to apply
int gap = 2, substitution = 1, match = 0;
int[][] opt = new int[sequenceA.length() + 1][sequenceB.length() + 1];
// First of all, compute insertions and deletions at 1st row/column
for (int i = 1; i <= sequenceA.length(); i++)
opt[i][0] = opt[i - 1][0] + gap;
for (int j = 1; j <= sequenceB.length(); j++)
opt[0][j] = opt[0][j - 1] + gap;
for (int i = 1; i <= sequenceA.length(); i++) {
for (int j = 1; j <= sequenceB.length(); j++) {
int scoreDiag = opt[i - 1][j - 1] +
(sequenceA.charAt(i-1) == sequenceB.charAt(j-1) ?
match : // same symbol
substitution); // different symbol
int scoreLeft = opt[i][j - 1] + gap; // insertion
int scoreUp = opt[i - 1][j] + gap; // deletion
// we take the minimum
opt[i][j] = Math.min(Math.min(scoreDiag, scoreLeft), scoreUp);
}
}
for (int i = 0; i <= sequenceA.length(); i++) {
for (int j = 0; j <= sequenceB.length(); j++)
System.out.print(opt[i][j] + "\t");
System.out.println();
}
The result is just as in the example you gave us (but reversed, remember!):
0 2 4 6 8 10 12 14 16
2 1 2 4 6 8 10 12 14
4 3 1 3 5 7 9 11 13
6 4 3 2 4 6 7 9 11
8 6 5 3 3 5 7 8 9
10 8 7 5 4 4 6 8 8
12 10 9 7 5 4 5 7 9
14 12 11 9 7 6 4 5 7
16 14 12 11 9 8 6 5 6
18 16 14 13 11 10 8 6 6
20 18 16 15 13 12 10 8 7
So the final alignment score is found at opt[sequenceA.length()][sequenceB.length()]
(7). If you really need to show the reversed matrix as in the image, do this:
for (int i = sequenceA.length(); i >=0; i--) {
for (int j = sequenceB.length(); j >= 0 ; j--)
System.out.print(opt[i][j] + "\t");
System.out.println();
}