I read Stefan Gustavson's excellent paper on simplex noise, in which I was promised that:
Simplex noise has no noticeable directional artifacts
in contrast with "classic" Perlin noise. I excitedly implemented it to find out that the opposite appeared to be true. I do see artifacts in classic noise, but I see at least as many artifacts in simplex noise, aligned at 45 degrees to the main axes. They're especially noticeable when you map the noise to a step function.
To ensure it wasn't a problem with my implementation, I used someone else's JavaScript implementation. Compare some images:
And here's a gallery with all of them. In that last image, look for borders that are aligned at 45 degrees from horizontal/vertical. They're all over the place. I can highlight some of them if need be, but they seem really obvious to me. (And again, I see them in the classic noise image as well.)
EDIT: To be more quantitative, I sampled 1 million random points, and for each point I numerically computed the gradient of both classic and simplex noise, and took a histogram of the direction of the gradient projected onto the x-y plane. If there were no directional artifacts, the graph would be flat. But you can see that both classic and simplex noise spike every 45 degrees.
Is this a problem with the simplex noise algorithm? Is it something that can be fixed? Or am I the only one who sees this as a problem?
I just read the paper, and I think I have an idea what might be causing the artifacts. The gradients for each vertex of the grid are pseudorandomly chosen from a rather small lookup table. As Gustavson states on page 3:
"A good choice for 2D and higher is to pick gradients of unit length but different directions. For 2D, 8 or 16 gradients distributed around the unit circle is a good choice."
This was the method used in classical Perlin noise, which is not what Perlin proposed for Simplex noise in his 2001 paper, page 14:
"Rather than using a table lookup scheme to compute the index of a pseudo-random gradient at each surrounding vertex, the new method uses a bit-manipulation scheme that uses only a very small number of hardware gates."
However, Gustavson states on page 7:
"I will use a hybrid approach for clarity, using the gradient hash method from classic noise but the simplex grid and straight summation of noise contributions of simplex noise. That is actually a faster method in software."
His 2D implementation actually uses the 12 gradients from the 3D gradient table, discarding the z coordinate. In that scheme, the edge coordinates are used twice each, but the corners are used only once, which would seem to introduce a bias at 90-degree intervals. But that's not relevant in your case, because the implementation you're using only has 8 gradients, quite suggestive of a bias at 45-degree intervals. The likelihood of visible patterns emerging from such minimal variance seems pretty high. But it should be easy to adapt that algorithm for 16 gradients, using a mod 16 permutation table, which should help reduce the directional artifacts significantly.
Ultimately, though, I think there will always be some visible patterns in a single octave of any gradient noise function, simply because they're band-limited by design, as the narrow range of frequencies will tend to align perturbations to the grid. Being a triangular grid, Simplex noise will probably exhibit some bias at 60-degree intervals even if the gradients were truly random. Well, that's just conjecture, but the point is that these noise functions are really designed to be combined in different frequencies, which tends to break up any patterns you might see in a single octave.
EDIT:
Another point I just realized, the corner gradients such as (1,1) are not of unit length, they are sqrt(2). The first quote makes it clear that the gradients should lie on the unit circle. This may be another source of bias. Interestingly, Gustavson uses these non-unit gradients too.