Updated my question
I'm building a crawler system by Java to compare price online. However, I worry about my IP address can be banned. So I intend to use proxy to change IP dynamic or use some tools to rotate IP automatically.
Many people said that TOR is a powerful tool to rotate IP. However, I don't know how to use Tor and how to integrate Tor to Java Web Application ?
I've search Google to find example but still find nothing useful.
Anyone can help me.
You'll just need to get Java to use the SOCKS4 proxy at localhost:8118
(8118 is the default Tor port) when it makes an outgoing HTTP connection that uses a URL (use URLConnection
), while the Tor service is running. See here for how to use proxies in Java 8.
Edit: there is also this pure Java Tor library that you may be able to use, either directly or through minor modification (if it acts entirely like the normal native Tor service), but it hasn't been updated in a while so may not be compatible with the latest Tor specification.
HttpClient example:
HttpHost proxy = new HttpHost("127.0.0.1", 8118, "http");
DefaultHttpClient httpclient = new DefaultHttpClient();
try {
httpclient.getParams().setParameter(ConnRoutePNames.DEFAULT_PROXY, proxy);
HttpHost target = new HttpHost("www.google.com", 80, "http");
HttpGet req = new HttpGet("/");
System.out.println("executing request to " + target + " via " + proxy);
HttpResponse rsp = httpclient.execute(target, req);
...
} finally {
// When HttpClient instance is no longer needed,
// shut down the connection manager to ensure
// immediate deallocation of all system resources
httpclient.getConnectionManager().shutdown();
}
Note that you must have the Tor service running for this.