How to build a distributed java application?

Sami picture Sami · Jul 4, 2012 · Viewed 20.1k times · Source

First of all, I have a conceptual question, Does the word "distributed" only mean that the application is run on multiple machines? or there are other ways where an application can be considered distributed (for example if there are many independent modules interacting togehter but on the same machine, is this distributed?).

Second, I want to build a system which executes four types of tasks, there will be multiple customers and each one will have many tasks of each type to be run periodically. For example: customer1 will have task_type1 today , task_type2 after two days and so on, there might be customer2 who has task_type1 to be executed at the same time like customer1's task_type1. i.e. there is a need for concurrency. Configuration for executing the tasks will be stored in DB and the outcomes of these tasks are going to be stored in DB as well. the customers will use the system from a web browser (html pages) to interact with system (basically, configure tasks and see the outcomes). I thought about using a rest webservice (using JAX-RS) where the html pages would communicate with and on the backend use threads for concurrent execution. Questions:

  1. This sounds simple, But am I going in the right direction? or i should be using other technologies or concepts like Java Beans for example?

2.If my approach is fine, do i need to use a scripting language like JSP or i can submit html forms directly to the rest urls and get the result (using JSON for example)?

  1. If I want to make the application distributed, is it possible with my idea? If not what would i need to use?

Sorry for having many questions , but I am really confused about this.

Answer

Alex D picture Alex D · Jul 5, 2012

I just want to add one point to the already posted answers. Please take my remarks with a grain of salt, since all the web applications I have ever built have run on one server only (aside from applications deployed to Heroku, which may "distribute" your application for you).

If you feel that you may need to distribute your application for scalability, the first thing you should think about is not web services and multithreading and message queues and Enterprise JavaBeans and...

The first thing to think about is your application domain itself and what the application will be doing. Where will the CPU-intensive parts be? What dependencies are there between those parts? Do the parts of the system naturally break down into parallel processes? If not, can you redesign the system to make it so? IMPORTANT: what data needs to be shared between threads/processes (whether they are running on the same or different machines)?

The ideal situation is where each parallel thread/process/server can get its own chunk of data and work on it without any need for sharing. Even better is if certain parts of the system can be made stateless -- stateless code is infinitely parallelizable (easily and naturally). The more frequent and fine-grained data sharing between parallel processes is, the less scalable the application will be. In extreme cases, you may not even get any performance increase from distributing the application. (You can see this with multithreaded code -- if your threads constantly contend for the same lock(s), your program may even be slower with multiple threads+CPUs than with one thread+CPU.)

The conceptual breakdown of the work to be done is more important than what tools or techniques you actually use to distribute the application. If your conceptual breakdown is good, it will be much easier to distribute the application later if you start with just one server.