I'm interested in how the protocols (and game loop) work for these type of games; any pointers or insights are appreciated.
I guess the main loop would have a world state which would be advanced a few "ticks" per second, but how are the commands of the players executed? What kind of data needs to go back and forth?
I can go into a lot of detail about this but first, go read "1500 archers" http://www.gamasutra.com/view/feature/3094/1500_archers_on_a_288_network_.php and this will answer many of your questions. Here's a summary: First, most games use UDP due to the real-time nature of the game. THe game loop looks something like this:
That's vastly simplified and "messing with physics" could easily be a 200 page book on its own but it involves predicting client-side where something is likely to be, getting data from the server that is old but tells exactly where an object was/should be, and then interpolating those values somehow to make the object appear "close enough" to where it's actually supposed to be that no one notices. This is super-critical in first person shooters but not as much for real-time strategy.
For real-time strategy, what typically happens is a turn-based system where time is divided into discreet chunks called "turns" that happen sequentially and each turn has a number generated by a monotonic function that guarantees ever increasing values in a particular order without duplicates. On any given turn n, each client sends a message to all other clients with their intended action on turn n + m, where m is an arbitrary number that is usually fairly small and can be best determined through trial and error as well as playtesting. Once all the clients have sent their intended action, each client executes all actions that were sent on turn n + m. This introduces a tiny delay in when an action is ordered by the user and when it executes, however this is usually not noticable.
There are several techniques which can be used to fudge the time as well. For example, if you highlite a unit and then tell it to move, it will make a sound and have an animation when it starts moving but won't actually move right away. However, the network message of an intent to move that unit is sent immediately so by the time the screen responds to the player's input, the network messages have already been sent and acknowledged. You can fudge it further by introducing a small delay (100ms or so) between the mouse click and the game object's response. This is usually not noticable by the player but 100ms is an eternity in a LAN game and even with a broadband connection on a home computer the average ping is probably around 15-60ms or so, which gives you ample time to send the packet prior to the move.
As for data to send, there are two types of data in games: deterministic and non-deterministic. deterministic actions are grounded in game physics so that when the action starts, there is a 100% guarantee that I can predict the result of that action. This data never needs to be sent accross the network since I can determine what it will be on the client based on the initial state. Note that using a random number generator with the same seed on every client turns "random" events into deterministic behavior. Non-deterministic data is usually user input but it is possible to predict what a user's input is likely to be in many cases. The way these pair in a real-time strategy game is that the non-deterministic event is some sort of order to one of my game objects. Once the game object has been ordered to move, the way in which it moves is 100% deterministic. Therefore, all you need to send on the network is the ID of the object, the command given (make this an enum to save bandwidth), and the target of the command (if any, so a spell may have no target if it's an area of affet but a move command has an end-destination). If the user clicks like 100 times to make a unit move, there is no need to send a separate move command for each click since they're all in the same general area so be sure to filter this out as well since it will kill your bandwidth.
One final trick for handling a possible delay between a command and its execution is something called a local perception filter. If I get a move order some time t after the order was given, I know when the unit should have started moving and I know its end destination. Rather than teleporting the unit to get it where it's supposed to be, I can start its movement late and then mess with physics to speed it up slightly so that it can catch up to where it's supposed to be, and then slow it back down to put it in the correct place. The exact value you need to speed it up is also relative and playtesting is the only way to determine the correct value because it just has to "feel right" in order for it to be correct. You can do the same thing with firing bullets and missiles as well and it's highly effective for that. The reason this works is that humans aren't horribly good at seeing subtle changes in movement, particularly if an object is heading directly towards them or away from them, so they just don't notice.
The next thing to think about is cutting down on bandwidth. Don't send messages to clients that couldn't possible see or interact with a unit that is moving. Don't send the same message over and over again because the user clicks. Don't send messages immediately for events that have no immediate affect. Finally, don't require an acknowledgement for events that will be stale should they fail to be received. If I don't get a movement update, by the time I re-transmit that update, its value will be so old that it's no longer relevant so it's better to just send another move and use a local perception filter to catch up or use a cubic spline to interpolate the movement so that it looks more correct or something of that nature. However, an event that's critical, such as a "you're dead" or "your flag has been taken" should be acknowledged and re-transmitted if needed. I teach network game programming at Digipen so feel free to ask any other questions about this as I can probably provide you with an answer. Network game programming can be quite complicated but ultimately it's all about making choices in your implementation and understanding the consequences of your choice.