Choosing the right network model for your multiplayer game
I'm Glenn Fiedler and welcome to Más Bandwidth, my new blog at the intersection of game network programming and scalable backend engineering.
I'm fresh back from lovely Malmö, Sweden where I hung out with the team from Coherence at their offices for the week around Nordic Game conference.
I'm happy to report that Coherence is an excellent multiplayer product for Unity with a very strong and talented team behind it. I spent a lot of time with their CTO Tadej (a close personal friend of mine) and over the week we had many discussions about network models and the pros and cons for the networking strategy they have chosen for their product.
And this leads to a very important point that I think most people are unaware of.
There are just so many different ways to network a game.
In the industry we call these network models. Each network model represents a different strategy for keeping your game in sync, and comes with significant pros and cons. Something absolutely trivial in one network model might be difficult or even impossible to implement in another, so it's extremely important to choose the right network model for your game.
So if you're a game developer trying to decide which network model to use – read on, and I'll share some helpful tips to help you pick the correct network model for your next game!
Important Factors to Consider
Let's start with the key inputs to help make your a decision.
These include:
- Do you already have a lot of game code written? Are you starting from scratch with a blank slate, or are you converting an existing game to multiplayer?
- What platforms will your game ship on? Are all platforms closed (PS5, Xbox), or do you have to support open platforms, like PC and mobile phones?
- Is your game competitive or cooperative? Do people have a significant incentive to cheat, and will this cheating negatively impact other players?
- How competitive is your game? Is your game competitive in a casual sense, or hyper-competitive like Counterstrike or Street Fighter?
- How latency sensitive is your game? There is no point networking a game of Solitaire like it's Street Fighter, but from the other direction if your game is latency sensitive and you network it incorrectly, it will kill your game.
- Can you afford dedicated servers? Dedicated servers are extremely expensive and out of reach of many game developers. Choosing the correct network model can help reduce or eliminate this cost.
- How many players are in each match? Deterministic lockstep based approaches are good for lower player counts, but you'll never see an MMO networked this way.
- How large is your game world? Open world games with streaming levels usually are not possible to make deterministic, ruling out deterministic lockstep approaches.
- Is your game highly CPU bound? Games with lots of physically simulated objects, especially in large open worlds, often spend so much time on simulating physics that rolling back and re-simulating for GGPO style networking or client side prediction is impractical.
- Are players-to-player interactions tightly or loosely coupled? Examples of tightly coupled interactions include blocks, parries and dodge mechanics in fighting games. Loosely coupled interactions are things like instant hit projectile weapons at a distance with very short time to kill.
- Does your game have a fantastically high number of units? For example, a real-time strategy game like Starcraft may have many thousands of units. For games like this a deterministic lockstep strategy is a no brainer.
- What engine are you using? Certain solutions are only available for specific game engines, for example Coherence and Photon Quantum are only available for Unity, while Snapnet is available only for Unreal Engine. Of course, you can always write your own implementation of a network model, but it's extremely difficult to build the rocket the same time as you are flying it to the moon (been there).
Distributed Authority
In this network model objects are distributed across player machines to achieve beneficial things like latency hiding and load balancing.
At its simplest, input delay is removed by simulating each player locally on their own machine, making each client have "authority" (acting as the server) for their own player character. Non-player objects like AIs can also be distributed, sharing the workload of simulating the world across all player machines.
This approach can also be extended to take authority over vehicles the player drives, turrets manned by the player, physics objects picked up and thrown by players, and even stacks of objects touched by the player or other objects under the authority of that player.
Examples:
- Journey
- God of War: Ascension
- Mercenaries 2
- GTA: Online
- Dark Souls
- Uncharted
- Destiny
When to consider:
- You are converting an existing singleplayer game to multiplayer and don't want to rewrite all your game code to network it.
- You don't want to, or cannot afford to pay for dedicated servers.
- You have a large open world game.
- Your game is not deterministic and rewriting it to make it deterministic is not feasible.
- Your game is very CPU bound (perhaps because of physics simulation) and rolling back and re-simulating multiple frames of the game simulation invisibly each frame is just not possible because it would use too much CPU.
- Your game is cooperative, or casually competitive on closed platforms only.
- You want to try interesting stuff like players drifting in/out of other players games (Journey), or players invading other players games (Dark Souls).
- You are creating a physics based games in VR, where the underlying physics simulation is not deterministic, and you have to trust the position and orientation inputs for the VR headset and controllers anyway.
When to exclude:
- Do not use for hyper competitive fighting games, even on closed platforms. The distributed authority creates variable timing and slop in attacks, blocks and parries which drives fighting game players crazy. Ask me how I know (I worked on God of War: Ascension multiplayer). Use a GGPO style deterministic lockstep model with rollback instead.
- Do not use for hyper-competitive first person shooters. You are shooting at a sloppy extrapolation with no real concept of time synchronization between machines. Professional players will notice this. Not to mention it's completely insecure because it trusts the client. Use the Quake network model with client side prediction and lag compensation instead.
Pros:
- Determinism is not required
- You don't need to host dedicated servers
- By distributing the simulation, you are able to distribute the cost of simulating the world across multiple machines, making it possible to have larger worlds.
- If you are not planning on hosting dedicated servers the choice is between having one player host the game (and being able to cheat), or distributing the game so now all players can cheat. YOLO.
Cons:
- Insecure. Trusts the client.
- Late join can be difficult to implement. Even when implemented, it may "work most of the time but not always". Can be challenging/impossible to get 100% reliable.
- Additional complexity because things run on different machines. Distributed programming is harder than just having everything run on a server, although this can be mitigated by using a solution like Coherence.
- Relatively high upload bandwidth because each player is simulating some objects and needs to send them to all other players, which can be a problem with asynchronous network connections (more download than upload). As player counts increase, this grows an O(n*m) where n is the number of players, and m is the number of objects. Can be solved somewhat by synchronizing nearby objects at a higher rate than faraway objects, and by having "nearly deterministic" physics simulation extrapolation where updates can happen at lower rates and still look visually correct.
Further reading:
- Coherence a distributed authority networking solution for Unity.
- Networked Physics in Virtual Reality (2016)
- Networking for Physics Programmers (2010)
Pure Client/Server
In this network model the game runs exclusively on the server. Each client sends inputs to the server, and displays an interpolated view of the world reconstructed from a time series of snapshots sent back to it (state of all visual objects in the world at time t).
Examples:
- Quake (pre-Quakeworld)
- The Finals? (not 100% sure though, please correct me if I'm wrong).
When to consider:
- When your game is not particularly latency sensitive.
- When a small amount of fixed latency is acceptable (50-100ms or so).
When to exclude:
- When you don't want to run dedicated servers.
Pros:
- Low bandwidth from client to server (inputs only).
- Low CPU cost on the client, since no game simulation or physics is running. You can spend all your client CPU on rendering.
- Only need to synchronize visual components for objects, for example position, orientation, animation state etc. Don't have to synchronize "deep" object state required for prediction or extrapolation.
- Because all objects are in the same time stream, players can interact with other players and a destructible world with greater consistency than in a typical first person shooter with client side prediction.
Cons:
- There will be a delay between the client's inputs and the local player taking action in response to them, equal to at least the round trip time (RTT) between the client and server. And usually a bit more, since packets don't always arrive nicely spaced but have some jitter.
- It can be difficult to perform time synchronization between client and server, such that the client delivers inputs ahead of it being needed, but does not add too much additional delay.
- The internet is inconsistent and gives players hairpin routes (routes with significantly higher latency than normal) according to the hash of the source IP and dest IP and port numbers around 5% of the time on average worldwide. Fixing this requires a network acceleration solution like Network Next.
Further reading:
Client side prediction and lag compensation
Start with the Quake network model and modify it such that each player is now in their own time stream. Now, instead of the whole world stepping forward together in fixed time steps, step players forward on the server only when player input + delta time is received from their client.
All remote objects are still interpolated on the client (and possibly extrapolated although I prefer not to), but local player objects are special cased and run with full simulation using local player inputs on the client machine. This removes the feeling of latency on player actions, eg. players can now move and shoot with no delay.
To keep the server authoritative, the server regularly sends back "deep state" corrections (containing internal non-visual state, eg. inventory, # of bullets in ammo clip, current weapon, firing delay between shots and so on) to each client, and the client applies that correction (which is effectively in the past) and invisibly re-simulates back up to current predicted time on the client. This is called client side prediction, or outside of game development, optimistic execution with rollback.
Weapon firing and damage are applied exclusively on the server. Predicted weapon firing on the client is purely cosmetic. Since the client and the server have a strong concept of shared time, the server is able to reconstruct the state of the world (lag compensation) from the client's point of view when they fired their weapon, crediting hits to the player according to their point of view.
Reconstructing the world according to player point of view on the server is done using a ring buffer containing all visual state for the last ~second (eg. position, orientation and bone orientations) sent down to the client in snapshots and interpolating between them to match the interpolated state displayed on the client.
Lag compensation avoids needing to lead targets and creates more precision in firing shots, for example another player is running and you shoot them in the knee at a specific point in their run cycle, it will hit. The cost is that the player being shot sometimes feels that from their point of view they were shot from behind cover, because effectively the shot was fired in their past.
Examples:
- Quakeworld (sans lag compensation, which was invented for Counterstrike).
- Counterstrike
- Call of Duty
- Titanfall 1 and 2
- Apex Legends
- Valorant
When to consider:
- You are creating an absolute top-tier eSports FPS that will compete with Counterstrike, Call of Duty, Valorant and Apex Legends.
When to exclude:
- You don't want to, or can't afford to host dedicated servers.
- You have tightly coupled melee mechanics between players (eg. block, parry, dodge etc). Use a GGPO style deterministic lockstep network model instead.
Pros:
- Absolute top tier shooting mechanics between players.
- Low bandwidth from client to server (inputs only).
- Almost everything runs on the server, which frees up client CPU for rendering.
Cons:
- Dedicated servers are expensive.
- It can be a significant change to how you write game code to implement shared client and server code such that it works with client side prediction and lag compensation.
- Lag compensation and shot behind cover paradox is a mindfuck.
- Vulnerable to latency injection so that players can shoot other players further in the past by faking additional latency between the server and client, while having low latency from client to server so they get their shot through first.
- Not a good fit for games with melee elements and tight coupling between players like melee attacks, blocks, parries and dodges.
- If you have a highly dynamic and destructible world, the cost of lag compensation reconstructing objects on the server could be prohibitive.
Further reading:
Client side prediction with client simulation in the remote view of objects
Instead of having remote objects on the client being interpolated as per-Quake model, a subset of the simulation runs on the client for remote objects, such that objects continue simulating forward on the client between network updates.
Object updates are prioritized such that not every object is included in each packet, and objects are updated at varying rates. Typically, objects closer to the player are updated more frequently. This is done by having a priority accumulator per-object, and increasing the priority accumulator value for an object each frame it is not sent, scaled by some multiplier (which can be some function of the type of object, the state of the object and how far away it is from the player, and so on.)
Because the client view of the remote players is no longer as accurate as in games like Counterstrike, you usually need to lead the shots. Sometimes this is compensated for by giving the client credit for some shots that are "credible" within some amount of tolerance of the server position, although this is not as robust as the lag compensation solution in Counterstrike style games.
This is the default mode of networking in Unreal Engine.
Examples:
- Unreal Tournament
- Fortnite
- Halo
- Battlefield
When to consider:
- You are in Unreal Engine and you are making a more casual competitive FPS or cooperative shooter with the default Unreal network model.
- You have a large world with many players and objects and you want to prioritize updates for nearby objects at a higher rate.
- Your objects have highly predictable motion, for example projectile motion or flying planes or spaceships or rigid bodies that can be extrapolated via the simulation between network updates.
When to exclude:
- If you are making a hyper-competitive first person shooter in Unreal, consider using Snapnet to get a Quake style network model with client side prediction and lag compensation inside Unreal Engine.
- If you are making a fighting game. Use a GGPO-style network model instead. The fighting games that you have maybe seen using Unreal Engine do not use the Unreal Engine network model, but have ripped it out and replaced it.
Pros:
- Lower latency in the view of remote objects on the client because there is no buffering for interpolation.
- Bandwidth usage can be dynamically scaled up easily by just changing the maximum packet size sent. Priority accumulator takes care of the rest.
- Some players prefer the network feel of Halo or Battlefield over the Quake model (Counterstrike, COD, Titanfall, Apex Legends...)
Cons:
- Players are shooting at an extrapolation.
- The server cannot really verify that the client hit exactly.
- Higher CPU cost on the client because it runs at least some simulation and collision detection for remote objects instead of just interpolating them.
- Significant complexity can be introduced because you may receive updates for one object in a packet without receiving an update for another object it depends on (for example, the object it is parented to).
- Implementing delta encoding is much more complicated due to not all objects being in every packet.
Further reading:
Peer-to-Peer Deterministic Lockstep
In this network model players connect directly to each other peer-to-peer and send inputs or commands to each other. Peers only step the simulation forward when they have received inputs/commands from all players for the current frame, hence the origin of the term "lock-step". In order for this approach to work, the game must be completely deterministic such that a checksum of game state at the end of the frame can be compared between peers at the end of each frame to check for desync.
Pros:
- No netcode needs to be written. "It just works".
- Only inputs/commands need to be sent, so bandwidth usage is very low.
- No servers required, so games don't cost anything to run.
- You can record the input or command stream for a game, and play the whole thing back.
Cons:
- Making the game deterministic can be incredibly difficult.
- Debugging causes of non-determinism can be incredibly frustrating. Full Spectrum Warrior from Pandemic shipped with a very rare non-determinism bug they couldn't fix, and it was solved with an instant sniper kill!
- It is difficult/impossible to get perfect determinism working across different compilers and platforms.
- NAT punch through is required, and may not be possible between all players depending on NAT types for each player. Usually ends up with some relay system being used as a fallback when NAT punch through doesn't work.
- Vulnerable to fog of war removal attacks in RTS games.
- Lag switches can be used to delay other players in the game indefinitely.
- Really only suitable for low player counts (eg. 4-8 players at most).
When to avoid:
- Always. Lag switch attacks are just too effective.
Examples:
- Age of Empires
- Command and Conquer
- Many fighting games, sports games and RTS in the past...
Deterministic Lockstep with Relay Server
Start with the deterministic lockstep model, but instead of connecting players peer-to-peer, have them exchange packets via a relay server. Give this relay server a concept of time instead of just being a dumb reflector. Now if a client tries using a lag switch or messes around with input timing too much, it can be detected and they can be banned from the game.
Examples:
- Starcraft II
- Any modern RTS game
- Fighting games without rollback netcode
When to consider:
- You are making an RTS game and you have waaaaaaay too many units to synchronize any other way.
When to avoid:
- Fighting game players expect rollback netcode. Use GGPO style rollback netcode or face their wrath.
Pros:
- Don't have to worry about NAT issues.
- Lag switch attacks are no longer effective.
- All the benefits of deterministic lockstep.
Cons:
- Have to wait for the most lagged player, or that lagged player loses their input or is banned from the game. It's a fine line.
GGPO style deterministic lockstep
All the benefits of deterministic lockstep without the lag. Each client frame process frames with inputs from the relay server up to most recent server time. Then, copy the game state (fork it) and simulate the copy forward with local inputs to present (predicted) client time. Continue predicting the fork until the next update arrives from the network, then discard the predicted fork, rinse and repeat.
When to consider:
- You are making a fighting game.
When to exclude:
- Don't use for physics heavy games because they are CPU bound, although as a con for this advice, Little Big Planet was networked using something similar to this approach, but one of their team members was the author of the physics engine!
Pros:
- Can predict ahead using local inputs to hide lag.
- All the benefits of deterministic lockstep.
Cons:
- It can be difficult to make sure your game code keeps track of visual effects and sounds considering all the rollback that is going on.
- In practice some amount of input delay needs to be used (eg. 2-3 frames @ 60HZ), otherwise rollbacks can be very disruptive to players.
- If the game is CPU bound, you can get into a spiral of death where the simulation can't keep up with all the invisible resimulations.
- Only suitable for games with low player counts. I would recommend 4-8 being maximum but perhaps it can be pushed up to 32.
Further reading:
Deterministic lockstep with simulation server
Upgrade your relay server so that it doesn't just handle player inputs and time, but it also acts as an invisible player in your game and runs the same simulation with all player inputs. Things that happen in this headless server instance of the game can now securely call out to backend systems and grant players progression and items in the meta game without being vulnerable to modified code or memory hacking on the client.
Examples:
- Supercell games
Pros:
- All the benefits of deterministic lockstep, and you now have a place to stand to grant players items and points and score that is secure!
- You can now have a real meta game and progression, and players cannot just hack it easily.
Cons:
- You now have to pay for dedicated servers to run alongside your players, so your costs for infrastructure are now much higher.
- Low player counts only.
I want to make an FPS which is also a fighting game like Street Fighter!
It's not possible. The two network models are fundamentally different. Pick one.
I want to make a hybrid FPS/MMO with thousands/millions of players!
I'm working on this here, but it's likely to be very expensive to run.
I want to make a first person shooter with no latency, a fully destructible world and client side prediction for all physics objects!
The CPU cost makes this impossible today. Come back in 10 years.
I want to make a hybrid MMO/fighting game like Street Fighter meets World of Warcraft!
I hate you.
I want to make a metaverse with millions of players and I'm using Photon!
I hate you even more.