Is working as a developer in the gamedev sector different from working in other branches of the IT sector? What are its specifics? I, after almost a decade of working in the IT sector, ended up in gamedev and in this article, based on my appearance at the Javeloper conference in September 2021, I would like to talk about the specifics of this industry, the challenges and opportunities it offers.
At Ten Square Games, we create mobile games in the Free-to-Play model, meaning that anyone can download the game for free. Our games don’t require any hardware other than a mobile phone. That fact makes it easier for us to put ourselves into our clients shoes, unlike, for example, a corporate position where we’re forced to optimize processes for internal purposes.
Despite popular preconceptions about mobile games being small games, their scale is anything but. Depending on circumstances, a given game can host anywhere from tens to hundreds of thousands of players every day. That’s where the first challenge comes in – ensuring dynamic product scalability.
The numbers above are crucial to us and we measure them as DAU (Daily Active Users) or MAU (Monthly Active Users). On one hand, they’re great business metrics, on the other hand, creating an infrastructure that allows us to monitor the data on a large population of players in real time. We measure a lot of aspects of the game like ARPU (average revenue per user), N-day retention, and more.
Another important goal is securing the game economy by both ensuring that the players aren’t able to abuse glitches for nefarious means, and by ensuring that we don’t have any chaos-causing errors on our part, as they can be a lot of work to undo.
We also want to ensure user retention – this means that we constantly have to deliver them loads of fun in short cycles. To achieve this, we regularly add new locations, events or functionalities. We can’t afford to do long maintenance breaks, since our players are playing from all over the world, meaning our infrastructure has to be online most of the time.
We also use machine learning in our day-to-day work. The analytical, mathematical, and statistical solutions behind that statements help us constantly adjust offers in our games to the players’ needs, to make sure its optimal. For example, the decision whether 3, 4, or 5 animals should appear simultaneously in a given Hunting Clash location is always based on a verified hypothesis.
Another challenge is constantly monitoring and responding to problems. To ensure that we can react to problems ASAP, we’ve created a large amount of indicators to track in real-time. We use Prometheus to capture data, and the process it in Grafan, building automatic alerts that can be integrated with popular communication tools and email.
Java – used in our entire server-side infrastructure, responsible for the persistence, communication, network layers and synchronization.
Redis – given the scale, choosing a database that could handle the traffic and changes was hard. NoSQL became a natural direction, and we chose Redis since it provides responsive access to the persistence layer. For auditing purposes, statistics and verification of player reports are kept in relational databases. We process those with an unacceptable disease to use it for the needs of a current
Netty – a generic solution that allows us to build our protocol out of small blocks, that ensures that, for example, a player won’t lose connection if they switch from WIFI to 3G. Netti allows us to use JNI (an interface allowing to provide lower level code).
(No) Spring – no magic framework. A compromise between building something specific out of small blocks or a framework that would be as broad as possible.
Although the main game servers that handle player traffic do not use Spring in their current architecture, many of our services shared between products are based on this technology.
We try to maintain a balance between simple internal solutions and generic tools, and we’re open to every tool that can provide value to our players and developers in their everyday work.
We also use common market solutions such as MQRabbit or GCP – many services supporting products with different business characteristics use a different technology stack. We are open to any tool that will bring value to our players or developers in their daily work.
The following graphs show the traffic in our games. The first graph shows a regular day which has a sudden user peak. What is the reason for that sudden peak? Is it new content? Or is it perhaps a login error that makes players attempt to login multiple times. A graph like this is always a reason for investigation.
The second graph shows a large increase. It can be connected to the game getting a global launch after only being available in limited locations, or a game being recommended by the Google Play or Apple Store, or even just because we’ve increased our marketing spending on it.
However, the most important aspect shown by this graph is that we always have to be ready to scale our infrastructure, sometimes even on a day-to-day basis with our userbase increasing 10-fold. This means we need an expandable architecture, that can handle bigger traffic at the cost of being scaled.
A crucial aspect of game development are resources, both ours and the players. Imagine a situation where the game’s load time is more than 10 seconds, or if the game downloaded 2 MB of data every time it logged in. These might irritate players, just like higher battery usage caused by decompression on the user’s devices to reduce transfer. These seemingly small matters have a real impact on a game’s business performance.
We have to remember avoid over-exerting player resources, while also taking care of server-side resources. If we want a persistent layer, we have to remember to avoid having too much data. For example, 1 TB isn’t a large unit of data, however, if we think about the cache, it’s a massive amount.
We store data needed in real-time on quickly accessible resources (e.g. RAM) on servers close to the game server (in a network sense). When it comes to relational databases, the internal infrastructure can be, for example, 50 TB in the data center where we store the entirety of a player’s history.
All this serves to ensure that the players can be supported well and that we have the data we need to develop the game in an appropriate direction.
The entire population of our players is divided into sectors assigned to a single server. This is where players can communicate with each other on game level.
Server information is visible to players and lets us dynamically scale architecture based on traffic. We use HAProxy, and on the server side, adding support for it is handled by one block in Netty.
At TSG we have a set of common services (the TSG bar) supporting product development, like our AB testing tool Absinthe and others such as Rum, Whiskey, Martini, Tequila, and Vodka, which handle support tools, promo codes, machine learning and more.
The priority in building an architecture like a game server has to be driven by functionalities that at first glance are too tied together to divide them into different domains. However, these can be easily and logically divided into modules thanks to tools like Maven or ArchUnit.
Game models also bring about many challenges. Imagine that the developers want to add a new object or event. How do we use our servers to communicate that to the players?
Some more challenges related to the game model include:
You have to remember that due to the global scale of our product, our infrastructure has to be available 99.99% of the time. We want to reduce the need to download new content in the store, since that can impact retention. The solutions we use are meant to allow us to constantly add new content to the game, while reducing the number of updates.
The diagram below shows how we can add new features to the game. It’s worth noting that our goal is to minimize the number of updates, while constantly adding new content to the game, potentially with every session.
If we have new functionalities requiring a new version of the code, we wait for players to install the update. This update takes up to 24h, so we have to be able to support two versions at the same time. If we’re adding new game model content e.g. a new event, we prompt the players to restart their session, if it’s a small matter (like a new version of a translation), we don’t prompt the players and the change is silently added when the player next launches the game.
Mobile games seem simple, but as you can see, the technological challenges are a fully professional branch of the IT sector, where we follow the market and create innovative products. What’s more this branch can allow us to quickly validate whether a result is working or not, additionally giving us a lot of insight into our players.
Is it easy to enter game development? It’s not really that hard, and from a backend developers perspective, you have to look at it as another domain in which we work and which has its specific rules and work methodology.
For more on the truths and myths about the industry, read our post Mobile games – truths and myths about the mobile gamedev industry.
And if you want to see for yourself what it’s like to work in gamedev, check out our current job openings.
Damian Dudek, Backend Development Lead – professionally creating IT systems for in the gaming landscape, Damian credits his Maths degree for the foundations for his craft that he’s been applying to the IT world for the past 10 years.
In today's world of the gaming industry, a question about the importance of inclusive language…read more