Jump to content
MakeWebGames

Scaling your Game


Zeggy

Recommended Posts

This is an article on how to code your game so it can scale to very large audiences. The techniques described in this article can be applied in any other programming language, I just chose PHP as an example since most people here will know it. The article describes two techniques to scale large amounts of data writes, and large amounts of data reads.

Sharding - heavy writing

Sharding is a technique used to handle frequent writes. While a database can contain large amounts of data, writing to a particular piece of data in the database is limited to x number of writes per second. While one row is being written, nobody else can write to that row at the same time. (Most storage engines on mysql lock the entire table, so sharding across rows won't help in any way. If you want to shard on myisam, you'll need to do it across different tables. An example of row level locking in mysql is InnoDB, which is what is discussed below.)

What type of data should sharding be used for?

Data that is updated very frequently (> x writes per second). For example, a specific player's level and stats should NOT need to be sharded since a player's actions are limited. What could be sharded is a counter for total page views, or total number of battles performed, etc. - data that might be updated by many players at the same moment.

What is sharding?

An example of how you would normally count the total number of page views would be something like this:

mysql_query("UPDATE `game` SET `count`=count+1 WHERE `key`='pageviews' ");

 

With sharding, instead of having the page view count stored in a single row, you split it among many rows. This way you can spread the write load among several rows:

 

$shards = 50;

function update_pageview() {
 //Pick a random shard to write to
 $shard = 'pageview_' . rand(0, $shards - 1);
 $shard_exists = mysql_num_rows(mysql_query("SELECT `count` FROM `pageviews` WHERE `key`='$shard' "));
 if ($shard_exists == 1)
   mysql_query("UPDATE `pageviews` SET `count`=count+1 WHERE `key`='$shard' ");
 else
   mysql_query("INSERT INTO `pageviews` (`key`, `count`) VALUES('$shard', 1) ");
}

function get_pageviews() {
 //Get the total count of every shard in the table
 $result = mysql_fetch_array(mysql_query("SELECT SUM(`count`) AS `count` FROM `pageviews` "));
 return $result['count'];
}

 

The code defines two functions you use for updating the pageview counter, and for getting the total number of page views. Whenever you update the pageview counter, it randomly chooses a shard to update, and writes it to the database. When you get the total number of pageviews, it will calculate the sum of counts in every shard in the table.

This way you spread the write load among your different shards, letting you handle far more writes to a particular piece of data. Like I said above, you will need to spread the load between different tables instead of rows if you are using table level locking, but I'm not sure how effective it is as you can't select the sum of data in different tables. Again, I only chose mysql because most people here are familiar with it, but it doesn't work so well as an example (table level locking is pretty fast). If you are using a different database that will benefit hugely from sharding, then you can apply the idea in whatever language and database.

 

Memcached - heavy reading

Memcached is an api to store data directly on your server's memory, available in many programming languages (http://code.google.com/p/memcached/wiki/Clients). Getting data from memory is much, much faster than fetching data from a database. Ideally you could store lots of game data in memory, however the data you store in memory is

a) temporary, not persistent - if your server restarts, all data is lost

b) not always available - if memory is needed for other applications on your server, data can be overwritten and lost, so your game must never depend on that data to be available. Having a backup source for the data is important (most often it's your database).

which is why it should be used a cache instead of permanent storage.

The ideal type of data to store with memcached is data that is read very often, but written less frequently (< 1 write per minute for example).

Before you read on, not every server has memcached installed. You will need to install it on your server first, and then install the PHP memcache package (there are two available, see the above link).

Here's an example of using memcache: http://uk.php.net/manual/en/memcache.examples-overview.php

However, for it to be useful for your game, you'll want to define your own functions for fetching certain data through memcache. Let's take the player's profile data for an example:

 

function get_player_profile($player, $memcache) {
 $key = 'profile_' . $player;
 $profile = $memcache>get($key); //Get from memcache
 if ($profile === false) //Memcache data doesn't exist?
   $profile = set_player_profile($player, $memcache);
 return $profile;
}

function set_player_profile($player, $memcache) {
 $key = 'profile_' . $player;
 $timeout = 600; //Number of seconds to cache the data
 $profile = mysql_fetch_array(mysql_query("SELECT * FROM `profiles` WHERE `player_id`=$player"));
 $memcache->set($key, $profile, false, $timeout); //Save data to memcache so it can be used in future calls to get_player_profile()
 return $profile;
}

 

We defined a function to get the player profile data from memcache. The function checks if the memcache data exists (because remember, you cannot depend on the data to be there). If it doesn't, then it makes a call to set_player_profile. Set_player_profile grabs the profile data from the database, then adds it to memcache.

I gave that particular piece of data a 10 minute time-out since it's unlikely that the player's profile data will have changed in the next 10 minutes. You can change the timeout to whatever you feel is appropriate. A shorter timeout means players will see profile changes much quicker, but in worst case you'll be making more database queries. A longer timeout means you'll make less database queries, profiles load faster in general, but profile updates will take a while to appear until the old data expires and memcache is updated.

Personally, I've used memcache to store data for as little as 60 seconds for fast changing data, and as long as a week, for data that will rarely change. I also use a slightly different updating mechanism that ensures my memcache data is usually up to date with the database, so I don't suffer any delays in seeing updates, but I'm not writing about that here.

Examples of memcache uses

Long term: template files, translation strings, item data, mail, map tiles/info

Short term: profile data, inventory, user logs, statistics

These examples might not apply to every game, it really just depends on your game mechanics. What data you store and for how long depends on the situation, and you should decide for yourself.

 

Finally

You can combine the two techniques I described above - sharding with memcache (good), and memcache with sharding (not so useful). If I think of other techniques used for scaling, I'll update my post here or make a new post. Feel free to ask any questions! :)

If you want to see some sample code that I actually use in my game, I can post some here but it's in python and uses a non relational database.

  • Thanks 1
Link to comment
Share on other sites

  • 10 months later...

Ok.. I dont think this question is worth a new thread, so Ill stick it in here.

So is there internal limits in MySQL that limits how much request per given time frame it can process? Or is it only limited by your hardware??

By the way.. Thanks for this tutorial! Really nice.

First I read it about a month ago, and understand nothing lol Now I read it again, and it all makes perfect sense.. So i suppose good indicator that I am making good progress in PHP/SQL lol

Link to comment
Share on other sites

It's limited by the hardware, the mysql soft itself, the engine you choose for the storage AND your own database design. From my own experience I must say MySQL with MyIsam is incredibly fast for INSERT / UPDATES but quiet slow for complex queries. There Oracle databases are generally better (by complex queries: group, sub queries and more). I saw somewhere that DB2 offers also very good performances but never worked with it.

However, none of the databases will work with bad designed tables, not optimized settings, and bad code. So most of the time you can more than double the speed of your queries by making a good design and optimizing the settings of the DB.

Link to comment
Share on other sites

Thanks for detailed reply Bertrand. I should probably avail of that code checking service you offer (Well I remember post or article about it, but cannot find it anywhere anymore). Is it still available? Can you give me a link to that article, or give some info here?

Link to comment
Share on other sites

  • 2 years later...

Question, why not write and update data into memcache, and use the sql database as a back end backup? What I was thinking was, do everything in memcached, and backup data to sql on an hourly or semi hourly basis (one giant query is better than a million small ones) worst case scenario server crashes players lose an hour of their game, on the other the game will handle more players more smoothly and would save server costs. Premium related actions can be handled purely by sql so that money would not be in jeopardy.

I also think if the game is for example an RPG game, and you only keep active players data in memcached, you might be able to fit the whole active part of the database in your memory, if each player is worth 100 KB of data, and you have 1000 players active at one time... so um yeah, I hope to get a reply on this.

Link to comment
Share on other sites

First of all you are posting on an old old thread... so maybe it would be smarter to open a new one? What do you think?

Second, why try to re-implement things when all DB I knows do have caches, so if you query 2x the same stuff it will not need to read back all again. Trying to make your own cache for things where a DB could do it better... not sure it makes sense.

Personally for high load, I would avoid PHP, go to Java or C# which both would allow you to handle all the caching as you may there very well keep threads and objects in ram between page loads. That would be the first step in the right direction => use something more professional that PHP, and think carefully what you need to do.

Link to comment
Share on other sites

I know it's old, but I also noticed it was a sticky, so I figured it's ok... And I don't have much expertise in databases and memcache, I am talking out of pure research. But I have read that writes on DB are a major bottleneck and that you would be making like 20 writes per second at best, while memcached can do thousands. And in a game, the user might change the same variable several times in under 10 min(lets say we were keeping track of his life and he attacks 2 players within 10 min etc...) I ran some basic tests on memcached with php, I was very impressed with the way it can modify data and the speed of it.

I never considered any other language than php... Guess I should look into that. I thought java as a language was too slow and heavy to be used to run a game server.

Link to comment
Share on other sites

of course memcache is fast => it's basically a memory table where you can store whatever you want... But that will not make your game persistent in case you reboot the memcache would be lost (at least as far as I understand it). So it's not meant to store data, instead it's meant to keep temporary things you access often. Memory tables in mysql could do basically the same trick, again at reboot you loose all.

So how you can solve that?

1) Avoid to modify too mucj

2) Make sure modifications / db access is as fast and simple as possible

3) Use a smart cache which do store after some time

Link to comment
Share on other sites

DB for writes is fine if you have 50 to 100 online, but if it gets more than that you have to take the gamble, you still have sql, but sql would be a backup, a little behind what is actually going on. You really believe twitter and facebook store every single thing on hard drive when it comes out? They take a gamble to save billions, sure some data might be lost, as long as it is nothing very vital and backups are not very old, nothing bad will really happen...

Link to comment
Share on other sites

SQL? SQL is just a language to access a database. And you can handle more than 50-100 people online with a database. Also, you could try MongoDB which works differently than MySQL and may react faster for what you need.

And I'm sure twitter and facebook store every modifications. Yes. BTW if you know all, why don't you go forward and use memcache if you are so convinced it is better?

Link to comment
Share on other sites

I don't know all. In fact as I mentioned I never made a game before so I don't know anything :P which is why I am here discussing and learning and I thank you for the valuable info. The thing is just that I want to make a small game (it has always been my dream :P) and I cannot afford to pay for than 70$ a month for a server and I was checking to see any way that I can squeeze every dollar from that :P

Link to comment
Share on other sites

2 GB of ram and dual core is what I am expecting for that kind of money, my friend uses gogrid for most of his online stuff and he trusts them a lot. I'm inclined to go with that. I am not worried about the actual game size as I am worried about how many players might potentially be on it... given my 0 experience in hosting a game, I find it difficult to predict/estimate server usage pattern. I'm making it on android phones as an app. It might get better visibility, competition is slightly less fierce, and you get the benefit of writing native code that can do its own calculations and store its own graphics, saving you server energy(of course you'll be checking with the server before every action, but not necessarily checking every time the player clicks his inventory to remember what item he has... etc) anyway I should start coding it already :D talk is cheap.

Link to comment
Share on other sites

  • 1 year later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...