23 Eylül 2012 Pazar

The myth of scalable systems

This week, Prof Ben invited a special guest speaker from NUS School of Computing, Mr Lai Zit Seng. He is currently IT Architect and his talk was about building   scalable web apps which is very interesting to me. I often wonder how a web app like Facebook and Twitter can handle all the traffic. Of course they cannot use a single computer with normal LAMP (Linux – Apache – MySQL - PHP) architecture.
Website performance is not a simple problem that can be described fully in a few hours but thanks to Mr Lai Zing Seng, I did manage to get the big picture of it. Below are the 3 most important things I learnt after his talk:
1)Optimizing Database:I have never taken any module about database but I have some experience about database (mostly from 2 cs3216 assignments), the size of those database systems that I have created are all quite small to be worried about speed. Let think of a social network like Facebook where billions of activities happen every hour, all require database access (maybe read or write).It should be no surprise that an site as high-scale as Facebook uses a variety of data management technology. Each database product has its strengths, and Facebook needs all of them. They have also changed their data management from time to time, as they find solutions that meet their needs.For other sites, they may use one or two technologies like master-slave/master-master architecture, cluster DB, sharded DB or NoSQL (not only SQL). The type of table also masters. It is not true that you should index every table. It helps read queries a lot but the writes will be slow as you have to update indexes every write.
2) memcache :Clever use of memcache  will make your web app a lot faster. The reason is I/O operation is generally slow in compare with memory operation (maybe >100 time faster). But the amount of data you can be cached on the server is quite limited. In addition, memcache comes with its own problems. Memcache is only suitable for content that are frequently read and rarely updated. Usually it adds quite a lot of complexity to your system.
3) Webserver architecture:Okey, what we are taught in CS3216 is a single server does it all. But it does not usually the case for a real killer app. When your app grows big enough, you will feel the need to apply tiered architecture ( have separate web server, application server, database server).If your website gets even bigger? It’s time to use Web server acceleration and load balancer.
In summary,  design a scale system is a hard problem. There seems to be no completed solution and for each solution you apply, you need to take into account its pros and cons. This site is a very good reference for those who want to learn more about this topic http://highscalability.com. You can find many successful stories like youtube, facebook, twitter ….

Hiç yorum yok:

Yorum Gönder