York's Blog

A Startup's Server Architecture

| Comments

Having worked at EZTABLE for three years. The company is quite successful in Asia and the engineering team growed from two people to almost fifty people. There is no system administrator nor devops. I spent 5% of the time on the devops job.

The following are some notes I took on the server architecture and components used. Althought not perfect, it works and actually generates revenue.

Servers

AWS EC2. Keep most of them in us-east-1d to reduce cross-region data transfer fee. Having one DB Slave in us-east-1b to recover from region crash.

Currently not using VPC. There will be performance and security issue. Try to use VPC in the future.

Shared File System

AWS S3.

If need random access, use NFS.

If need cheap data archiving, use AWS Glacier.

DNS

Currently use Godaddy, try to migrate to Route 53 for better control.

Content Delivery Network (CDN)

AWS Cloudfront. SSL support for Cloudfront will cost you $600 USD per year. As a result, use the configuration file and the following for static files on CDN to support both http and https.

<img src="//d1gpbxqmt7wq2i.cloudfront.net/image.jpg" />

This can be done in AssetPipeline to support both local development and production.

If we really need SSL support with custom domain name, use Nginx as the reverse proxy for S3 static hosting.

Legacy design use static-host.eztable.com as the origin for CloudFront. However, modern design like ImageService use S3 as origin. Try to use S3 as origin as much as possible to ease the deployment tasks.

Cluster, Data Processing

AWS Elastic MapReduce. One medium instance for each MASTER and CORE group with abitrary number of spot large instances would be enough for current data-scale.

Email

SMS

Database

MySQL, Percona distribution. Use InnoDB storage engine to support master-master replication in the future. If really need support, buy their service.

AWS RDS would be a secondary choose since it is much more costly to get the same functionality.

For scaling issues, try to apply the following solutions:

  • Consider re-design the data model or not to use database.
  • High io EBS and raid 10.
  • More powerful instance type.
  • Table partitioning.
  • Use different server for vertical databases.
  • Table sharding.
  • Use NoSQL solution like MongoDB, HBase, or AWS Dynamo with carefully analysis.

Cache, Memory Storage

Redis

Queue

php-resque

Search

InnoDB does not support full-text search.

Solr. Currently use 3.2, but the latest stable version is 4.0. Could not migrate because native PHP extension does not support 4.x. Migrate to the Solarium client would solve the problem.

Log

Scribe

Scribe is not actively maintained these years. However, it is still a solid choice. (Facebook use the same code in their production servers.). Make sure scribed process in job001 is always alive, otherwise buffer servers harddisk will blow up.

Flume might be a better choice since its actively maintained and can be integrated with many other components.

Node.js Socket.io Server

Combining with Redis pub/sub, this provide us solid real-time messaging.

Load Balancer

Nginx

Web Server

  • PHP: Apache2 with mod-php5
  • Static Files: Nginx, S3, CDN
  • Node.js

You might want to use php-fpm to replace apache2 for better performance.

Comments

comments powered by Disqus