Thomaschaaf
Thomaschaaf

Reputation: 18196

How to get Google like speeds with php?

I am using PHP with the Zend Framework and Database connects alone seem to take longer than the 0,02 seconds Google takes to do a query. The wierd thing today I watched a video that said Google connects to 1000 servers for a single query. With latency I would expect one server for every query to be more efficent than having multiple servers in different datacenters handeling stuff.

How do I get PHP, MySQL and the Zend Framework to work together and reach equal great speeds?

Is caching the only way? How do you optimize your code to take less time to "render".

Upvotes: 5

Views: 1177

Answers (10)

macbirdie
macbirdie

Reputation: 16203

PHP scripts by default are interpreted every time they are called by the http server, so every call initiates script parsing and probably compilation by the Zend Engine. You can get rid of this bottleneck by using script caching, like APC. It keeps the once compiled PHP script in memory/on disk and uses it for all subsequent requests. Gains are often significant, especially in PHP apps created with sophisticated frameworks like ZF.

Every request by default opens up a connection to the database, so you should use some kind of database connection pooling or persistent connections (which don't always work, depending on http server/php configuration). I have never tried, but maybe there's a way to use memcache to keep database connection handles.

You could also use memcache for keeping session data, if they're used on every request. Their persistence is not that important and memcache helps make it very fast.

The actual "problem" is that PHP works a bit different than other frameworks, because it works in a SSI (server-side includes) way - every request is handled by http server and if it requires running a PHP script, its interpreter is initialized and scripts loaded, parsed, compiled and run. This can be compared to getting into the car, starting the engine and going for 10 meters.

The other way is, let's say, an application-server way, in which the web application itself is handling the requests in its own loop, always sharing database connections and not initializing the runtime over and over. This solution gives much lower latency. This on the other hand can be compared to already being in a running car and using it to drive the same 10 meters. ;)

The above caching/precompiling and pooling solutions are the best in reducing the init overhead. PHP/MySQL is still a RDBMS-based solution though, and there's a good reason why BigTable is, well, just a big, sharded, massively distributed hashtable (a bit of oversimplification, I know) - read up on High Scalability.

Upvotes: 2

vartec
vartec

Reputation: 134631

  • APC code caching;
  • Zend_Cache with APC or Memcache backend;
  • CDN for the static files;

Upvotes: 0

Toby Hede
Toby Hede

Reputation: 37133

Google have a massive, highly distributed system that incorporates a lot of proprietary technology (including their own hardware, and operating, file and database systems).

The question is like asking: "How can I make my car be a truck?" and essentially meaningless.

Upvotes: 1

Baishampayan Ghose
Baishampayan Ghose

Reputation: 20666

There are many techniques that Google uses to achieve the amount of throughput it delivers. MapReduce, Google File System, BigTable are a few of those.

There are a few very good Free & Open Source alternatives to these, namely Apache Hadoop, Apache HBase and Hypertable. Yahoo! is using and promoting the Hadoop projects quite a lot and thus they are quite actively maintained.

Upvotes: 8

James Brady
James Brady

Reputation: 27492

I am using PHP with the Zend Framework and Database connects alone seem to take longer than the 0,02 seconds Google takes to do a query.

Database connect operations are heavyweight no matter who you are: use a connection pool so that you don't have to initialise resources for every request.

Performance is about architecture not language.

Upvotes: 6

spoulson
spoulson

Reputation: 21591

Memcached is a recommended solution for optimizing storage/retrieval in memory on Linux.

Upvotes: 2

Coltin
Coltin

Reputation: 3794

Awhile ago Google decided to put everything into RAM.

http://googlesystem.blogspot.com/2009/02/machines-search-results-google-query.html

If you never have to query the hard drive, your results will improve significantly. Caching helps because you don't query the hard drive as much, but you still do when there is a cache miss (Unless you mean caching with PHP, which means you only compile the PHP program when the source has been modified).

Upvotes: 5

krosenvold
krosenvold

Reputation: 77191

According to the link supplied by @Coltin, google response times are in the region of .2 seconds, not .02 seconds. As long as your application has an efficient design, I believe you should be able to achieve that on a lot of platforms. Although I do not know PHP it would surpise me if .2 seconds is a problem.

Upvotes: 0

Julien
Julien

Reputation: 525

If it's for a search engine, the bottleneck is the database, depending of its size.

In order to speed-up search on full text on a large set, you can use Sphinx. It can be configured either on 1 or multiple servers. However, you will have to adapt existing querying code, as Sphinx runs as a search daemon (libs are available for most languages)

Upvotes: 1

jonstjohn
jonstjohn

Reputation: 60276

It really depends on what you are trying to do, but here are some examples:

  • Analyze your queries with explain. In your dev environment you can output your queries and execution time to the bottom of the page - reduce the number of queries and/or optimize those that are slow.

  • Use a caching layer. Looks like Zend can be memcache enabled. This can potentially greatly speed up your application by sending requests to the ultra-fast caching layer instead of the db.

  • Look at your front-end loading time. Use Yahoo's YSlow add-on to Firebug. Limit http requests, set far-future headers to cache js, css and images. Etc.

You can get lightning speeds on your web app, probably not as fast as google, if you optimize each layer of your application. Your db connect times are probably not the slowest part of your app.

Upvotes: 3

Related Questions