Posts Tagged ‘indexing speed’

Big Server Upgrade Lets Us Increase Index Size and Paves Way for New Algorithms and Public API

Friday, January 22nd, 2010

Yesterday I completed the promised server upgrades, upgrading the main server from a Linode 720 to a Linode 1080 and adding a new Linode 360 server to host the API (I’ll tell you more about that later) and for doing special data processing that is outside the normal indexing cycle.

So what does this mean for you?

The main thing you’ll notice is that the index will get bigger. I can’t tell you how much though as this is impossible to predict, but I really hope we will be able to reach 300,000 pages with this setup.

There’s two things that will allow the undex to grow:

1. With more RAM, 1080MB instead of 720MB the updating of the index database will be quicket letting the robot index more pages in a month.

2. As there is less VPS nodes sharing the same server there should be more disk and CPU cycles available to Us. This is not cut in stone though as it depends a lot on what usage profile the other nodes have and I don’t know that yet.

The index will grow, great. What about that other server then?

The other server, the Linode 360, will be used to host the new API/Feed that I will announce soon. I will make a feed of the search results available for free letting you make your own search engine, use it as additional content for your directory or anywhere that you want to give your visitors some relevant websites to visit. But more about this later.

The other mission for the new server will be to do calculations of data sets that will support the main indexing.

Mission one will be to make a related keywords database to allow us to find sites about “New York City” when someone writes “NYC” or “Win XP ” when someone writes “Windows XP”.

I know I promised you an index of 300,000 pages the last time we upgraded servers but things changed and I implemented a few new algorithms that improv quality but slows down the indexing process.

The main slowdown this year has been the addition of the search cache, a separate database arranged to make search queries ultra fast. You can now expect to get your results in under a second, sometimes in two, while before the cache searches was anywhere between 4 and 40 seconds as they had to dig through the main index.

Now lets just wait and see how much the index grows.

Simon Byholm
CEO and founder,
Secret Search Engine Labs

P.S. If you order a Linode through the links in this post Linod will give us $20 in free server time.