Beta Testing Is Open, Did You See The Press Release Last Week?

January 12th, 2010

I forgot to blog about it but in case you didn’t notice we started the official beta testing period last week and we even had a press release distributed through PRWeb.

Thanks to the press release we had about 400 extra visitors on last Thursday and Friday,  most spending more than the average number of click on site.

The number if visitors isn’t exactly earth shattering but nevertheless it was great to have you all here and I hope many of you will be back for more searches.

The Beta testing period is planned to run all year and end in a public launch in January 2011 with the target of having over a million relevant pages indexed by then.

I also want to take the opportunity to apologize if you have stumbled upon some less than relevant results. We had a bit of a Cuil experience when in anticipation of a stampade of visitors from the press release we added a new cache database to speed up searches.

Yes the searches are now about 30 times faster with most searches ready in less than half a second.

The downside is it will take a whole month before the cache is fully populated with data and in the meantime the search results are based only on a subset of the whole index. For some search terms this means junk and garbage from the bottom of the barrel will show up.

Please enjoy the new super quick search results and check out the ping tool in the webmaster tools section.

What is CashRank? For Your Site?

January 12th, 2010

It’s amazing how much junk there is on the Internet. You let the spider loose and it just wanders so deep into a site that you never find it again.

I first made  a system with levels to prevent the spider to go more than x link jumps (3 or 4) from the domain homepage of  a site. This wasn’t enough though.

As I have a really small index I don’t want that many pages from every domain but as many different domains as possible, so I invented something I call CashRank which is a somewhat controversial way to rank pages.

Each page gets a dollar value indicating how much money has to be paid annually to keep that page online. This is domain registration fees, hosting fees, cost of IP address, advertising costs etz.

A page in only included in the index if it’s worth at least $1/year, this effectively limits auto generated content that has no value, because no one is paying for it’s upkeep.

The CashRank is also propagated through links, a page keeps $1 for it’s own
upkeep and then sends $1 through every link until it has used up all of it’s cash.

This means pages with many inbound links can have more pages in the index and only the first links on a page are counted.

Please give me your opinion on CashRank by leaving a comment below.

Stinky Teddy Lets You Search Gossip Online

December 25th, 2009

I just found out that Stinky Teddy is following me on Twitter.

Stinky Teddy is a real-time gossip search engine which gives your search term a buzz-o-meter score and if the buzz is high enough it will show you the hot news stories relating to your search term.

This search engine is developed by David Hardtke, a “Physicist turned search engine innovator” according to his Twitter profile, and was launched on the 15th of December after half a year of development.

Stinky Teddy is a meta search engine and uses data from Bing, Yahoo, VidoSurf, Twitter, Oneriot and Collecta which it then mashes up to become gossip results.

I added this one the the big list of search engines.

Simon

Link Counting Problem Fixed

December 25th, 2009

We had a slight problem with counting the final number of links from a page, when new links where discovered and there was several duplicates (same URL)  of the same link on the page we counter every duplicate of the same link.

This caused many pages to get a lower number of outlinks approved than what the CashRank would allow.

i.e. if the CashRank of a page was 10 and there was 5 duplicates of a link to the /shop/ directory, the page would effectively only get 5 links out even though the CashRank would allow 9 links out.

The link counting is now fixed which should speed up discovery of new sites and allow indexing of more pages from sites that were affected by the bug.

Simon

Thanks to Server Upgrade We Can Now Index Up To 300,000 pages

December 16th, 2009

A couple of days ago we upgraded our VPS server from a Linode 360 to a Linode 720 which means we now have doubled our indexing capacity and you should see a steady growth in index site during the coming month.

In reality even though we doubled the server resources the indexing speed has, according to the stats, increased from about 6,000 pages/day to around 10,000 pages per day. I believe the reason we don’t see a doubling of indexing speed is the fact that we still run the indexer on a single thread which means we are spending too much time waiting for network transfer and disk access,  not utilizing the CPU even if we could.

Which just makes it that much more important to get the multi-threaded indexer implemented… just need to get some marketing done first.

The migration to the new servers went really smoothly. With the system that Linode uses you just select the upgrade you want from the online interface and once you have clicked the “Yes” button they instantly and automatically shut down the site and copy everything to the new bigger system.  It all took 12 minutes of downtime plus a couple of minutes to reconfigure mysql for the larger memory size.

If you’re looking to get a VPS server that is easy to upgrade Linode is a great alternative and you can support this search engine by signing up for a Linode using this link.

Linode will support us with $20 for every one of you that stays a Linode customer for more than 90 days.

Please tell me what you think about the quality of the search results by leaving a comment below!

Simon Byholm
CEO and founder,
Secret Search Engine Labs

Secret Search Engine Labs Now Have a Blog for News and Announcements

December 9th, 2009

Today I finalized the setup of a WordPress blog for Secret Search Engine Labs.

The new search engine blog will be used to send out news and announcement related to the work I do at Secret Search Engine Labs as well as for discussing new ideas for ranking and filtering data.

Feel free to add your ideas, criticism and praise by commenting on the blog posts, I love feedback. The search engine is meant to search you in the best possible way and that will be possible when you tell me what you expect of the worlds most useful search engine.

Have a great day,

Simon Byholm
CEO and founder,
Secret Search Engine Labs

Test Post 2 for CSS Work

December 9th, 2009

Testing how it looks with two posts in the blow to complete the CSS tuning. Lets see if I can write a little bit of text here to have something long enough that it looks like a real blog post. The trick is to keep typing even though I don’t really have anything to say, don’t read on if you have better things to do.

This is the second paragraph of this nonsense text that has only one mission: To let me test out different CSS style on this new blog for Secret Search Engine Labs.

This is a live link to the Add URL page to see how links in text are rendered.

Hello world!

December 8th, 2009

Welcome to WordPress. This is your first post. Edit or delete it, then start blogging!