Skip to main content

Alexa opens up Web search database and API

Article by Peter Sayer, IDG News Service, Paris Bureau

Alexa Internet Inc. is offering online computing capacity for US$1 an hour -- and throwing in access to the database of millions of Web pages that lurk behind its Alexa toolbar search service. Programmers who register for the beta version of Alexa Web Search Platform, released Tuesday, can use it to create specialized search engines for vertical markets, drawing results from the database of 4 billion Web pages crawled by Alexa, the company said. Alexa is a subsidiary of Amazon.com Inc.

Following in the footsteps of Google Inc., Alexa is opening up the API (application programming interface) to parts of its search engine, but going one better by offering to host applications that build on its database -- for a fee. Programmers remixing Google's search utilities must organize their own application hosting.

Alexa Web Search Platform gives programmers a way to specify a subset of documents from the archive, develop an application to search those documents, and publish the results as an XML (Extensible Markup Language) feed or a specialized search engine. The results returned can include simple text or HTML (Hypertext Markup Language) documents, or graphics, audio or video files.

As an example of how to use the service, Alexa has built a photo search engine at http://photos.alexa.com/ that allows visitors to refine their search for photographs according to technical details such as the size of the image, the make and model of camera it was taken with, and even the aperture setting used.

While the photo search engine shows how the platform can be used to build a live service, a one-off search of the database content can also be used to seed another service. That's how Rainer Typke, a researcher at the University of Utrecht in the Netherlands, used the platform to expand his searchable melody directory.

Typke used the platform to extract around 1,000 MIDI files from Alexa's database, converted them to a monophonic form and stored them on his own server to make them easier to search. Musipedia doesn't use Alexa for its live search service, Typke said in an e-mail response to questions. Using the Alexa computer cluster, Typke plans to identify hundreds of thousands of MIDI files in the database and process them using an algorithm that extracts their characteristic melody. Those melody files will be used to expand the Musipedia directory. Later, he hopes to be able to process files containing audio recordings in the same way.

"For the more computationally expensive preprocessing that would be required, especially by audio, Alexa's fast and large computers will come in handy," he said.

Alexa will charge for hosting applications that use the platform. The charges include $1 per processor per hour for computing capacity, $1 a year for 1G-byte of storage, $1 per 50G-bytes of data processed by the system, $1 per gigabyte of data transferred into or out of the system, and $1 for every 4,000 search requests the system responds to from published search engines using the service.

Typke expects the pricing will "be okay for people like me," he said. He's identified a number of ways to control the cost of his melody search, including updating the core data less frequently, or restricting the search to a smaller subset of Alexa's total data.

"I still need to get a feeling for how much I can do with one hour of computing power," he said. "Getting the 1,000 files for the prototype took just minutes."

The API is designed for the C programming language. It can be used to build "Web services" which can be integrated into other systems or published through Amazon.com's Web services platform, Alexa said.


ITWorld

Comments

Popular posts from this blog

Credit card debt catches up with Britons

By Cesar G. Soriano, USA TODAY USAToday.com - London : "This Christmas season, the hottest-selling gifts in Europe are pricey American products such as iPods, the Xbox 360 and celebrity-inspired fashions. That kind of shopping has led to a very American problem: credit card debt." Nationwide, 34% of Britons say they will use credit cards or store cards to pay for their holiday purchases this year. And one in five say they are still paying off their gifts from last Christmas, according to a December poll by Zopa, an online lending agency. "The UK has adopted the American habit of credit with vigor, and consequently consumers are rapidly getting in over their heads," says Steve Rhode, president of Myvesta.org, a non-profit, debt-relief group. In August, U.S.-based Myvesta opened an office in Britain to deal with the growing number of Britons in debt. The number of people filing for bankruptcy or insolvency in England and Wales rose 46% from 2004 to 2005 to a record, a

Learn What is Search Engine Optimization & How to Optimize Your Website For Search Engines

SEO is an action from a webmaster/owner of the website to optimize the website for search engines, to receive maximum traffic and increase ranking in the search engine result pages(SERPs). There are different types of SEO. Some are very specific, trying to dominate very targeted audience, and some are for targeting wide and general audience. Search Engine Optimization is quite long process, and it requires from a webmaster constant testing and tracking, to see what works and what doesn't. If you are a webmaster, then it's vital to know SEO, because it's a free, effective and clever way to get traffic to your website. There are many guides available on SEO, so you can read them and learn, if you want to be an expert. Why Does A Website Needs SEO? Most of the websites on the Internet get their main traffic from search engines, like Google, Yahoo & MSN. If you website cannot be located by search engines, or your content is not indexed in their database, then you are missin

Google Trends: Building Links with the Correct Keywords Makes All the Difference

The most important part of SEO is building great links to your content. The most common mistake is targeting the wrong keywords. The question at hand is, how do I choose my keywords? I have a great suggestion that should help you out. Google trends is a great help with choosing what you should target. Google trends will show you how much traffic a search term in Google is getting. The power is not in knowing how much traffic it gets, but how much traffic it gets relative to a similar term. Because you get no hard numbers you must remember to keep everything relative. I like to do a common search with every query so I know how much traffic I get relative to the keyword I learn about. For example, if I want to write an article for an article website, I may try to test Free Content, and Free Articles. To compare these two searches I separate them with a comma and hit search. Then I see that Free Articles is very close to Free Content. I can also choose to add a third keyword if I would li