Site Search

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Sunday, 25 October 2009

A contextual search experience for Wikipedia

Posted on 22:28 by Unknown
Wikipedia users can now configure a Custom Search skin to customize their Wikipedia search experience. Once configured, the skin helps you to search Wikipedia, and for contextually relevant articles, from any Wikipedia page. This can make it easier to find relevant information, especially on Wikipedia pages with many links, and where the topics you are researching are ambiguous. You can find instructions to configure the Custom Search skin at Wikipedia. It works with Wikipedia's Monobook and the Beta Vector skins, and should work on Wikipedia domains globally. Remember that you need a user account and must log in to Wikipedia to use it.

With the skin configured, if you are reading the Wikipedia page on NASA, and do a search for the query [mars], you are presented inline results organized into 3 tabbed groups: All Wikipedia pages, Linked Wikipedia pages, and Linked non-Wikipedia pages. The first tab shows all Wikipedia articles that match, including those about the candy (Mars Bars) and the television series (Veronica Mars). The next 2 tabs provide contextually relevant results that are linked from the NASA page, such as information about various Mars rovers, orbiters, and space labs, as shown in the screenshot.



Here's what's going on under the covers:

Linked Custom Search enables the creation of dynamic search experiences, where the content being searched can be defined on the fly, and can change over time as new information becomes relevant. The Custom Search skin creates a Linked Custom Search engine on demand for every Wikipedia page that you navigate to.

The results from the current Wikipedia domain, as well as the results from the per-page dynamic search engine, are presented inline in tabbed categories via the AJAX search API. You can refine results by the category of choice, and quickly review the results without having to open a new browser window or tab. This happens through the Javascript code in the skin. The skin's CSS defines the look and feel of the results.

As for the page-specific Linked Custom Search engine, it computes the contextual results within the Linked Wikipedia pages (on-domain) and Linked non-Wikipedia pages(off-domain) categories. These two tabs are technically very similar, so we'll just describe how one of them works.

Suppose you're visiting the NASA article and search for [mars]. The Linked Wikipedia tab sends the search query to Google Custom Search, along with a parameter that indicates that the search engine specification is at (view source in browser):

http://googlecustomsearch.appspot.com/wikipedia/spec.do?url=en.wikipedia.org/wiki/NASA

Google picks up this Linked CSE request and uses the above specification and the supplied query. You can simulate this process by visiting:

http://www.google.com/cse?cref=http://googlecustomsearch.appspot.com/wikipedia/spec.do?url=en.wikipedia.org/wiki/NASA&q=mars

A different specification is generated for every Wikipedia page (based on url) by a tiny AppEngine application at http://googlecustomsearch.appspot.com. The specification defines a search engine with two facets, labeled "internal" (Linked Wikipedia pages) and "external" (Linked non-Wikipedia pages). The list of "internal" (and "external") webpages to search over is provided by this line in the specification:

<Include href="http://googlecustomsearch.appspot.com/wikipedia/annotations.do?url=en.wikipedia.org%2Fwiki%2FNASA" type="Annotations"/>

This causes Google to visit the webapp at a new URL (annotations.do). Our webapp now collects links from the NASA article, classifies them as "internal" or "external", and returns the annotations in an XML format. You can see the result at (view source in browser)

http://googlecustomsearch.appspot.com/wikipedia/annotations.do?url=en.wikipedia.org%2Fwiki%2FNASA

Now Google can finish building the Custom Search engine for the NASA article, and compute the results for [mars]. The results are returned to your web browser and displayed in the appropriate tab.

But wait! Our little AppEngine webapp doesn't have the CPU horsepower or bandwidth to scan Wikipedia pages on-demand or in nearly-real-time for thousands of Wikipedia users. Instead, the webapp asks Google to scan the page, via a Custom Search tool called makeannotations. The request looks something like this:

http://google.com/cse/tools/makeannotations?url=en.wikipedia.org%2Fwiki%2FNASA&label=myLabel

After makeannotations returns the list of links in the NASA article in XML, the webapp simply rewrites the XML according to the domain of each link.

Since we are creating the per-page search engines on demand, there can sometimes be a short delay in the creation of the search engine, e.g., for new or obscure pages. However, for popular Wikipedia pages, these definitions should be cached, and you should see no delays. In fact, we use a ping method to load up the Custom Search engine in advance before you search. Remember that if there are not many links on the Wikipedia page you are searching from, you may sometimes find no matches for linked pages.

We've open sourced the code for this application. Feel free to work with it. Feel free to extend the skin beyond Monobook and Vector. We built this skin with the help of Wikipedia, and hope that you will provide feedback on your experience. You can also provide your feedback directly to Wikipedia.

Posted by: Paul Komarek, Software Engineer and Jeffrey Scudder, Developer Programs Engineer
Email ThisBlogThis!Share to XShare to Facebook
Posted in | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Custom Search Engine APIs
    Posted by: Matt Wytock, Software Engineer A couple of weeks ago we blogged about a new feature and a new kind of Custom Search Engine (CS...
  • Connecting with the Adobe Community with Google Site Search
    Posted by: Tom Duerig, Software Engineer We love Google Site Search. And when working with our customers, we often discover new and interest...
  • Autocompletion of queries in Custom Search
    One of the most requested features for Custom Search is autocompletion of queries. Today, we announced at Google I/O that you can now enab...
  • Star Power
    Posted by: Jay Davies, Community Manager If you've created a Custom Search Engine you've likely encountered the term 'URL patter...
  • Ads background colors in Custom Search
    As we continue to improve the look and feel of Google Custom Search Engine (CSE), today we’re announcing a change in how ads are displayed ...
  • Ads now in harmony with search results
    Posted by: Tom Duerig, Software Engineer Many folks have pointed out that all the results on the page -- both the search results and the adv...
  • Bootstrapping your CSEs from keywords
    Custom Search provides upto 5000 URL patterns to define a “slice” of the web to search over. However, if you’re creating a Custom Search Eng...
  • Integrating Custom Search with your service
    Posted by: Kevin Gollum Lim, Technical Writer Most people who need something just go to the store to get the item, but a number of people ta...
  • Custom Search at the core of Google Site Search
    Posted by: Matt Wytock and Vrishali Wagle, Software Engineers Today, we announced Google Site Search , a hosted website search product that...
  • Custom Search promotions made easier
    Posted by: Bartlomiej Niechwiej and Nicholas Weininger, Software Engineers Last year, we made it easier to promote relevant information to ...

Blog Archive

  • ►  2013 (5)
    • ►  December (1)
    • ►  October (1)
    • ►  September (1)
    • ►  March (1)
    • ►  January (1)
  • ►  2012 (8)
    • ►  August (1)
    • ►  June (1)
    • ►  May (1)
    • ►  March (1)
    • ►  February (2)
    • ►  January (2)
  • ►  2011 (18)
    • ►  December (2)
    • ►  November (4)
    • ►  October (1)
    • ►  September (2)
    • ►  August (1)
    • ►  July (1)
    • ►  June (3)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
    • ►  January (1)
  • ►  2010 (14)
    • ►  December (1)
    • ►  November (2)
    • ►  October (1)
    • ►  September (1)
    • ►  August (1)
    • ►  May (1)
    • ►  April (3)
    • ►  March (2)
    • ►  February (1)
    • ►  January (1)
  • ▼  2009 (23)
    • ►  December (1)
    • ►  November (1)
    • ▼  October (6)
      • Three birthday candles for Custom Search
      • Plug-n-play with Custom Search Themes
      • Structured Custom Search
      • A contextual search experience for Wikipedia
      • Google Custom Search for your smartphone
      • Google Sites turns Custom Search on
    • ►  September (1)
    • ►  August (1)
    • ►  July (2)
    • ►  June (2)
    • ►  May (5)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
    • ►  January (1)
  • ►  2008 (18)
    • ►  December (1)
    • ►  November (1)
    • ►  October (2)
    • ►  September (3)
    • ►  August (1)
    • ►  June (2)
    • ►  May (1)
    • ►  April (1)
    • ►  March (3)
    • ►  February (2)
    • ►  January (1)
  • ►  2007 (20)
    • ►  December (1)
    • ►  November (1)
    • ►  October (1)
    • ►  September (2)
    • ►  August (1)
    • ►  July (2)
    • ►  June (3)
    • ►  May (1)
    • ►  April (1)
    • ►  March (2)
    • ►  February (3)
    • ►  January (2)
  • ►  2006 (9)
    • ►  December (2)
    • ►  November (7)
Powered by Blogger.

About Me

Unknown
View my complete profile