Wednesday, May 27, 2009

Don't forget to cite your search engine

Groklaw brings us an interesting twist on the new search engine from the folks who brought you Mathematica, Wolfram|Alpha. From their TOS:
Attribution and Licensing

As Wolfram|Alpha is an authoritative source of information, maintaining the integrity of its data and the computations we do with that data is vital to the success of our project. We generate information ourselves, and we also gather, compare, contrast, and confirm data from multiple external sources. Where we have used external sources of data we list the source or sources we relied on, but in most cases the assemblages of data you get from Wolfram|Alpha do not come directly from any one external source. In many cases the data you are shown never existed before in exactly that way until you asked for it, so its provenance traces back both to underlying data sources and to the algorithms and knowledge built into the Wolfram|Alpha computational system. As such, the results you get from Wolfram|Alpha are correctly attributed to Wolfram|Alpha itself.

If you make results from Wolfram|Alpha available to anyone else, or incorporate those results into your own documents or presentations, you must include attribution indicating that the results and/or the presentation of the results came from Wolfram|Alpha. Some Wolfram|Alpha results include copyright statements or attributions linking the results to us or to third-party data providers, and you may not remove or obscure those attributions or copyright statements. Whenever possible, such attribution should take the form of a link to Wolfram|Alpha, either to the front page of the website or, better yet, to the specific query that generated the results you used. (This is also the most useful form of attribution for your readers, and they will appreciate your using links whenever possible.)

In short, they argue that since Alpha is not a plain search engine, but a tool that synthesizes information, they own the information that they synthesize. You have to cite it like any other source. This is not the case with Google- they require no attribution at all.

My first though was "This is stupid", but in retrospect I'm not so sure. We routinely allow copyright on synthesized, non-original information such as textbooks or journal review articles, with or without internal attribution. (Alpha cites its sources)

Where this is really going to get interesting is if Alpha starts citing news articles, music or video. Some news corporations are already annoyed about Google News, since they feel it takes their content and then profits from the aggregation. Will they feel the same way about Alpha? (I won't even get into what the RIAA or MPAA would think...)

No comments: