Nervous About Wolfram

Take a look at the screen shot, below. This is from my first five minutes of playing with Wolfram/Alpha. Do you see what’s wrong with it? I’ll tell you in a minute…

Wolfram/Alpha is the new search engine that isn’t so much a search engine as a find-interesting-ways-to-analyze-data-and-show-it-to-me engine. It’s a closed system, as far as I can tell. It does some cool things. But I don’t understand how they will keep up with the data quality problem.

This worries me because the output from Wolfram/Alpha looks authoritative. I want to be able to trust it. But look at this slightly disturbing problem. I searched for Francis Bacon, but instead of getting a page about the various Francis Bacons of history and having an opportunity to disambiguate, I got the output, below. As you see, it combines information from two different men: Francis Bacon, 1st Viscount St. Alban and Lord Chancellor of England under Elizabeth I, and Francis Bacon, the painter. Furthermore, there appears to be no way to focus the search. Adding search terms that should distinguish between the two men appears to do nothing.

This tells me that there isn’t a lot of data in the system, yet, and that the data that is there may be mangled in ways that I may not notice unless I already know the thing I asked to learn about.

At least with Google and Wikipedia, it’s a relatively open system where I get a variety of results. So, beware, folks.

That said, I’m going back to playing with Wolfram/Alpha some more… Because it’s cool.


9 thoughts on “Nervous About Wolfram

  1. Hi James,

    I’m not a huge fan of Wiki either, mainly because its so easy to alter the information. This I dont think is a big problem in itself as most people using Wiki are aware of the collaborative nature of the information. The scary thing is that journalists use Wiki as a definitive source, which then gets turned to print and generally we take news in papers to be definitive.
    Shane Fitzgerald, and irish sociology student demonstrated this by adding a false quote into recently deceased Maurice Jarre’s Wiki post. Journalists didn’t bother to validate the data and used it in their articles.

    Two librarians wrote an interesting article on the subject

  2. Good spot there! Wonder how many more mistakes in newspapers that will lead to. Not only quote Wikipedia but usually also can’t manage to convert currencies. Annoys me all of the time. Still trying to find a business case for Wolfram though (apart from lets get bought up by Google). It’s a novelty and interesting to some students I’d say but day-to-day life….????


  3. Some of the folks working on Alpha have compared it to an almanac (crossed with a graphing calculator). I’d say it’s also a bit like a data mining CRC Handbook.

    If your day-to-day life involves little thinking about topics Alpha handles, there’s no drive to go there. If it does, you’re going to run afoul of what I call the “pretty oracle” or “reverse Cassandra” problem. “Look at that results page~it’s so tight and Tufte-looking! Of course it can’t be garbage?!”

    Because it’s curated, there are slightly different quality problems than those for Wikipedia. I don’t know if it’ll fall prey to the Google Knol effect if they extend the curation to other parties. I wonder about tying in the CYC knowledge base for an anttempt at more “common sense” about questions, but Lenat and Guha would probably need a good bit of romancing. has a slightly breathless take on something Google Labs is cooking up that might be in the same experience-results space: “Google Squared”.

  4. I did a search for “James Bach”, and the results said “Wolfram|Alpha isn’t sure what to do with your input.”

    So you’re not sure what to make of Wolfram|Alpha. And Wolphram|Alpha isn’t sure what to make of you either, James. Kinda funny.

  5. Good catch – I wasn’t even thinking about the reliability of the source data. Much less, that they would “comingle” source data into one final output.

    I think many may also lose sight of this and be lost in the “wow” factor.

  6. Hi James,

    The test data you used was very efficient one :). I believe that its the first input you gave to wolfram.
    Wolfram, as far as I experimented, felt its more inclined to scientific search and situation where user needs a single answer instead of a list of links.
    But, in any case, as you pointed out, error data is not expected.


  7. Interesting. I think there were a number of unfortunate design choices made here: 1. You give me something that looks like a search engine, I’m going to compare your results to Google (and find you lacking). 2. The interface is clean enough to make it seem like it’s good for more than it currently is (and doesn’t do a good enough job of letting me know what it IS good for). Conversely, it may be more of a black box than scientists and other really serious investigators will need from a tool. 3. (As you noted) don’t give me something that looks like The One Authoritative Result if you’re not VERY confident that you’ve disambiguated my query entirely, and have answered it well.

    That said, I think there’s definitely some cool stuff it can do. As I understand it, they mean to be a computational tool more than a way to learn about a particular thing. Here are a few queries I’ve tried that I was happy with:

    What does the weather tend to be like in Cancun — generally, or at a particular time of the year?

    What was the nutritional content of my breakfast today?

    What was the distance between two planets on a particular day?,+2033

    …And of course the ever popular:

    I don’t know yet what problems I’ll turn to it for, but I’m glad that Wolfram and his team are working on it.

  8. I have two additional issues with this site. 1) It is not consistent with its own data. If I perform a search on the name “Joseph”, I get pretty statistics on the use of the name in the U.S. I even get a link to Joseph as a female name (and get data represented similarly). My problem is that if I search for “Alex” I don’t get a link to that name as a female name. Now, I know several Alex’s as that are female and not a single Joseph as a female. Along these same set of attempts, I also observed there is not an obvious way to force WA to perform that style of search. FWIW, using “male given name” confuses WA, but “male name” does not. Which is funny, because “male given name” is the nomenclature used in the input interpretation box. However, using “Alex female name” does not return any data that I was expecting. Also, I observed that Jesus is represented as a religious figure but Abraham is not. 2) It makes authoritative claims it cannot back up. From its main category page you can search for various musical things, like chords. I can search for C major 7th chord, but it can’t tell me what the 1st inversion of the C major 7th chord is. Sounding complete when its not is just as bad as being ambiguous. This is not a site for exploratory learning, you have to know what it is and is not capable of doing before you can even think of using it. Which means, it looks cool, and I can use with my daughter’s calculus homework (it plots great!), but for my style of learning, it doesn’t work.

  9. James, your blog and a subsequent discussion brought me to this application. Already my second search revealed amazing results: I asked for the weather in a small Bavarian town (Miesbach) in September 2008. For those who don’t know Miesbach – moderate climate, some 500m elevation, situated between Munich and Salzburg. I didn’t really expect an answer. The bad thing is, they returned one as below (copied from the screenshot I took, not sure how to attach the screenshot to this comment):

    (start of screenshot copy)

    average temperature: 12 C (1 to 74 C)
    relative humidity: 80% (4 to 167%)
    wind speed average 2m/s (-93 to 8ms)

    (end of screenshot copy)

    Now – Miesbach is not in a tropical desert (74 C?), and even by WolframAlpha clearly identified as “town in Bavaria, Germany”. Using out of boundary relative humidity of 167% or negative wind speeds (-93m/s) is just unacceptable for a “scientific” application. Makes you wonder about the validity of the computed data. Looks like they need to either add some code (sanity checker?) or remove some bugs in their data base.

    Happy browsing …

