Sunday, March 8, 2015

Wikipedian Philosophy

A common legend on wikipedia is the following: Start on any wikipedia page, and click on the first link not in parenthesis or brackets, and repeat.  The claim is that this always leads to the page philosophy.  Is this true?  If so, why?

First of all, after some testing, I found that some websites didn't go to Philosophy.  Some, like Louisiana Voodoo, have redirect crashes, which messes up the process (If not for that odd hyperlink, that would work too.  My OCD was trying to edit that page.)  But, under running many examples (try xefer.com/wikipedia) I noticed that this worked for around 95% of pages.  (The only way it doesn't go to philosophy is if there is a loop, or a redirect crash.)

Below is an image of some pages leading to philosophy in a graph



Now, that's all good and dandy, but the question is why?  Well, it's a convention that wikipedia articles discuss the category under which the article fits in the first sentence.  And literally everything fall under philosophy.

That reasoning is the generic one offered by wikipedia, but I am not convinced yet.  Why aren't there more loops.  Like rabbit is similar to [[hare]] -> Hare is similar to [[rabbit]].  Why not?

The first person to publicly announce this phenomenon was Randall Munroe, the writer of the webcomic xkcd.  In his 903rd comic (see xkcd.com/903) , Munroe challenged readers to find webpages that don't go to philosophy, leading to more OCDers like me, who removed redirects to make the pattern more oblique, from about 90% to 95% of pages.

But even that is unsatisfying.  WHY???  When I am lost (I don't know about you),  I turn to math.  To those of you that are interested in doing the math yourself, consider wikipedia a random, connected graph, with large minimum degree, and derive the predicted longest path.  After doing this, one finds that it is probable to find a path that has 100%*(1-e^-2) ~ 87% of the vertices, or pages, in the graph.  In other words, it isn't surprising to find a page that has around 87% of the pages leading to it.

For a detailed mathematical explanation: http://math.mit.edu/~cb_lee/resource/mindegree-random-subgraph.pdf

No comments:

Post a Comment