The Subject Wikis Blog Rotating Header Image

Recent improvements

The subject wikis are being upgraded to MediaWiki 1.16.0beta (see here for the security release). The high traffic wikis have already been upgraded; others should be upgraded in a few days. We’re also upgrading to Semantic MediaWiki 1.5.0.

The default skin on the upgraded wikis is the “Vector” skin, which is the same as Wikipedia’s new default skin. Those who want to change the skin appearance back to monobook should create an account and change their user settings. Users who already have accounts may still have their settings as “monobook” — so they need to manually change the settings to “vector.” For more information about why Wikipedia switched to a vector skin, see Usability Initiative.

Apart from these software changes, we’re also making changes to page content and appearance to make it easier to find relevant information. Most of these build further on the page design changes blogged about earlier. These include:

  • Front-ending the definitions: For long pages, the definition is being moved right to the top, above the table of contents and the article-tagging template boxes, which give information about the type of term and similar terms. This means that people in a hurry can quickly read the definition without scrolling down too much. This change may be rolled over even for shorter pages.

  • Increased use of tables: Tables allow for a compact expression of relates, correlates and analogies, and also make it easier to locate information. On the minus side, it becomes tedious to put detailed and lengthy paragraphs in a table. We are dealing with this by putting the most important summary information in tables with further detailed information below expandable/collapsible “SHOW MORE”s. Even for definitions, we are switching from the earlier “Symbol-free definition” and “Definition with symbols” as separate subsections to a single tabular format for definitions where one column gives a shorthand phrase for the definition, one column gives the symbol-free definition, and another column gives the definition with symbols (for instance, pronormal subgroup (permalink to current version). Additional columns may include applications of a particular definition to ways to prove/use the given term. We are also using tables for references, as seen in this example, making them easier to parse as well as look up.

  • Increased use of expandables/collapsibles: Expandable/collapsible “SHOW MORE”s allow for a lot of relevant information to be placed within pages without causing a cognitive overload or making the page too much effort to scroll down. Expandable/collapsibles are used both within pre-defined templates and on a discretionary basis within pages. Sometimes, the most important things are stated and the rest are hidden under a “SHOW MORE” — for instance, the list of properties stronger than characteristicity. The “SHOW MORE” feature uses the MediaWiki extension ToggleDisplay.

  • A continued shift away from categories to semantic constructions: We are continually trimming down the use of categories to the level where they help with broad “containment”-based navigations. We’re moving all relations and analogies to the semantic realm, which is much more flexible and allows for more powerful querying. For instance, variations on a particular term are no longer stored in a MediaWiki category, but can instead be accessed using a semantic query — for instance, here’s the query for variations of normal subgroup.

  • More suggested semantic queries: The pages now often contain links to semantic queries that might answer further questions the reader could or should have. Many of these links are generated automatically through templates.

Groupprops usage patterns update

I had started working on a report on usage pattern analytics for Groupprops, but for various reasons, will not have the time to complete the report in the near future. Also, I would like to subject some of the findings in my preliminary number-crunching to the test of more data — particularly data spanning across more than one year. Nonetheless, it might be worthwhile to note some of the findings on the blog. (What follows below is what I consider the most salient snippets from the current draft of the report).

Nature of variation between daily traffic across days

The variation is broadly of three kinds:

  • Intra-week variation: There is a clear pattern here: weekend traffic is generally about 50-70% per day of weekday traffic. The minimum usually occurs on Saturdays, and the second lowest is on Sundays, with the third lowest on Fridays (extended weekends?). So, it seems like visits to Groupprops are somehow better classified as “work” than “leisure”. This is further corroborated by the fact that in holiday seasons, traffic is low on all days and the difference between weekdays and weekends is less pronounced. (More data in as well as further segmentation of existing data will allow the testing of further hypotheses about intra-week variation).

  • Seasonal variation: There is a reasonably clear pattern here too: seasons that are “off” in colleges and universities see less traffic. Holidays see less traffic in the regions that observe those holidays. For instance, the Thursday of Thanksgiving saw a significant drop in U.S. traffic while U.K. traffic remained at usual weekday levels. Christmas week saw a worldwide traffic drop. Traffic is most in mid-September to mid-December and mid-January to mid-June, and less mid-December to mid-January and mid-June to mid-September. (These trends will be better understood with more multi-year data available, because it is difficult to separate seasonal variation from a general upward trend if traffic. However, these observations are similar to the observations made in the 2005 full evaluation report for MIT OpenCourseWare.

  • Secular increase (here, secular means over time, i.e., a long-run trend): Traffic has been increasing since May 2008, when the wiki was moved to this site, with most of the dips being accounted for by intra-week and seasonal variation. For instance, a comparison of the mid-December to mid-January of 2008-2009 with the mid-December to mid-January of 2009-2010 shows an increase of 260% (which means the new traffic quantity is 3.6 times the old). Interestingly, the same-time-of-year comparisons show that the proportional increase is least in holiday seasons and more during seasons when traffic is higher. This hypothesis needs to be tested further.

Visits and pageviews

For overall magnitude estimates, there was a total of about 20,000 visits and 48,000 pageviews from mid-September to mid-December of 2009, higher than over previous three-month periods.

The ratio of pageviews to visits has remained steadily in the range of 2.4-2.6, and this ratio has not shown much change despite the secular increase in both the number of visits and the number of pageviews. Moreover, the composition of visitors by depth of visit has remained remarkably similar over time. The breakdown is roughly as follows: 60% of visitors had one pageview, 14% has two pageviews, 8.5% had three pageviews, 4.5% had four pageviews, 3% had five pageviews, 2% had six pageviews, 1.5% had seven pageviews, 1% had eight pageviews, and so on. About 1% had eighteen or more pageviews.

Inter-country variation

This is another area with fertile analytical possibilities. Current results suggest the following picture.

  • In absolute numbers, in terms of visits (pageview rankings are almost the same), the top countries are, in decreasing order: United States, United Kingdom, Canada, India, Germany, Australia, Italy, Israel, Turkey and the Phillippines. Note that there is likely to be quite a bias in favor of countries that are English-speaking (such as the United States, the United Kingdom, Canada, Australia) or countries where higher education and research is carried out in English, even though there are other local languages (such as India, Turkey, and perhaps Phillippines and (in the higher mathematical context) Israel). Despite this, Germany and Italy make it near the top of the list. However, the absence of countries such as Japan, Korea, China, and France form the top of the list may be explained by the language factor.

  • The picture looks a little different if we consider the number of visits per capita. Here, the United Kingdom comes out on top (largely due to the contributions of Cambridge and London), and other good performers include New Zealand, United States, Israel, Ireland, Singapore, Canada, and Australia. India falls very far down once we divide out by its huge population, though it still comes higher than China.

  • It is unclear how the per capita usage of Groupprops compares with other indicators, and more research needs to be done on the connection with such factors as wealth, Internet access, number of college students, etc. One clear finding seems to be that with the exception of Israel, the top nine countries in per capita visits are among the top ten in the 2007 Economic Freedom of the World rankings (download report as PDF). The connection with political freedom, as measured by Freedom House, seems more tenuous.

  • Top cities

    The cities that top (in absolute numbers, not on a per capita basis) include Cambridge (UK) (home to the University of Cambridge), London, Chicago (home to The University of Chicago, and also where I currently am), New York, and Cambridge (Massachussetts, USA) (home to MIT and Harvard). Other cities that do well include Oxford (home to Oxford University), Pasadena (home to CalTech), Portland (home to University of Oregon), Singapore, Atlanta, Charlottesville, Philadelphia, Don Mills, Stanford, Los Angeles,
    Chennai, Delhi, Ithaca, Sydney, Manchester, Austin, Champaign, Claremont, and Seoul. The fact that both the top cities are in the United Kingdom, and the clear lead enjoyed by Cambridge, UK, are as yet unexplained, though it seems that the high traffic from Cambridge is largely confined to the period from October 2009 onward.

    Browser/OS combinations

    In the analysis over one time period, the most popular browser/OS combination among Groupprops visitors appears to be Firefox/Windows (35%) followed by IE/Windows (32%). Other popular browser/OS combinations are Safari/Macintosh (8.37%), Firefox/Linux (7.36%), Firefox/Macintosh (6.11%), and Chrome/Windows (5.93%). 0.38% of the visits came from the Safari/iPhone combination. The changes in these proportions over time is potentially a subject of further study.

    Network locations: university networks and commercial service providers

    Among the top network locations, the universities were University of Cambridge (rank 3), which accounts for about 97% of the traffic coming from Cambridge, the University of Chicago (rank 5), which
    accounts for about 60% of the traffic coming from Chicago, Harvard University (rank 11), which accounts for about 70% of the traffic coming from Cambridge, Massachussetts, Oxford University (rank 13)
    which accounts for about 90% of the traffic coming from Oxford, and Caltech (rank 15) which accounts for about 80% of the traffic coming from Pasadena. Note that the actual traffic from people affiliated
    with the university is probably higher, since many of the students and faculty may be using non-university Internet connections when at home.

    Most of the network locations at the top are non-university. The topper is Comcast Cable, an internet provider in the United States.

    Connection speeds

    The most used connection speed is T1, and it generates more than a third of the traffic. Other connection speeds commonly in use are cable (slightly more than a fifth of the traffic) and DSL (about a sixth of the traffic). A large amount of traffic was generated through unknown connection speeds.

    Traffic sources

    Most of the traffic (varying between 75% and 90%) is generated by search engines, with about 99% of the search traffic originating from Google. The remaining traffic includes both direct visits and referring sites, and the proportions of these vary with time. Long-term trends in these will be among the things to be studied in a more in-depth investigation.

Page design changes

Over the last few months, there have been some changes in page design on the subject wikis, with more changes scheduled for the coming months. Most of these changes have, as of now, been limited to Groupprops, but the ones that seem to be working well are being incorporated in other wikis as well.

One of the goals of the subject wikis is to provide a lot of information in a readily accessible and suggestive manner without “overloading” or “confusing” the person reading the information. This goal is challenging, particularly as more and more potentially useful information gets identified. Compare, for instance, the Wolfram Mathworld page on normal subgroup, the English Wikipedia page on normal subgroup, and the Groupprops page on normal subgroup. The Groupprops page has a much greater “raw quantity” of information, both in terms of content in the page and links from the page.

Some techniques to incorporate more content while trying to minimize overload and maximizing user control have been discussed below.

Increased use of tables

When there are multiple items with similar information (attributes) for each item, it makes sense to organize them in a table, with the columns corresponding to the attributes. This simple idea has taken some time to implement in various contexts. In the page on normal subgroup on Groupprops, the “Relation with other properties” section and the “Metaproperties” section provide examples of the use of tables. In the “Relation with other properties” section, other properties related to normality, and more information on the relation, are organized in a tabular format. This makes it easier both for viewers to engage with the content and to ignore it, because rectangles are easier to ignore than uglier shapes.

Figuring out what attributes to choose for the table is a tricky task, in that it involves looking at the “typical attributes” discussed for most objects, then providing reasonable names for these attributes. This, in turn, requires some past work in expressing the relationships. That is one reason why the transition to tables is taking some effort. For instance, in Related facts section of the normality is not transitive page, different kinds of factual relationships require different kinds of attribute expressions. On the other hand, the attribute choice for the arithmetic functions section of dihedral group:D8 is relatively straightforward.

Show/hide feature

This was long coming — I just needed to figure out and install the correct extension (ToggleDisplay is what I installed, though there are many other similar extensions), and have now done so on Groupprops. This enables less to be shown by default, and viewers can choose to show more. Interestingly, not only does this make the default page shorter, it might actually increase the chance that the viewer will interact with the content not shown — because the “show more” might act as a spur to curiosity for the particular not-shown content that the viewer is interested in, without the distraction of other not-shown content the viewer is not interested in. The normal subgroup page again is an example.

More sophisticated use of semantic relationships

There has been a steady move in Groupprops towards semantic relationships. Categories (in the MediaWiki sense of the term) are now reserved for very basic kinds of lists and semantic relationships are being increasingly used for lists that involve relationships to other terms. For instance, the collection of properties satisfying a particular (meta)-property is now not stored in a separate category, but is stored using the semantic relation satisfies metaproperty::metaproperty name.

This allows for much richer information storage without category profusion. In order to further this, it is necessary to spot the various kinds of relationships. Standard relationships used are “defining ingredient” (the referring object uses the referred-to object in its definition), “uses” (the referring object uses the referred-to object in its proof or statement), “fact about” (the referring object is a fact about the referred-to object). Others include “satisfies property”, “dissatisfies property”, “satisfies metaproperty”, “dissatisfies metaproperty”, “stronger than”, “weaker than”, “uses property satisfaction of”, and “proves property satisfaction of”. Once these have been identified, the next step is making sure that they are used wherever appropriate. Given the size of the wiki (over 3700 pages) this would not be an easy task. Luckily, it turns out to be relatively easy because of the extensive use of article-tagging templates. Tinkering with the article-tagging templates gets the right semantic relationships on large numbers of pages.

Article rating

This was another long-awaited extension (ReaderFeedback): it allows viewers to rate articles by selecting from a drop-down, displayed both at the top and the bottom of the article page (the extension only does it at the bottom; I tinkered a bit with the code to show it at the top as well). The extension has been installed on Groupprops; it will be carried over to other wikis as well.

Top hundred cities for Groupprops

It’s both interesting and inspiring to know that content in Groupprops is being used (and hopefully liked) by people in many different regions of the world. Of course, both the choice of subject matter (i.e., group theory) and the choice of language medium (English, with a bias towards American spelling) limit the usefulness of the material to most people.

Here is a PDF file with a break-up of views over the past fourteen months for the top hundred cities. A few comments are in order:

  • A sequence of pages viewed in one shot is treated as one visit. This is what the number of “visits” counts. A “large depth visit” is a visit with five or more pageviews.
  • The top rank of Chicago is probably explained by the fact that I use the wiki from Chicago, significantly affecting the number of edits. I have not accessed the wiki from any other city (save a couple of times from Bangalore) so all other cities reflect genuine numbers.
  • Keep in mind that this is an English-language wiki, so viewership is biased towards countries where university-level mathematics is done in English. This explains to a large extent the dominance of the United Kingdom as opposed to France, Germany, Russia and other European countries. It might also explain the relatively lower numbers from cities in China and Iran compared to their higher education population.
  • The position of Indian cities may have also been inflated by the fact that many people in India have heard of the wiki directly or indirectly from me.
  • Within each country, we see that cities with more higher education universities do better. Hence, the good performance of Cambridge and Oxford in the United Kingdom. Similarly, the good performance of Chennai, Mumbai and Bangalore (which have the bulk of the top-quality math higher education and research) in India. In the United States, we see that, in addition to usual suspects like Chicago and New York, cities such as Ithaca (home to Cornell University) have done well.

Language issues

This is just a few quick notes about some of the issues of language and conventions that come up when dealing with subject wikis.

One of the most basic questions is what language to use. English is the only language I know where all the requisite terminology and ideas have been developed sufficiently, so that’s what I’m using, but there is still the tricky matter of what language variant to use. The current choice is American English as the default. This is partly opportunism: the bulk of traffic comes from the United States, so catering to American English is probably better for the bulk of traffic.

I’m not aware of any software features in the English MediaWiki that can automatically handle the minor spelling changes between different English language variants. I did recently discover that some pretty clever stuff has been done to handle issues of traditional versus simplified Chinese on the Chinese Wikipedia. Probably, the differences between different variants of English (the standard Queen’s, American, Canadian, Australian, and others) is too little.

I have made generous use of redirect features to automatically redirect between different language variants for the spellings of terms, but the current solution is unsatisfactory.

Even fixing a particular language variant doesn’t solve the problem because the need for mathematical terminology tpyically requires one to distort language anyway. For instance, I need to use “closedness” for describing whether something is “closed”, rather than the more pleasing “closure”, because the latter is used for the act of making something closed. Then, there are issues of hyphenation and capitalization. By and large, I have tried to follow rules that are reasonably consistent and unambiguous and also match up with the way mathematicians typically use the words. The butchery of language is likely to displease many. On the other hand, the good thing about a wiki structure is that the tools for navigating and searching for information can be so good that spelling discrepancies and weird-sounding word formations aren’t a hindrance to finding information quickly.

More disturbing are issues of convention within the subject. For instance, many people on the pure side of group theory use the “right exponentation” notation for group actions, while representation theorists and many people in other parts of mathematics use the left action convention. Having a single convention for the wiki might be appealing, but it doesn’t seem quite right because results in a particular sub-discipline should be stated in the language that people of that sub-discipline are used to.

The solution I’m gradually coming to for this is roughly as follows: In most cases, try to have a statement that doesn’t get notational at all, so that people do not have to worry about issues of left and right. In addition to this, have notational versions of the statement, and give them in left-right pairs — for each left action convention version, give the corresponding right action convention version. For instance, consider the definition of pronormal subgroup.

Success parameters for the subject wikis

It’s been slightly more than a year since I booked the subwiki.org domain, and some of the people with whom I’ve discussed the idea of subject wikis have told me that it’s an extremely ambitious project that requires more hands and better ways of measuring success. At the end of the day, it might turn out to be flawed in some basic way, or to not be feasible for various reasons — the net benefit to people from reading subject wiki entries may just not be worth the effort of creating them.

Frankly, I don’t know. A few people have shown interest in contributing, but there isn’t a steady base of contributors yet, and obviously, for the project to scale, it needs contributors. On the other hand, the subject wikis need to get a lot better before people can look at them and make an informed decision of whether this is something good enough to contribute to.

As far as usage is concerned, the group theory wiki is enjoying decent usage — around 8000 pageviews per month, which is not bad. On the other hand, this definitely isn’t a spectacular amount of usage, and the question of whether the same level of usage can be achieved for other subject wikis remains a moot point.

So I’ve decided to create two lists. The first list indicates some completion goals, that largely involve efforts I need to put in (with possible future collaborators) on the subject wikis. The second list is a list of external success parameters; things that, if they happen, will make me say, “Okay, the subject wikis are successful.”

Here’s a list of completion goals:

  1. Groupprops guided tour for beginners: I plan to complete the tour by December 2009. A completed tour will be a way of ensuring that the wiki has all basic group theory material (something that it almost, but not quite, has right now). It will also be a valuable resource in its own right for group theory beginners and other students.

  2. A subject wiki in microeconomics: I’ve started on this one, and I plan to bring it to a reasonable stage by June 2009. Basically, I want it to reach a state where it starts becoming clear which of the subject wiki principles apply outside of mathematics.

  3. A Ramsey theory subject wiki: This is a challenge I took up in this blog post, and I plan to work on it, hopefully reaching an interesting stage by August 2009.

  4. Shape up the subject wikis reference guide to something that can be used to find the meaning of practically any mathematical term. I plan to be done with this by December 2009.

Here, now, are some events that would make me consider the subject wikis a success:

  1. Independent review of the subject wikis on a separate website, that does not trash them completely.

  2. A single day with more than 1,000 pageviews for any one subject wiki.

  3. A regular collaborator or somebody who decides to sprearhead or take responsibility for one of the subject wikis, and develops it to a reasonable extent (say, over 100 articles).

I’ll add more success parameters as I think of them, and note if any of these get satisfied!

Proof style in the math subject wikis

Proofs are one of the defining features of mathematical content: practically any mathematical work is considered incomplete without a proof. A “proof” of a mathematical fact is a clear explanation of why the fact is true, starting from some previous known facts and agreed-upon definitions.

When I initially started up Groupprops, my idea was to create a repository of basic definitions of mathematical terms, organized using the property-theoretic paradigm. Proofs had a secondary role, and I initially didn’t intend to put separate pages about facts. however, within a month or so of editing, I realised the importance of having separate pages about facts. A fact here refers to a precisely formulated mathematical statement that has a proof. Each of the fact pages has its proof section.

Over time, I have worked to improve the fact pages to make them more useful and more actionable. Consider a wide range of fact pages in Groupprops: characteristic implies normal, normal not implies characteristic, normality is not transitive, equivalence of definitions of group are some starting examples. A categorized list of facts and a complete list of all facts, for those who want many more.

This subject wiki page describes the general layout of fact articles. In this blog post, I’ll explain a bit about why I’ve made certain choices about designing fact pages.

The first maxim I’ve tried to follow is that for simple but extremely important facts, I’ve striven to provide complete, hands-on proofs. This is so even when the proof is obvious or follows directly from some other proof. The reason is that a lot of time, a learner or researcher tends to get confused about a simple fact, and looking up a simple proof quickly can help dispel the confusion.

Second, in most cases of simple but important statements, I’ve also tried to provide alternate proofs to the hands-on proofs. Usually, these alternate proofs aren’t different: they juts couch the hands-on proofs in a slightly different formal language that enables different directions of generalization. The characteristic implies normal page is an example.

In some cases of complicated statements, a hands-on proof (That assumes very few basic facts) is accompanied by another proof that simply combines a list of other facts — esentially a breakup of the proof. This gives learners flexibility between reading the direct proof and understanding the different components of the proof in a modular fashion. Learners who want to generalize some of the proof ideas may prefer the modular fashion. Learners who just want to understand the proof without being burdened by too many special ideas may prefer the hands-on proof. Of course, even learners who want to generalize the idea may prefer the hands-on proof if the way they’re generalizing is orthogonal to the way the breakup is done.

Third, for facts whose proofs use other facts, there is a “Facts used” section that lists all the other facts that need to be used for the proof. These are numbered and the facts are referred to by number in the actual proof. The main advantage I see for this is that even people who do not want to go through the proof can eyeball and get an idea of what facts are used in the proof. Here’s an example of the critical subgroup theorem where the list of facts used is rather long.

This is in contrast with textbook proofs and the typical presentation style of proofs, where the list of facts used is typically not stated before the proof, but rather, the facts are referred to by theorem or lemma number within the proof. I think the approach used in the subject wikis is clearer particularly when the facts used are of a diverse range. It’s also more suited to a wiki, that thrives on links, as opposed to a sequenced text, that is based on the order in which material appears.

A fourth thing I’ve tried to do is that, for step-by-step proofs that involve arguments within arguments, I’ve used an enumerated point presentation. Enumerated points allow one to refer more easily to previous steps. Further, the enumeration enables people to get a sketch or outline of the proof, because in most cases, I begin the enumerated point with a statement of the goal of that point. Thus, people can just read the goal of each point and get an idea of the proof direction. Learners who want to get practice with proving stuff can note down the goal of each point and then try to fill in details themselves without looking at what is given.

Fifth, most fact pages have lists of related facts, which include generalizations, applications, corollaries, similar statements, opposite statements, and other facts using similar proof ideas. They also give the proof in multiple versions, often using different notations (for instance, one proof using the left-action convention, another using the right-action convention). Finally, for proofs that are particularly easy if specific definitions are chosen for the terms involved in the proof, a separate section called “Definitions used” is present that describes the definitions of the terms used in the proofs.

Arguably, a disadvantage of these is that even simple facts that usually merit only a couple of lines (not even a formal proof) often have extremely long pages on the wiki. I don’t consider this a disadvantage because, in my experience, the simplest ideas are often launching pads for more complicated questions. If the content of the page provides a large number of starting points for further research, I think the goal of having a wiki page on such facts is meant.

Another thing I’ve tried to do systematically is to treat counterexample pages at par with proof pages. Many mathematical facts are essentially just counterexamples to theorems that we might have hoped for. For instance, normality is not transitive. But each of these gets its own page just as much as the theorems do, even if the “proofs” of these essentially just involve describing one counterexample. Thus, for instance, Category:Subgroup property implications, which lists implication relations between subgroup properties, has a counterpart, Category:Subgroup property non-implications, which lists statements about implication relations that do not hold.

In the offing now are ideas about how to make proofs more interactive and actionable. For counterexample pages that can be verified computationally using software such as GAP, I’ve tried to give short GAP code that allows people to “check” the counterexample. This includes both specific counterexamples and general classes of counterexamples (see, for instance, the GAP implementation in normality is not transitive). My goal is to make this available for all specific counterexamples.

GAP-actionability would also be good even for theoretical proofs. For instance, there could be situations where key ideas in proofs can be illustrated computationally.

Another idea is to provide flow diagrams that encode the proof, particularly for complicated proofs. I don’t yet have samples of this.

A third idea, that I have partially implemented, is to guide readers towards general survey articles that discuss the fact as well as the ideas behind the proof. Some of these survey articles specialize in knitting together the different facts known about certain kinds of things, as well as how the proofs of these facts follow from some general ideas and from each other. Other survey articles specialize in proof techniques that apply to a wide range of things. In the former category is an article such as deducing basic facts about Sylow subgroups and Hall subgroups and in the latter category is disproving transitivity.

I’ll be interested in seeing how these ideas work out, and whether they work well on the other math subject wikis.

Short survey added

I’ve created a short survey on Groupprops. A link to the survey comes up randomly in the sitenotice on Groupprops, so I’m hoping that people who visit the site will click on the link and fill the survey. This is a very preliminary survey, and is my first experience with surveys, so I’ll use this as a way to learn about surveys that’ll hopefully enable better design in future.

Usage analysis on Groupprops

The Group Properties Wiki has been at its new location since around May 2008, and Google Analytics has been operational on it since May 10. Using Google Analytics, I’ve been able to get a fairly good idea of what pages people prefer when they visit Groupprops, how they use the website, and how to make it better. Here, I’ll share some of my observations.

General visitor trends

First, there has been a largely steady increase in the number of visitors to the website. The initial increase can probably be attributed to the fact that search engines took time to index the site at its new location, and many people were using the old location (which is still up) as search engines pointed to those. From may to July, the number of distinct visits per day varied between zero and twenty. In August, it rose to around 20-40. After that, it rose slowly and steadily till the beginning of December. There was a slump in the second half of December and the beginning of January, something that I suspect is due at least in part to it being vacation season in many parts of the world. Since January 25, traffic has increased significantly, and it now averages around 80-120 visits per day on weekdays, and 30-70 visits per day on weekends.

Variation within the week is what one might expect of a typical academic or work-related site: high during weekdays, lower during weekends.

The bulk of visitors (around 78%) seem to find the content through a search engine, and the search engine driving the most visitors to the website is Google. Google has sent the site approximately 7000 visits, compared to a total of a few hundred for all other search engines. There are two possible reasons: first, Google shows Groupprops entries more frequently than other search engines do, and second, the user profile that is targeted by the website uses Google more frequently. I suspect that a combination of the two apply.

People from 105 countries have visited the wiki. The country-wide distribution of visits is roughly as follows: the United States makes up for the bulk of visits (around 5000), followed by the United Kingdom (around 1000), Canada, and India. Also high are Israel, Australia, Germany, France, Turkey, and Saudi Arabia. Part of the bias towards the United States and the United Kingdom may be due to language: an English-language website is likely to get more traffic from English-speaking countries.

Navigation patterns

Groupprops is a rather large site, with over 2000 pages, and not all pages receive equal traffic. Nonetheless, people often read pages on fairly abstruse topics. The depth of visits ranges widely, with the average visitor reading 2.50 pages per visit, and the bounce rate is about 60%. This means that most visitors just open one page, or read a couple of pages, and then explore no further. On the other hand, there are a number of visits of larger depth. As such, the proportion of visits of large depth has not changed significantly even as the total traffic has increased.

The most frequently visited pages seem to be the pages about specific groups. These pages are surprisingly popular considering that they are neither among the best developed nor among the most heavily linked to pages. Three of these pages are symmetric group:S3, symmetric group:S4, and dihedral group:D8. Other pages that seem to be extremely popular include nilpotent group and solvable group.

It also seems that not too many visitors are using the bulk of the methods to efficiently locate information and explore it. It is possible that a number of visits can be categorized as follows: “have a question, type keywords into a search engine, follow a link to Groupprops, get (or not get) an answer, then close.” Nonetheless, it is likely that as people find the site grow in value, they’ll discover more of the many different ways of efficiently navigating the site. As of now, these tools at any rate help search engines get a better idea of how the material on the site is related (not to mention that the tools help me get around the site).

Referring sites

There are very few sites referring to Groupprops, other than websites I own. A couple of websites link to the subject wikis reference guide, which is a central point for subject wikis. There was a sharp positive spike in traffic to the subject wikis reference guide in January when a user posted a link on Reddit.

I might say here that given the current developmental pre-beta status of the subject wikis websites, it is to be expected that not too many external sites would point to it. I hope that with steady improvements, it eventually reaches the point where people find it good enough to link to.

Lessons learned from visitor trends

The first lesson, which is obvious to most webmasters, is that the way people tend to use a website is usually very different from the way one might anticipate their using it. I hadn’t thought that pages on symmetric and dihedral groups would be the most viewed, and I’d expected a larger fraction of the visitors to use the many features I’d built for navigating the site more efficiently.

The second lesson I learned is about my own personal goals here. To some extent, I realized that the goal of subject wikis isn’t merely to cater to what users want. User feedback is useful — the fact that people are keen to learn more about specific groups indicates that adding more pages on such groups would answer the unspoken needs of many. This is something I plan to pursue. On the other hand, I have no intentions of re-engineering the website to put excessive focus on particular groups. That’s because to me, the group properties wiki is also an expression of the way that I (and in many cases, other researchers, though there are some organizational aspects that cater to my idiosyncrasies) look at the subject. Here, specific groups are extremely important, but there are a number of general concepts and ideas that are also important. The goal of putting this up in the form of a wiki is to give people a chance to play with the way the subject is structured.

The goal of subject wikis

Having described the history of subject wikis in a previous post, I can now get to describing what I consider the ultimate goal, mission, and vision, of subject wikis.

This is emphatically not my first attempt at formulating a general goal for subject wikis. In March 2008, shortly before I booked the subwiki.org domain and moved the wikis, I brainstormed myself about subject wikis. I came up with a long and enthusiastic statement of purpose for the subject wikis. This was on paper, while I was waiting for an appearance by Bill Gates. later, I refined these ideas and wrote up a short private file describing the mission. Since then, I thought considerably, but somehow, didn’t get around to posting the mission statement publicly.

Prior to the general development of subject wikis, I had composed separate pages on Groupprops comparing Groupprops against other online math resources: Groupprops versus Wikipedia, Groupprops versus Mathworld, and Groupprops versus Planetmath. In addition, I had also written up a purpose statement and a what makes us special page. These pages are still present and haven’t been replaced by more generic pages on the subject wikis framework, largely because my new expanded insights are not mature enough yet. (After I’m done with this blog post, I might find them mature enough).

My basic description of the goal is as follows: I want to appeal to two fundamental attributes of people. These are curiosity, which makes them ask questions and seek answers, and laziness (or, more politely, thriftiness and economy), which makes them seek to minimize the effort they put in to get answers. I want to provide a knowledge resource that caters to people’s curiosity two-fold: it answers their immediate questions, but opens the fount further by raising several more questions and stimulating them further. At the same time I want to appeal to their laziness by providing everything: all the resources, all the details, and a big picture at the same time, while simultaneously giving them information on how they can get answers even faster.

Curiosity and laziness. Satisfaction and stimulation. Satisfy the user’s curiosity, stimulating it further. Satisfy the person’s desire for laziness and economy, and stimulate the person’s interest in learning quicker and more efficient ways to learn.

I first formulated this goal explicitly in March 2008. This explicit formulation led to many changes in the way I worked on the subject wikis. I realized, for instance, that if people are to be encouraged to explore with their given level of laziness, they need to be given easy options. This led me to create the careful text box quotations at the top that link to related articles in a systematic fashion. People, following their curiosity and using the easy links out of laziness, would soon “learn” the pattern of organization and develop a deeper intuition. See, for instance, an entry such as characteristic subgroup, where the boxes at the top provide many useful links. I also realized that working on good and easy-to-use, inviting overall organization was important.

Satisfying and stimulating curiosity and laziness does not completely define subject wikis. (In fact, the motto is so generic it could apply to anything such as food, sex, news, or entertainment). Rather, there is something more that describes subject wikis. These are fundamentally user-driven tools. By this, I mean that the path of exploration is chosen entirely by the user, at the user’s will, at any time and in any manner of the user’s choosing. For this, the tool itself should be available all the time, easy to locate, reliable both in terms of content and presentation, and helpful but not intrusive.

Some of the specific principles derived from this include: a high level of modularization with articles as the basic units combined with pinpoint referencing: it is easy and quick to get an answer to a specific question. Each topic has its separate article, achieving a high level of granularity and modularity. Pinpoint referencing is achieved by canonical naming (the name of a page on a topic is precisely that topic), good redirection and disambiguation, excellent search features, and good-quality categorization. Another principle is strong internal linking: pages are linked to closely related pages in a way that symbolizes and explains the manner of the relationship. This allows for easier location of new facts, stimulation of curiosity, and expansion of knowledge. Yet another principle is standardization: standardization of page format for similar pages, leading to predictability and reliability. A principle that is important and not obvious is genericity: individual subject wiki pages should largely make sense as independent entry points into the wiki, so that people coming from outside can go straight there. While they should link to other subject wiki entries, they should not be dependent on them in a strong sense. Most important, there should be no forced sequencing of the entries as in a textbook, where future entries depend on earlier ones.

The genericity is described by the different between building a road network and a bus system. A road network serves all directions — it allows for a plethora of routes that users can choose, whether by foot or car or bus. On top of this, a bus route system can be introduced — this route system operates buses along specific routes, and people who want to go along those routes can take the buses. However, the robust and generic road network allows people to freely choose other routes. This is the core of the idea of being user-driven: the users choose their direction.

Ultimately, this genericity, combined with the ease of use that comes from modularity and pinpoint, can make subject wikis a useful starting point for learning, research and exploration, stimulating and satisfying curiosity, and its much-maligned cousin, laziness.

Before ending, I’d like to end with an illustrative anecdote. All too often, it happens that in the middle of a mathematical discussion, one of the persons takes out his or her IPhone or goes to his or her laptop and checks up the Wikipedia entry. This is usually better than nothing, but Wikipedia doesn’t usually stimulate the conversation in the sense of vetting and further stimulating the curiosity of the curious people while answering their immediate questions. I think of each such occasion as an opportunity partly lost.

Nearly a year ago, two friends of mine were enjoying the University’s Happy Hour and talking about some math when they wondered how exactly the Artin-Tate lemma (a result in commutative algebra) is proven. They were having difficulty reconstructing some point in the proof. Well, one of my friends had an IPhone, and he whipped it out, did a Google Search, and landed at the Commalg entry (actually, they landed at the entry on the old version at Wiki-site, which still seems to be the top entry on a Google Search). Reading this entry helped them fill in the gap. The goal of subject wikis is to do this on a substantially wider scale, daring people to be more curious and ask more questions with the confidence that the answer, along with rewards in the form of more knowledge and more questions, are just a mouse click away.