Friday, May 20, 2011

Initial results

So data has been pouring in for the last few days and here's the deal. Here's a screenshot of the data:




Total visits recorded: 441
Number of participants: 11
Average visit length: 3.64 page views
Domains visited by participants:

  • espn.go.com
  • facebook.com
  • huffingtonpost.com
  • hulu.com
  • news.ycombinator.com
  • nytimes.com
  • reddit.com
  • techcrunch.com
  • twitter.com
  • youtube.com

Graph of domain visits:



Graph of participants:



Page views per visit distribution:



There isn't enough data yet to draw certain correlations that I will want to eventually. However looking at the data thus far is promising. We can make some judgments:

  • People are using the plugin: there are 11 participants who have logged 441 total visits in just 2 days, so with a week more of data there should be plenty of data.
  • The majority of visits are to Facebook, which makes sense considering their traffic. I want to see how this bears out over a larger population, but if we can determine that Facebook is indeed one of the top productivity killers, then future efforts can better focus how to curb procrastination on just that site.
  • The average visit length is just over 3 page views, which seems very short to me. Going further, the median visit length is 2, with the majority being either 1 or 2 page views. I would have expected more, but perhaps this could be a sign that the plugin is working since it induces a delay on every page load.

8 comments:

  1. Looks great Ravi! One note on your graphs of participants - even if they are anonymized, I think it would still be useful to assign them an id or just index 1-11 to place on the graph below them (and use to discuss them specifically in any tables you might generate).

    What exactly is your "average visit length" counting? Is this a count of how many pages into some procrastination website people get before giving up / leaving? If so, I find the low number to be in line with facebook being the number 1 site, since when I go to facebook at least its usually to just check the feed, and maybe click on one or two interesting new things there.

    I remember you had chosen a specific length of delay to give users. Have you given any thought to giving a larger delay for more frequent visits, or just trying out the plugin with a higher and lower constant delay to see if this changes the statistics you get? I think that could be pretty insightful.

    ReplyDelete
  2. For your "Graph of domain visits," you may want to consider using a log scale so that it is easier to look at the differences between the websites that aren't facebook.

    Also, I don't really understand what your "Graph of participants" is showing? What do each of the axes represent? Page views per person?

    Also, for chart clarity, you may want to consider sorting by bar height, unless the data needs to be in a specific order.

    Overall, pretty interesting data so far! I'm really interested to see how user behavior changes over time, once the novelty of the extension wears off.

    ReplyDelete
  3. Hey Ravi,

    I know you've probably said this in class, but I can't remember---how are you dividing up your two conditions? Are you going to collect data about these subjects normal behavior without the 2 sec delay and then compare it to their behavior once you induce the delay? Or do you give half the participants a plugin with no delay and half a delay?

    I'm just wondering because it would be really nice to see this data in comparison to something. Obviously, you'll need your control group to draw any conclusions, but it would also help to visually understand your data. I'm definitely excited to see your results.

    ReplyDelete
  4. First a quick question - is the Graph for participants depicting the visits by them (ie the distribution of the 441 visits) ?

    I think your experiment is really interesting, especially for people like me who waste A LOT of time on Facebook and have tried every which way to get over the "addiction" :P

    As far as your experiment goes, the results look promising. However, was just wondering on what basis do you decide the induced delay? For instance, we can clearly see Facebook is the most popular site when it comes to procrastinating, so if a person is visiting the site often even after the 2 sec delay , does the delay increase to further curb the number of visits? An unreasonable increase in delay would be bad, but I would be interested to see the delay threshold beyond which the visits to popular sites( not super affected by 2 sec delay) would begin to decrease!

    ReplyDelete
  5. Your second hypothesis is that the longer the delay, the more likely users will close the tab before the page loads. Do you have data for this? It seems like this must be a super infrequent event, possibly difficult to pick anything up on. Usually I'd just navigate somewhere else in the same tab. Do you catch that, or only if the tab is closed?

    ReplyDelete
  6. Missing some labels on x, y axes on some graphs...makes it difficult to quickly see what is going on.

    Can you get more test users? I see the current data is very heavily skewed to stanford area, very heavy in tech/hacker news and some sports as well. I highly doubt the rest of the world hits these same sites.

    ReplyDelete
  7. I have the same questions (the one Melissa asked) - what is your baseline comparison. For this kind of experiment, I think you would need to do within subjects experiment (see people's behavior for a week without your plugin and then compare it with their behavior for an another week with your plugin installed).

    The consistent higher delta (difference in time spent with your plugin vs. without your plugin) would really strengthen your hypotheses.

    ReplyDelete
  8. Yo this is sweet-- I'll be very interested to see what the results are. I'm also a bit curious about your baseline. Are you comparing users to themselves over different amounts of latency, or are you only inducing latency on some clients, or what? The one thing that strikes me about comparing users to themselves is that you obviously have to be careful about latency sequencing so that the results aren't screwed up.

    Also, if you want to get some background / average data on some of these sites, I'd probably be able to help you out, so just let me know if you need even just like a ballpark baseline or just some other data to compare against.

    ReplyDelete