Friday, May 20, 2011

Initial results

So data has been pouring in for the last few days and here's the deal. Here's a screenshot of the data:




Total visits recorded: 441
Number of participants: 11
Average visit length: 3.64 page views
Domains visited by participants:

  • espn.go.com
  • facebook.com
  • huffingtonpost.com
  • hulu.com
  • news.ycombinator.com
  • nytimes.com
  • reddit.com
  • techcrunch.com
  • twitter.com
  • youtube.com

Graph of domain visits:



Graph of participants:



Page views per visit distribution:



There isn't enough data yet to draw certain correlations that I will want to eventually. However looking at the data thus far is promising. We can make some judgments:

  • People are using the plugin: there are 11 participants who have logged 441 total visits in just 2 days, so with a week more of data there should be plenty of data.
  • The majority of visits are to Facebook, which makes sense considering their traffic. I want to see how this bears out over a larger population, but if we can determine that Facebook is indeed one of the top productivity killers, then future efforts can better focus how to curb procrastination on just that site.
  • The average visit length is just over 3 page views, which seems very short to me. Going further, the median visit length is 2, with the majority being either 1 or 2 page views. I would have expected more, but perhaps this could be a sign that the plugin is working since it induces a delay on every page load.

Tuesday, May 17, 2011

Privacy issues solved. DOWNLOAD IT NOW!

So the plugin is good to go. Download it here: http://stanford.edu/~rparikh/cgi-bin/delaybot/delaybot.zip

What I've changed is that it no longer logs the whole url, just the source website so that way you don't have to fear me snooping about the specific urls you visit, just the domain. Also, it no longer logs the actual IP address, just a hash of the IP so that I can disambiguate users.

INSTRUCTIONS:

  1. Download the plugin here: http://stanford.edu/~rparikh/cgi-bin/delaybot/delaybot.zip
  2. Double-click the zip archive to open it up
  3. Open Google Chrome
  4. Go to "Window > Extensions"
  5. Click the + icon next to the label that says "developer mode" if it is not already open
  6. Click "load unpacked extension"and it will bring up a file select dialog
  7. Navigate to the "delaybot" folder that came out of the downloaded zip archive
  8. Click "select" to choose that folder
  9. Make sure you enable the plugin in chrome
  10. That's it!
I would really appreciate it if you wanted to download the extension and help me out! Please ask me any questions or concerns you have.

Friday, May 13, 2011

More updates


So I've finally set up my extension to log all user activity to an external log file. My log file is here: http://stanford.edu/~rparikh/cgi-bin/delaybot/log.txt

Essentially I'm logging user actions as comma-separated values. The fields I'm logging are:
datetime, delay length (ms), url, tabid, action, ip address

To explain further...delay length is the delay length in milliseconds that my extension has induced on that particular page load, url is the url they visited, tabid is Chrome's assigned tab id, and action is whether the user was opening or closing the tab.

My extension is available for download here: http://stanford.edu/~rparikh/cgi-bin/delaybot/delaybot.zip

I haven't put it up on the extension store because I want to control the distribution for now. Right now I'm just going to offer it to friends and people in the class, and track usage for a few days to make sure that nothing fishy is happening and then distribute it more widely.

Friday, May 6, 2011

Updates

I've modified my browser extension so that it records every time a user visits a site on The List™, and sends this back to a server which I haven't set up yet (I've just been testing locally for now). The only thing that's missing right now is the ability to detect when a user uninstalls the extension, but I'll get that figured out in the next couple days. Over the weekend I will set up a server, and use the extension on myself and a couple friends to make sure that everything's working alright. Then, I plan to distribute the plugin to a wider audience early next week, in time to get lots of results over the next few weeks, and leave myself enough time to modify the study on the fly if my early results necessitate this.

So yeah, not much new this time around. But things will be going down soon enough.

Friday, April 29, 2011

Pilot Study

My hypotheses are as follows:
1. If a random delay between 0-2 seconds is introduced before a page loads for a non-essential website (i.e. site that is being visited for recreation purposes only), users will spend longer amounts of time before visiting that site again, and the "visit delay" will be directly correlated with the previous delay length (perhaps as a markov process, with the previous N delay lengths).
2. The longer a non-essential website takes to load, the more likely users will be to close the tab before the page finishes loading.

Methodology
For my pilot study I wanted to test how introducing a 2 second delay every time a certain subset of websites was visited would change the web browsing experience. I implemented this on myself. I found myself annoyed by the delay at first, but not so annoyed that I considered disabling the feature or opening an incognito window (where it would have been disabled). I quickly got used to it. I also noticed that my overall browsing of those sites went down a little bit, though this might be due to many other reasons. For example, yesterday I was getting ready to procrastinate but then a wasp flew into my room so I got scared and fled with my laptop. I fled all the way to Meyer, and then being in Meyer I felt guilty about goofing off on the internet with all these studious people around me so I just studied. The conclusion of my brief pilot study is that the extension will not be intrusive for users to the point that they significantly alter their behavior, but may be effective enough to change behavior in subtle ways.

Results!
With the extension (2 days data, Wednesday and Thursday):
23 total visits to facebook
151 total visits to reddit

Without the extension (2 days data, Monday and Tuesday)
59 total visits to facebook
145 total visits to reddit

My overall browsing on reddit went up, but that was mostly due to a reddit binge at the end when all my work was actually done so I was allowed to. Without that the total visits to reddit was 97. I did not measure the time in between visits. I have this data but I have not yet compiled it in any meaningful way.

The results themselves are not very meaningful, but the main takeaway was that the extension does not pose any major obstacles or problems that would significantly alter what I am trying to test; that is, subtly influencing human behavior. I didn't think it would be meaningful at this point to perform statistical tests on them due to the extremely low amount of data and the fact that the circumstances of data collection in both trials was so different. The pilot experiment mostly served as a sanity check on the viability of the extension and the experiment.

Thursday, April 28, 2011

Hey Buddy

My buddy is Jonathan Nation, whose blog can be viewed here: http://cs303-jn.blogspot.com/

To recap a bit, Jonathan's idea is to integrate quizzes into Wikipedia pages to test whether this increases knowledge retention. He's going to test whether quizzes work best at the beginning, end, or interspersed throughout Wikipedia content; also, does giving the user negative/positive feedback affect their scores? I think the premise is excellent; I remember reading a thread on Reddit (which Jonathan also references) where a lot of people suggested that Wikipedia quizzes would make the site more fun and increase their knowledge retention. I think there are several challenges in this experiment, however. Quiz design is not a trivial task; I think a lot of pilot testing will be needed to refine these down into fair assessments of ability.

Also, I think it's going to be necessary to have a large sample size, due to a few issues. First, there are several treatments; this project is a lot more ambitious than mine in terms of all the different hypotheses and treatments it is trying to test. This will multiply the number of subjects needed. Also, there is large potential for bias to work its way into this study; if one particular user already happens to know a lot about a particular subject, then clearly they will do disproportionately well on the quiz. Also, things like attentiveness, time of day, general level of interest of the article, and other circumstances could greatly affect peoples' ability to answer quizzes. With a large sample size, subject-specific biases can be diminished. Another possible factor is that if volatile Wikipedia pages are chosen, then edits may happen between various treatments and thus make it more difficult to ensure consistency. This is probably easily mitigated by either choosing more stable Wikipedia articles (based on edit history), or by serving users a static version of the page that is pulled at a certain date and held constant.

All in all I'm looking forward to this experiment and the product itself. A Wikipedia quiz browser extension or website that users can edit themselves and share (like Wikipedia itself) would be a great companion tool to Wikipedia and something I would use.

Thursday, April 21, 2011

Methodology!

Prior Work

I looked at a lot of prior work in the study of procrastination. While my research question is about creating an interface to curb procrastination, I thought it might be important to research the psychology behind procrastination so that's what I did.

"Psychological antecedents of student procrastination," 1988, Beswick et al - This paper found positive correlations between low self-esteem and time spent/frequency of procrastination. The study also uncovered a negative correlation between self-reported procrastination and final course grade (gasp!). I don't think the paper was too helpful in the sense that I'm going for, i.e. addressing the moment of choosing to procrastinate, but it was still a good insight into the things that motivate procrastination on a larger scale.

  • "The nature of procrastination: A meta-analytic and theoretical review of quintessential self-regulatory failure," 2007, Steel. Another paper which addresses macro psychological causes of long-term procrastination but never gets the resolution of what goes through the mind of what drives procrastination in a given instance, the kind of which almost everyone engages in.


  • "Choice and Procrastination," 2006, O'Donoghue and Rabin. This one presents an economic model which was pretty interesting, which is about modeling how a person chooses to procrastinate over other possibilities. One of their key insights was that offering too much choice in the number of things one can do at a given time can induce procrastination (i.e. if I have only one homework assignment, I am less likely to procrastinate than if I have ten things to do). Also, more difficult and important tasks are more likely to induce procrastination.

The tl;dr of all these procrastination studies is that procrastination as a long-term and debilitating activity can be tied to a myriad of psychological factors, including depression and low self-esteem. Also, procrastination can be worsened by factors such as too many or too difficult tasks to do, which makes it very easy to just put it off and ignore it in favor of a bit more time spent internet-ing.

  • "Speed Matters for Google Web Search," 2009, Brutlag. This one is more immediately relevant to my research topic. Increasing search latency from 100ms to even just 400ms causes a statistically significant drop in the number of searches a user performs. That difference is small enough that no one would really notice it or even really register consciously that Google seems to be going a bit slower than usual, yet still at some level this subtly discourages people from performing more web searches. 

Hypothesis: Increasing the latency to access procrastination websites, such as Facebook and Youtube, will lower the amount of times they are accessed by users and ultimately decrease procrastination.

The only factor I will test for is latency; some suggestions were to have a "guilt" factor as well, such as a popup that asks "are you sure you shouldn't be working right now?" every time they try to access a certain site. However, I decided against this because while it is an interesting question I don't want to have too many test conditions and I'm really more interested in whether a subtle effect like latency can have a noticeable impact on behavior.

Experimental Method

I will be writing a simple chrome extension which forces users to wait 2 extra seconds before accessing certain greenlisted sites (they are "greenlisted" and not "blacklisted" because blacklisted implies that the sites are inaccessible or blocked, which they are not. They are just hard to access, just like if there was a thing you wanted on the other side of a swamp it would be hard to access, and swamps are green, so greenlisted). However, my experimental method will be somewhat different than described before. While initially I wanted to conduct a within-subjects test that changed the condition from one week to the next, I realized through class discussion that there might be too many confounding variables for this to be feasible. Too much changes from week to week, and any observations I make might be due to different things entirely (maybe it's closer to finals week, so people procrastinate less, or something).

Instead I will conduct the experiment as follows. I will build the chrome extension so that it will induce latency randomly half the time. So going to a website sometimes leads to an extra 2 second wait, and sometimes goes through like normal. I'll conduct the experiment over hopefully at least 2 weeks, which should be enough to have hundreds to thousands of visits to these types of websites per user and be a large enough sample size. The greenlist of websites will be the same for each user and will be determined by a user survey beforehand to find the most common sites.  The browser extension will record each visit to one of these sites, whether or not the delay was induced, and whether or not the user closed the tab or went elsewhere before it loaded. It will also record the length of the visit. This is still a within-subjects design, just different than the one initially proposed.

After collecting data, I will compare the rates of premature quitting on normal page loads versus delayed page loads. Also, I will compare total time spent procrastinating in both modes, to see if there is an appreciable difference here.

Some Anticipated Questions

Why a user survey to determine the greenlist? This is indeed susceptible to gamemanship (i.e. not listing a site they don't want affected), but I think it is preferable to asking for permission to peruse browser history.

Why a 2 second wait? It's kind of arbitrary right now but I wanted it long enough to be perceptible, but not so long that the user might game the system by closing the tab and reopening it again hoping that the delay won't happen. 2 seconds is probably shorter than or of similar length to the expected value of going through all that, whereas 3 seconds might not be. It's mostly a guess for now. Perhaps a pilot study is needed to determine a better value here. It would even be simple for me to code this up and use it myself and get a "feel" for it.

The Future

The experiment I'm doing is very small of course but there is a much larger opportunity here. There are companies such a RescueTime and others whose sole mission is time-related productivity. The reason some of these haven't gained mass adoption is because they require a conscious change in behavior. However if things like latency can induce a change in behavior without any conscious effort on the user's part, that is a big win and could see much wider appeal. Given more time, I'd like to determine the latency "sweet spot" not just from a time perspective but from the other possible roadblocks one could put in towards procrastination: disabling autofill for certain sites, rearranging the order of your windows in the browser to obscure certain sites, de-prioritizing certain applications in alt-tab, etc. I'd like to spend time designing interfaces where certain activities are subtly prioritized over others, but not in a way that really annoys or frustrates.