From CrowdsourcingSeminar
Jump to: navigation, search

Reading Responses

Kristal Curtis

Algorithmic Wage Negotiations: Applications to Paid Crowdsourcing: In this short paper, the authors present an idea plus a preliminary experiment exploring whether attempting to negotiate the price paid to Turkers will result in cheaper costs while retaining throughput. They found that Turkers were unlikely to offer a counter proposal to the wage offered them, independent of the actual wage amount. It would be interesting to see if Turkers behaved the same way if they were asked to negotiate on the price before doing a task rather than after.

Financial Incentives and the "Performance of Crowds": The authors of this work performed some experiments related to economic theory about the effect of compensation on work quantity and quality. Their findings were mostly consistent with theory. I was very interested in their findings from the second experiment (the word search puzzles). They found that many of their subjects really enjoyed working on word puzzles. It seemed that their level of enjoyment was a confounding variable. It would be interesting to group subjects' responses by their reported level of enjoyment and then redo the analysis.

Nicholas Kong

Financial Incentives and the "Performance of Crowds" The authors of this paper present two studies on how changing financial incentives affects the quantity and quality of work completed on MTurk. Their two tasks were sorting images of traffic in chronological order, and solving word jumble puzzles. Their main finding was that more pay resulted in faster work, but no increase in quality. They also found that a quota payment system, or payment per batch of tasks, resulted in more work-for-pay than a per-task payment system. I thought the demographics of their studies were very interesting: in both, they had more than 80% American respondents. I wonder if this was due to the time they posted their tasks, or that they ran their studies in late 2008 or early 2009 (for a mid-2009 publication). In any case, I'd be curious if the same results held up for other workers, such as Indian workers with more financial incentives. That being said, the result that more pay resulted in better quantity but not quality was duplicated by Heer and Bostock in the crowdsourcing graphical perception paper, so maybe the result is reasonably robust.

Algorithmic Wage Negotiations: Applications to Paid Crowdsourcing In this paper, Horton and Zeckhauser presented Hagglebot, an automated price negotiation agent that allows workers to make counteroffers for performing a task. They tested it with initial offers of 1 and 5 cents, but did not find a statistically significant difference in how many users in each group made a counter offer or ended negotiations. The idea seems like an effective way to allow workers to value their own work, though, so I'd be interested to see the results of a more thorough study with a wider variety of tasks.

Kurtis Heimerl

Financial Incentives and the "Performance of Crowds" Someone needs to run this EXACT SAME STUDY on oDesk. Every piece of it. This work, as far as I see, fundamentally shows the limitations of Turk, as it stands right now. While they argue that we should use intrinsic motivations, that's very very hard. It's a self-referencing cycle as well, where motivations change hourly. I don't think that gives a sustainable solution. This is a terrible factor for social scientists, where only interested parties put effort into your science. Makes me worried about the whole deal.

Anyhow, the work itself is a good citation. Anyone who has used mTurk should know that the market is fucked. It's clearly not a functioning market, not from the worker's perspective. It's nearly impossible to make good economic decisions given the interface.

What a long workshop paper.

Algorithmic Wage Negotiations: Applications to Paid Crowdsourcing Oh god, I remember being mad at this during Crowdconf. I mean, where to begin? First, these guys were essentially worried that we are overpaying turkers. What we need to do is haggle over the low end of prices. Secondly, they have to look at the bias inherent in this thing being recently deployed. Turkers have never seen anything like this before. They may just be paying to go through the process for the first time. It's new, it's interesting. Lastly, this is clearly a dysfunctional market, as seen in the prior paper. Does the difference between 1c and 5c really mean anything to the workers?

Still, I can see why people would want to look at this problem. Let's go straight to the bottom now, before we all get excited.

Chulki Lee

Financial Incentives and the "Performance of Crowds" discussed the effect of financial incentive, in particular the amount and the scheme of the incentive. They found that more payment yield more quantity but not better quality, and that the compensation scheme has a significant effect on quality. For me, the distinction of quality and quantity in the second experiment is a little bit confusing. For example, I think considering total length of matched words would be more like quality.

Algorithmic Wage Negotiations: Applications to Paid Crowdsourcing suggested very interesting approach to determine wage in AMT. I’m interested in how users would feel this system, especially as the agency of the computer. Did they feel like when they get offers from human? Or, does online negotiation seem to be blunter so more forced?

Beth Trushkowsky

"Algorithmic Wage Negotiations" describes initial results for an automatic wage negotiator ("hagglebot") for crowdsourcing. After workers try an example task, they can attempt to negotiate a pay rate to do more of that task. While this paper only describes initial work, I was happy to hear that the authors are combatting the take-it-or-leave-it pricing model prevalent on mechanical turk. I had always been irked by the ad-hoc nature of choosing how much to pay for a particular task, and I can relate to how difficult it can be to price a task such that a minimum number of people will complete it within a given time frame. I'd be interested to see more analysis of workers' counter-offers for different types of tasks, as I wonder if higher counter-offers are associated more with time-consuming tasks or difficult tasks (or both). The results in the paper left much to be desired (as they admit), possibly because the task chosen wasn't the best example -- paying more than one cent seems a bit silly, so it's not surprising that most HIGH and LOW offers were accepted.

"Financial Incentives" demonstrates that increasing the payment for a task does not necessarily increase work quality (but sometimes work quantity), and that payment scheme affects quality as well. I found the concept of "anchoring" super fascinating, viz., that people base their perception on what they should be paid on what they have been paid. The paper suggests you shouldn't pay more to get quality, however I'm still left wondering what I *should* do to get the quality that I desire. I liked how the authors note that workers skipping words in the puzzle experiment was more psychological rather than a rational tradeoff between time and compensation. Maybe there are other psychological tricks to play to get workers to produce higher quality output (I guess games are one). While I thought the tasks chosen for the paper were pretty good, I am more interested in tasks where workers need to produce quality factual content, as opposed to the tasks described in the paper; perhaps the difference is that data entry tasks don't have a spectrum of correctness like the number of words found in a word jumble.

Sally Ahn

Financial Incentives and the "Performance of Crowds" This paper reveals how much influence financial compensation holds over the amount and quality of work done on AMT. Their experiments show that increasing pay increases quantity, but not quality of work. The observation about the "anchoring effect" with AMT workers was interesting; when asked to approximate the value for their labor, the Turkers' responses were relative to their current pay (all felt that the actual pay was lower than it should be). The authors acknowledge that the results could also have been influenced by the workers' intrinsic motivation (especially in the second experiment, where the tasks strongly resembles a game format). Nevertheless, their results indicate that previous findings regarding financial compensation and work quality applies in the AMT environment as well. The demographics of their participants were interesting. In particular, the distribution of income was much more balanced than I had expected. Since this experiment was conducted back around 2009, I wonder if the demographics has changed enough for the results to show significant change if the same experiment were to be ran today.

Algorithmic Wage Negotiations: Applications to Paid Crowdsourcing The authors of this paper present hagglebot, a program designed to automate payment negotiation in AMT. This sounds like a useful tool that may be beneficial to both workers and requesters. In the pilot experiment, the workers first perform the task at a fixed-price, which could be low or high. It's interesting to see that while the "anchoring effect" mentioned in the first paper is evident for the majority in both groups, a small percentage of both groups seem to converge on the same higher price (~$10-12), well below hagglebot's maximum threshold ($20), regardless of their initial payment. The authors acknowledge that the sample size of the data collected is too small for conclusions to be made, but if this pattern persists in future experiments, this might provide a more accurate idea of the true cost of a particular task. I would be quite interested in seeing the results from future experiments with hagglebot.

Wesley Willett

Financial Incentives and the "Performance of Crowds" details a set of experiments in which workers on Mechanical Turk were paid varying amounts to perform the same task and graded in terms of the quantity and quality of results they produced. The big result is that paying workers more doesn't seem to increase the quality of results - just the quantity. Moreover, they show that contingent-pay schemes may not have as strong a result as hoped on performance - per-piece pay seems to provide less incentive than quota-based pay. This is an interesting set of results that seem to say something pretty counterintuitive about how these markets (or at least Mechanical Turk circa 2009) work.

It'd be interesting to see how robust their findings are across multiple platforms and task types. Other papers (including the graphical perception paper from last week) have reported results that seem to corroborate these, and a lot of people (myself included) now seem to be designing studies and pricing tasks under the assumption that pay rate doesn't impact performance. But there must be exceptions. It would be good to know more definitively where this is true and where it isn't.

The Algorithmic Wage Negotiations paper, by comparison seems pretty empty. The authors present a system that was supposed to barter with participants in order to determine their payment for a task. However, the paper seems fraught with issues. It's not clear to me that the authors prove anything with their study - the most likely explanation of their results is that workers didn't understand their interface or what bargaining for a rate might look like, since Turk generally doesn't support this. The "result" that workers in the higher-paid condition ultimately make more than lower-paid worker also seems pretty meaningless given that almost none of their participants actually engaged with the system. The motivation for this seems problematic as well, since it isn't clear that bartering with an individual worker (as opposed to facilitating competition amongst workers) will have any impact on the quality or timeliness of their work.

Yaron Singer

Financial Incentives and the "Performance of Crowds". This paper makes an important contribution to the study of incentives and their role in the quality of online labor markets. In this paper the authors examine the effects of payments on the quality of the work performed in mechanical turk. The authors performed two different experiments. In one workers sorted traffic images and in the other solved crossword puzzles. The authors conclude that the quality of the work is not affected by financial incentives, but the rate of the work is. In the first experiment, it seems like a different approach could have been taken to examine whether incentives affect performance. Perhaps a more useful experiment would have been to condition the payment that users get on the number of images they sort correctly. In the second experiment, it seems that the fact that less puzzles were solved when incentives were low, indicates that the the quality of the work decreases. In summary, this is an important paper that takes a deep and serious look into the issue.

Algorithmic wage negotiations. This paper examines a new method for pricing tasks in online labor markets like MTurk. While the previous paper examined the effects of payments on the quality of the work, the main objective of this work is to suggest new methods for payment in online labor markets that will be more efficient. The negotiation method suggested seems interesting, though it is not clear from the workshop version what the objective of the buyer is and how well the current method performs. In general, this seems like an interesting approach that can lead to new works in the area.

Manas Mittal

Financial Incentives and the Performance of Crowds: "Quality doesn't change". The main contribution of this work is that increased payment doesn't lead to increased quality of work.

Algorithmic Wage Negotiation: Applications to paid crowdsourcing. This paper illustrates a software and mechanism to enable workers to bid for incremental work, and attempts to determine the marginal cost of such work. The mechanics, i.e., we'll always pay for some tasks and then enable bidding seems to be a interesting way to organize bid-related tasks on turk (as opposed to pay everything through bonus). I am curious if the results are any different if all the payment was in the form of a bonus.

Philipp Gutheim

Financial Incentives and the "Performance of Crowds" is a horrible paper. It analyzes the "[...] effect of compensation on performance in the context [...]" via experiments on MTurk. The paper has several arguments or visualizations that question the validity of results. For instance, the section "3.1.2 Participants" presents the demographies of participants of the first experiment. The experiments asks for income in US$ of a group of participants who range from the US to Vietnam (in total 42 countries). The conclusion they draw from the responses (a diverse set of answers within all three provided groups) is "the subject pool was therefore reasonably diverse etc". First, NO! Second, what is diversity (in which parameters)? Thirdly, why is it relevant? Along that line, the authors seem to lack basic research/academic skills by visualizing discrete variables with a line graph that is appropriate for continuous variables.

The Algorithmic Wage Negotiations paper introduces a wage-biding-widget to MTurk but basically says that "it is interesting but our results are not statistically significant so we have to do them again". From an economic perspective, I don't know why this suggestion would make sense. Requesters want to get their work done (assuming high quality and appropriate time-to-complete) with minimal payment, workers want to maximize their payment (per hour). There is an optimal allocation of HITs between the worker and the requester, skimming off the willingness to work for a given payment. The widget of this paper introduces an unnecessary additional layer of interaction. It is like the Wallstreet would start and negotiate with every single investor for what price a particular stock should be offered to them.

David Rolnitzky

Financial Incentives and the “Performance of Crowds”

Article investigates how financial incentives impact the quality and quantity of work performed on Mechanical Turk. The authors find that increasing financial incentives increase the quantity of the work done, but not the quality. They attribute the difference to what they call an "anchoring" effect. Furthermore, their findings are consistent with other lab studies that found the details of the compensation scheme do matter: a quota system results in better work for less pay than an equivalent 'piece rate' system.

This paper has some large implications for future crowdsouring platforms and crowdsourced labor. It's not all about the money! The paper suggests that there is big importance in using intrinsic motivators instead of just trying to pay more money if you are interested in a quality return (instead of just quantity). In fact, you can get away with paying little or no cost if the intrinsic rewards are high enough. This seems especially true for more involved tasks. The paper does talk briefly about real-world impacts (e.g., "…absolute pay rates are less important to performance than perceptions of relative value"), though the authors acknowledge these are quite speculative. Interesting nonetheless.

Algorithmic Wage Negotiations: Applications to Paid Crowdsourcing

Authors discuss how an automated bot that can bargain on behalf of buyers or sellers could actually help parties reach more beneficial agreements in faster way. They describe a new software tool called "hagllebot" that does this negotiation and they report on the results of an initial pilot experiment with this bot on M-Turk. Not sure how useful the results were, though I can understand the main point that negotiation when the stakes are low (as in M-Turk environment) is quite time consuming and inefficient. But it seems that if a user is using an interface to negotiate, then it's still negotiating -- and it doesn't seem like much of a timesaving to me--it's just adding another layer to the process. It certainly adds an aspect of negotiation to M-Turk, which doesn't exist.

Prayag Narula

Financial Incentives and the “Performance of Crowds”

An interesting paper that tries to find the co-relation between financial incentives and the quantity and quality of work being done. The paper concludes that though there is a direct co-relation between the amount of work done and the extrinsic incentives but there is no direct co-relation between the quality of work and these extrinsic incentives. This conclusion follows the results in the off-line world.

It would be interesting to do some qualitative survey with the participants and ask them why they focus on doing more work than doing quality work for more money. It would help us in understanding how to create incentives for doing better jobs

Exploring Iterative and Parallel Human Computation Processes

The paper proposed a bargaining algorithm for optimizing the process of coming up with a good value for a work on MTurk. I feel the attempt fizzled when most of the participants refused to negotiate and mostly just accepted the amount of money they are getting.

The authors also seemed to have forgotten that these are people who are doing the work and deserve fair wages. It also supports the claim of MobileWorks that its the prerogative of the platforms to make sure that the users get fair wage for their work. Even when given a choice the user would not negotiate a better wage for himself.

Siamak Faridani

Financial Incentives and the "Performance of Crowds": The main contribution of the first paper is that the amount of money that you pay to your Turkers is not correlated to the quality that you get from them. Authors argue that Turkers who are paid more believe that their qualifications put them in the higher paying bracket. The finding is very interesting although they do not comment on how they capture the kind of the task that is posted. My personal experiments on Turk show that you can in fact increase the quality by increasing the reward but after you hit some threshold the quality gets a slow improvement versus the reward.

Algorithmic Wage Negotiations: Applications to Paid Crowdsourcing: The second paper I guess largely misses the point that Mturk is a microtast market. If we assume that Turkers have been on the system large enough they might just have trained their action maps (prof Canny's terminology) to accept tasks and not negotiate. I am not sure if negotiation in such a none symmetric market even makes sense. I sometimes just approve HITs from users simply because the overhead for doing a QA is much more expensive than 1 cents that I have paid for the task. If Mturk was not a microtask market where doing the work is actually cheaper than finding the work, QAing it and making sense of it. Negotiation might have worked.