Help and Tutorials

From CS260 Fall 2011
Jump to: navigation, search

Bjoern's Slides


Extra Materials

  • Caitlin Kelleher and Randy Pausch. 2005. Stencils-based tutorials: design and evaluation. In Proceedings of the SIGCHI conference on Human factors in computing systems (CHI '05). ACM, New York, NY, USA, 541-550. DOI=10.1145/1054972.1055047
  • PauseandPlay: Automatically Linking Screencast Video Tutorials with Applications. Suporn Pongnumkul, Mira Dontcheva, Wilmot Li, Jue Wang, Lubomir Bourdev, Shai Avidan, Michael Cohen. UIST 2011 paper
  • ShadowDraw: Real-Time User Guidance for Freehand Drawing. Yong Jae Lee, Larry Zitnick, and Michael Cohen, SIGGRAPH 2011, paper | web | video
  • Justin Matejka, Wei Li, Tovi Grossman, and George Fitzmaurice. 2009. CommunityCommands: command recommendations for software applications. In Proceedings of the 22nd annual ACM symposium on User interface software and technology. ACM, New York, NY, USA, 193-202. paper | video
  • Jennifer Fernquist, Tovi Grossman, George Fitzmaurice. 2011. Sketch-Sketch Revolution: An Engaging Tutorial System for Guided Sketching and Application Learning. Proceedings of the ACM symposium on user interface software and technology. ACM, New York, NY, USA. paper on ACM DL
  • What Would Other Programmers Do? Suggesting Solutions to Error Messages. Hartmann, Björn, MacDougall, D., Brandt, J., and Klemmer, S.R. CHI 2010. paper

Discussant's Materials

Media:cs260-15-help_discussion.pdf by Peggy Chi

Reading Responses

Hanzhong (Ayden) Ye - 10/24/2011 22:45:39

Reading response for Wednesday Oct 26: Help and Tutorials. Help and tutorials are extremely important issues in human-computer interaction. The first paper discusses widely on the topic of learnability. After presenting a survey of previous definitions, metrics and evaluation methodologies, the authors also launched a study and make comparison on the efficiency of issue identification between traditionally used think-aloud protocol and a new protocol they called ‘question-suggestion’ protocol. As for me, the best part I like about this work is the ‘question-suggestion’ protocol because it is a very effective way not only for learnability evaluation, but also a great way for learning software. I think more application can be done to leverage the efficiency of this protocol, such as a new way for interactive software learning.

The second paper introduce a specific work on tutorial generation (So far, I’ve been feeling that specific examples and implementations are always more beneficial for me to gain insight for CHI concepts than those people addressing general ideas widely). This paper presents a demonstration-based system for automatically generating succinct step-by-step visual tutorials for photo manipulations. The implementation process leverage many matured technologies such in computer vision. The analysis for step classification and tutorial generation is very precise and is a key factor that leads to the successful final implementation. The system designed is also worth learning in that each sub-system works independently and generates results cooperatively. Study shows the tutorial generated is effective in helping users learning new technique. However, user preference is also a key element that should be ignored (for example, some users would prefer video tutorial better). The macro generator is also very useful and has the potential to be developed with higher artificial intelligence.

Valkyrie Savage - 10/25/2011 10:07:00

Main idea:

Users need to be supported in their use of a given tool. There are a variety of ways to do this, including tutorials. Before we can begin to play with the creation of sDiscusuch support tools, though, we need to create a common language for discussing the learnability of software at every stage.


The metrics paper by Grossman, et al., seemed well-motivated and well-balanced. As indicated by their breakdown of the current language being used in HCI papers, it seems like high time somebody came along and tried to standardize it. Their experiment also seemed reasonable: I've learned much more from people prodding me about my poor habits than I have from exploring literature about many tools (there's just so much!). I think transition learnability issues are underexplored. To be fair, it seems silly that there should be so many ways to accomplish the same thing in a program, but if we consider programs to be like coding then it seems obvious that there should be. I wonder if anyone's done studies on how learnable different programming languages are, actually? It seems like that would be a fun study to do. My fianceé recently began interviewing candidates on behalf of Facebook, and some of the questions that they explore in those interviews heavily weight candidates' transition learning of languages: e.g. what could be done in one line instead of 5 or 6. Hypothesis: Python's experiences few learnability issues in the task flow, locating, and understanding categories, but its awareness and transition errors are probably high. It's so damn easy to use and full of functions and modules that it seems impossible for someone to become a perfectly efficient or knowledgeable python user, since most anything one types works as if by magic.

My criticism of this paper would be their conflict of interest. I mean, I understand that Autodesk researchers are supposed to do research to make Autodesk better, but this learnability definition issue that they've brought up is important to more than just AutoCAD. I guess I thought it seemed strange to split the paper between a survey/definition/concept paper and a study/implementation paper.

The tutorials paper from Grabler, et al., is a cool implementation paper. Only part of it seems well-motivated to me: the generation of tutorials. I'm unconvinced that automagically applying Photoshop processes to input images is something that users want: where's the challenge? I mean, the paper spends a goodly amount of time discussing whether their tutorials generalize to other processes that the user might wish to perform on photos, but they don't seem to be concerned about the fact that applying a macro can't generalize in any way, since it's just a button. Maybe making Photoshop smart is helpful; I reckon that's why the context-aware eraser was developed. Maybe I'm "old-fashioned", but I want to be a part of my own computer-based photo editing. I know, I'm just unbelieveably super traditional. Then again, I don't make my bread and butter on beautifying photographs, and it's possible that these macros would help the sorts of people who do. I guess I don't actually know if it's well-motivated, after all, since its motivation isn't discussed in the paper.

Also, did I mention that the tutorials they chose to generate are strange? How often do people actually do these things? Why would someone want to change their eye color in a photo, or turn day into night?

Steve Rubin - 10/25/2011 15:50:55

The first paper presented a survey of "learnability" in the HCI literature, and then showed a meta-study that compared the "think-aloud" protocol to the "question-suggestion" protocol for conducting usability/learning studies. The second paper, an application, described a system for automatically generating photo-editing tutorials, mentioning many of the challenges inherent in instrumenting software.

The salient point made in the first paper is that a definition of "learnability" is unnecessary. Researchers can study many facets of how well people can learn and adapt to systems, but there is no one metric that is a silver bullet--the sheer number of metrics that they list is evidence of that. The important question from a researcher's point of view is then, "what questions should I ask in this situation?" Their study showed that different experimental styles resulted in different kinds of learnability feedback. A more comprehensive study should be done to better flesh out the design space of learnability studies.

The second paper, on automatically generating tutorials, was interesting in light of the first because it defines a certain approach to learnability. Notably, it more or less assumes that with good enough tutorials, users do not need other forms of documentation--more advanced knowledge of photo-editing software can be gained through aggregating tutorial-based workflows. The paper is, in effect, siding with an example-driven model of learnability. While I think the work is impressive and its results are promising, my one real problem is its deemphasis on "understanding" in the learning process. I personally find that when I am following a tutorial, if it is just a list of instructions, I don't internalize them properly. Sure, it'll get the job done, but I don't learn much. Explaining why to do something is, in my opinion, just as important as explaining how.

Laura Devendorf - 10/25/2011 17:08:38

Generating Photo Manipulation Tutorials by Demonstration details the motivation and implementation of an automated tutorial making system. A Survey of Software Learnability discussed that problems relating to the definition and evaluation of software "learnability."

I thought the article presented an interesting and useful tool for making tutorials for others to learn digital image manipulation. I think it's interesting to consider when and where tutorials are being generated, is it because the interfaces are difficult to use or that the concepts are hard to understand. I have noticed that many App's have been successful at allowing users to manipulate image levels by offering them a limited set of tools. The tools offered are usually the most commonly used tools. Would an overlay that simplifies the interface and points to next steps, instead of having users follow the leader, be equally effective. I am also curious how the tutorials are followed. Since the exact positions of the controls are shown, do people feel that they need to follow those exactly? For instance, saturation levels will vary based on the image: is the user of the tutorial automatically going to set the saturation to what the teacher did, regardless of how it effects their own image? This wouldn't be intended by the system but might be an interesting side effect to investigate. Also, I thought an interface like this one would be interesting for teaching design tricks such as the rule of thirds or the art of good cropping. If a face is detected, could it also suggest an optimal position for the face in the frame that adheres to design conventions? Overall, I enjoyed the research, background and lessons learned. I also appreciated the discussion of the interfaces limitations.

Grossman's paper was interesting in a number of ways and, in a way, felt like two papers in one. The paper gives reason for their argument that the term "learnability" is problematic and defined differently in many cases (the term is especially problematic to my spell checker). It then goes on to classify how it is defined. The classification is helpful in demonstrating that many people define it differently, but I don't find it particularly helpful in understanding which measure I would or should use. In other words, it classifies what already happened by doesn't say much about a future direction. The paper then jumps into a study of "talk out loud" vs. a question-suggestion protocol in order to evaluate learnability. I found the study set up interesting and helpful. While it requires a certain amount of organizational overhead (finding a coach), it does seem particularly helpful for identifying learnability issues.

Amanda Ren - 10/25/2011 20:48:17

The CHI paper discusses the classification system of learnability issues and how their user study was able to identify learnability issues.

This paper is important because it acknowledges the learnability issues that arise with software programs. They use a user study to create categories of observed learnability issues and also they introduce the question-suggestion evaluation protocol. The categories that arise are: task for, awareness, locating, understanding, and transition.This paper is relevant to today's technology because software engineers need a reliable way to evaluate learnability and isolate specific areas of learnability. Given that many users today are only familiar with the desktop, designers of smartphones/tablets need to make sure their software makes it easy for beginner users to pick up the system. One software I can think of that this applies to is OSX. As a beginner user, the system is easy to begin using. But I am also aware that there are features that would have to be learned through extended use. I thought it was interesting they brought up Microsoft's "clippy" in the awareness category - users benefit from having a coach, but only if the coach/intelligent agent is not intrusive.

The SIG paper presents a user study on a system that automatically generates step by step tutorials from photo manipulation demonstrations.

This paper takes from the idea that generating tutorials for photo manipulation is tedious. Instead they present an easy way that will record the demonstrator's actions and auto generate tutorials. Although this paper focuses on a very specific type of tutorial (photo manipulation), there are ideas that can be applied to other type of tutorials (such as combination of text and images is better than either individually). I really think they should have further explored how well the users could apply what they learned from using the different tutorials because these tutorials should help a user learn how to use the system, not just how to do a specific task using the system. I was actually surprised that the generated tutorials proved more effective in the tests than the book of Photoshop tutorials. I'm curious as to how the book tutorials differed from the generated tutorials.

Yun Jin - 10/25/2011 21:14:36

The first paper presents a survey of the previous definitions, metrics, and evaluation methodologies which have been used for software learnability. Also it developes a new question-suggestion evaluation protocol to specifically identify existing learnability issues in an application, and compared it to a traditional think-aloud evaluation. Based on the issues identified in the study, the paper also presents a classification system of learnability issues, and demonstrates how these categories can lead to guidelines for addressing the associated challenges. Not only did this study show that the new methodology exposed a significantly larger number of learnability issues, it also allowed them to collect and categorize learnability issues. The second paper draws on theories of embodiment—from psychology, sociology, and philosophy —synthesizing five themes they believe are particularly salient for interaction design: thinking through doing, performance, visibility, risk, and thick practice. And it introduces aspects of human embodied engagement in the world with the goal of inspiring new interaction design approaches and evaluations that better integrate the physical and computational worlds.

Viraj Kulkarni - 10/25/2011 21:24:16

A Survey of Software Learnability: Metrics, Methodologies and Guidelines' is a compilation from different sources of definitions and metrics of software learnability. Although there has been much talk about this topic, there is agreement on few definitions and terms. I feel this paper does a good job at putting everything together in a consistent manner. The authors also describe the 'question suggestion protocol' and conduct a user study on it. They use it to identify existing learnability issues in an application and it performed better and exposed more issues than the 'think aloud protocol'.

'Generating Photo Manipulation Tutorials by Demonstration' is about automating the process of generating image manipulation tutorials. I like the concept but I find all the computer vision stuff in it to be a bit superfluous. If you want to carry out the operation 'select the lip' as mentioned in the paper, the tutorial is anyways going to mark out the region of the lip. Do you really need computer vision algorithms to detect what region is being selected if it is marked out? This might be helpful to create macros but I doubt how much this is required if human users are going to use these tutorials.

Cheng Lu - 10/25/2011 21:56:28

It is well-accepted that learnability is an important aspect of usability, yet there is little agreement as to how learnability should be defined, measured, and evaluated. In this paper, we present a survey of the previous definitions, metrics, and evaluation methodologies which have been used for software learnability. The survey of evaluation methodologies cited in the first paper, “A Survey of Software Learnability”, leads people to a new question-suggestion protocol, which, in a user study, was shown to expose a significantly higher number of learnability issues in comparison to a more traditional think-aloud protocol. Based on the issues identified in our study, they present a classification system of learnability issues, and demonstrate how these categories can lead to guidelines for addressing the associated challenges.

The second paper, “Generating Photo Manipulation Tutorials by Demonstration”, presents a demonstration-based system for automatically generating succinct step-by-step visual tutorials of photo manipulations. An author first demonstrates the manipulation using an instrumented version of GIMP that records all changes in interface and application state. From the example recording, our system automatically generates tutorials that illustrate the manipulation using images, text, and annotations. It leverages automated image labeling to generate more precise text descriptions of many of the steps in the tutorials. A user study comparing to the automatically generated tutorials to hand-designed tutorials and screencapture video recordings finds that users are 20–44% faster and make 60–95% fewer errors using our tutorials. While their system focuses on tutorial generation, they also present some initial work on generating content-dependent macros that use image recognition to automatically transfer selection operations from the example image used in the demonstration to new target images. While t heir are limited to transferring selection operations they demonstrate automatic transfer of several common retouching techniques including eye recoloring, whitening teeth and sunset enhancement.

Derrick Coetzee - 10/25/2011 23:22:54

The first work by Grossman et al. investigated defining and evaluating "learnability," how easy it is for users to learn to use a user interface. It made a powerful case that such a survey was needed, collecting a diversity of divergent opinions about how learnability should be defined and what metrics should be used to evaluate it into an effective common taxonomy. Even more impressively, it designed and evaluated a novel method of assessing learnability of a UI, the question/suggestion method, showing it was superior to the dominant think-aloud protocol; any researcher can put this method to use immediately.

The evaluation had some experimental problems: sample size and demographic variation were low, learnability issues were identified by researchers with an interest in a certain outcome, and p-levels for Likert scales were not given precisely (just p<0.05, which is relatively high). This casts some doubt on the generalizability of the results, although intuitively I expect them to be reproducible.

One interesting extension would be to consider an intermediate method where the system expert evaluates subjects after the fact. This would separate two effects of question/suggestion: learnability issues discovered by the coach, and the greater progress made on the task due to the coach's assistance.

The second work, by Grabler et al, described a specific method for generating static tutorials for photo manipulation by tracking the actions of a user in GIMP. This makes it easy to build a static tutorial that users can follow at their own pace, and gracefully edits out steps that are immediately undone.

Although promising, the scheme has a substantial number of issues. First, only the ability of subjects to follow tutorials was evaluated - no subjects were asked to use the system to generate tutorials. Even among these subjects, they measured only short-term success at the task, rather than long-term learning and the ability of the subject to generalize their learning, which are precisely the areas where a human-written tutorial with conceptual explanations would have an advantage. The researchers created the videos themselves - results may be substantially different for high-quality videos.

There was excessive focus on automation; an interactive system that can incorporate high-level explanations by the tutorial writer and allow editing of the tutorial would be superior. The content-based labels they endeavored to generate, while impressive, were visually obvious and added little. The system automatically captures irrelevant steps executed by mistake, and they don't describe a way of removing them.

Peggy Chi - 10/26/2011 0:40:16

We all need helps. As long as software includes certain amounts of diverse features, users would need to learn. Why? There is always a gap between software developers' mental model and users' one, thus embedding "help" is essential to all software. The paper from Grossman et al. discussed how the idea "learnability" should be defined, measured, and evaluated. They proposed a new protocol "question-suggestion," beyond think-aloud or question-asking methods. Grabler et al. presented a system that automatically captured users' actions and generated step-by-step visual tutorials using computer vision techniques. This also helped transfer into macros for repetitive tasks.

It is interesting to see how learnability is defined for novice users (initial learning) and experienced users (extended, long term learning). The former somehow affects how a user wants to continue using the software or not, but the latter decides if the user could get the maximum benefits from it. I enjoyed reading the paper; however, I'm not sure if the proposed question-suggestion protocol could be easily controlled for valid results to compare. By including an additional "coach" in the system, the variables become more complicated. For examples, could a coach fairly assist each user, or would he also "learn" how to suggest through the evaluation process? Would users performs differently because of the sit-aside pressure or the "trust" to the coach? (e.g. less willing to try and use the system)

Grabler et al.'s paper shows a brilliant approach to capture and understand the workflow. The results seem promising. Nevertheless, I wonder how users as tutorial content providers might easily revise the generated material when the recognition goes wrong, or how the system could adjust the results if any action is incorrectly detected. Moreover, is the automated-generated description interesting enough to non-first time users? This is a common issue for template-based text generation. (Though the book tutorials performed the worst in the user study, chances might be that users enjoyed reading the diverse content more?) I think some issues between automated systems and human content were missing in the paper that we might want to notice for a long term use.

Yin-Chia Yeh - 10/26/2011 1:12:27

The two papers today are about learning and tutorial. The learnability paper surveys the literature of software learnability and proposes a new research strategy, question and suggestion. The question and suggestion model can help researchers find out more learnability issues but is not a good simulation of practical software usage pattern. The photo manipulation tutorial paper proposes a system to generate image manipulation tutorial automatically. Learnability paper: The first question I have in mind is that do learnability researches only focus on software learnability. I can understand that software usually have a more complicated interface so that learnability really matters. However, I feel like applying these research approaches into other domain might also be very useful. For example, can we use these methods to find out a better way to teach playing instrument, riding bicycle or other sports. Another question is that if measuring efficiency is the only measurement of learnability. For example, can we measure if a certain task can be done by the user or not? For example, can a user layout his paper to fit format of certain conference or not. Maybe this kind of measurement is out of the range of learnability? The final issue I am interested in is the issue of performance plateau in suboptimal level. It would be interesting to see how people decide whether they need to learn/use a function or not. Photo manipulation tutorial paper: I really like the idea of this paper and I agree that video tutorials sometimes can be very frustrating since it could be either too fast or too slow. For example, I learned flash programming recently and found the video tutorials of adobe is not very useful since they separate their video tutorials and text/graphic tutorial in two different sites. On the other hand, Microsoft Kinect SDK tutorial presents video and accompanying text/graphics together and I feel it is much better. One question about this paper is that automatic labeling might not always works. It could fail because of unstable behavior of computer vision algorithms or the picture context is just out of the scope of the labeling algorithm. Anyway, it could be easily resolved by asking the tutorial author to refine this part manually.

Alex Chung - 10/26/2011 1:41:03

“Generating Photo Manipulation Tutorials by Demonstration”

Summary: Automatically generated storyboard style tutorial that is easy to understand and follow.

Positive: Automated annotation. The proposed system addressed many pain points that I experience everyday such as the difficulty of following the tutorial materials as well as the tediousness of creating the same tutorial. Since its error rate is low and easy to follow, it is perfectly applicable to mechanical Turk jobs where the workers usually are required to follow instructions.

The paper did an excellent job on explaining the logistics behind the implementation as well as the rationales behind the design decision. The description on capturing spatial operations and undo operations would have been helpful for our assignment #3. Like the proposed system, it was succinct and easy to understand.

Negative: Can users take what they learn and adapt to a new image setting? The tutorial effectiveness experiment shows people can follow directions better on one system against another. However, the author did not verify if users learn better on the modified GIMP system. While users can often figure out how to adapt the tutorial to new images, it is based on the assumption that users can learn equally on the proposed system.

“A Survey of Software Learnability: Metrics, Methodologies and Guidelines”

Summary: UI that is difficult to use because the controls feel unnatural or hard to comprehend. Yet there is little consensus among researchers on how to define or to evaluate learnability. It leads to the development of a new “question-suggestion protocol” for learnability evaluation.

Positive: Excellent categorization of learnability metrics based on task performance. The taxonomy will help the HCI community tremendous in terms of unifying the measure of user interface designs. If normalizing can be successfully done between two separate systems, then researcher of one UI design can borrow another system into his/her own study.

Negative: I struggle to see the distinction between question-asking and question-suggesting protocols. Both cases rely on dialogs between an expert and a novice where the expert provides tips for the novice to overcome the usability hurdles in order to extend the study. Also, there is no standardized measure of the quality of coaching because the questions or the suggestions could vary. While the protocol could script the suggestions, different users might need help at different stages of testing.

I also failed to find any discussion about the measure of learnability. Instead there were much discussion about usability design issues and how to avoid them. The whole novelty of this paper depends on the significance of the proposed question-suggestion protocol; however, it does not seem that interesting at all. Rather than discovering a solution of measuring learnability, the author introduced another alternative to a cluster of metrics. Because there is an emphasis on a single method, the title of this paper should be question-suggestion protocol rather than a survey of software learnability.

Ali Sinan Koksal - 10/26/2011 2:05:29

This week's first paper presents a taxonomy of "learnability" definitions (on which there doesn't seem to be an agreement considering previous work), as well as a classification of metrics for evaluating learnability. It also argues for a new protocol for running user studies in order to improve learnability, the "question-suggestion" protocol where a user is assisted by an expert who answers questions and suggests efficiency improvements.

Building a framework for incorporating different definitions of learnability is a valuable step to take, which will help structure how one thinks about and achieves an increase in learnability in designing tools. Learnability (and more generally, usability) needs to be studied more for a number of systems -- I was reading about user reactions on the web on how unusable the Ubuntu desktop has become with the latest update to the distribution, and have been personally frustrated too about this.

In my opinion, the "question-suggestion" protocol is indeed useful in uncovering more learnability issues compared to "think-aloud" as it will allow to explore more aspects of the program by the user during the evaluation, as he is guided by an expert. Meanwhile, this dialogue will probably interfere with the natural flow of work typically followed by users, and may prohibit us from assessing the learnability of a given tool. I think another option could be an iterative application of "think-aloud", where the designers would be made aware of the initial learning barriers and could address these in order to reevaluate the refined program later.

The second paper presents a system for generating photo manipulation tutorials based on demonstrations of image manipulation tasks. An image processing tool is instrumented to capture the input commands, and images are labeled to infer the semantics of which region of an image is being worked on, to provide detailed annotations in tutorials, and most interestingly to automatically apply the same task on different inputs by establishing a mapping between the semantic regions of images, by the labeling procedure.

Programming by demonstration is a very interesting domain, allowing end-users to create programs without even seeing them. This work is valuable as it could form the basis for an efficient PBD system for image processing. A typical challenge is to successfully generalize actions performed on a limited number of examples to a program that is able to operate on different input values. AI techniques such as version space algebras may help in keeping track of all possibilities of programs that can be inferred, and eventually removing all ambiguity by considering enough input examples (as in Tessa Lau's SMARTedit).

Galen Panger - 10/26/2011 2:24:00

For this class, we read two studies, one exploring the concept of learnability and another exploring automatically generated tutorials. The learnability piece I thought suffers greatly from a failure to distinguish the concept from usability. I don’t see the distinction very clearly—the ability to use something well initially, and the ability to master a piece of technology over time, is usability to me. Perhaps the distinction is that learnability should focus on the value added by support structures, like help documentation, interface highlighting, or intelligent agents, to usability. These support structures make us more able to learn an interface. Another distinction might be how memorable the interface is—how quickly and to what extent you remember patterns of operations for task completion versus a simple usability measure of error rate or completion time. I also thought “think time” was an interesting measure for learnability.

Another problem in the learnability piece was that very little attention was paid to the concept of a learning curve. There are colloquial ways of describing learning curves—steep, gradual—that don’t really have a normative component. Some things are really hard and painful to learn but are quickly learned (snowboarding is an example, where after 3 days of falling down constantly I finally learned how to snowboard), while others are very easy to learn but are not learned well except over a long period of time (like skiing). Which one is more learnable? Well, short but steep learning curve might be better in one situation while a long but gradual learning curve might be better in another. In modern computer interfaces, you also have to deal with the fact that there are completely separate methods of accomplishing a task—through the toolbars, through the menus, through commands, or some combination. There may even be expert modes that are entirely different from the beginner mode. This suggests multiple (or discontinuous) learning curves; something that was not addressed in the paper.

I do think the question-suggestion methodology is interesting but can’t figure out how it’s different from Think Aloud except that people aren’t left to struggle and thus can do more of the task (thus identifying more problems). And I don’t see how question-suggestion is better than Think Aloud for learnability as opposed to usability. The authors don’t attempt to explain that.

Finally, I don’t have much to comment on with regard to the tutorial generation piece. I thought what they did was interesting and appreciated that the computer recorded all of the necessary steps to accomplish a task—I have worked with Adobe’s books for Creative Studio and thought they often left out key steps that caused me to struggle with the software. When you just want to get something done, having the assurance that no step is left out, as with the tutorial generator, would be the way to go. But maybe struggling is better for learning—that’s a testable hypothesis. As the authors note, tutorial generators also can’t motivate the task very well, so generalizability is a concern.

Jason Toy - 10/26/2011 7:30:48

A Survey of Software Learnability: Metrics, Methodologies and Guidelines

"A Survey of Software Learnability" is about how learnability, an important part of usability, should be defined, measured, and evaluated.

The authors of the paper present the question-suggestion protocol, a new system by which extended learning can be measured. Current protocols deal on with the initial learnability of systems because it is generally easier to evaluate, i.e. it does not take as long. However the question-suggestion protocol allows for an expert to answer questions about how to best complete the task, dealing with extended learnability issues. The ability to easily access expert advice for both initial usability issues (how to accomplish the task at hand) and extended learnability issues (optimality) are addressed in the tactic of pair programming. Similar to the experimental setup in this paper, two programmers look at the same screen, however in pair programming both can learn from the other. To facilitate this kind of learning, the paper we read on Monday, "How Bodies Matter" discusses ways of using the workplace to facilitate collaboration and learning from others. By creating metrics for extended learning, this paper may influence future products by allowing for a focus on learnability problems that may have been previously ignored.

The paper does a good job following experimental procedure: having both an experimental and control group (which used the question-answer protocol). To avoid bias (maybe participants would be less likely to self-record questions that made them look incompetent) in results, the experimenters allowed the expert's to report issues the participants were having as well. Finally, I like the idea of categories of learning issues. One thing the paper could have improved on is the discussion on level of experience as part of the dimensions to consider. For example, in the current experiment, experts help participants when they have problems accomplishing the goals. This would mean that the participants have gained a new skill set, and were they asked to repeat the experiment, they would have no problem accomplishing the same tasks, possibly convincing the experimenters that there were fewer learnability issues. These dimensions, and the variance of them among people, have a large role in learnability, and they could have been addressed more in depth in both the experiment and the paper itself.

Generating Photo Manipulation Tutorials by Demonstration

This paper is about a demonstration-based system for automatically generating step-by-step visual tutorials of photo manipulation.

Visual tutorials made with screenshots are known to be effective, however they are hard to make. This new system can affect real world systems by allowing an easier entry point into complex software like photoshop, by allowing tutorial makers to be more prolific, and their tutorials to be more helpful by augmenting them with informative notes. This shift may convince users like me who were previously using simplistic photo-editing software like microsoft paint to manipulate images, into using other alternatives. Since learning is not a closed system, improving the availability and quality of resources and documentation for users affects the idea of learnabilty described in "A Survey of Software Learnability".

The paper does a good job comparing generated tutorials to those in a book to see how robust their system was. The authors discovered areas in which they could improve the system in a second iteration. I liked the idea that people were not penalized in using better methods they knew previously. However this introduces the question, how much of the users' success was based on the tutorials, and how much was based on knowledge or skill they acquired previously? Another problem with this paper is the question of generalization? How would this process generalize to different types or pieces of software? Would it have to be rebuilt for each different product? Would it work for non-photo-editing types of software?

Personally, I disagree that it is a limitation of the system that the tutorial cannot explain why users must perform each action, unlike what books do. What would stop this system from being used to help create tutorials for books, thus solving this problem?

Hong Wu - 10/26/2011 7:43:44

Main idea:

Both papers talked about how designers could make the manual or tutorial more efficient.


“A Survey of Software Learnability” described the current argument and understanding on the definition, the metric and evaluation of learnability. Learnability in HCI is the ability for a user to learn how to use software without any formal training. The user can be initial or continuous user. The paper mainly talked about a methodology to test learnability. The methodology based on the question and suggestion user may ask for when using the software. To me, the question-and-suggestion prototype is hard to scale up as it is almost impossible to assign a couch to every user.

“Generating Photo Manipulation Tutorials by Demonstration” applied static screenshot tutorial instead video cast. Authors argued that static screenshot is more efficient to help user to get along with the software. However, authors also noticed the drawbacks of their static screenshot tutorial, which is hard to explain why a certain step is necessary.

Teaching user to use software easily and fast is crucial to software. The core of software is to let user to get used to it and cannot leave the software. Particularly for programming language, it is hard for a programmer to understand a new method in most languages. Matlab is very easy to use because most of its functions have a runnable example which can offer programmer intuitive understanding.

Manas Mittal - 10/26/2011 7:44:30

The question answer metric that Grossman et al. propose is a good one - it attempts to capture how we naturally go about learning a new tool.

I would propose a new metric based on number (and depth) of google searches the user has to do to perform a similar task. This protocol has the benefit that i) It is more natural (this is how people typically learn things anyway), ii) It also factors-in 'community knowledge' as a contributor to learnability. This is important because people seldom learn things in isolation. The metric could be normalized for the user's adeptness with a search engine.

The 2nd paper talks about automatically generating tutorials in Gimp. These tutorials are more like documented recordings. I am curious to know if the authors have considered doing this for other context (for example, for people learning new programming languages). The Watch-What-I-Do approach has a rich tradition in HCI.

Vinson Chuong - 10/26/2011 8:50:53

Grossman, Fitzmaurice, and Attar's "A Survey of Software Learnability: Metrics, Methodologies, and Guidelines" surveys the existing definitions and evaluation methods for learnability and offers a new evaluation method as well as some guidelines for improving interface learnability. Grabler, Agrawala, Li, Dontcheva and Igarashi's "Generating Photo Manipulation Tutorials by Demonstration" discusses a system for automatically generating annotated tutorials for using GIMP from recordings of a user performing the task.

Grossman, Fitzmaurice, and Attar address the lack of a "well-accepted definition of learnability" by surveying and categorizing the various definitions and operationalizations of learnability from past research. They go on to discuss a new "question-suggestion protocol" which shows promise in allowing the measurement of more aspects of learnability than the traditional "think-aloud protocol" used in usability studies. From the results of their user study of the new protocol, they extract several design guidelines for improving the learnability of interfaces. Their guidelines seem to focus on providing context-sensitive pointers and demonstrations of features and workflows that either help a user complete a task or increase his efficiency in that task. This resembles learning from an expert coach or working with a pair programming partner. I'm reminded of "mixed-initiative interaction", where an intelligent agent sits in the background and waits for opportune moments to offer assistance or automation; I wonder if such techniques can be used to build more context-sensitive help systems.

Grabler, Agrawala, Li, Dontcheva, and Igarashi observe that the process of manually creating and writing a photo manipulation tutorial is difficult and involves many concurrent tasks: recording steps taken, capturing visual aids, etc., all while performing the actual task. They present a system which automates much of that process, and allows the tutorial writer to focus on the task being performed. While they focus mostly on the quality of the tutorial output by the system, I believe that the main benefit lies with the person who is generating the tutorial. This system saves the tutorial writer a lot of time and effort in documenting actions in detail and capturing and editing visual aids. I imagine that if such a system is offered as a plugin or is baked into the popular photo manipulation software (along with a way to edit the generated tutorial), then we would see many more tutorials for a wider variety of tasks. This could lead to more detailed and relevant integrated documentation for users.

Sally Ahn - 10/26/2011 8:52:53

In the first paper, Grossman et. al. presents a framework of metrics for describing learnability of various softwares as well as a "question-suggestion" protocol for evaluating learnability. In the first part of the paper, the authors survey existing definitions of learnability and point out the need for a more concrete definition. It seems that many definitions used vague or relative terms (e.g. "maximal," "optimal,' etc,) without first establishing a system of metrics on which these terms can be evaluated. Even in the new taxonomy presented by Grossman et. al., we see words like "optimal performance," but how to measure whether performance is "optimal" is still unclear.

The second paper describes a software that automates the creation of image manipulation tutorials by recording the usage of a software (e.g. GIMP or Photoshop) and automatically creating screenshots and labels from these recordings. This approach leverages computer vision techniques to label features of the images to aid with the annotations of these screenshots. I think the authors identified a well-motivated problem; tutorial generation is a tedious task that can benefit greatly from automation. I think one of their greatest contributions is tackling the technical challenges (e.g. recording user actions and separating them into set operations, set parameters, and commit operations) to demonstrate that this task can be automated; I think it opens a niche for which many extensions and improvements can be made.

Allie - 10/26/2011 8:54:10

In "Survey of Software Learnability: Metrics, Methodologies, and Guidelines", Grossman, Fitzmaurice, and Attar tried to define how learnability should be defined/measured/and evaluated. In trying to understand what causes uses to not just acquire new abilities, but to improve their usage behaviors by finding more efficient strategies, they introduced a taxonomy of learnability definitions. An AutoCAD user study was performed, where 4 purposefully high-level scenarios were carried out to cover the application's functionality. The categorizations used to analyze the results were 1) task flow 2) awareness 3) locating 4) understanding 5) transition. Further, the coach was found to be useful, for he was able to identify inadequate behavior, which an evaluator less educated may not recognize.

In "Generating Photo Manipulation Tutorials by Demonstration", Grabler et al GNU Image Manipulation Program to record the changes users during the user study. The findings were that the GIMP experiment performed 20-44% faster and reduce errors by 60-95% compared to video instructions.

It's interesting both articles are user studies performed in a visual context. Perhaps computation is most obvious when presented in a graphical manner.

Apoorva Sachdev - 10/26/2011 8:58:39

This reading was focused on Help and Tutorials and we covered two papers where one dealt more with the analysis of learnability of the system and another dealt with creating effective automated tutorials from demonstration.

Grossman et al. presented their study of numerous papers presented related to learnability and claimed that no one definition or assessment criteria exists. Their aim was to provide a classification system and highlight some of the issues related to learnability. The described how the evaluation methods can be broken down into two categories the formative and the summative evaluation where one is used to learn about the usability problems in an effort to improve the interface while the latter is used to either compare the current system with another or to determine whether it meets requirements. They presented one new “question and suggestion” protocol in comparison to the traditional “think aloud” methodology. Although, this approach was interesting, it seems a little further away from realistic scenarios as in a realistic setting, the user is usually problem solving by themselves and an expert is not present. However, using this approach one might be able to identity useful hints that could be provided in the help/tip section to make the application use more efficient. I felt like the paper started off strong but ended up fizzling out a little bit in the end as I expected the authors to present a more elaborate framework for analyzing learnability in various interfaces and describing a common definition/analysis methodology.

The second paper “Generating Photo Manipulation Tutorials by Demonstration” was very interesting. I think the approach it describes of automating the process of making tutorials not only makes the tutorials very accurate, and detailed but also makes their creation a lot faster. From personal experience, I prefer going through screen shots and seeing what steps are being taken rather than watching a video or reading a book as in such cases a picture(Screenshot) is really worth a 1000 words. As was proven by their studies, the generated tutorials help to reduce the error rate of the users. I think the macros generation would be a very useful application, as most new/moderate users would like to use template style designs to modify their images and enhance them rather than manually changing variables individually. However, a lot more work needs to be done to make the macros work on general cases, as is evident from the paper in the eye-coloring example, the output is satisfactory with light iris but not with dark ones.

Rohan Nagesh - 10/26/2011 9:00:47

This set of readings discussed the issues of learnability and tutorials in HCI. The first paper "Generating Photo Manipulation Tools by Demonstration" discusses the authors' tool to automatically generate static picture-by-picture tutorials to demonstrate a particular image manipulation task. The second paper "A Survey of Software Learnability" presents metrics, taxonomies, and an evaluation method for identifying learnability problems in software.

In the first reading, I was surprised to learn that people preferred static picture-by-picture tutorials over video-screencast tutorials. The paper mentioned this was due to the fact that videos often forced users to learn at that pace whereas they could learn at their own pace in a picture-by-picture tutorial. Given that research, I agree with the author's premise that picture-by-picture tutorials are painful to create and annotate, and I find a definite need for the author's automation software.

As for the second reading, I found the question-suggestion protocol to be quite interesting. I agree that it appears to be more productive to the designer than the think-aloud method which I think is not focused on the task at hand enough. The instructions for the coach were a bit different between the two protocols, but the instructions for the participant differed the most, and I found that to be interesting.

Shiry Ginosar - 10/26/2011 9:10:16

These two papers discuss tutorials and learning from two different perspectives. Grossman et. al. conduct a survey of previous work in order to consolidate all the different definitions for learnability and then proceed to propose a new evaluation mechanism for testing the learnability of a system. Grabler et. al. develop a tutorial authoring system that aids experts in the process of capturing and distributing their knowledge.

In their paper, Grossman et. al. describe a new method for evaluating learnability in a lab environment. This method relies on an expert working with the subject and providing suggestions when the subject is stuck. This method is presented as useful for getting the subject to reach more parts of the interface in the allotted time and for helping the subject in the points that are specifically hard for him. While this method indeed simulates a learning environment in practice (when an expert is available as a mentor) it seems to me that it does not precisely model the learnability of the interface itself which is what the authors claim to be testing. Is it not the case that with good mentorship even the most obscure interface can be learned? And on the other hand, is it not the case that a inadequate mentor can hinder the learning curve of a user of a well designed interface? All in all, this method seems to be pretty dependent on the expert used in the test.

Grabler et. al. take a different approach to learnability. Rather than provide the user with suggestions for the points in which he is stuck, they author a complete tutorial for a given task and expect the user to follow it precisely in order to perform that task. Moreover, the measure any discrepancy between the user's actions an the prescribed recipe is error in their user studies. This is a different approach and is an interesting one, but it raises a question about the process of learning itself. What part of a human learning process derives from trial and error, from exploration, trying out different things and correcting our mistakes? While following a prescribed set of steps may lead the user to accomplish a task in the least amount of steps, how much of this knowledge would be retained for next time, or learned vs. how much knowledge would have been learned if the user would have been allowed to explore freely?